Re2
Although OCaml strings and C++ strings may legally have internal null bytes, this library doesn't handle them correctly by doing conversions via C strings. The failure mode is the search stops early, which isn't bad considering how rare internal null bytes are in practice.
The strings are considered according to Options.encoding
which is UTF-8 by default (the alternative is ISO 8859-1).
val sexp_of_t : t -> Sexplib0.Sexp.t
include Core.Comparable.S_plain with type t := t
include Base.Comparable.S with type t := t
include Base.Comparisons.S with type t := t
ascending
is identical to compare
. descending x y = ascending y x
. These are intended to be mnemonic when used like List.sort ~compare:ascending
and List.sort ~cmp:descending
, since they cause the list to be sorted in ascending or descending order, respectively.
clamp_exn t ~min ~max
returns t'
, the closest value to t
such that between t' ~low:min ~high:max
is true.
Raises if not (min <= max)
.
val clamp : t -> min:t -> max:t -> t Base.Or_error.t
include Base.Comparator.S with type t := t
val comparator : (t, comparator_witness) Base.Comparator.comparator
val validate_lbound : min:t Core.Maybe_bound.t -> t Validate.check
val validate_ubound : max:t Core.Maybe_bound.t -> t Validate.check
val validate_bound :
min:t Core.Maybe_bound.t ->
max:t Core.Maybe_bound.t ->
t Validate.check
module Replace_polymorphic_compare : sig ... end
module Map : sig ... end
module Set : sig ... end
include Core.Hashable.S_plain with type t := t
include Ppx_compare_lib.Comparable.S with type t := t
include Ppx_hash_lib.Hashable.S with type t := t
val hash_fold_t : Base.Hash.state -> t -> Base.Hash.state
val hash : t -> Base.Hash.hash_value
val hashable : t Base.Hashable.t
module Table : sig ... end
module Hash_set : sig ... end
module Hash_queue : sig ... end
type regex = t
Subpatterns are referenced by name if labelled with the /(?P<...>...)/
syntax, or else by counting open-parens, with subpattern zero referring to the whole regex.
index_of_id t id
resolves subpattern names and indices into indices. *
The sub
keyword argument means, omit location information for subpatterns with index greater than sub
.
Subpatterns are indexed by the number of opening parentheses preceding them:
~sub:(`Index 0)
: only the whole match ~sub:(`Index 1)
: the whole match and the first submatch, etc.
If you only care whether the pattern does match, you can request no location information at all by passing ~sub:(`Index -1)
.
With one exception, I quote from re2.h:443,
Don't ask for more match information than you will use: runs much faster with nmatch == 1 than nmatch > 1, and runs even faster if nmatch == 0.
For sub > 1
, re2 executes in three steps: 1. run a DFA over the entire input to get the end of the whole match 2. run a DFA backward from the end position to get the start position 3. run an NFA from the match start to match end to extract submatches sub == 1
lets it stop after (2) and sub == 0
lets it stop after (1). (See re2.cc:692 or so.)
The one exception is for the functions get_matches
, replace
, and Iterator.next
: Since they must iterate correctly through the whole string, they need at least the whole match (subpattern 0). These functions will silently rewrite ~sub
to be non-negative.
module Options : sig ... end
See re2_c/libre2/re2/re2.h
for documentation of these options.
val create : ?options:Options.t -> string -> t Core.Or_error.t
val num_submatches : t -> int
num_submatches t
returns 1 + the number of open-parens in the pattern.
N.B. num_submatches t == 1 + RE2::NumberOfCapturingGroups()
because RE2::NumberOfCapturingGroups()
ignores the whole match ("subpattern zero").
val get_named_capturing_groups : t -> Core.Int.t Core.String.Map.t
get_named_capturing_groups t
returns a map from names of capturing groups in t to their indices.
val pattern : t -> string
pattern t
returns the pattern from which the regex was constructed.
val find_all : ?sub:id_t -> t -> string -> string list Core.Or_error.t
find_all t input
a convenience function that returns all non-overlapping matches of t
against input
, in left-to-right order.
If sub
is given, and the requested subpattern did not capture, then no match is returned at that position even if other parts of the regex did match.
val find_first : ?sub:id_t -> t -> string -> string Core.Or_error.t
find_first ?sub pattern input
finds the first match of pattern
in input
, and returns the subpattern specified by sub
, or an error if the subpattern didn't capture.
val find_submatches : t -> string -> string option array Core.Or_error.t
find_submatches t input
finds the first match and returns all submatches. Element 0 is the whole match and element 1 is the first parenthesized submatch, etc.
val find_submatches_exn : t -> string -> string option array
val matches : t -> string -> bool
matches pattern input
val matches_substring_no_context_exn :
t ->
string ->
pos:int ->
len:int ->
bool
Same as matches
, except it only matches a substring, completely ignoring the surrounding (i.e. treating the substring as if it's the full string). Raises if pos
and len
specify an invalid range (negative values, or the range is outside the string).
val split : ?max:int -> ?include_matches:bool -> t -> string -> string list
split pattern input
If t
never matches, the returned list has input
as its one element.
val rewrite : t -> template:string -> string -> string Core.Or_error.t
rewrite pattern ~template input
is a convenience function for replace
: Instead of requiring an arbitrary transformation as a function, it accepts a template string with zero or more substrings of the form "\\n"
, each of which will be replaced by submatch n
. For every match of pattern
against input
, the template will be specialized and then substituted for the matched substring.
val rewrite_exn : t -> template:string -> string -> string
val valid_rewrite_template : t -> template:string -> bool
valid_rewrite_template pattern ~template
returns true
iff template
is a valid rewrite template for pattern
escape nonregex
returns a copy of nonregex
with everything escaped (i.e., if the return value were t to regex, it would match exactly the original input)
module Infix : sig ... end
val sexp_of_without_trailing_none :
('a -> Sexplib0.Sexp.t) ->
'a without_trailing_none ->
Sexplib0.Sexp.t
val without_trailing_none : 'a -> 'a without_trailing_none
This type marks call sites affected by a bugfix that eliminated a trailing None. When you add this wrapper, check that your call site does not still work around the bug by dropping the last element.
module Match : sig ... end
val get_matches :
?sub:id_t ->
?max:int ->
t ->
string ->
Match.t list Core.Or_error.t
get_matches pattern input
returns all non-overlapping matches of pattern
against input
val to_sequence_exn : ?sub:id_t -> t -> string -> Match.t Core.Sequence.t
val first_match : t -> string -> Match.t Core.Or_error.t
first_match pattern input
val replace :
?sub:id_t ->
?only:int ->
f:(Match.t -> string) ->
t ->
string ->
string Core.Or_error.t
replace ?sub ?max ~f pattern input
module Exceptions : sig ... end
module Multiple : sig ... end
module Stable : sig ... end
module Parser : sig ... end