String.Utf32be
UTF-32 big-endian encoding. See Utf
interface.
val t_sexp_grammar : t Sexplib0.Sexp_grammar.t
t_of_sexp
and of_string
will raise if the input is invalid in this encoding. See sanitize
below to construct a valid t
from arbitrary input.
include Identifiable.S with type t := t
val hash_fold_t : Hash.state -> t -> Hash.state
val hash : t -> Hash.hash_value
include Sexplib0.Sexpable.S with type t := t
val t_of_sexp : Sexplib0.Sexp.t -> t
val sexp_of_t : t -> Sexplib0.Sexp.t
include Comparable.S with type t := t
include Comparisons.S with type t := t
compare t1 t2
returns 0 if t1
is equal to t2
, a negative integer if t1
is less than t2
, and a positive integer if t1
is greater than t2
.
ascending
is identical to compare
. descending x y = ascending y x
. These are intended to be mnemonic when used like List.sort ~compare:ascending
and List.sort ~cmp:descending
, since they cause the list to be sorted in ascending or descending order, respectively.
clamp_exn t ~min ~max
returns t'
, the closest value to t
such that between t' ~low:min ~high:max
is true.
Raises if not (min <= max)
.
val clamp : t -> min:t -> max:t -> t Or_error.t
include Comparator.S with type t := t
val comparator : (t, comparator_witness) Comparator.comparator
include Pretty_printer.S with type t := t
val pp : Formatter.t -> t -> unit
val hashable : t Hashable.t
Interpret t
as a container of Unicode scalar values, rather than of ASCII characters. Indexes, length, etc. are with respect to Uchar.t
.
include Indexed_container.S0_with_creators
with type t := t
and type elt = Uchar.t
include Container.S0_with_creators with type t := t with type elt = Uchar.t
type elt = Uchar.t
E.g., append (of_list [a; b]) (of_list [c; d; e])
is of_list [a; b; c; d; e]
Concatenates a nested container. The elements of the inner containers are concatenated together in order to give the result.
map f (of_list [a1; ...; an])
applies f
to a1
, a2
, ..., an
, in order, and builds a result equivalent to of_list [f a1; ...; f an]
.
filter t ~f
returns all the elements of t
that satisfy the predicate f
.
filter_map t ~f
applies f
to every x
in t
. The result contains every y
for which f x
returns Some y
.
partition_tf t ~f
returns a pair t1, t2
, where t1
is all elements of t
that satisfy f
, and t2
is all elements of t
that do not satisfy f
. The "tf" suffix is mnemonic to remind readers that the result is (trues, falses).
include Container.S0 with type t := t with type elt := elt
val is_empty : t -> bool
iter
must allow exceptions raised in f
to escape, terminating the iteration cleanly. The same holds for all functions below taking an f
.
fold t ~init ~f
returns f (... f (f (f init e1) e2) e3 ...) en
, where e1..en
are the elements of t
.
fold_result t ~init ~f
is a short-circuiting version of fold
that runs in the Result
monad. If f
returns an Error _
, that value is returned without any additional invocations of f
.
val fold_until :
t ->
init:'acc ->
f:('acc -> elt -> ('acc, 'final) Container.Continue_or_stop.t) ->
finish:('acc -> 'final) ->
'final
fold_until t ~init ~f ~finish
is a short-circuiting version of fold
. If f
returns Stop _
the computation ceases and results in that value. If f
returns Continue _
, the fold will proceed. If f
never returns Stop _
, the final result is computed by finish
.
Example:
type maybe_negative =
| Found_negative of int
| All_nonnegative of { sum : int }
(** [first_neg_or_sum list] returns the first negative number in [list], if any,
otherwise returns the sum of the list. *)
let first_neg_or_sum =
List.fold_until ~init:0
~f:(fun sum x ->
if x < 0
then Stop (Found_negative x)
else Continue (sum + x))
~finish:(fun sum -> All_nonnegative { sum })
;;
let x = first_neg_or_sum [1; 2; 3; 4; 5]
val x : maybe_negative = All_nonnegative {sum = 15}
let y = first_neg_or_sum [1; 2; -3; 4; 5]
val y : maybe_negative = Found_negative -3
Returns true
if and only if there exists an element for which the provided function evaluates to true
. This is a short-circuiting operation.
Returns true
if and only if the provided function evaluates to true
for all elements. This is a short-circuiting operation.
Returns the number of elements for which the provided function evaluates to true.
val sum :
(module Container.Summable with type t = 'sum) ->
t ->
f:(elt -> 'sum) ->
'sum
Returns the sum of f i
for all i
in the container.
Returns as an option
the first element for which f
evaluates to true.
Returns the first evaluation of f
that returns Some
, and returns None
if there is no such element.
Returns a min (resp. max) element from the collection using the provided compare
function. In case of a tie, the first element encountered while traversing the collection is returned. The implementation uses fold
so it has the same complexity as fold
. Returns None
iff the collection is empty.
These are all like their equivalents in Container
except that an index starting at 0 is added as the first argument to f
.
init n ~f
is equivalent to of_list [f 0; f 1; ...; f (n-1)]
. It raises an exception if n < 0
.
mapi
is like map. Additionally, it passes in the index of each element as the first argument to the mapped function.
filter_mapi is like filter_map
. Additionally, it passes in the index of each element as the first argument to the mapped function.
val to_sequence : t -> Uchar.t Sequence.t
Produce a sequence of unicode characters.
val sanitize : string -> t
Create a t
from a string by replacing any byte sequences that are invalid in this encoding with Uchar.replacement_char
. This can be used to decode strings that may be encoded incorrectly.
Decodes the Unicode scalar value at the given byte index in this encoding. Raises if byte_pos
does not refer to the start of a Unicode scalar value.
val of_string_unchecked : string -> t
Creates a t
without sanitizing or validating the string. Other functions in this interface may raise or produce unpredictable results if the string is invalid in this encoding.
Similar to String.split
, but splits on a Uchar.t
in t
. If you want to split on a char
, first convert it with Uchar.of_char
, but note that the actual byte(s) on which t
is split may not be the same as the char
byte depending on both char
and the encoding of t
. For example, splitting on 'α' in UTF-8 or on '\n' in UTF-16 is actually splitting on a 2-byte sequence.
val length_in_uchars : t -> int
Counts the number of unicode scalar values in t
.
This function is not a good proxy for display width, as some scalar values have display widths > 1. Many native applications such as terminal emulators use wcwidth
(see man 3 wcwidth
) to compute the display width of a scalar value. See the uucp library's Uucp.Break.tty_width_hint
for an implementation of wcwidth
's logic. However, this is merely best-effort, as display widths will vary based on the font and underlying text shaping engine (see docs on tty_width_hint
for details).
For applications that support Grapheme clusters (many terminal emulators do not), t
should first be split into Grapheme clusters and then the display width of each of those Grapheme clusters needs to be computed (which is the max display width of the scalars that are in the cluster).
There are some active efforts to improve the current state of affairs:
val length : t -> int
length
could be misinterpreted as counting bytes. We direct users to other, clearer options.