module Scanf:Formatted input functions.sig
..end
module Scanning:sig
..end
exception Scan_failure of string
val bscanf : Scanning.scanbuf ->
('a, Scanning.scanbuf, 'b) format -> 'a -> 'b
bscanf ib format f
reads tokens from the scanning buffer ib
according
to the format string format
, converts these tokens to values, and
applies the function f
to these values.
The result of this application of f
is the result of the whole construct.
For instance, if p
is the function fun s i -> i + 1
, then
Scanf.sscanf "x = 1" "%s = %i" p
returns 2
.
Raise Scanf.Scan_failure
if the given input does not match the format.
Raise Failure
if a conversion to a number is not possible.
Raise End_of_file
if the end of input is encountered while scanning
and the input matches the given format so far.
The format is a character string which contains three types of objects:
f
,
Conversion specifications consist in the %
character, followed by
an optional flag, an optional field width, and followed by one or
two conversion characters. The conversion characters and their
meanings are:
d
: reads an optionally signed decimal integer.i
: reads an optionally signed integer
(usual input formats for hexadecimal (0x[d]+
and 0X[d]+
),
octal (0o[d]+
), and binary 0b[d]+
notations are understood).u
: reads an unsigned decimal integer.x
or X
: reads an unsigned hexadecimal integer.o
: reads an unsigned octal integer.s
: reads a string argument (by default strings end with a space).S
: reads a delimited string argument (delimiters and special
escaped characters follow the lexical conventions of Caml).c
: reads a single character. To test the current input character
without reading it, specify a null field width, i.e. use
specification %0c
. Raise Invalid_argument
, if the field width
specification is greater than 1.C
: reads a single delimited character (delimiters and special
escaped characters follow the lexical conventions of Caml).f
, e
, E
, g
, G
: reads an optionally signed
floating-point number in decimal notation, in the style dddd.ddd
e/E+-dd
.F
: reads a floating point number according to the lexical
conventions of Caml (hence the decimal point is mandatory if the
exponent part is not mentioned).B
: reads a boolean argument (true
or false
).b
: reads a boolean argument (for backward compatibility; do not use
in new programs).ld
, li
, lu
, lx
, lX
, lo
: reads an int32
argument to
the format specified by the second letter (decimal, hexadecimal, etc).nd
, ni
, nu
, nx
, nX
, no
: reads a nativeint
argument to
the format specified by the second letter.Ld
, Li
, Lu
, Lx
, LX
, Lo
: reads an int64
argument to
the format specified by the second letter.[ range ]
: reads characters that matches one of the characters
mentioned in the range of characters range
(or not mentioned in
it, if the range starts with ^
). Returns a string
that can be
empty, if no character in the input matches the range. Hence,
['0'-'9']
returns a string representing a decimal number or an empty
string if no decimal digit is found.
If a closing bracket appears in a range, it must occur as the
first character of the range (or just after the ^
in case of
range negation); hence []]
matches a ]
character and
[^]]
matches any character that is not ]
.l
: applies f
to the number of lines read so far.n
: applies f
to the number of characters read so far.N
: applies f
to the number of tokens read so far.!
: matches the end of input condition.%
: matches one %
character in the input.%
character introducing a conversion, there may be
the special flag _
: the conversion that follows occurs as usual,
but the resulting value is discarded.
The field widths are composed of an optional integer literal
indicating the maximal width of the token to read.
For instance, %6d
reads an integer, having at most 6 decimal digits;
and %4f
reads a float with at most 4 characters.
Scanning indications appear just after the string conversions s
and
[ range ]
to delimit the end of the token. A scanning
indication is introduced by a @
character, followed by some
constant character c
. It means that the string token should end
just before the next matching c
(which is skipped). If no c
character is encountered, the string token spreads as much as
possible. For instance, "%s@\t"
reads a string up to the next
tabulation character. If a scanning indication @c
does not
follow a string conversion, it is ignored and treated as a plain
c
character.
Notes:
Scanf
format strings compared to those used by the
Printf
module. However, scanning indications are similar to those
of the Format
module; hence, when producing formatted text to be
scanned by !Scanf.bscanf
, it is wise to use printing functions
from Format
(or, if you need to use functions from Printf
,
banish or carefully double check the format strings that contain
'@'
characters).'_'
characters may appear
inside numbers (this is reminiscent to the usual Caml
conventions). If stricter scanning is desired, use the range
conversion facility instead of the number conversions.scanf
facility is not intended for heavy duty lexical
analysis and parsing. If it appears not expressive enough for your
needs, several alternative exists: regular expressions (module
Str
), stream parsers, ocamllex
-generated lexers,
ocamlyacc
-generated parsers.val fscanf : in_channel ->
('a, Scanning.scanbuf, 'b) format -> 'a -> 'b
Scanf.bscanf
, but inputs from the given channel.
Warning: since all scanning functions operate from a scanning
buffer, be aware that each fscanf
invocation must allocate a new
fresh scanning buffer (unless careful use of partial evaluation in
the program). Hence, there are chances that some characters seem
to be skipped (in fact they are pending in the previously used
buffer). This happens in particular when calling fscanf
again
after a scan involving a format that necessitates some look ahead
(such as a format that ends by skipping whitespace in the input).
To avoid confusion, consider using bscanf
with an explicitly
created scanning buffer. Use for instance Scanning.from_file f
to allocate the scanning buffer reading from file f
.
This method is not only clearer it is also faster, since scanning
buffers to files are optimized for fast bufferized reading.
val sscanf : string -> ('a, Scanning.scanbuf, 'b) format -> 'a -> 'b
Scanf.bscanf
, but inputs from the given string.val scanf : ('a, Scanning.scanbuf, 'b) format -> 'a -> 'b
Scanf.bscanf
, but reads from the predefined scanning
buffer Scanf.Scanning.stdib
that is connected to stdin
.val kscanf : Scanning.scanbuf ->
(Scanning.scanbuf -> exn -> 'a) ->
('b, Scanning.scanbuf, 'a) format -> 'b -> 'a
Scanf.bscanf
, but takes an additional function argument
ef
that is called in case of error: if the scanning process or
some conversion fails, the scanning function aborts and applies the
error handling function ef
to the scanning buffer and the
exception that aborted the scanning process.