A negative return value doesn't make sense for any of those functions.
The return types were inherited from POSIX, where they also need to have
an indicator for an error (negative values).
The Unicode spec defines much more complicated caseless matching
algorithms in its Collation spec. This implements the "basic" case
folding comparison.
Case folding rules have a similar mapping style as special casing rules,
where one code point may map to zero or more case folding rules. These
will be used for case-insensitive string comparisons. To see how case
folding can differ from other casing rules, consider "ß" (U+00DF):
>>> "ß".lower()
'ß'
>>> "ß".upper()
'SS'
>>> "ß".title()
'Ss'
>>> "ß".casefold()
'ss'
This implements FileWatcher using inotify filesystem events. Serenity's
InodeWatcher is remarkably similar to inotify, so this is almost an
identical implementation.
The existing TestLibCoreFileWatcher test is added to Lagom (currently
just for Linux).
This does not implement BlockingFileWatcher as that is currently not
used anywhere but on Serenity.
The existing `is_i32()` and friends only check if `i32` is their
internal type, but a value such as `0` could be literally any integer
type internally. `is_integer<T>()` instead determines whether the
contained value is an integer and can fit inside T.
This saves us an actual seek and rereading already stored buffer data in
cases where the seek is entirely covered by the currently buffered data.
This is especially important since we implement `discard` using `seek`
for seekable streams.
Unicode declares that to titlecase a string, the first cased code point
after each word boundary should be transformed to its titlecase mapping.
All other codepoints are transformed to their lowercase mapping.
This error was introduced by 9a7accdd and had a significant impact on
`BufferedFile` behavior. Hence, we started seeing crash in test262.
By itself, the issue was a wrong calculation of the internal reading
spans when using the `read` and `until` parameters. Which can lead to
at worse crash in VERIFY and at least weird behaviors as missed needles
or detections out of bounds.
It was also accompanied by an erroneous test.
This patch fixes the bug, the test and also provides more tests.
This parameter allows to start searching after an offset. For example,
to resume a search.
It is unfortunately a breaking change in API so this patch also modifies
one user and one test.
This implements a FlyString that will de-duplicate String instances. The
FlyString will store the raw encoded data of the String instance: If the
String is a short string, FlyString holds the String::ShortString bytes;
otherwise FlyString holds a pointer to the Detail::StringData.
FlyString itself does not know about String's storage or how to refcount
its Detail::StringData. It defers to String to implement these details.
Also add some tests that ensure that the input and output streams match
each other, because I can't wrap my head around what the internal
representation looks like.
Since AK can't refer to LibUnicode directly, the strategy here is that
if you need case transformations, you can link LibUnicode and receive
them. If you try to use either of these methods without linking it, then
you'll of course get a linker error (note we don't do any fallbacks to
e.g. ASCII case transformations). If you don't need these methods, you
don't have to link LibUnicode.
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
This makes construction of Utf16String fallible in OOM conditions. The
immediate impact is that PrimitiveString must then be fallible as well,
as it may either transcode UTF-8 to UTF-16, or create a UTF-16 string
from ropes.
There are a couple of places where it is very non-trivial to propagate
the error further. A FIXME has been added to those locations.
This allows us to make all comparision operators on the class constexpr
without pulling in a bunch of boilerplate. We don't use the `<compare>`
header because it doesn't compile in the main serenity cross-build due
to the include paths to LibC being incompatible with how libc++ expects
them to be for clang builds.
Simplify a lot of uses of ElapsedTimer by converting the callers to
elapsed_time from elapsed, as the AK::Time returned is better for unit
conversions and comparisons against constants.
Previously, if a pattern matched the empty string (e.g. ".*"), it would
match the string twice instead of once. Among other issues, this caused
a Regex replacement to duplicate its expected output, since it would
replace "both" empty matches.
These instances were detected by searching for files that include
stdlib.h, but don't match the regex:
\\b(_abort|abort|abs|aligned_alloc|arc4random|arc4random_buf|arc4random_
uniform|atexit|atof|atoi|atol|atoll|bsearch|calloc|clearenv|div|div_t|ex
it|_Exit|EXIT_FAILURE|EXIT_SUCCESS|free|getenv|getprogname|grantpt|labs|
ldiv|ldiv_t|llabs|lldiv|lldiv_t|malloc|malloc_good_size|malloc_size|mble
n|mbstowcs|mbtowc|mkdtemp|mkstemp|mkstemps|mktemp|posix_memalign|posix_o
penpt|ptsname|ptsname_r|putenv|qsort|qsort_r|rand|RAND_MAX|random|reallo
c|realpath|secure_getenv|serenity_dump_malloc_stats|serenity_setenv|sete
nv|setprogname|srand|srandom|strtod|strtof|strtol|strtold|strtoll|strtou
l|strtoull|system|unlockpt|unsetenv|wcstombs|wctomb)\\b
(Without the linebreaks.)
This regex is pessimistic, so there might be more files that don't
actually use anything from the stdlib.
In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.