Previously, it ignored 'start', sorting from the array's
beginning. This caused unintended changes and slower
performance. Fix ensures sorting stays within 'start'
and 'end' indices only.
C++ will jovially select the implicit conversion operator, even if it's
complete bogus, such as for unknown-size types or non-destructible
types. Therefore, all such conversions (which incur a copy) must
(unfortunately) be explicit so that non-copyable types continue to work.
NOTE: We make an exception for trivially copyable types, since they
are, well, trivially copyable.
Co-authored-by: kleines Filmröllchen <filmroellchen@serenityos.org>
This makes them trivially copyable/movable, silencing
> "parameter is copied for each invocation"
warnings on `Optional<T&>`, which are unnecessairy,
since `Optional<T&>` is just a trivially copyable pointer.
This creates a slight change in behaviour when moving out of an
`Optional<T&>`, since the moved-from optional no longer gets cleared.
Moved-from values should be considered to be in an undefined state,
and if clearing a moved-from `Optional<T&>` is desired, you should be
using `Optional<T&>::release_value()` instead.
There was an existing check to ensure that `U` was an lvalue reference,
but when this check fails, overload resolution will just move right on
to the copy asignment operator, which will cause the temporary to be
assigned anyway.
Disallowing `Optional<T&>`s to be created from temporaries entirely
would be undesired, since existing code has valid reasons for creating
`Optional<T&>`s from temporaries, such as for function call arguments.
This fix explicitly deletes the `Optional::operator=(U&&)` operator,
so overload resolution stops.
Before this change, a StringView with a character-data pointer would
never compare as equal to one with a null pointer, even if they were
both length 0. This could happen for example if one is
default-initialized, and the other is created as a substring.
These don't have to worry about the input not being valid UTF-8 and
so can be infallible (and can even return self if no changes needed.)
We use this instead of Infra::to_ascii_{upper,lower}_case in LibWeb.
First, this isn't actually helpful, as we no longer store 32-bit values
in JsonValue. They are stored as 64-bit values anyways.
But more imporatantly, there was a bug here when trying to coerce an i64
to an i32. All negative values were cast to an i32, without checking if
the value is below NumericLimits<i32>::min.
Some callers (LibJS) will want to control the size of the output buffer,
to decode up to a maximum length. They will also want to receive partial
results in the case of an error. This patch adds a method to provide
those capabilities, and makes the existing implementation use it.
This requires pulling in some of the STL, but the result is that our
iterator is now STL Approved ™️ and our containers can be
auto-conformed to Swift protocols.
This isn't an issue now because this is only invoked from a macro that
is expanded within this file. But in an upcoming commit, it will be
invoked from a helper function in the Test namespace. At that point, the
compiler complains about the comparitor not being found (and helpfully
indicates we should move this one to the AK namespace to allow ADL to
succeed).
Otherwise, the following code would not compile:
constexpr Array<int, 3> array { 4, 5, 6 };
Vector<int> vector { 4, 5, 6 };
if (array == vector.span()) { }
We do such comparisons in tests quite a bit. But it currently doesn't
become an issue because of the way EXPECT_EQ copies its input parameters
to non-const locals. In a future patch, that copying will be removed,
and the compiler would otherwise complain about not finding a suitable
comparison operator.
The underlying CPU-specific instructions for operating on UTF-16 strings
behave differently for null inputs. Add an explicit check for this state
for consistency.
The underlying CPU-specific instructions for operating on UTF-8 strings
behave differently for null inputs. Add an explicit check for this state
for consistency.
Utf16View currently assumes host endianness. Add support for specifying
either big or little endianness (which we mostly just pipe through to
simdutf). This will allow using simdutf facilities with LibTextCodec.
The one behavior difference is that we will now actually fail on invalid
code units with Utf16View::to_utf8(AllowInvalidCodeUnits::No). It was
arguably a bug that this wasn't already the case.
We currently have 2 base64 coders: one in AK, another in LibWeb for a
"forgiving" implementation. ECMA-262 has an upcoming proposal which will
require a third implementation.
Instead, let's use the base64 implementation that is used by Node.js and
recommended by the upcoming proposal. It handles forgiving decoding as
well.
Our users of AK's implementation should be fine with the forgiving
implementation. The AK impl originally had naive forgiving behavior, but
that was removed solely for performance reasons.
Using http://mattmahoney.net/dc/enwik8.zip (100MB unzipped) as a test,
performance of our old home-grown implementations vs. the simdutf
implementation (on Linux x64):
Encode Decode
AK base64 0.226s 0.169s
LibWeb base64 N/A 1.244s
simdutf 0.161s 0.047s
There are a couple of differences here due to using ICU:
1. Titlecasing behaves slightly differently. We previously transformed
"123dollars" to "123Dollars", as we would use word segmentation to
split a string into words, then transform the first cased character
to titlecase. ICU doesn't go quite that far, and leaves the string
as "123dollars". While this is a behavior change, the only user of
this API is the `text-transform: capitalize;` CSS rule, and we now
match the behavior of other browsers.
2. There isn't an API to compare strings with case insensitivity without
allocating case-folded strings for both the left- and right-hand-side
strings. Our implementation was previously allocation-free; however,
in a benchmark, ICU is still ~1.4x faster.