Commit graph

3609 commits

Author SHA1 Message Date
Dan Klishch
2b65b62c6e AK: Use bit_cast in SIMDExtras.h/AK::Detail::byte_reverse_impl
This necessitates marking bit_cast as ALWAYS_INLINE since emitting it as
a function call there will create an unnecessary potential SSE
registers -> plain registers/memory round-trip.
2024-07-12 18:30:07 -04:00
Dan Klishch
56b7f9e404 Meta: Globally disable -Wpsabi
This warning is triggered when one accepts or returns vectors from a
function (that is not marked with [[gnu::target(...)]]) which would have
been otherwise passed in register if the current translation unit had
been compiled with more permissive flags wrt instruction selection (i.
e. if one adds -mavx2 to cmdline). This will never be a problem for us
since we (a) never use different instruction selection options across
ABI boundaries; (b) most of the affected functions are actually
TU-local.

Moreover, even if we somehow properly annotated all of the SIMD helpers,
calling them across ABI (or target) boundaries would still be very
dangerous because of inconsistent and bogus handling of
[[gnu::target(...)]] across compilers. See
https://github.com/llvm/llvm-project/issues/64706 and
https://www.reddit.com/r/cpp/comments/17qowl2/comment/k8j2odi .
2024-07-12 18:30:07 -04:00
Dan Klishch
02c66c8696 AK: Add links to relevant coroutine bugs on GCC bugzilla
Fixes #24630.
2024-07-10 05:02:05 -04:00
Diego
7712ca1eb3 AK: Add remaining method to ConstrainedStream
Simply returns how many bytes can be read from the stream.

(cherry picked from commit aee2f25929157a4e6c7f391584c4839a40beadc8)
2024-07-10 01:10:12 +02:00
Timothy Flynn
8adcbd7136 AK: Decode paired UTF-16 surrogates in a JSON string
For example, such use is seen on Twitter.

(cherry picked from commit 698a95d2dee0ba7e9a3f1c39af5459ba506c445f)
2024-07-07 18:47:09 +02:00
Timothy Flynn
c56a965126 AK: Make a couple of GenericLexer helper methods protected
We will want to use the exact behavior of these methods in JsonParser.

(cherry picked from commit c39a3fef17e913da944c35d57541298232bcea53)
2024-07-07 18:47:09 +02:00
sdomi
6403468f09 AK: Re-enable execinfo for GNU Hurd
Adds an explicit check for Hurd due to previous regression.
2024-07-05 09:24:53 -06:00
sdomi
4cc93cf80f AK: Do not build against libexecinfo on non-glibc systems
It seems that libexecinfo was broken on musl (and possibly other libcs)
for quite a while now (resulting in a segfault, or returning 0 frames).
Hence, we shouldn't include it there because it doesn't provide any
added functionality.
2024-07-05 13:50:55 +02:00
Hendiadyoin1
1b8fd5c35a AK: Add generic SIMD shuffle/reverse functions 2024-07-05 00:52:30 +02:00
Hendiadyoin1
27c386797d AK: Add generic SIMD vector load/store functions 2024-07-05 00:52:30 +02:00
Hendiadyoin1
8d6028d366 AK: Add introspection helpers to SIMD.h 2024-07-05 00:52:30 +02:00
kleines Filmröllchen
606963bfa2 AK: Move FormatParser to a new header
Since it was intentionally removed from Format.h to untangle
dependencies, this is a new small header that can be included by anyone
needing custom format parsing.
2024-07-04 22:17:11 +02:00
kleines Filmröllchen
edd85b1281 AK: Move nanoseconds_within_second to Duration 2024-07-04 22:17:11 +02:00
Zaggy1024
a0780b30f5 AK: Prevent overflow of the min when clamping unsigned values to signed
Also, add some tests for the cases that were broken before.

(cherry picked from commit bbd8a218a5bdf077fe82f958e3cc583d0131b687)
2024-07-04 22:09:32 +02:00
Nico Weber
9cc688041a AK: Give IntrusiveBinaryHeap template args named types
Works around https://gcc.gnu.org/PR70413. Without this, the next
commit would cause -Wsubobject-linkage warnings for AK/BinaryHeap.h
when used in LibCompress/Huffman.h.

We can undo this once we're on GCC 14 everywhere.

No behavior change.
2024-07-01 00:29:39 +02:00
Dan Klishch
2360df3ab8 Everywhere: Define even more destructors out of line
You guessed it, this fixes compilation with LLVM trunk.
2024-06-30 08:52:07 +02:00
Dan Klishch
c03cca7b2f AK+LibTest: Choose definition of CO_TRY and CO_TRY_OR_FAIL more robustly
There are three compiler bugs that influence this decision:

 - Clang writing to (validly) destroyed coroutine frame with -O0 and
   -fsanitize=null,address under some conditions
   (https://godbolt.org/z/17Efq5Ma5) (AK_COROUTINE_DESTRUCTION_BROKEN);

 - GCC being unable to handle statement expressions in coroutines
   (AK_COROUTINE_STATEMENT_EXPRS_BROKEN);

 - GCC being unable to deduce template type parameter for TryAwaiter
   with nested CO_TRYs (AK_COROUTINE_TYPE_DEDUCTION_BROKEN).

Instead of growing an ifdef soup in AK/Coroutine.h and
LibTest/AsyncTestCase.h, define three macros in AK/Platform.h that
correspond to these bugs and use them accordingly in the said files.
2024-06-29 20:15:05 -06:00
Liav A.
f6e01aae9a Prekernel: Add support for assertion printing
This is done by using a FixedStringBuffer as the foundation to perform
string formatting, which ensures that we avoid memory allocations in
the prekernel stage.
2024-06-29 19:56:45 +02:00
implicitfield
176999e20c AK/Span: Add a method to check if a Span is filled with a value 2024-06-29 19:16:08 +02:00
Diego
7b50f71e0e AK: Read signed LEB128 integers without 64-bit assumptions
This fixes some errors where too many bytes were allowed to be read for
signed integers of a smaller size (e.g. i32). The new parser doesn't
make 64-bit assumptions and now matches the generality of its unsigned
counterpart.

(cherry picked from commit 596dd5252d477747d466e660acca25c015ce12e4)
2024-06-26 22:13:13 +02:00
Daniel Bertalan
d1f1a6eef0 Everywhere: Remove usages of template keyword with no parameter list
These were made invalid with P1787, and Clang (19) trunk started warning
on them with https://github.com/llvm/llvm-project/pull/80801.

(cherry picked from commit 397774d42272fff8dbc6d8d53616d79667d6608a)
2024-06-25 17:42:49 +02:00
Liav A.
b91c625f8f AK: Introduce declerations for base10 size suffixes
Also, re-organize the base2 size suffixes to not use KiB repeatedly.
2024-06-25 09:24:55 +02:00
Ali Mohammad Pur
a711bb63b7 LibHTTP+LibCore+RequestServer: Use async streams for HTTP
We no longer use blocking reads and writes, yielding a nice performance
boost and a significantly cleaner (and more readable) implementation :^)
2024-06-19 15:45:02 +02:00
Ali Mohammad Pur
6edf5653a5 AK: Add a formatter for OwnPtr<T>
This formatter just prints the object out as a pointer.
2024-06-19 15:45:02 +02:00
Dan Klishch
79051e5f79 AK: Add AK::AsyncStreamBuffer
This class encapsulates all the logic required for a buffer of
asynchronous stream.
2024-06-13 17:40:24 +02:00
Dan Klishch
73b4d39189 AK: Add AK::AsyncStreamTransform
This class allows to convert an asynchronous generator which generates
chunks of data into an AsyncInputStream. This is useful in practice
because the said generator often ends up looking very similar to the
underlying synchronous algorithm it describes.
2024-06-13 17:40:24 +02:00
Dan Klishch
0f1d67f2f0 AK: Introduce AK::Generator 2024-06-13 17:40:24 +02:00
Dan Klishch
aeb17a864e AK: Instantiate GenericLexer::consume_decimal_integer<size_t> on MacOS
u64 and size_t are different types on it.
2024-06-13 17:40:24 +02:00
Dan Klishch
60476e247b AK: Introduce a few asynchronous stream helpers
These were proven to be generally useful after writing some asynchronous
streams code.
2024-06-13 17:40:24 +02:00
Dan Klishch
7fb6d24402 LibTest: Support asynchronous tests 2024-06-13 17:40:24 +02:00
Dan Klishch
6662d5de2b AK: Introduce asynchronous streams 2024-06-13 17:40:24 +02:00
Dan Klishch
b15b51f9da AK: Don't delete Badge move constructor
Otherwise it is impossible to use Badge as an argument of a coroutine.
2024-06-13 17:40:24 +02:00
Dan Klishch
8263e0a619 AK: Introduce AK::Coroutine 2024-06-13 17:40:24 +02:00
Dan Klishch
9054682536 AK: Qualify forward call in AK::exchange
Otherwise it fails to resolve ambiguity between std::forward and
AK::forward when called with the std:: type.
2024-06-13 17:40:24 +02:00
Nico Weber
bb2d80a2bb Everywhere: Gently remove the ladybird android port
With Ladybird now being its own repository, there's little reason
to keep the Ladybird Android port in the SerenityOS repository.

(The Qt port is useful to be able to test changes to LibWeb in lagom
so it'll stay around. Similar for the AppKit port, since getting
Qt on macOS is a bit annoying. But if the AppKit port is too much
pain to keep working, we should toss that too.

Eventually, the lagom browser ports should move out from Ladybird/
to Meta/Lagom/Contrib, but for now it might make sense to leave them
where they are to keep cherry-picks from ladybird easier.)
2024-06-11 19:40:08 -04:00
Diego
a8639245bf AK: Add AllowSurrogates to UTF-8 validator
The [UTF-8](https://datatracker.ietf.org/doc/html/rfc3629#page-5)
standard says to reject strings with upper or lower surrogates. However,
in many standards, ECMAScript included, unpaired surrogates (and
therefore UTF-8 surrogates) are allowed in strings. So, this commit
extends the UTF-8 validation API with `AllowSurrogates`, which will
reject upper and lower surrogate characters.
2024-06-09 16:30:09 +02:00
Timothy Flynn
fe3fde2411 AK+LibUnicode: Implement a case-insensitive variant of find_byte_offset
The existing String::find_byte_offset is case-sensitive. This variant
allows performing searches using Unicode-aware case folding.
2024-06-01 07:37:54 +02:00
Daniel Bertalan
637ccacce5 AK: Enable format string checking in Clang builds
Format string checking was disabled in Clang-based builds due to a
compiler bug: https://github.com/llvm/llvm-project/issues/51182. Now
that the requirement has been raised to Clang 17, that is no longer
necessary.

This has been tested to work correctly with Apple Clang 15.0.0 (which is
the *least modern* supported compiler), as well as CLion 2024.1's
bundled Clangd.
2024-05-29 13:34:15 -06:00
Matthew Olsson
e0d6afbabe ClangPlugins: Invert the lambda detection escape mechanism
Instead of being opt-out with NOESCAPE, it is now opt-in with ESCAPING.
Opt-out is ideal, but unfortunately this was extremely noisy when
compiling the entire codebase. Escaping functions are rarer than non-
escaping ones, so let's just go with that for now.

This also allows us to gradually add heuristics for detecting missing
ESCAPING annotations and emitting them as errors. It also nicely matches
the spelling that Swift uses (@escaping), which is where this idea
originally came from.
2024-05-22 21:55:34 -06:00
Matthew Olsson
a5f4c9a632 AK+Userland: Remove NOESCAPE
See the next commit for an explanation
2024-05-22 21:55:34 -06:00
Dan Klishch
38b51b791e AK+Kernel+LibVideo: Include workarounds for missing P0960 only in Xcode
With this change, ".*make.*" function family now does error checking
earlier, which improves experience while using clangd. Note that the
change also make them instantiate classes a bit more eagerly, so in
LibVideo/PlaybackManager, we have to first define SeekingStateHandler
and only then make() it.

Co-Authored-By: stelar7 <dudedbz@gmail.com>
2024-05-21 14:24:59 +02:00
Tim Ledbetter
d0d81e470e AK: Fix off by one error in integral ceil_log2()
Previously, certain values of `ceil_log2(x)` would be 1 smaller than
`ceil(log2(x))`.
2024-05-21 09:31:17 +02:00
Dan Klishch
be36dbce7d AK: Don't put element count next to heap-allocated data in FixedArray
This not only makes code easier to follow but also makes it faster.
2024-05-18 18:30:42 +02:00
Lucas CHOLLET
c6e4563489 AK: Export Statistics to the global namespace 2024-05-18 18:30:07 +02:00
Andreas Kling
b2e6843055 LibJS+AK: Fix integer overflow UB on (any Int32 - -2147483648)
It wasn't safe to use addition_would_overflow(a, -b) to check if
subtraction (a - b) would overflow, since it doesn't cover this case.

I don't know why we didn't have subtraction_would_overflow(), so this
patch adds it. :^)
2024-05-18 18:11:50 +02:00
Sönke Holz
b6cc95c38e AK: Add a function for frame pointer-based stack unwinding
Instead of duplicating stack unwinding code everywhere, introduce a new
AK helper to unwind the stack in a generic way.
2024-05-14 14:02:06 -06:00
ptrcnull
13e44ab035 AK: Add stack size fixup for musl libc
Fixes #16681
2024-05-14 13:56:45 -06:00
Andreas Kling
6b2b90d2b0 AK: Remove AK_HAS_CONDITIONALLY_TRIVIAL
Code behind this appears to compile nicely with Clang 17 and later.
2024-05-10 15:03:24 +00:00
implicitfield
f923016e0b AK: Add reinterpret_as_octal()
This is useful for parsing user-provided integers that should be
interpreted as octals.
2024-05-07 16:54:27 -06:00
Abuneri
b5bed37074 AK: Replace FP math in is_power_of with a purely integral algorithm
The previous naive approach was causing test failures because of
rounding issues in some exotic environments. In particular, MSVC
via MSBuild
2024-05-07 16:43:34 -06:00
Andreas Kling
ebe6ec6069 AK: Check for u32 overflow in String::repeated()
I don't know why this was checking for size_t overflow, but it was
tripping up ASAN malloc() checks by passing a way-too-large size.
2024-05-07 09:15:40 +02:00
Nico Weber
c421a3d7ce AK: Add missing using statements to Find.h 2024-05-06 17:32:19 +02:00
Sergey Bugaev
0bb37f9c0e AK: Include <features.h> before checking for platform macros
AK/Platform.h did not include any other header file, but expected
various macros to be defined. While many of the macros checked here are
predefined by the compiler (i.e. GCC's TARGET_OS_CPP_BUILTINS), some
may be defined by the system headers instead. In particular, so is
__GLIBC__ on glibc-based systems.

We have to include some system header for getting __GLIBC__ (or not).
It could be possible to include something relatively small and
innocuous, like <string.h> for example, but that would still clutter
the name space and make other code that would use <string.h>
functionality, but forget to include it, build on accident; we wouldn't
want that. At the end of the day, the header that actually defines
__GLIBC__ (or not) is <features.h>. It's typically included from other
glibc headers, and not by user code directly, which makes it unlikely
to mask other code accidentlly forgetting to include it, since it
wouldn't include it in the first place.

<features.h> is not defined by POSIX and could be missing on other
systems (but it seems to be present at least when using either glibc or
musl), so guard its inclusion with __has_include().

Specifically, this fixes AK/StackInfo.cpp not picking up the glibc code
path in the cross aarch64-gnu (GNU/Hurd on 64-bit ARM) Lagom build.
2024-05-02 07:46:53 -06:00
Tim Ledbetter
8b01abf9f7 AK: Don't move trivially copyable types in BufferedStream methods 2024-04-30 13:22:56 +02:00
Liav A.
122c82a2a1 AK: Add the SetOnce class
The SetOnce class is meant to be used as one-time set boolean flag,
which is useful for flags that change only once and then stay immutable
forever.
2024-04-26 23:46:23 -06:00
Nico Weber
88d0702763 AK: Make ceil_div() handle one argument being negative correctly
`ceil_div(-1, 2)` used to return -1.
Now it returns 0, which is the correct ceil(-0.5).

(C++'s division semantics have floor semantics for numbers > 0,
but ceil semantics for numbers < 0.)

This will be important for the JPEG2000 decoder eventually.
2024-04-27 07:09:08 +02:00
Timothy Flynn
fecd08ce64 Everywhere: Remove 'clang-format off' comments that are no longer needed 2024-04-24 16:50:01 -04:00
Timothy Flynn
ec492a1a08 Everywhere: Run clang-format
The following command was used to clang-format these files:

    clang-format-18 -i $(find . \
        -not \( -path "./\.*" -prune \) \
        -not \( -path "./Base/*" -prune \) \
        -not \( -path "./Build/*" -prune \) \
        -not \( -path "./Toolchain/*" -prune \) \
        -not \( -path "./Ports/*" -prune \) \
        -type f -name "*.cpp" -o -name "*.mm" -o -name "*.h")

There are a couple of weird cases where clang-format now thinks that a
pointer access in an initializer list, e.g. `m_member(ptr->foo)`, is a
lambda return statement, and it puts spaces around the `->`.
2024-04-24 16:50:01 -04:00
kleines Filmröllchen
8443d0a74d AK: Use common ComponentType integer type for float bitfields
This allows us to easily use an appropriate integer type when performing
float bitfield operations.

This change also adds a comment about the technically-incorrect 80-bit
extended float mantissa field.
2024-04-23 19:18:09 -06:00
Andrew Kaster
913cffe928 AK: Add workaround for faulty Sanitizer warning on gcc 13+ in Atomic
gcc can't seem to figure out that the address of a member variable of
AK::Atomic<u32> in AtomicRefCounted cannot be null when fetch_sub-ing.
Add a bogus condition to convince the compiler that it can't be null.
2024-04-23 15:37:07 -06:00
dgaston
08aaf4fb07 AK: Add methods to BufferedStream to resize the user supplied buffer
These changes allow lines of arbitrary length to be read with
BufferedStream. When the user supplied buffer is smaller than
the line, it will be resized to fit the line. When the internal
buffer in BufferedStream is smaller than the line, it will be
read into the user supplied buffer chunk by chunk with the
buffer growing accordingly.

Other behaviors match the behavior of the existing read_line method.
2024-04-21 11:46:55 +02:00
Jess
ecb7d4b40f LibJS: Throw RangeError in StringPrototype::repeat if OOM
currently crashes with an assertion failure in `String::repeated` if
malloc can't serve a `count * input_size` sized request, so add
`String::repeated_with_error` to propagate the error.
2024-04-20 19:23:46 -04:00
Andrew Kaster
1e749d023a AK: Add fallible dequeue method to Queue 2024-04-19 16:38:55 -04:00
Dan Klishch
5ed7cd6e32 Everywhere: Use east const in more places
These changes are compatible with clang-format 16 and will be mandatory
when we eventually bump clang-format version. So, since there are no
real downsides, let's commit them now.
2024-04-19 06:31:19 -04:00
implicitfield
1159cd9390 AK+Kernel+LibSanitizer: Implement __ubsan_handle_function_type_mismatch 2024-04-18 13:14:33 -06:00
Space Meyer
fdc0328ce3 Kernel: Exclude individual functions from coverage instrumentation
Sticking this to the function source has multiple benefits:
- We instrument more code, by not excluding entire files.
- NO_SANITIZE_COVERAGE can be used in Header files.
- Keeping the info with the source code, means if a function or
  file is moved around, the NO_SANITIZE_COVERAGE moves with it.
2024-04-15 21:16:22 -06:00
Space Meyer
7d8431dcfc AK: Toolchain dependend instrumentation __attribute__
GCC sometimes complains about the The `no_sanitize("address")` syntax,
and clang sometimes complains abouth the `no_sanitize_address` syntax.
Both claim to support both, so that's neat!
2024-04-15 21:16:22 -06:00
Andrew Kaster
8c5e64e686 Ladybird+LibWebView: Add mechanism to get Mach task port for helpers
On macOS, it's not trivial to get a Mach task port for your children.
This implementation registers the chrome process as a well-known
service with launchd based on its pid, and lets each child process
send over a reference to its mach_task_self() back to the chrome.

We'll need this Mach task port right to get process statistics.
2024-04-09 16:43:27 -06:00
Andrew Kaster
4a9546a7c8 AK: Add platform macro for Mach-based operating system environments 2024-04-09 16:43:27 -06:00
Matthew Olsson
76fa127cbf LibJSGCVerifier: Detect stack-allocated ref captures in lambdas
For example, consider the following code snippet:

    Vector<Function<void()>> m_callbacks;
    void add_callback(Function<void()> callback)
    {
    	m_callbacks.append(move(callback));
    }

    // Somewhere else...
    void do_something()
    {
    	int a = 10;
    	add_callback([&a] {
            dbgln("a is {}", a);
    	});
    } // Oops, "a" is now destroyed, but the callback in m_callbacks
      // has a reference to it!

We now statically detect the capture of "a" in the lambda above and flag
it as incorrect. Note that capturing the value implicitly with a capture
list of `[&]` would also be detected.

Of course, many functions that accept Function<...> don't store them
anywhere, instead immediately invoking them inside of the function. To
avoid a warning in this case, the parameter can be annotated with
NOESCAPE to indicate that capturing stack variables is fine:

    void do_something_now(NOESCAPE Function<...> callback)
    {
    	callback(...)
    }

Lastly, there are situations where the callback does generally escape,
but where the caller knows that it won't escape long enough to cause any
issues. For example, consider this fake example from LibWeb:

    void do_something()
    {
    	bool is_done = false;
    	HTML::queue_global_task([&] {
            do_some_work();
            is_done = true;
        });
    	HTML::main_thread_event_loop().spin_until([&] {
            return is_done;
        });
    }

In this case, we know that the lambda passed to queue_global_task will
be executed before the function returns, and will not persist
afterwards. To avoid this warning, annotate the type of the capture
with IGNORE_USE_IN_ESCAPING_LAMBDA:

    void do_something()
    {
   	IGNORE_USE_IN_ESCAPING_LAMBDA bool is_done = false;
    	// ...
    }
2024-04-09 09:10:44 +02:00
stelar7
3f1019b089 AK: Add XOR method to ByteBuffer 2024-04-08 09:34:49 -06:00
Shannon Booth
8c34842962 AK: Simplify and optimize ASCIICaseInsensitiveFlyStringTraits::equals
The member function `equals_ignoring_ascii_case` has a fast path which
will return early if it is the same FlyString instance.
2024-04-06 09:17:51 -04:00
Timothy Flynn
c5c5e52c24 AK: Disallow calling ByteString methods that return a view on rvalues
This prevents, for example:

    StringView view = ByteString { "foo" }.view();

This prevents a class of potential UAF.
2024-04-04 11:23:21 +02:00
Timothy Flynn
de80f544d8 AK: Disallow calling String methods that return a view on rvalues
This prevents, for example:

    StringView view = "foo"_string.bytes_as_string_view();

This prevents a class of potential UAF.
2024-04-04 11:23:21 +02:00
Timothy Flynn
b5f22b6e90 AK+Userland: Remove some needlessly explicit conversions to StringView 2024-04-04 11:23:21 +02:00
Timothy Flynn
e0bddbb65e AK: Add a Stream::write_until_depleted overload for string types
All string types currently have to invoke this function as:

    stream.write_until_depleted("foo"sv.bytes());

This isn't very ergonomic, but more importantly, this overload will
allow String/ByteString instances to be written in this manner once
e.g. `ByteString::view() &&` is deleted.
2024-04-04 11:23:21 +02:00
Timothy Flynn
c7ea710b55 AK: Return a constant reference from JsonValue::as_string
Rather than making a copy of the held string, this returns a reference
so that expressions like the following:

    do_something(json.as_string().view());

are not disallowed once `ByteString::view() &&` is deleted.
2024-04-04 11:23:21 +02:00
Andreas Kling
3881717103 LibJS+AK: Register GC memory as root regions for LeakSanitizer
This should fix the gigantic list of false positives dumped by
LeakSanitizer on exit .
2024-04-03 12:41:02 +02:00
Hendiadyoin1
877cfe1890 AK: Move generalized internals of UFixedBigIntDivision to BigIntBase
We will reuse this in LibCrypto

Co-Authored-By: Dan Klishch <danilklishch@gmail.com>
2024-03-25 14:26:29 -06:00
Hendiadyoin1
9045840e33 AK: Use correct wide integer type for qhat check in UFixedBigIntDivision
Previously, we were assuming that were always on a 64-bit platform,
which is not 100% correct
2024-03-25 14:26:29 -06:00
Hendiadyoin1
f95abe8c0e AK: Make BigIntBase more agnostic to non native word sizes
This will allow us to use it in Crypto::UnsignedBigInteger, which always
uses 32 bit words
2024-03-25 14:26:29 -06:00
Nico Weber
1ab28276f6 LibGfx: Add the start of a JPEG2000 loader
JPEG2000 is the last image format used in PDF filters that we
don't have a loader for. Let's change that.

This adds all the scaffolding, but no actual implementation yet.
2024-03-25 20:35:00 +01:00
Nico Weber
07750774cf AK: Allow creating a MaybeOwned<Superclass> from a MaybeOwned<Subclass> 2024-03-25 20:35:00 +01:00
Andreas Kling
2b8a920a7c AK: Don't blindly use SipHash as default hash function
Although it has some interesting properties, SipHash is brutally slow
compared to our previous hash function. Since its introduction, it has
been highly visible in every profile of doing anything interesting with
LibJS or LibWeb.

By switching back, we gain a 10x speedup for 32-bit hashes, and "only"
a 3x speedup for 64-bit hashes.

This comes out to roughly 1.10x faster HashTable insertion, and roughly
2.25x faster HashTable lookup. Hashing is no longer at the top of
profiles and everything runs measurably faster.

For security-sensitive hash tables with user-controlled inputs, we can
opt into SipHash selectively on a case-by-case basis. The vast majority
of our uses don't fit that description though.
2024-03-25 12:39:23 +01:00
Timothy Flynn
7e38653492 AK: Reject invalid Base64 encoded string lengths 2024-03-25 08:13:27 +01:00
Timothy Flynn
4ecf4c7617 AK: Compute the exact size of decoded Base64 strings 2024-03-25 08:13:27 +01:00
Timothy Flynn
754ff41b9c AK: Remove whitespace skipping feature from AK's Base64 decoder
This was added in commit f2663f477f as a
partial implementation of what is now LibWeb's forgiving Base64 decoder.
All use cases within LibWeb that require whitespace skipping now use
that implementation instead.

Removing this feature from AK allows us to know the exact output size of
a decoded Base64 string. We can still trim whitespace at the start and
end of the input though; for example, this is useful when reading from a
file that may have a newline at the end of the file.
2024-03-25 08:13:27 +01:00
Timothy Flynn
690db10463 AK: Convert Base64 template parameters to regular function parameters
The generated function name is otherwise very long, which makes stack
traces a bit more difficult to sift through.
2024-03-25 08:13:27 +01:00
Timothy Flynn
f292746134 AK: Convert some west-consts to east-const in Base64.cpp
Caught by clang-format-17. Note that clang-format-16 is fine with this
as well (it leaves the const placement alone), it just doesn't perform
the formatting to east-const itself.
2024-03-25 08:13:27 +01:00
Andreas Kling
3bdfca1119 AK: Make FlyString::from_utf8*() avoid allocation if possible
If we already have a FlyString instantiated for the given string,
look that up and return it instead of making a temporary String just to
use as a key into the FlyString table.
2024-03-24 13:28:24 +01:00
Andreas Kling
8d7a1e5654 LibWeb: Skip some redundant UTF-8 validation in CSS tokenizer
If we're just adding code points to a StringBuilder, there's no need to
revalidate the result.
2024-03-24 13:28:24 +01:00
Andreas Kling
a88799c032 AK: Remove excessive hashing caused by FlyString table
Before this change, the global FlyString table looked like this:

    HashMap<StringView, Detail::StringBase>

After this change, we have:

    HashTable<Detail::StringData const*, FlyStringTableHashTraits>

The custom hash traits are used to extract the stored hash from
StringData which avoids having to rehash the StringView repeatedly like
we did before.

This necessitated a handful of smaller changes to make it work.
2024-03-24 13:28:24 +01:00
Andreas Kling
8bfad24708 AK: Move AK::Detail::StringData to its own header file
This will allow us to access it from FlyString.cpp
2024-03-24 13:28:24 +01:00
Dan Klishch
45a0ba2167 AK: Introduce AK::enumerate
Co-Authored-By: Tim Flynn <trflynn89@pm.me>
2024-03-23 09:02:58 -04:00
Stanisław Wiśniewski
994fe0b89f AK: Use else if constexpr in explode_byte() 2024-03-21 14:35:20 -06:00
Timothy Flynn
81ad6de41b AK: Avoid creating an intermediate buffer when decoding a Base64 string
There's no need to copy the result. We can also avoid increasing the
size of the output buffer by 1 for each written byte.

This reduces the runtime of `./bin/base64 -d enwik8.base64 >/dev/null`
from 0.917s to 0.632s.

(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
2024-03-21 15:53:46 +01:00
Timothy Flynn
0fd7ad09a0 AK: Avoid StringBuilder when creating a Base64-encoded string
We don't really need the features provided by StringBuilder here, since
we know the exact size of the output. Avoiding StringBuilder avoids the
recurring capacity/size checks both within StringBuilder itself and its
internal ByteBuffer.

This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
0.976s to 0.428s.

(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
2024-03-21 15:53:46 +01:00
Timothy Flynn
5f5b8ee9bb AK: Do not perform UTF-8 validation on Base64-encoded strings
We know we are only appending ASCII characters to the StringBuilder, so
do not bother validating the result.

This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
1.192s to 0.976s.

(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
2024-03-21 15:53:46 +01:00
Andrew Kaster
e9b16970fe AK: Add base64url encoding and decoding methods
This encoding scheme comes from section 5 of RFC 4648, as an
alternative to the standard base64 encode/decode methods.

The only difference is that the last two characters are replaced
with '-' and '_', as '+' and '/' are not safe in URLs or filenames.
2024-03-20 12:18:57 -04:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00