Commit graph

2451 commits

Author SHA1 Message Date
kleines Filmröllchen
09a12247fb AK: Use bucket states with special bit patterns in HashTable
This simplifies some of the bucket state handling code, as there's now
an easy way of checking the basic category of bucket state.
2022-03-31 12:06:13 +02:00
kleines Filmröllchen
49d29c8298 AK: Rehash HashTable in-place instead of shrinking
As seen on TV, HashTable can get "thrashed", i.e. it has a bunch of
deleted buckets that count towards the load factor. This means that hash
tables which are large enough for their contents need to be resized.
This was fixed in 9d8da16 with a workaround that shrinks the HashTable
back down in these cases, as after the resize and re-hash the load
factor is very low again. However, that's not a good solution. If you
insert and remove repeatedly around a size boundary, you might get
frequent resizes, which involve frequent re-allocations.

The new solution is an in-place rehashing algorithm that I came up with.
(Do complain to me, I'm at fault.) Basically, it iterates the buckets
and re-hashes the used buckets while marking the deleted slots empty.
The issue arises with collisions in the re-hash. For this reason, there
are two kinds of used buckets during the re-hashing: the normal "used"
buckets, which are old and are treated as free space, and the
"re-hashed" buckets, which are new and treated as used space, i.e. they
trigger probing. Therefore, the procedure for relocating a bucket's
contents is as follows:
- Locate the "real" bucket of the contents with the hash. That bucket is
  the starting point for the target bucket, and the current (old) bucket
  is the bucket we want to move.
- While we still need to move the bucket:
  - If we're the target, something strange happened last iteration or we
    just re-hashed to the same location. We're done.
  - If the target is empty or deleted, just move the bucket. We're done.
  - If the target is a re-hashed full bucket, we probe by double-hashing
    our hash as usual. Henceforth, we move our target for the next
    iteration.
  - If the target is an old full bucket, we swap the target and to-move
buckets. Therefore, the bucket to move is a the correct location and the
former target, which still needs to find a new place, is now in the
bucket to move. So we can just continue with the loop; the target is
re-obtained from the bucket to move. This happens for each and every
bucket, though some buckets are "coincidentally" moved before their
point of iteration is reached. Either way, this guarantees full in-place
movement (even without stack storage) and therefore space complexity of
O(1). Time complexity is amortized O(2n) asssuming a good hashing
function.

This leads to a performance improvement of ~30% on the benchmark
introduced with the last commit.

Co-authored-by: Hendiadyoin1 <leon.a@serenityos.org>
2022-03-31 12:06:13 +02:00
kleines Filmröllchen
bcb8937898 AK: Merge HashTable bucket state into one enum
The hash table buckets had three different state booleans that are in
fact exclusive. In preparation for further states, this commit
consolidates them into one enum. This has the added benefit on not
relying on the compiler's boolean packing anymore; we definitely now
only need one byte for the bucket state.
2022-03-31 12:06:13 +02:00
safarp
704e1d13f4 AK: Allow printing wide characters using %ls modifier 2022-03-30 11:30:43 +04:30
Ali Mohammad Pur
67357fe984 LibXML: Add a fairly basic XML parser
Currently this can parse XML and resolve external resources/references,
and read a DTD (but not apply or verify its rules).
That's good enough for _most_ XHTML documents as the HTML 5 spec
enforces its own rules about document well-formedness, and does not make
use of XML DTDs (aside from a list of predefined entities).

An accompanying `xml` utility is provided that can read and dump XML
documents, and can also run the XML conformance test suite.
2022-03-28 23:11:48 +02:00
Ali Mohammad Pur
06cedf5bae AK: Add a 'OneOf' concept
Similar to 'SameAs', but for multiple types.
2022-03-28 23:11:48 +02:00
Ali Mohammad Pur
2a1a619eed AK: Display SourceLocation function name in color
It's much easier to spot the function name (which is what you often
expect) like this.
2022-03-28 23:11:48 +02:00
Ali Mohammad Pur
b3c18db463 AK: Add a 'is_not_any_of' similar to 'is_any_of' to GenericLexer
It's often useful to have the negated version, so instead of making a
local lambda for it, let's just add the negated form too.
2022-03-28 23:11:48 +02:00
Ali Mohammad Pur
e21fa158dd AK: Make Vector capable of holding forward-declared types
This is pretty useful for making trees.
2022-03-28 23:11:48 +02:00
Hendiadyoin1
6b20496758 AK: Add appendln helper to SourceGenerator 2022-03-28 23:08:08 +02:00
Hendiadyoin1
f6f7280fe3 AK: Explicitly move value String in SourceGenerator::set 2022-03-28 23:08:08 +02:00
Hendiadyoin1
14caecefb1 AK: Make SourceGenerator move constructible
This makes us able to return one from a function
2022-03-28 23:08:08 +02:00
Linus Groh
22308e52cf AK: Add an ArbitrarySizedEnum template
This is an enum-like type that works with arbitrary sized storage > u64,
which is the limit for a regular enum class - which limits it to 64
members when needing bit field behavior.

Co-authored-by: Ali Mohammad Pur <mpfard@serenityos.org>
2022-03-27 18:54:56 +02:00
Linus Groh
8b2361e362 AK: Add non-const DistinctNumeric::value() getter 2022-03-27 18:54:56 +02:00
Linus Groh
76e85ebbfc AK: Remove unused String.h include from UFixedBigInt.h
This makes it usable in the Kernel. :^)
2022-03-27 18:54:56 +02:00
Idan Horowitz
5626e1b324 LibWeb: Rename PARSER_DEBUG => HTML_PARSER_DEBUG
Since this macro was created we gained a couple more parsers in the
system :^)
2022-03-24 21:37:49 +01:00
Hendiadyoin1
820e03e8d4 AK: Add a case insensitive of is_one_of to String[View] 2022-03-21 10:48:17 +01:00
Sam Atkins
dfc02f9761 AK: Fix typo in warnln_if() 2022-03-19 11:01:49 -07:00
Sam Atkins
7e98c8eaf6 AK+Tests: Fix StringUtils::contains() being confused by repeating text
Previously, case-insensitively searching the haystack "Go Go Back" for
the needle "Go Back" would return false:

1. Match the first three characters. "Go ".
2. Notice that 'G' and 'B' don't match.
3. Skip ahead 3 characters, plus 1 for the outer for-loop.
4. Now, the haystack is effectively "o Back", so the match fails.

Reducing the skip by 1 fixes this issue. I'm not 100% convinced this
fixes all cases, but I haven't been able to find any cases where it
doesn't work now. :^)
2022-03-18 23:51:56 +00:00
Lenny Maiorani
4c5e9f5633 Everywhere: Deduplicate day/month name constants
Day and month name constants are defined in numerous places. This
pulls them together into a single place and eliminates the
duplication. It also ensures they are `constexpr`.
2022-03-18 23:48:50 +00:00
Timothy Flynn
31515a9147 AK: Mark the StringView user-defined literal as consteval
Even though the StringView(char*, size_t) constructor only runs its
overflow check when evaluated in a runtime context, the code generated
here could prevent the compiler from optimizing invocations from the
StringView user-defined literal (verified on Compiler Explorer).

This changes the user-defined literal declaration to be consteval to
ensure it is evaluated at compile time.
2022-03-18 19:56:50 +01:00
Andreas Kling
fc6b7fcd97 AK: Add const variant of Vector::in_reverse() 2022-03-18 15:18:48 +01:00
Lenny Maiorani
2844f7c333 Everywhere: Switch from EnableIf to requires
C++20 provides the `requires` clause which simplifies the ability to
limit overload resolution. Prefer it over `EnableIf`

With all uses of `EnableIf` being removed, also remove the
implementation so future devs are not tempted.
2022-03-17 22:15:42 -07:00
Michiel Visser
3d561abe15 AK: Add constant time equality and zero check to UFixedBigInt 2022-03-18 07:56:47 +03:30
Michiel Visser
590dcb0581 AK: UFixedBigInt add efficient multiplication with full result 2022-03-18 07:56:47 +03:30
Lenny Maiorani
5b59375a56 AK: Fix implicit and narrowing conversions in Base64 2022-03-16 16:19:53 +00:00
Lenny Maiorani
8d1d4d4f09 AK: Make static constexpr variables to avoid stack copy in Base64
Alphabet and lookup table are created and copied to the stack on each
call. Create them and store them in static memory.
2022-03-16 16:19:53 +00:00
Daniel Bertalan
e3eb68dd58 AK+Kernel: Avoid double memory clearing of HashTable buckets
Since the allocated memory is going to be zeroed immediately anyway,
let's avoid redundantly scrubbing it with MALLOC_SCRUB_BYTE just before
that.

The latest versions of gcc and Clang can automatically do this malloc +
memset -> calloc optimization, but I've seen a couple of places where it
failed to be done.

This commit also adds a naive kcalloc function to the kernel that
doesn't (yet) eliminate the redundancy like the userland does.
2022-03-15 11:56:46 +01:00
Hendiadyoin1
cd21e03225 AK+Everywhere: Add sincos and use it in some places
Calculating sin and cos at once is quite a bit cheaper than calculating
them individually.
x87 has even a dedicated instruction for it: `fsincos`.
2022-03-15 11:39:42 +01:00
Timothy Flynn
c12cfe83b7 AK: Allow creating a Vector from any Span of the same underlying type
This allows, for example, to create a Vector from a subset of another
Vector.
2022-03-14 16:33:15 +01:00
Brian Gianforcaro
390666b9fa AK: Add naive implementations of AK::timing_safe_compare
For security critical code we need to have some way of performing
constant time buffer comparisons.
2022-03-13 19:08:58 -07:00
Tim Schumacher
dd71754d10 AK: Properly parse unimplemented format length specifiers
This keeps us from stopping early and not rendering the argument at all.
2022-03-12 12:42:07 +03:30
Sam Atkins
2362cc2943 AK: Remove unused String[256] from JsonParser
This shrinks the JsonParser class from 2072 bytes to 24. :^)
2022-03-10 18:43:09 +01:00
Sam Atkins
a451810599 AK: Print a better error message when missing a SourceGenerator key
Previously, if you forgot to set a key on a SourceGenerator, you would
get this less-than-helpful error message:

> Generate_CSS_MediaFeatureID_cpp:
  /home/sam/serenity/Meta/Lagom/../../AK/Optional.h:174: T
  AK::Optional<T>::release_value() [with T = AK::String]: Assertion
  `m_has_value' failed.

Now, it instead looks like this:

> No key named `name:titlecase` set on SourceGenerator
  Generate_CSS_MediaFeatureID_cpp:
  /home/sam/serenity/Meta/Lagom/../../AK/SourceGenerator.h:44:
  AK::String AK::SourceGenerator::get(AK::StringView) const: Assertion
  `false' failed.
2022-03-09 23:06:30 +01:00
Federico Guerinoni
0aed2f0f86 AK: Add reverse iterator as member 2022-03-09 17:16:28 +01:00
Federico Guerinoni
f34fff852b AK: Implement wrapper for reverse range for loop
Now it is possible to use range for loop in reverse mode for a
container.
```
	for (auto item : in_reverse(vector))
```
2022-03-09 17:16:28 +01:00
Federico Guerinoni
a54e20d958 AK: Implement ReverseIterator for NonnullPtrVector 2022-03-09 17:16:28 +01:00
Federico Guerinoni
b0e74a3fd3 AK: Implement reverse iterator for Vector class 2022-03-09 17:16:28 +01:00
Federico Guerinoni
74650b4e32 AK: Add generic reverse iterator for containers 2022-03-09 17:16:28 +01:00
Tom
2f0e3da142 AK: Add IPv6Address class
This is the IPv6 counter part to the IPv4Address class and implements
parsing strings into a in6_addr and formatting one as a string. It
supports the address compression scheme as well as IPv4 mapped
addresses.
2022-03-08 23:05:44 +01:00
Vrins
73ade62d4f AK: Add float support for JsonValue and JsonObjectSerializer 2022-03-08 22:09:52 +01:00
Vrins
ae1cd4b448 AK: Add to_double() to JsonValue 2022-03-08 22:09:52 +01:00
Andreas Kling
9d8da1697e AK: Automatically shrink HashTable when removing entries
If the utilization of a HashTable (size vs capacity) goes below 20%,
we'll now shrink the table down to capacity = (size * 2).

This fixes an issue where tables would grow infinitely when inserting
and removing keys repeatedly. Basically, we would accumulate deleted
buckets with nothing reclaiming them, and eventually deciding that we
needed to grow the table (because we grow if used+deleted > limit!)

I found this because HashTable iteration was taking a suspicious amount
of time in Core::EventLoop::get_next_timer_expiration(). Turns out the
timer table kept growing in capacity over time. That made iteration
slower and slower since HashTable iterators visit every bucket.
2022-03-07 00:08:22 +01:00
Andreas Kling
eb829924da AK: Remove return value from HashTable::remove() and HashMap::remove()
This was only used by remove_all_matching(), where it's no longer used.
2022-03-07 00:08:22 +01:00
Andreas Kling
623bdd8b6a AK: Simplify HashTable::remove_all_matching()
Just walk the table from start to finish, deleting buckets as we go.
This removes the need for remove() to return an iterator, which is
preventing me from implementing hash table auto-shrinking.
2022-03-07 00:08:22 +01:00
Linus Groh
1719862d12 LibWeb: Hide some debug logging behind CANVAS_RENDERING_CONTEXT_2D_DEBUG
This can be quite noisy and isn't generally useful information.
2022-03-04 23:03:29 +01:00
Peter Ross
34108547b6 AK: Print NaN and infinite numbers in PrintfImplementation 2022-03-02 11:40:37 +01:00
Lucas CHOLLET
39bfc48ea7 AK: Add Time::from_ticks()
This helper allows Time to be constructed from a tick count and a ticks
per second value.
2022-02-28 20:09:37 +01:00
Timur Sultanov
406b3fc3fe AK: Correctly process precision modifiers in printf 2022-02-28 14:08:24 +01:00
kleines Filmröllchen
98058f7efe AK: Add FixedPoint base 2 logarithm
The log base 2 is implemented using the binary logarithm algorithm
by Clay Turner (see the link in the comment)
2022-02-28 13:59:31 +01:00