Before this change, we were hard-coding 4 KiB. This meant that systems
with a 16 KiB native page size were wasting 12 KiB per HeapBlock on
nothing, leading to worse locality and more mmap/madvise churn.
We now query the system page size on startup and use that as the
HeapBlock size.
The only downside here is that some of the pointer math for finding the
base of a HeapBlock now has to use a runtime computed value instead of a
compile time constant. But that's a small price to pay for what we get.
These were just here to create a nicer error message for debugging.
A release build will still crash in the exact same place, but now you'll
need to get a backtrace the normal way instead.
NetBSD and FreeBSD get upset when we don't set the fd to an invalid
value when using a non-shared mapping.
Reported-By: Thomas Klausner <wiz@gatalith.at>
Before this change both ExecutionContext and CallFrame were created
before executing function/module/script with a couple exceptions:
- executable created for default function argument evaluation has to
run in function's execution context.
- `execute_ast_node()` where executable compiled for ASTNode has to be
executed in running execution context.
This change moves all members previously owned by CallFrame into
ExecutionContext, and makes two exceptions where an executable that does
not have a corresponding execution context saves and restores registers
before running.
Now, all execution state lives in a single entity, which makes it a bit
easier to reason about and opens opportunities for optimizations, such
as moving registers and local variables into a single array.
This allows to skip iterating through all allocated blocks in
`find_min_and_max_block_addresses()`.
With this change `collect_garbage()` in profiles of Discord goes down
from 17% to 8%.
This lets us avoid false positives when a GCPtr-wrapped member is only
a weak reference which is automatically updated by the GC when the
member's gc state is updated.
This avoids the potential for unwanted implicit conversions to bool.
It doesn't prevent if (m_ptr) checks though, as that invokes an explicit
conversion to bool. This is how std::unique_ptr and std::optional work.
This works very similarly to MarkedVector<T>, but instead of expecting
T to be Value or a GC-allocated pointer type, T can be anything.
Every pointer-sized value in the vector's storage will be checked during
conservative root scanning.
In other words, this allows you to put something like this in a
ConservativeVector<Foo> and it will be protected from GC:
struct Foo {
i64 number;
Value some_value;
GCPtr<Object> some_object;
};
This helps us avoid from needing to construct a Function<T> when
invoking `create_heap_function` with a lambda.
Co-Authored-By: Ali Mohammad Pur <mpfard@serenityos.org>
Instead of returning HeapBlock memory to the kernel (or a non-type
specific shared cache), we now keep a BlockAllocator per CellAllocator
and implement "deallocation" by basically informing the kernel that we
don't need the physical memory right now.
This is done with MADV_FREE or MADV_DONTNEED if available, but for other
platforms (including SerenityOS) we munmap and then re-mmap the memory
to achieve the same effect. It's definitely clunky, so I've added a
FIXME about implementing the madvise options on SerenityOS too.
The important outcome of this change is that GC types that use a
type-specific allocator become immune to use-after-free type confusion
attacks, since their virtual addresses will only ever be re-used for
the same exact type again and again.
Fixes#22274
Due to a `requires` mistake, we were always using the fallback
size-based cell allocators.
Also, now that we start using them, make them NeverDestroyed so
we don't try to deallocate them on program exit.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
Instead of allocating these in a mixture of ways, we now always put
them on the malloc heap, and keep an intrusive linked list of them
that we can iterate for GC marking purposes.
This patch adds two macros to declare per-type allocators:
- JS_DECLARE_ALLOCATOR(TypeName)
- JS_DEFINE_ALLOCATOR(TypeName)
When used, they add a type-specific CellAllocator that the Heap will
delegate allocation requests to.
The result of this is that GC objects of the same type always end up
within the same HeapBlock, drastically reducing the ability to perform
type confusion attacks.
It also improves HeapBlock utilization, since each block now has cells
sized exactly to the type used within that block. (Previously we only
had a handful of block sizes available, and most GC allocations ended
up with a large amount of slack in their tails.)
There is a small performance hit from this, but I'm sure we can make
up for it elsewhere.
Note that the old size-based allocators still exist, and we fall back
to them for any type that doesn't have its own CellAllocator.
The functions for registering and unregistering MarkedVector, Handle,
etc. were quite prominent in benchmark profiles.
4% speed-up on the entire Kraken benchmark :^)
(including: 7% speed-up on Kraken/imaging-gaussian-blur.js, the current
slowest subtest)
- Convert to FlatPtr instead of doing pointer arithmetic on a too-large
pointer type in find_min_and_max_block_addresses(). This makes the
range more accurate.
- Untag possible cell pointers in add_possible_value() before doing the
rejection. Otherwise we end up rejecting most pointers since the tags
sit in the highest bits!
This fixes a crash when running the Speedometer benchmark.
This change adds a check to discard pointers that are lower than the
minimum address of all allocated blocks or higher than the maximum
address of all blocks. By doing this we avoid executing plenty of set()
operations on the HashMap in the add_possible_value().
With this change gather_conservative_roots() run 10x times faster in
Speedometer React-Redux-TodoMVC test.
for_each_cell_among_possible_pointers() was taking HashTable by value
instead of by const reference for no reason.
The copying was soaking up ~4% of CPU time while loading https://x.com/
We have the right conversions to make this work, so let's make it
possible to have a `HashMap<JS::Handle<T>, V>` and look for a specific
T inside it without having to create a temporary handle.
This involves adding some operator== implementations, and some
specializations of AK::Traits.
This change introduces HeapFunction, which is intended to be used as a
replacement for SafeFunction. The new type behaves like a regular
GC-allocated object, which means it needs to be visited from
visit_edges, and unlike SafeFunction, it does not create new roots for
captured parameters.
Co-Authored-By: Andreas Kling <kling@serenityos.org>
This change introduces a very basic GC graph dumper. The `dump_graph()`
function outputs JSON data that contains information about all nodes in
the graph, including their class types and edges.
Root nodes will have a property indicating their root type or source
location if the root is captured by a SafeFunction. It would be useful
to add source location for other types of roots in the future.
Output JSON dump have following format:
```json
"4908721208": {
"class_name": "Accessor",
"edges": [
"4909298232",
"4909297976"
]
},
"4907520440": {
"root": "SafeFunction Optional Optional.h:137",
"class_name": "Realm",
"edges": [
"4908269624",
"4924821560",
"4908409240",
"4908483960",
"4924527672"
]
},
"4908251320": {
"class_name": "CSSStyleRule",
"edges": [
"4908302648",
"4925101656",
"4908251192"
]
},
```
These cases were found with GCC's `-Wsuggest-final-{types,methods}`
warnings, which catch calls that could have been devirtualized had we
declared the functions `final` in the source.
To reproduce, Link Time Optimization needs to be enabled. The easiest
way to achieve this is to set the `CMAKE_INTERPROCEDURAL_OPTIMIZATION`
cache variable to `ON`. The `.incbin` directive in LibCompress' Brotli
decompressor might needs to be changed to an absolute path for this to
work.
This commit also removes a pair of unused virtual functions.
Stop worrying about tiny OOMs. Work towards #20449.
While going through these, I also changed the function signature in many
places where returning ThrowCompletionOr<T> is no longer necessary.
This patch triggers the collector when allocated memory doubles instead
of every 100k allocations. Which can almost half (reduce by ~48%) the
time spent on collection when loading google-maps.
This dynamic approach is inspired by some other GCs like Golang's and
Lua's and improves performance in memory heavy applications because
marking must visit old objects which will dominate the marking phase if
the GC is invoked too often.
This commit also improves the Octane Splay benchmark and almost
doubles it :^)