This patch adds generic slab allocators to kmalloc. In this initial
version, the slab sizes are 16, 32, 64, 128, 256 and 512 bytes.
Slabheaps are backed by 64 KiB block-aligned blocks with freelists,
similar to what we do in LibC malloc and LibJS Heap.
There are no more users of the C-style kfree() API in the kernel,
so let's get rid of it and enjoy the new world where we always know
how much memory we are freeing. :^)
This patch does two things:
- Combines kmalloc_aligned() and kmalloc_aligned_cxx(). Templatizing
the alignment parameter doesn't seem like a valuable enough
optimization to justify having two almost-identical implementations.
- Stores the real allocation size of an aligned allocation along with
the other alignment metadata, and uses it to call kfree_sized()
instead of kfree().
This class was misusing the outdate Lockable template and didn't take
advantage of the lock/resource separation mechanism fully anyway.
Since the underlying PRNG has its own SpinLock, and we already use that
for synchronization everywhere anyway, we can simply remove the Lockable
inheritance from this class.
I've seen how @awesomekling changes the script to disable KVM, so
that's a useful thing to have.
An example how to use it:
SERENITY_KVM_SUPPORT='0' ./Meta/serenity.sh run x86_64
My first commit btw :^)
Currently the APIC class is constructed irrespective of whether it
is used or not.
So, move APIC initialization from init to the InterruptManagement
class and construct the APIC class only when it is needed.
Since we allocate the subheap in the first page of the given storage
let's assert that the subheap can actually fit in a single page, to
prevent the possible future headache of trying to debug the cause of
random kernel memory corruption :^)
This avoids getting caught with our pants down when heap expansion fails
due to missing page tables. It also avoids a circular dependency on
kmalloc() by way of HashMap::set() in MemoryManager::ensure_pte().
If the data passed to sys$write() is backed by a not-yet-paged-in inode
mapping, we could end up in a situation where we get a page fault when
trying to copy data from userspace.
If that page fault handler tried reading from an inode that someone else
had locked while waiting for the disk cache lock, we'd deadlock.
This patch fixes the issue by copying the userspace data into a local
buffer before acquiring the disk cache lock. This is not ideal since it
incurs an extra copy, and I'm sure we can think of a better solution
eventually.
This was a frequent cause of startup deadlocks on x86_64 for me. :^)
Previously, the heap expansion logic could end up calling kmalloc
recursively, which was quite messy and hard to reason about.
This patch redesigns heap expansion so that it's kmalloc-free:
- We make a single large virtual range allocation at startup
- When expanding, we bump allocate VM from that region
- When expanding, we populate page tables directly ourselves,
instead of going via MemoryManager.
This makes heap expansion a great deal simpler. However, do note that it
introduces two new flaws that we'll need to deal with eventually:
- The single virtual range allocation is limited to 64 MiB and once
exhausted, kmalloc() will fail. (Actually, it will PANIC for now..)
- The kmalloc heap can no longer shrink once expanded. Subheaps stay
in place once constructed.
Apparently Andreas found remains for that in the build system.
Let's remove them for completeness of that process of removing support
for kernel modules, which didn't work for many months before being
removed.
This is an interface to downcast(), which degrades errors into runtime
errors, and allows seemingly-correct-but-not-quite constructs like the
following to appear to compile, but fail at runtime:
Variant<NonnullRefPtr<T>, U> foo = ...;
Variant<RefPtr<T>, U> bar = foo;
The expectation here is that `foo` is converted to a RefPtr<T> if it
contains one, and remains a U otherwise, but in reality, the
NonnullRefPtr<T> variant is simply dropped on the floor, and the
resulting variant becomes invalid, failing the assertion in downcast().
This commit adds a Variant<Ts...>(Variant<NewTs...>) constructor that
ensures that no alternative can be left out at compiletime, for the
users that were using this interface for merely increasing the number of
alternatives (for instance, LibSQL's Value class).
This was used to return a pre-locked UDPSocket in one place, but there
was really no need for that mechanism in the first place since the
caller ends up locking the socket anyway.
I encountered a WindowServer crash due to null-pointer dereference in
this function, so let's protect against it by simply skipping over
nulled-out WeakPtrs.
I added a FIXME about how we ideally wouldn't be in this situation in
the first place, but that will require some more investigation.
This call caused GCC 12's static analyzer to think that we perform an
out-of-bounds write to the v_key Vector. This is obviously incorrect,
and comes from the fact that GCC doesn't properly track whether we use
the inline storage, or the Vector is allocated on the heap.
While searching for a workaround, Sam pointed out that this call is
redundant as `Vector::resize()` already zeroes out the elements, so we
can completely remove it.
Co-authored-by: Sam Atkins <atkinssj@serenityos.org>
The function to protect ksyms after initialization, is only used during
boot of the system, so it can be UNMAP_AFTER_INIT as well.
This requires we switch the order of the init sequence, so we now call
`MM.protect_ksyms_after_init()` before `MM.unmap_text_after_init()`.
This is a raffinement of 49cbd4dcca.
Previously, the container was scanned to compute the size in the unhappy
path. Now, using `all_of` happy and unhappy path should be fast.
The following features are now available in the system, making these
patches unnecessary:
- isblank() function
- SIGSTKSZ constant
- MS_SYNC and MS_ASYNC msync() flags
- EDQUOT errno constant