The way the Process::FileDescriptions::allocate() API works today means
that two callers who allocate back to back without associating a
FileDescription with the allocated FD, will receive the same FD and thus
one will stomp over the other.
Naively tracking which FileDescriptions are allocated and moving onto
the next would introduce other bugs however, as now if you "allocate"
a fd and then return early further down the control flow of the syscall
you would leak that fd.
This change modifies this behavior by tracking which descriptions are
allocated and then having an RAII type to "deallocate" the fd if the
association is not setup the end of it's scope.
To add a new per-CPU data structure, add an ID for it to the
ProcessorSpecificDataID enum.
Then call ProcessorSpecific<T>::initialize() when you are ready to
construct the per-CPU data structure on the current CPU. It can then
be accessed via ProcessorSpecific<T>::get().
This patch replaces the existing hard-coded mechanisms for Scheduler
and MemoryManager per-CPU data structure.
This enables further work on implementing KASLR by adding relocation
support to the pre-kernel and updating the kernel to be less dependent
on specific virtual memory layouts.
This allows us to specify virtual addresses for things the kernel should
access via virtual addresses later on. By doing this we can make the
kernel independent from specific physical addresses.
Previously the kernel relied on a fixed offset between virtual and
physical addresses based on the kernel's load address. This allows us
to specify an independent offset.
Instead of doing a reset via triple-fault, let's just shutdown the QEMU
virtual machine because this is already a QEMU-specific handling code
for Self-Test CI mode.
In preparation for modifying the Kernel IOCTL API to return KResult
instead of int, we need to fix this ioctl to an argument to receive
it's return value, instead of using the actual function return value.
It's easy to forget the responsibility of validating and safely copying
kernel parameters in code that is far away from syscalls. ioctl's are
one such example, and bugs there are just as dangerous as at the root
syscall level.
To avoid this case, utilize the AK::Userspace<T> template in the ioctl
kernel interface so that implementors have no choice but to properly
validate and copy ioctl pointer arguments.
GCC and Clang allow us to inject a call to a function named
__sanitizer_cov_trace_pc on every edge. This function has to be defined
by us. By noting down the caller in that function we can trace the code
we have encountered during execution. Such information is used by
coverage guided fuzzers like AFL and LibFuzzer to determine if a new
input resulted in a new code path. This makes fuzzing much more
effective.
Additionally this adds a basic KCOV implementation. KCOV is an API that
allows user space to request the kernel to start collecting coverage
information for a given user space thread. Furthermore KCOV then exposes
the collected program counters to user space via a BlockDevice which can
be mmaped from user space.
This work is required to add effective support for fuzzing SerenityOS to
the Syzkaller syscall fuzzer. :^) :^)
When we get a COW fault and discover that whoever we were COW'ing
together with has either COW'ed that page on their end (or they have
unmapped/exited) we simplify life for ourselves by clearing the COW
bit and keeping the page we already have. (No need to COW if the page
is not shared!)
The act of doing this does not return a committed page to the pool.
In fact, that committed page we had reserved for this purpose was used
up (allocated) by our COW buddy when they COW'ed the page.
This fixes a kernel panic when running TestLibCMkTemp. :^)
This makes a kernel panic immediately fail the on-target CI job.
Otherwise the failed job looks like a test timeout unless one digs into
the details of the job.
We don't need an entirely separate VMObject subclass to influence the
location of the physical pages.
Instead, we simply allocate enough physically contiguous memory first,
and then pass it to the AnonymousVMObject constructor that takes a span
of physical pages.
VMObject already has an IntrusiveList of all the Regions that map it.
We were keeping a counter in addition to this, and only using it in
a single place to avoid iterating over the list in case it only had
1 entry.
Simplify VMObject by removing this counter and always iterating the
list even if there's only 1 entry. :^)
This was previously used for a single debug logging statement during
memory purging. There are no remaining users of this weak pointer,
so let's get rid of it.
If a purgeable VM object is in the "volatile" state when we're asked
to make a COW clone of it, make life simpler by simply "purging"
the cloned object right away.
This effectively means that a fork()'ed child process will discover
its purgeable+volatile regions to be empty if/when it tries making
them non-volatile.
This patch changes the semantics of purgeable memory.
- AnonymousVMObject now has a "purgeable" flag. It can only be set when
constructing the object. (Previously, all anonymous memory was
effectively purgeable.)
- AnonymousVMObject now has a "volatile" flag. It covers the entire
range of physical pages. (Previously, we tracked ranges of volatile
pages, effectively making it a page-level concept.)
- Non-volatile objects maintain a physical page reservation via the
committed pages mechanism, to ensure full coverage for page faults.
- When an object is made volatile, it relinquishes any unused committed
pages immediately. If later made non-volatile again, we then attempt
to make a new committed pages reservation. If this fails, we return
ENOMEM to userspace.
mmap() now creates purgeable objects if passed the MAP_PURGEABLE option
together with MAP_ANONYMOUS. anon_create() memory is always purgeable.
Right now, NE2000 NICs don't work because the link is down by default
and this will never change. Of all the NE2000 documentation I looked
at I could not find a link status indicator, so just assume the link
is up.