This commit is very invasive, because Thread likes to take a pointer and write
to it. This means that translating between timespec/timeval/Time would have been
more difficult than just changing everything that hands a raw pointer to Thread,
in bulk.
Make more of the kernel compile in 64-bit mode, and make some things
pointer-size-agnostic (by using FlatPtr.)
There's a lot of work to do here before the kernel will even compile.
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
The `default_signal_action(u8 signal)` function already has the
full mapping. The only caveat being that now we need to make
sure the thread constructor and clear_signals() method do the work
of resetting the m_signal_action_data array, instead or relying on
the previous logic in set_default_signal_dispositions.
There's no real system here, I just added it to various functions
that I don't believe we ever want to call after initialization
has finished.
With these changes, we're able to unmap 60 KiB of kernel text
after init. :^)
Now that we no longer need to support the signal trampolines being
user-accessible inside the kernel memory range, we can get rid of the
"kernel" and "user-accessible" flags on Region and simply use the
address of the region to determine whether it's kernel or user.
This also tightens the page table mapping code, since it can now set
user-accessibility based solely on the virtual address of a page.
The signal trampoline was previously in kernelspace memory, but with
a special exception to make it user-accessible.
This patch moves it into each process's regular address space so we
can stop supporting user-allowed memory above 0xc0000000.
This patch adds Space, a class representing a process's address space.
- Each Process has a Space.
- The Space owns the PageDirectory and all Regions in the Process.
This allows us to reorganize sys$execve() so that it constructs and
populates a new Space fully before committing to it.
Previously, we would construct the new address space while still
running in the old one, and encountering an error meant we had to do
tedious and error-prone rollback.
Those problems are now gone, replaced by what's hopefully a set of much
smaller problems and missing cleanups. :^)
Wrap thread creation in a Thread::try_create() helper that first
allocates a kernel stack region. If that allocation fails, we propagate
an ENOMEM error to the caller.
This avoids the situation where a thread is half-constructed, without a
valid kernel stack, and avoids having to do messy cleanup in that case.
There's no need for this to be generic and support running from an
arbitrary thread context. Perf events are always generated from within
the thread being profiled, so take advantage of that to simplify the
code. Also use Vector capacity to avoid heap allocations.
This allows us to get rid of the thread lists in SchedulerData.
Also, instead of iterating over all threads to find a thread by id,
just use a lookup table. In the rare case of having to iterate over
all threads, just iterate the lookup table.
Rather than walking all Thread instances and putting them into
a vector to be sorted by priority, queue them into priority sorted
linked lists as soon as they become ready to be executed.
Attempt to wake idle processors to get threads to be scheduled more quickly.
We don't want to wait until the next timer tick if we have processors that
aren't doing anything.
This allows us to determine what the previous mode (user or kernel)
was, e.g. in the timer interrupt. This is used e.g. to determine
whether a signal handler should be set up.
Fixes#5096
Let's force callers to provide a VM range when allocating a region.
This makes ENOMEM error handling more visible and removes implicit
VM allocation which felt a bit magical.
This was done with the help of several scripts, I dump them here to
easily find them later:
awk '/#ifdef/ { print "#cmakedefine01 "$2 }' AK/Debug.h.in
for debug_macro in $(awk '/#ifdef/ { print $2 }' AK/Debug.h.in)
do
find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/#ifdef '$debug_macro'/#if '$debug_macro'/' {} \;
done
# Remember to remove WRAPPER_GERNERATOR_DEBUG from the list.
awk '/#cmake/ { print "set("$2" ON)" }' AK/Debug.h.in
It was possible to signal a process while it was paging in an inode
backed VM object. This would cause the inode read to EINTR, and the
page fault handler would assert.
Solve this by simply not unblocking threads due to signals if they are
currently busy handling a page fault. This is probably not the best way
to solve this issue, so I've added a FIXME to that effect.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
This commit touches some dbg() calls which are enclosed in macros. This
should be fine because with the new constexpr stuff, we ensure that the
stuff actually compiles.
Problem:
- Many constructors are defined as `{}` rather than using the ` =
default` compiler-provided constructor.
- Some types provide an implicit conversion operator from `nullptr_t`
instead of requiring the caller to default construct. This violates
the C++ Core Guidelines suggestion to declare single-argument
constructors explicit
(https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit).
Solution:
- Change default constructors to use the compiler-provided default
constructor.
- Remove implicit conversion operators from `nullptr_t` and change
usage to enforce type consistency without conversion.
This patch merges the profiling functionality in the kernel with the
performance events mechanism. A profiler sample is now just another
perf event, rather than a dedicated thing.
Since perf events were already per-process, this now makes profiling
per-process as well.
Processes with perf events would already write out a perfcore.PID file
to the current directory on death, but since we may want to profile
a process and then let it continue running, recorded perf events can
now be accessed at any time via /proc/PID/perf_events.
This patch also adds information about process memory regions to the
perfcore JSON format. This removes the need to supply a core dump to
the Profiler app for symbolication, and so the "profiler coredump"
mechanism is removed entirely.
There's still a hard limit of 4MB worth of perf events per process,
so this is by no means a perfect final design, but it's a nice step
forward for both simplicity and stability.
Fixes#4848Fixes#4849