Commit graph

208 commits

Author SHA1 Message Date
Lenny Maiorani
c6acf64558 Kernel: Change static constexpr variables to constexpr where possible
Function-local `static constexpr` variables can be `constexpr`. This
can reduce memory consumption, binary size, and offer additional
compiler optimizations.

These changes result in a stripped x86_64 kernel binary size reduction
of 592 bytes.
2022-02-09 21:04:51 +00:00
Andreas Kling
3845c90e08 Kernel: Remove unnecessary includes from Thread.h
...and deal with the fallout by adding missing includes everywhere.
2022-01-30 16:21:59 +01:00
Idan Horowitz
e28af4a2fc Kernel: Stop using HashMap in Mutex
This commit removes the usage of HashMap in Mutex, thereby making Mutex
be allocation-free.

In order to achieve this several simplifications were made to Mutex,
removing unused code-paths and extra VERIFYs:
 * We no longer support 'upgrading' a shared lock holder to an
   exclusive holder when it is the only shared holder and it did not
   unlock the lock before relocking it as exclusive. NOTE: Unlike the
   rest of these changes, this scenario is not VERIFY-able in an
   allocation-free way, as a result the new LOCK_SHARED_UPGRADE_DEBUG
   debug flag was added, this flag lets Mutex allocate in order to
   detect such cases when debugging a deadlock.
 * We no longer support checking if a Mutex is locked by the current
   thread when the Mutex was not locked exclusively, the shared version
   of this check was not used anywhere.
 * We no longer support force unlocking/relocking a Mutex if the Mutex
   was not locked exclusively, the shared version of these functions
   was not used anywhere.
2022-01-29 16:45:39 +01:00
Andreas Kling
b56646e293 Kernel: Switch process file descriptor table from spinlock to mutex
There's no reason for this to use a spinlock. Instead, let's allow
threads to block if someone else is using the descriptor table.
2022-01-29 02:17:09 +01:00
Andreas Kling
8ebec2938c Kernel: Convert process file descriptor table to a SpinlockProtected
Instead of manually locking in the various member functions of
Process::OpenFileDescriptions, simply wrap it in a SpinlockProtected.
2022-01-29 02:17:06 +01:00
Andreas Kling
31c1094577 Kernel: Don't mess with thread state in Process::do_exec()
We were marking the execing thread as Runnable near the end of
Process::do_exec().

This was necessary for exec in processes that had never been scheduled
yet, which is a specific edge case that only applies to the very first
userspace process (normally SystemServer). At this point, such threads
are in the Invalid state.

In the common case (normal userspace-initiated exec), making the current
thread Runnable meant that we switched away from its current state:
Running. As the thread is indeed running, that's a bogus change!
This created a short time window in which the thread state was bogus,
and any attempt to block the thread would panic the kernel (due to a
bogus thread state in Thread::block() leading to VERIFY_NOT_REACHED().)

Fix this by not touching the thread state in Process::do_exec()
and instead make the first userspace thread Runnable directly after
calling Process::exec() on it in try_create_userspace_process().

It's unfortunate that exec() can be called both on the current thread,
and on a new thread that has never been scheduled. It would be good to
not have the latter edge case, but fixing that will require larger
architectural changes outside the scope of this fix.
2022-01-27 11:18:25 +01:00
Brian Gianforcaro
e954b4bdd4 Kernel: Return error from sys$execve() when called with zero arguments
There are many assumptions in the stack that argc is not zero, and
argv[0] points to a valid string. The recent pwnkit exploit on Linux
was able to exploit this assumption in the `pkexec` utility
(a SUID-root binary) to escalate from any user to root.

By convention `execve(..)` should always be called with at least one
valid argument, so lets enforce that semantic to harden the system
against vulnerabilities like pwnkit.

Reference: https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
2022-01-26 13:05:59 +01:00
Idan Horowitz
d1433c35b0 Kernel: Handle OOM failures in find_shebang_interpreter_for_executable 2022-01-26 02:37:03 +02:00
Idan Horowitz
8cf0e4a5e4 Kernel: Eliminate allocations from generate_auxiliary_vector 2022-01-26 02:37:03 +02:00
Jelle Raaijmakers
df73e8b46b Kernel: Allow program headers to align on multiples of PAGE_SIZE
These checks in `sys$execve` could trip up the system whenever you try
to execute an `.so` file. For example, double-clicking `libwasm.so` in
Terminal crashes the kernel.

This changes the program header alignment checks to reflect the same
checks in LibELF, and passes the requested alignment on to
`::try_allocate_range()`.
2022-01-23 00:11:56 +02:00
Andreas Kling
0e08763483 Kernel: Wrap much of sys$execve() in a block scope
Since we don't return normally from this function, let's make it a
little extra difficult to accidentally leak something by leaving it on
the stack in this function.
2022-01-13 23:57:33 +01:00
Andreas Kling
0e72b04e7d Kernel: Perform exec-into-new-image directly in sys$execve()
This ensures that everything allocated on the stack in Process::exec()
gets cleaned up. We had a few leaks related to the parsing of shebang
(#!) executables that get fixed by this.
2022-01-13 23:57:33 +01:00
Idan Horowitz
cfb9f889ac LibELF: Accept Span instead of Pointer+Size in validate_program_headers 2022-01-13 22:40:25 +01:00
Idan Horowitz
3e959618c3 LibELF: Use StringBuilders instead of Strings for the interpreter path
This is required for the Kernel's usage of LibELF, since Strings do not
expose allocation failure.
2022-01-13 22:40:25 +01:00
Andreas Kling
8ad46fd8f5 Kernel: Stop leaking executable path in successful sys$execve()
Since we don't return from sys$execve() when it's successful, we have to
take special care to tear down anything we've allocated.

Turns out we were not doing this for the full executable path itself.
2022-01-13 16:15:37 +01:00
Idan Horowitz
40159186c1 Kernel: Remove String use-after-free in generate_auxiliary_vector
Instead we generate the random bytes directly in
make_userspace_context_for_main_thread if requested.
2022-01-13 00:20:08 -08:00
Idan Horowitz
bc85b64a38 Kernel: Replace usages of String::formatted with KString in sys$exec 2022-01-12 16:09:09 +02:00
Brian Gianforcaro
54b9a4ec1e Kernel: Handle promise violations in the syscall handler
Previously we would crash the process immediately when a promise
violation was found during a syscall. This is error prone, as we
don't unwind the stack. This means that in certain cases we can
leak resources, like an OwnPtr / RefPtr tracked on the stack. Or
even leak a lock acquired in a ScopeLockLocker.

To remedy this situation we move the promise violation handling to
the syscall handler, right before we return to user space. This
allows the code to follow the normal unwind path, and grantees
there is no longer any cleanup that needs to occur.

The Process::require_promise() and Process::require_no_promises()
functions were modified to return ErrorOr<void> so we enforce that
the errors are always propagated by the caller.
2021-12-29 18:08:15 +01:00
Brian Gianforcaro
bad6d50b86 Kernel: Use Process::require_promise() instead of REQUIRE_PROMISE()
This change lays the foundation for making the require_promise return
an error hand handling the process abort outside of the syscall
implementations, to avoid cases where we would leak resources.

It also has the advantage that it makes removes a gs pointer read
to look up the current thread, then process for every syscall. We
can instead go through the Process this pointer in most cases.
2021-12-29 18:08:15 +01:00
Guilherme Goncalves
33b78915d3 Kernel: Propagate overflow errors from Memory::page_round_up
Fixes #11402.
2021-12-28 23:08:50 +01:00
Andreas Kling
1d08b671ea Kernel: Enter new address space before destroying old in sys$execve()
Previously we were assigning to Process::m_space before actually
entering the new address space (assigning it to CR3.)

If a thread was preempted by the scheduler while destroying the old
address space, we'd then attempt to resume the thread with CR3 pointing
at a partially destroyed address space.

We could then crash immediately in write_cr3(), right after assigning
the new value to CR3. I am hopeful that this may have been the bug
haunting our CI for months. :^)
2021-12-23 01:18:26 +01:00
Hendiadyoin1
c860e0ab95 Kernel: Add implicit auto qualifiers in Syscalls 2021-12-18 10:31:18 -08:00
Idan Horowitz
81a76a30a1 Kernel: Preserve pending signals across execve(2)s
As required by posix. Also rename Thread::clear_signals to
Thread::reset_signals_for_exec since it doesn't actually clear any
pending signals, but rather does execve related signal book-keeping.
2021-12-12 08:34:19 +02:00
Idan Horowitz
92a6c91f4e Kernel: Preserve signal mask across fork(2) and execve(2)
A child created via fork(2) inherits a copy of its parent's signal
mask; the signal mask is preserved across execve(2).
2021-12-12 08:34:19 +02:00
Andreas Kling
88b6428c25 AK: Make Vector::try_* functions return ErrorOr<void>
Instead of signalling allocation failure with a bool return value
(false), we now use ErrorOr<void> and return ENOMEM as appropriate.
This allows us to use TRY() and MUST() with Vector. :^)
2021-11-10 21:58:58 +01:00
Andreas Kling
79fa9765ca Kernel: Replace KResult and KResultOr<T> with Error and ErrorOr<T>
We now use AK::Error and AK::ErrorOr<T> in both kernel and userspace!
This was a slightly tedious refactoring that took a long time, so it's
not unlikely that some bugs crept in.

Nevertheless, it does pass basic functionality testing, and it's just
real nice to finally see the same pattern in all contexts. :^)
2021-11-08 01:10:53 +01:00
Ben Wiederhake
c05c5a7ff4 Kernel: Clarify ambiguous {File,Description}::absolute_path
Found due to smelly code in InodeFile::absolute_path.

In particular, this replaces the following misleading methods:

File::absolute_path
This method *never* returns an actual path, and if called on an
InodeFile (which is impossible), it would VERIFY_NOT_REACHED().

OpenFileDescription::try_serialize_absolute_path
OpenFileDescription::absolute_path
These methods do not guarantee to return an actual path (just like the
other method), and just like Custody::absolute_path they do not
guarantee accuracy. In particular, just renaming the method made a
TOCTOU bug obvious.

The new method signatures use KResultOr, just like
try_serialize_absolute_path() already did.
2021-10-31 12:06:28 +01:00
Ben Wiederhake
735da58d44 Kernel: Avoid OpenFileDescription::absolute_path 2021-10-31 12:06:28 +01:00
Brian Gianforcaro
e8ec1e908d Kernel: Only instantiate main_program_metadata in the scope it's needed
pvs-studio flagged this as a potential perf optimization.
2021-09-16 17:17:13 +02:00
Andreas Kling
dd82f68326 Kernel: Use KString all the way in sys$execve()
This patch converts all the usage of AK::String around sys$execve() to
using KString instead, allowing us to catch and propagate OOM errors.

It also required changing the kernel CommandLine helper class to return
a vector of KString for the userspace init program arguments.
2021-09-09 21:25:10 +02:00
Andreas Kling
4a9c18afb9 Kernel: Rename FileDescription => OpenFileDescription
Dr. POSIX really calls these "open file description", not just
"file description", so let's call them exactly that. :^)
2021-09-07 13:53:14 +02:00
Andreas Kling
dbd639a2d8 Kernel: Convert much of sys$execve() to using KString
Make use of the new FileDescription::try_serialize_absolute_path() to
avoid String in favor of KString throughout much of sys$execve() and
its helpers.
2021-09-07 13:53:14 +02:00
Andreas Kling
226383f45b LibELF: Use StringView to carry temporary strings in auxiliary vector
Let's not force clients to provide a String.
2021-09-07 13:53:14 +02:00
Andreas Kling
55b0b06897 Kernel: Store process names as KString 2021-09-07 13:53:14 +02:00
Andreas Kling
fe2e25edad Kernel: Add a comment explaining an alternate path in Process::exec()
I had to look at this for a moment before I realized that sys$execve()
and the spawning of /bin/SystemServer at boot are taking two different
paths out of exec().

Add a comment to help the next person looking at it. :^)
2021-09-07 01:34:26 +02:00
Andreas Kling
5d06ab6531 Kernel: Fix file description leak in sys$execve()
Before this patch, we were leaking a ref on the open file description
used for the interpreter (the dynamic loader) in sys$execve().

This surfaced when adapting the syscall to use TRY(), since we were now
correctly transferring ownership of the interpreter to Process::exec()
and no longer holding on to a local copy of it (in `elf_result`).

Fixing the leak uncovered another problem. The interpreter description
would now get destroyed when returning from do_exec(), which led to a
kernel panic when attempting to acquire a mutex.

This happens because we're in a particularly delicate state when
returning from do_exec(). Everything is primed for the upcoming context
switch into the new executable image, and trying to block the thread
at this point will panic the kernel.

We fix this by destroying the interpreter description earlier in
do_exec(), at the point where we no longer need it.
2021-09-07 01:18:02 +02:00
Andreas Kling
e226400dd8 Kernel: Don't seek the program executable description in sys$execve()
The dynamic loader doesn't care if the kernel has moved the file
cursor around before it gains control.
2021-09-07 01:18:02 +02:00
Andreas Kling
f4624e4ee1 Kernel: Hoist allocation of main program FD in sys$execve()
When executing a dynamically linked program, we need to pass the main
program executable via a file descriptor to the dynamic loader.

Before this patch, we were allocating an FD for this purpose long after
it was safe to do anything fallible. If we were unable to allocate an
FD we would simply panic the kernel(!)

We now hoist the allocation so it can fail before we've committed to
a new executable.
2021-09-07 01:18:02 +02:00
Andreas Kling
b141bfe53b Kernel: Reorganize ELF loading so it can use TRY()
Due to the use of ELF::Image::for_each_program_header(), we were
previously unable to use TRY() in the ELF loading code (since the return
statement inside TRY() would only return from the iteration callback.)
2021-09-07 01:18:02 +02:00
Andreas Kling
56a2594de7 Kernel: Make KString factories return KResultOr + use TRY() everywhere
There are a number of places that don't have an error propagation path
right now, so I've added FIXME's about that.
2021-09-06 19:25:36 +02:00
Andreas Kling
cd8d52e6ae Kernel: Improve API names for switching address spaces
- enter_space => enter_address_space
- enter_process_paging_scope => enter_process_address_space
2021-09-06 18:56:51 +02:00
Andreas Kling
298cd57fe7 Kernel: Allocate signal trampoline before committing to a sys$execve()
Once we commit to a new executable image in sys$execve(), we can no
longer return with an error to whoever called us from userspace.
We must make sure to surface any potential errors before that point.

This patch moves signal trampoline allocation before the commit.
A number of other things remain to be moved.
2021-09-06 18:56:51 +02:00
Andreas Kling
6863d015ec Kernel: Use TRY() more in sys$execve()
I just keep finding more and more places to make use of this. :^)
2021-09-06 18:56:51 +02:00
Andreas Kling
009ea5013d Kernel: Use TRY() in find_elf_interpreter_for_executable() 2021-09-06 18:56:51 +02:00
Andreas Kling
511ebffd94 Kernel: Improve find_elf_interpreter_for_executable() parameter names 2021-09-06 18:56:51 +02:00
Andreas Kling
645e29a88b Kernel: Don't turn I/O errors during sys$execve() into ENOEXEC
Instead, just propagate whatever the real error was.
2021-09-06 13:06:05 +02:00
Andreas Kling
84addef10f Kernel: Improve arguments retrieval error propagation in sys$execve()
Instead of turning any arguments related error into an EFAULT, we now
propagate the innermost error during arguments retrieval.
2021-09-06 13:06:05 +02:00
Andreas Kling
6e3381ac32 Kernel: Use KResultOr and TRY() for {Shared,Private}InodeVMObject 2021-09-06 13:06:05 +02:00
Andreas Kling
7981422500 Kernel: Make Threads always have a name
We previously allowed Thread to exist in a state where its m_name was
null, and had to work around that in various places.

This patch removes that possibility and forces those who would create a
thread (or change the name of one) to provide a NonnullOwnPtr<KString>
with the name.
2021-09-06 13:06:05 +02:00
Andreas Kling
75564b4a5f Kernel: Make kernel region allocators return KResultOr<NOP<Region>>
This expands the reach of error propagation greatly throughout the
kernel. Sadly, it also exposes the fact that we're allocating (and
doing other fallible things) in constructors all over the place.

This patch doesn't attempt to address that of course. That's work for
our future selves.
2021-09-06 01:55:27 +02:00