Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:
- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable
This patch renames the Kernel classes so that they can coexist with
the original AK classes:
- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable
The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
All users which relied on the default constructor use a None lock rank
for now. This will make it easier to in the future remove LockRank and
actually annotate the ranks by searching for None.
This matches out general macro use, and specifically other verification
macros like VERIFY(), VERIFY_NOT_REACHED(), VERIFY_INTERRUPTS_ENABLED(),
and VERIFY_INTERRUPTS_DISABLED().
If the final copy_to_user() call fails when writing the file descriptors
to the output array, we have to make sure the file descriptors don't
remain in the process file descriptor table. Otherwise they are
basically leaked, as userspace is not aware of them.
This matches the behavior of our sys$socketpair() implementation.
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
The extra argument to fcntl is a pointer in the case of F_GETLK/F_SETLK
and we were pulling out a u32, leading to pointer truncation on x86_64.
Among other things, this fixes Assistant on x86_64 :^)
`sigsuspend` was previously implemented using a poll on an empty set of
file descriptors. However, this broke quite a few assumptions in
`SelectBlocker`, as it verifies at least one file descriptor to be
ready after waking up and as it relies on being notified by the file
descriptor.
A bare-bones `sigsuspend` may also be implemented by relying on any of
the `sigwait` functions, but as `sigsuspend` features several (currently
unimplemented) restrictions on how returns work, it is a syscall on its
own.
This exposes the child processes for a process as a directory
of symlinks to the respective /proc entries for each child.
This makes for an easier and possibly more efficient way
to find and count a process's children. Previously the only
method was to parse the entire /proc/all JSON file.
When we lock a mutex, eventually `Thread::block` is invoked which could
in turn invoke `Process::big_lock().restore_exclusive_lock()`. This
would then try to add the current thread to a different blocked thread
list then the one in use for the original mutex being locked, and
because it's an intrusive list, the thread is removed from its original
list during the `.append()`. When the original mutex eventually
unblocks, we no longer have the thread in the intrusive blocked threads
list and we panic.
Solve this by making the big lock mutex special and giving it its own
blocked thread list. Because the process big lock is temporary and is
being actively removed from e.g. syscalls, it's a matter of time before
we can also remove the fix introduced by this commit.
Fixes issue #9401.
This makes pledge() ignore promises that would otherwise cause it to
fail with EPERM, which is very useful for allowing programs to run under
a "jail" so to speak, without having them termiate early due to a
failing pledge() call.
The obsolete ttyname and ptsname syscalls are removed.
LibC doesn't rely on these anymore, and it helps simplifying the Kernel
in many places, so it's an overall an improvement.
In addition to that, /proc/PID/tty node is removed too as it is not
needed anymore by userspace to get the attached TTY of a process, as
/dev/tty (which is already a character device) represents that as well.
POSIX requires that sigaction() and friends set a _process-wide_ signal
handler, so move signal handlers and flags inside Process.
This also fixes a "pid/tid confusion" FIXME, as we can now send the
signal to the process and let that decide which thread should get the
signal (which is the thread with tid==pid, but that's now the Process's
problem).
Note that each thread still retains its signal mask, as that is local to
each thread.
Arguments larger than 32bit need to be passed as a pointer on a 32bit
architectures. sys$profiling_enable has u64 event_mask argument,
which means that it needs to be passed as an pointer. Previously upper
32bits were filled by garbage.
Move the definitions for maximum argument and environment size to
Process.h from execve.cpp. This allows sysconf(_SC_ARG_MAX) to return
the actual argument maximum of 128 KiB to userspace.
This commit removes the usage of HashMap in Mutex, thereby making Mutex
be allocation-free.
In order to achieve this several simplifications were made to Mutex,
removing unused code-paths and extra VERIFYs:
* We no longer support 'upgrading' a shared lock holder to an
exclusive holder when it is the only shared holder and it did not
unlock the lock before relocking it as exclusive. NOTE: Unlike the
rest of these changes, this scenario is not VERIFY-able in an
allocation-free way, as a result the new LOCK_SHARED_UPGRADE_DEBUG
debug flag was added, this flag lets Mutex allocate in order to
detect such cases when debugging a deadlock.
* We no longer support checking if a Mutex is locked by the current
thread when the Mutex was not locked exclusively, the shared version
of this check was not used anywhere.
* We no longer support force unlocking/relocking a Mutex if the Mutex
was not locked exclusively, the shared version of these functions
was not used anywhere.
This ensures that everything allocated on the stack in Process::exec()
gets cleaned up. We had a few leaks related to the parsing of shebang
(#!) executables that get fixed by this.
This function is an extended version of `chmod(2)` that lets one control
whether to dereference symlinks, and specify a file descriptor to a
directory that will be used as the base for relative paths.
Previously we would crash the process immediately when a promise
violation was found during a syscall. This is error prone, as we
don't unwind the stack. This means that in certain cases we can
leak resources, like an OwnPtr / RefPtr tracked on the stack. Or
even leak a lock acquired in a ScopeLockLocker.
To remedy this situation we move the promise violation handling to
the syscall handler, right before we return to user space. This
allows the code to follow the normal unwind path, and grantees
there is no longer any cleanup that needs to occur.
The Process::require_promise() and Process::require_no_promises()
functions were modified to return ErrorOr<void> so we enforce that
the errors are always propagated by the caller.