When deleting a directory, the rmdir syscall should fail if the path was
unveiled without the 'c' permission. This matches the same behavior that
OpenBSD enforces when doing this kind of operation.
When deleting a file, the unlink syscall should fail if the path was
unveiled without the 'w' permission, to ensure that userspace is aware
of the possibility of removing a file only when the path was unveiled as
writable.
When using the userdel utility, we now unveil that directory path with
the unveil 'c' permission so removal of an account home directory is
done properly.
The Storage subsystem, like the Audio and HID subsystems, exposes Unix
device files (for example, in the /dev directory). To ensure consistency
across the repository, we should make the Storage subsystem to reside in
the Kernel/Devices directory like the two other mentioned subsystems.
The contents of the directory inode could change if we are not taking so
we must take the m_inode_lock to prevent corruption when reading the
directory contents.
This is not needed, because when we are doing this traversing, functions
that are called from this function are using proper and more "atomic"
locking.
"Wherever applicable" = most places, actually :^), especially for
networking and filesystem timestamps.
This includes changes to unzip, which uses DOSPackedTime, since that is
changed for the FAT file systems.
That's what this class really is; in fact that's what the first line of
the comment says it is.
This commit does not rename the main files, since those will contain
other time-related classes in a little bit.
The Raspberry Pi hardware doesn't support a proper software-initiated
shutdown, so this instead uses the watchdog to reboot to a special
partition which the firmware interprets as an immediate halt on
shutdown. When running under Qemu, this causes the emulator to exit.
These functions would have caused a `-Woverloaded-virtual` warning with
GCC 13, as they shadow `File::{attach,detach}(OpenFileDescription&)`.
Both of these functions had a single call site. This commit inlines
`attach` into its only caller, `FIFO::open_direction`.
Instead of explicitly checking `is_fifo()` in `~OpenFileDescription`
before running the `detach(Direction)` overload, let's just override the
regular `detach(OpenFileDescription&)` for `FIFO` to perform this action
instead.
Whenever an entry is added to the cache, the last element is removed to
make space for the new entry(if the cache is full). To make this an LRU
cache, the entry needs to be moved to the front of the list when there
is a cache hit so that the least recently used entry moves to the end
to be evicted first.
This was the last change that was needed to be able boot with the flag
of LOCK_IN_CRITICAL_DEBUG. That flag is not always enabled because there
are still other issues in which we hold a spinlock and still try to lock
a mutex.
Instead of using one global mutex we can protect internal structures of
the InodeWatcher class with SpinlockProtected wrappers. This in turn
allows the InodeWatcher code to be called from other parts in the kernel
while holding a prior spinlock properly.
`process.fds()` is protected by a Mutex, which causes issues when we try
to acquire it while holding a Spinlock. Since nothing seems to use this
value, let's just remove it entirely for now.
The existing `read_entire` is quite slow due to allocating and copying
multiple times, but it is simultaneously quite hard to get rid of in a
single step. As a replacement, add a new function that reads as much as
possible directly into a user-provided buffer.
To do this we also need to get rid of LockRefPtrs in the USB code as
well.
Most of the SysFS nodes are statically generated during boot and are not
mutated afterwards.
The same goes for general device code - once we generate the appropriate
SysFS nodes, we almost never mutate the node pointers afterwards, making
locking unnecessary.
We have a problem with the original utimensat syscall because when we
do call LibC futimens function, internally we provide an empty path,
and the Kernel get_syscall_path_argument method will detect this as an
invalid path.
This happens to spit an error for example in the touch utility, so if a
user is running "touch non_existing_file", it will create that file, but
the user will still see an error coming from LibC futimens function.
This new syscall gets an open file description and it provides the same
functionality as utimensat, on the specified open file description.
The new syscall will be used later by LibC to properly implement LibC
futimens function so the situation described with relation to the
"touch" utility could be fixed.
These were easy to pick-up as these pointers are assigned during the
construction point and are never changed afterwards.
This small change to these pointers will ensure that our code will not
accidentally assign these pointers with a new object which is always a
kind of bug we will want to prevent.
These were stored in a bunch of places. The main one that's a bit iffy
is the Mutex::m_holder one, which I'm going to simplify in a subsequent
commit.
In Plan9FS and WorkQueue, we can't make the NNRPs const due to
initialization order problems. That's probably doable with further
cleanup, but left as an exercise for our future selves.
Before starting this, I expected the thread blockers to be a problem,
but as it turns out they were super straightforward (for once!) as they
don't mutate the thread after initiating a block, so they can just use
simple const-ified NNRPs.
- Instead of taking the first new thread as an out-parameter, we now
bundle the process and its first thread in a struct and use that
as the return value.
- Make all Process factory functions return ErrorOr. Use this to convert
some places to more TRY().
- Drop the "try_" prefix on Process factory functions.
The only persistent one of these was Thread::m_process and that never
changes after initialization. Make it const to enforce this and switch
everything over to RefPtr & NonnullRefPtr.
- The host custody never changes after initialization, so there's no
need to protect it with a spinlock.
- To enforce the fact that some members don't change after
initialization, make them const.
There was only one permanent storage location for these: as a member
in the Mount class.
That member is never modified after Mount initialization, so we don't
need to worry about races there.
This commit fixes a kernel panic that happened when unmounting
a disk due to an invalid memory access.
This was because `DiskCache` initializes two linked lists that use
an argument `KBuffer` as the storage for their elements.
Since the member `KBuffer` was declared after the two lists,
when `DiskCache`'s destructor was called, then `KBuffer`'s destructor
was called before the ones of the two lists, causing a page fault in
the kernel.
This is done with 2 major steps:
1. Remove JailManagement singleton and use a structure that resembles
what we have with the Process object. This is required later for the
second step in this commit, but on its own, is a major change that
removes this clunky singleton that had no real usage by itself.
2. Use IntrusiveLists to keep references to Process objects in the same
Jail so it will be much more straightforward to iterate on this kind
of objects when needed. Previously we locked the entire Process list
and we did a simple pointer comparison to check if the checked
Process we iterate on is in the same Jail or not, which required
taking multiple Spinlocks in a very clumsy and heavyweight way.
This was mostly straightforward, as all the storage locations are
guarded by some related mutex.
The use of old-school associated mutexes instead of MutexProtected
is unfortunate, but the process to modernize such code is ongoing.
This patch switches away from {Nonnull,}LockRefPtr to the non-locking
smart pointers throughout the kernel.
I've looked at the handful of places where these were being persisted
and I don't see any race situations.
Note that the process file descriptor table (Process::m_fds) was already
guarded via MutexProtected.
Before of this patch, we looked at the unveil data of the FinalizerTask,
which naturally doesn't have any unveil restrictions, therefore allowing
an unveil bypass for a process that enabled performance coredumps.
To ensure we always check the dumped process unveil data, an option to
pass a Process& has been added to a couple of methods in the class of
VirtualFileSystem.
Since the ProcFS doesn't hold many global objects within it, the need
for a fully-structured design of backing components and a registry like
with the SysFS is no longer true.
To acommodate this, let's remove all backing store and components of the
ProcFS, so now it resembles what we had in the early days of ProcFS in
the project - a mostly-static filesystem, with very small amount of
kmalloc allocations needed.
We still use the inode index mechanism to understand the role of each
inode, but this is done in a much "static"ier way than before.
This subdirectory is meant to hold all constant data related to the
kernel. This means that this data is never meant to updated and is
relevant from system boot to system shutdown.
Move the inodes of "load_base", "cmdline" and "system_mode" to that
directory. All nodes under this new subdirectory are generated during
boot, and therefore don't require calling kmalloc each time we need to
read them. Locking is also not necessary, because these nodes and their
data are completely static once being generated.
This is considered somewhat an abstraction layer violation, because we
should always let userspace to decide on the root filesystem mount flags
because it allows the user to configure the mount table to preferences
that they desire.
Now that SystemServer is modified to re-mount the root mount with the
desired flags, we can just mount the root filesystem without assuming
special flags.