Instead of just having a giant KBuffer that is not resizeable easily, we
use multiple AnonymousVMObjects in one Vector to store them.
The idea is to not have to do giant memcpy or memset each time we need
to allocate or de-allocate memory for TmpFS inodes, but instead, we can
allocate only the desired block range when trying to write to it.
Therefore, it is also possible to have data holes in the inode content
in case of skipping an entire set of one data block or more when writing
to the inode content, thus, making memory usage much more efficient.
To ensure we don't run out of virtual memory range, don't allocate a
Region in advance to each TmpFSInode, but instead try to allocate a
Region on IO operation, and then use that Region to map the VMObjects
in IO loop.
We make these methods non-virtual because we want to ensure we properly
enforce locking of the m_inode_lock mutex. Also, for write operations,
we want to call prepare_to_write_data before the actual write. The
previous design required us to ensure the callers do that at various
places which lead to hard-to-find bugs. By moving everything to a place
where we call prepare_to_write_data only once, we eliminate a possibilty
of forgeting to call it on some code path in the kernel.
Instead of having three separate APIs (one for each timestamp),
there's now only Inode::update_timestamps() and it takes 3x optional
timestamps. The non-empty timestamps are updated while holding the inode
mutex, and the outside world no longer has to look at intermediate
timestamp states.
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:
- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable
This patch renames the Kernel classes so that they can coexist with
the original AK classes:
- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable
The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
Instead of requiring each FileSystem implementation to call this method
when trying to write data, do the calls at 2 points to avoid further
calls (or lack of them due to not remembering to use it) at other files
and locations in the codebase.
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
As with the previous commit, we put a distinction between filesystems
that require a file description and those which don't, but now in a much
more readable mechanism - all initialization properties as well as the
create static method are grouped to create the FileSystemInitializer
structure. Then when we need to initialize an instance, we iterate over
a table of these structures, checking for matching structure and then
validating the given arguments from userspace against the requirements
to ensure we can create a valid instance of the requested filesystem.
The HashMap of InodeIndex->Inode in TmpFS only had one purpose: looking
up parent inodes by index.
Instead of using a map for this, we can simply give each inode a WeakPtr
to its parent inode. This saves us the trouble of dealing with the
fallibility of HashMap allocations, and it just generally simpler. :^)
Previously we were uncaching inodes from TmpFSInode::one_ref_left().
This was not safe, since one_ref_left() was effectively being called
on a raw pointer after decrementing the local ref count and observing
it become 1. There was a race here where someone else could trigger
the destructor by unreffing to 0 before one_ref_left() got called,
causing us to call one_ref_left() on a deleted inode.
We fix this by using the new remove_from_secondary_lists() mechanism
in ListedRefCounted and synchronizing all access to the TmpFS inode
map with the main Inode::all_instances() lock.
There's probably a nicer way to solve this.
We were doing this dance in notify_watchers():
set_metadata_dirty(true);
set_metadata_dirty(false);
This was done in order to force out inode watcher events immediately.
Unfortunately, this was racy, as if SyncTask got scheduled at the wrong
moment, it would try to flush metadata for a clean inode. This then got
trapped by the VERIFY() statement in Inode::sync_all():
VERIFY(inode.is_metadata_dirty());
This patch fixes the issue by replacing notify_watchers() with lazy
metadata notifications like all other filesystems.
We now use AK::Error and AK::ErrorOr<T> in both kernel and userspace!
This was a slightly tedious refactoring that took a long time, so it's
not unlikely that some bugs crept in.
Nevertheless, it does pass basic functionality testing, and it's just
real nice to finally see the same pattern in all contexts. :^)
Prior to this change, both uid_t and gid_t were typedef'ed to `u32`.
This made it easy to use them interchangeably. Let's not allow that.
This patch adds UserID and GroupID using the AK::DistinctNumeric
mechanism we've already been employing for pid_t/ProcessID.
We need some overflow checks due to the implementation of TmpFS.
When size_t is 32 bits and off_t is 64 bits, we might overflow our
KBuffer max size and confuse the KBuffer set_size code, causing a VERIFY
failure. Make sure that resulting offset + size will fit in a size_t.
Another constraint, we make sure that the resulting offset + size will
be less than half of the maximum value of a size_t, because we double
the KBuffer size each time we resize it.
This commit converts naked `new`s to `AK::try_make` and `AK::try_create`
wherever possible. If the called constructor is private, this can not be
done, so we instead now use the standard-defined and compiler-agnostic
`new (nothrow)`.
This patch modifies InodeWatcher to switch to a one watcher, multiple
watches architecture. The following changes have been made:
- The watch_file syscall is removed, and in its place the
create_iwatcher, iwatcher_add_watch and iwatcher_remove_watch calls
have been added.
- InodeWatcher now holds multiple WatchDescriptions for each file that
is being watched.
- The InodeWatcher file descriptor can be read from to receive events on
all watched files.
Co-authored-by: Gunnar Beutner <gunnar@beutner.name>
The error handling in all these cases was still using the old style
negative values to indicate errors. We have a nicer solution for this
now with KResultOr<T>. This change switches the interface and then all
implementers to use the new style.