...and also RangeAllocator => VirtualRangeAllocator.
This clarifies that the ranges we're dealing with are *virtual* memory
ranges and not anything else.
Now that all KResult and KResultOr are used consistently throughout the
kernel, it's no longer necessary to return negative error codes.
However, we were still doing that in some places, so let's fix all those
(bugs) by removing the minuses. :^)
Create the disk cache up front, so we can verify it succeeds.
Make the KBuffer allocation fail-able, so we can properly handle
failure when the user asks up to mount a Ext2 filesystem under
OOM conditions.
It's easy to forget the responsibility of validating and safely copying
kernel parameters in code that is far away from syscalls. ioctl's are
one such example, and bugs there are just as dangerous as at the root
syscall level.
To avoid this case, utilize the AK::Userspace<T> template in the ioctl
kernel interface so that implementors have no choice but to properly
validate and copy ioctl pointer arguments.
This patch changes the semantics of purgeable memory.
- AnonymousVMObject now has a "purgeable" flag. It can only be set when
constructing the object. (Previously, all anonymous memory was
effectively purgeable.)
- AnonymousVMObject now has a "volatile" flag. It covers the entire
range of physical pages. (Previously, we tracked ranges of volatile
pages, effectively making it a page-level concept.)
- Non-volatile objects maintain a physical page reservation via the
committed pages mechanism, to ensure full coverage for page faults.
- When an object is made volatile, it relinquishes any unused committed
pages immediately. If later made non-volatile again, we then attempt
to make a new committed pages reservation. If this fails, we return
ENOMEM to userspace.
mmap() now creates purgeable objects if passed the MAP_PURGEABLE option
together with MAP_ANONYMOUS. anon_create() memory is always purgeable.
Advisory locks don't actually prevent other processes from writing to
the file, but they do prevent other processes looking to acquire and
advisory lock on the file.
This implementation currently only adds non-blocking locks, which are
all I need for now.
We often get queried for the root inode, and it will always be cached
in memory anyway, so let's make Ext2FS::root_inode() fast by keeping
the root inode in a dedicated member variable.
To count the remaining children, we simply need to traverse the
directory and increment a counter. No need for a custom virtual that
all file systems have to implement. :^)
Unless we're accessing mutex-guarded metadata, there's no need to
acquire the inode lock.
The file system ID or inode index of a constructed inode will never
change, for example.
Reimplement directory traversal in terms of read_bytes() instead of
doing direct block access. This lets us avoid taking the inode lock
while iterating over the directory contents.
Once we've finalized all the file system metadata in flush_writes(),
we no longer need to hold the file system lock during the call to
BlockBasedFileSystem::flush_writes().
Ext2FS::get_inode() will remember unknown inode indices that it has
been asked about and put them into the inode cache as null inodes.
flush_writes() was not null-checking these while iterating, which
was a bug I finally managed to hit.
Flushing also seemed like a good time to drop unknown inodes from
the cache, since there's no good reason to hold to them indefinitely.
The file system lock is meant to protect the file system metadata
(super blocks, bitmaps, etc.) Not protect processes from reading
independent parts of the disk at once.
This patch introduces a new lock to protect the *block cache* instead,
which is the real thing that needs synchronization.
Forcing users of a FileDescription to seek before they can read/write
makes it inherently racy. This patch adds variants of read/write that
simply ignore the "current offset" of the description in favor of a
caller-supplied offset.