Commit graph

1450 commits

Author SHA1 Message Date
kleines Filmröllchen
b645f87b7a Kernel: Overhaul system shutdown procedure
For a long time, our shutdown procedure has basically been:
- Acquire big process lock.
- Switch framebuffer to Kernel debug console.
- Sync and lock all file systems so that disk caches are flushed and
  files are in a good state.
- Use firmware and architecture-specific functionality to perform
  hardware shutdown.

This naive and simple shutdown procedure has multiple issues:
- No processes are terminated properly, meaning they cannot perform more
  complex cleanup work. If they were in the middle of I/O, for instance,
  only the data that already reached the Kernel is written to disk, and
  data corruption due to unfinished writes can therefore still occur.
- No file systems are unmounted, meaning that any important unmount work
  will never happen. This is important for e.g. Ext2, which has
  facilites for detecting improper unmounts (see superblock's s_state
  variable) and therefore requires a proper unmount to be performed.
  This was also the starting point for this PR, since I wanted to
  introduce basic Ext2 file system checking and unmounting.
- No hardware is properly shut down beyond what the system firmware does
  on its own.
- Shutdown is performed within the write() call that asked the Kernel to
  change its power state. If the shutdown procedure takes longer (i.e.
  when it's done properly), this blocks the process causing the shutdown
  and prevents any potentially-useful interactions between Kernel and
  userland during shutdown.

In essence, current shutdown is a glorified system crash with minimal
file system cleanliness guarantees.

Therefore, this commit is the first step in improving our shutdown
procedure. The new shutdown flow is now as follows:
- From the write() call to the power state SysFS node, a new task is
  started, the Power State Switch Task. Its only purpose is to change
  the operating system's power state. This task takes over shutdown and
  reboot duties, although reboot is not modified in this commit.
- The Power State Switch Task assumes that userland has performed all
  shutdown duties it can perform on its own. In particular, it assumes
  that all kinds of clean process shutdown have been done, and remaining
  processes can be hard-killed without consequence. This is an important
  separation of concerns: While this commit does not modify userland, in
  the future SystemServer will be responsible for performing proper
  shutdown of user processes, including timeouts for stubborn processes
  etc.
- As mentioned above, the task hard-kills remaining user processes.
- The task hard-kills all Kernel processes except itself and the
  Finalizer Task. Since Kernel processes can delay their own shutdown
  indefinitely if they want to, they have plenty opportunity to perform
  proper shutdown if necessary. This may become a problem with
  non-cooperative Kernel tasks, but as seen two commits earlier, for now
  all tasks will cooperate within a few seconds.
- The task waits for the Finalizer Task to clean up all processes.
- The task hard-kills and finalizes the Finalizer Task itself, meaning
  that it now is the only remaining process in the system.
- The task syncs and locks all file systems, and then unmounts them. Due
  to an unknown refcount bug we currently cannot unmount the root file
  system; therefore the task is able to abort the clean unmount if
  necessary.
- The task performs platform-dependent hardware shutdown as before.

This commit has multiple remaining issues (or exposed existing ones)
which will need to be addressed in the future but are out of scope for
now:
- Unmounting the root filesystem is impossible due to remaining
  references to the inodes /home and /home/anon. I investigated this
  very heavily and could not find whoever is holding the last two
  references.
- Userland cannot perform proper cleanup, since the Kernel's power state
  variable is accessed directly by tools instead of a proper userland
  shutdown procedure directed by SystemServer.

The recently introduced Firmware/PowerState procedures are removed
again, since all of the architecture-independent code can live in the
power state switch task. The architecture-specific code is kept,
however.
2023-07-15 00:12:01 +02:00
kleines Filmröllchen
021fb3ea05 Kernel/Tasks: Allow Kernel processes to be shut down
Since we never check a kernel process's state like a userland process,
it's possible for a kernel process to ignore the fact that someone is
trying to kill it, and continue running. This is not desireable if we
want to properly shutdown all processes, including Kernel ones.
2023-07-15 00:12:01 +02:00
kleines Filmröllchen
8940552d1d Kernel/VirtualFileSystem: Allow unmounting via inode and mount path
This pair of information uniquely identifies any mount point, and it can
be used in situations where mount point custodies are not available.
2023-07-15 00:12:01 +02:00
kleines Filmröllchen
abc1eaff36 Kernel/VirtualFileSystem: Count bind mounts towards normal FS mountcount
This is correct since unmount doesn't treat bind mounts specially. If we
don't do this, unmounting bind mounts will call
prepare_for_last_unmount() on the guest FS much too early, which will
most likely fail due to a busy file system.
2023-07-15 00:12:01 +02:00
kleines Filmröllchen
251b17085b Kernel/Ext2: Check and set file system state
This is supposed to detect whether a file system was unmounted
cleanly or not.
2023-07-15 00:12:01 +02:00
kleines Filmröllchen
8fb126bec6 Kernel/FileSystem: Pass last mount point guest inode to unmount prepare
This will be important later on when we check file system busyness.
2023-07-15 00:12:01 +02:00
kleines Filmröllchen
2fe5ece449 Kernel: Add accessor for mount host custody
There's no reason this information needs to be secret.
2023-07-15 00:12:01 +02:00
Taj Morton
1d2f1abf97 FileSystem/FATFS: Convert internal FAT inode attributes to dirent types 2023-07-10 21:54:23 -06:00
Timothy Flynn
c911781c21 Everywhere: Remove needless trailing semi-colons after functions
This is a new option in clang-format-16.
2023-07-08 10:32:56 +01:00
Liav A
23a7ccf607 Kernel+LibCore+LibC: Split the mount syscall into multiple syscalls
This is a preparation before we can create a usable mechanism to use
filesystem-specific mount flags.
To keep some compatibility with userland code, LibC and LibCore mount
functions are kept being usable, but now instead of doing an "atomic"
syscall, they do multiple syscalls to perform the complete procedure of
mounting a filesystem.

The FileBackedFileSystem IntrusiveList in the VFS code is now changed to
be protected by a Mutex, because when we mount a new filesystem, we need
to check if a filesystem is already created for a given source_fd so we
do a scan for that OpenFileDescription in that list. If we fail to find
an already-created filesystem we create a new one and register it in the
list if we successfully mounted it. We use a Mutex because we might need
to initiate disk access during the filesystem creation, which will take
other mutexes in other parts of the kernel, therefore making it not
possible to take a spinlock while doing this.
2023-07-02 01:04:51 +02:00
Liav A
9b8b8c0e04 Kernel: Simplify reboot & poweroff code flow a bit
Instead of using ifdefs to use the correct platform-specific methods, we
can just use the same pattern we use for the microseconds_delay function
which has specific implementations for each Arch CPU subdirectory.

When linking a kernel image, the actual correct and platform-specific
power-state changing methods will be called in Firmware/PowerState.cpp
file.
2023-06-27 20:04:42 +02:00
Liav A
d550b09871 Kernel: Move PC BIOS-related code to the x86_64 architecture directory
All code that is related to PC BIOS should not be in the Kernel/Firmware
directory as this directory is for abstracted and platform-agnostic code
like ACPI (and device tree parsing in the future).

This fixes a problem with the aarch64 architecure, as these machines
don't have any PC-BIOS in them so actually trying to access these memory
locations (EBDA, BIOS ROM) does not make any sense, as they're specific
to x86 machines only.
2023-06-19 23:49:00 +02:00
Liav A
be16a91aec Kernel: Rename FirmwareSysFSDirectory => SysFSFirmwareDirectory
This matches how we give the pattern names for other classses for SysFS
components.
2023-06-19 23:49:00 +02:00
Robin Voetter
a433cbefbe Kernel: Fix reading expansion ROM SysFS node
Previously, reads would only be successful for offset 0. For this
reason, the maximum size that could be correctly read from the PCI
expansion ROM SysFS node was limited to the block size, and
subsequent blocks would fail. This commit fixes the computation of
the number of bytes to read.
2023-06-19 21:35:37 +02:00
Tim Ledbetter
f95dccdb45 Kernel+LibCore: Add process creation time to /sys/kernel/processes 2023-06-10 07:13:25 +02:00
Tim Ledbetter
7f855ad6b3 Kernel: Initialize ProcFS timestamps to process creation time 2023-06-09 17:15:41 +02:00
Liav A
9ee098b119 Kernel: Move all Graphics-related code into Devices/GPU directory
Like the HID, Audio and Storage subsystem, the Graphics subsystem (which
handles GPUs technically) exposes unix device files (typically in /dev).
To ensure consistency across the repository, move all related files to a
new directory under Kernel/Devices called "GPU".

Also remove the redundant "GPU" word from the VirtIO driver directory,
and the word "Graphics" from GraphicsManagement.{h,cpp} filenames.
2023-06-06 00:40:32 +02:00
Liav A
336fb4f313 Kernel: Move InterruptDisabler to the Interrupts subdirectory 2023-06-04 21:32:34 +02:00
Liav A
927926b924 Kernel: Move Performance-measurement code to the Tasks subdirectory 2023-06-04 21:32:34 +02:00
Liav A
8f21420a1d Kernel: Move all boot-related code to the new Boot subdirectory 2023-06-04 21:32:34 +02:00
Liav A
7c0540a229 Everywhere: Move global Kernel pattern code to Kernel/Library directory
This has KString, KBuffer, DoubleBuffer, KBufferBuilder, IOWindow,
UserOrKernelBuffer and ScopedCritical classes being moved to the
Kernel/Library subdirectory.

Also, move the panic and assertions handling code to that directory.
2023-06-04 21:32:34 +02:00
Liav A
f1cbfc5a6e Kernel: Move task-crash related code to the Tasks subdirectory 2023-06-04 21:32:34 +02:00
Liav A
aaa1de7878 Kernel: Move {Virtual,Physical}Address classes to the Memory directory 2023-06-04 21:32:34 +02:00
Liav A
1b04726c85 Kernel: Move all tasks-related code to the Tasks subdirectory 2023-06-04 21:32:34 +02:00
Liav A
788022d5d1 Kernel: Move Jail code to a new subdirectory 2023-06-04 21:32:34 +02:00
Liav A
b40b1c8d93 Kernel+Userland: Ensure proper unveil permissions before using rm/rmdir
When deleting a directory, the rmdir syscall should fail if the path was
unveiled without the 'c' permission. This matches the same behavior that
OpenBSD enforces when doing this kind of operation.

When deleting a file, the unlink syscall should fail if the path was
unveiled without the 'w' permission, to ensure that userspace is aware
of the possibility of removing a file only when the path was unveiled as
writable.

When using the userdel utility, we now unveil that directory path with
the unveil 'c' permission so removal of an account home directory is
done properly.
2023-06-02 17:53:55 +02:00
Liav A
500b7b08d6 Kernel: Move the Storage directory to be a new directory under Devices
The Storage subsystem, like the Audio and HID subsystems, exposes Unix
device files (for example, in the /dev directory). To ensure consistency
across the repository, we should make the Storage subsystem to reside in
the Kernel/Devices directory like the two other mentioned subsystems.
2023-06-02 11:04:37 +02:00
Ben Wiederhake
5fafd82927 AK+Everywhere: Don't crash on invalid months
Sadly, we don't have proper error propagation here. However, crashing
the Kernel just because a CDROM contains an invalid month seems like a
bad idea.
2023-05-27 12:17:50 +02:00
Ben Wiederhake
815ea06d2c AK: Test from_unix_time_parts intensively 2023-05-27 12:17:50 +02:00
Liav A
2ab657d3b5 Kernel: Make Ext2FSInode::traverse_as_directory to take m_inode_lock
The contents of the directory inode could change if we are not taking so
we must take the m_inode_lock to prevent corruption when reading the
directory contents.
2023-05-27 10:58:58 +02:00
Liav A
902dac7f5f Kernel: Don't lock ProcFS mutex when calling traverse_as_directory
This is not needed, because when we are doing this traversing, functions
that are called from this function are using proper and more "atomic"
locking.
2023-05-27 10:58:58 +02:00
Liav A
bce17d06f5 Kernel: Don't lock SysFS filesystem mutex calling traverse_as_directory
This locking is simply not needed because the associated SysFS component
will use proper and more "atomic" locking on its own.
2023-05-27 10:58:58 +02:00
kleines Filmröllchen
939600d2d4 Kernel: Use UnixDateTime wherever applicable
"Wherever applicable" = most places, actually :^), especially for
networking and filesystem timestamps.

This includes changes to unzip, which uses DOSPackedTime, since that is
changed for the FAT file systems.
2023-05-24 23:18:07 +02:00
kleines Filmröllchen
213025f210 AK: Rename Time to Duration
That's what this class really is; in fact that's what the first line of
the comment says it is.

This commit does not rename the main files, since those will contain
other time-related classes in a little bit.
2023-05-24 23:18:07 +02:00
Liav A
4617c05a08 Kernel: Move a bunch of generic devices code into new subdirectory 2023-05-19 21:49:21 +02:00
Daniel Bertalan
d9c557d0b4 Kernel: Add RPi Watchdog and use it for system shutdown
The Raspberry Pi hardware doesn't support a proper software-initiated
shutdown, so this instead uses the watchdog to reboot to a special
partition which the firmware interprets as an immediate halt on
shutdown. When running under Qemu, this causes the emulator to exit.
2023-05-17 01:32:43 -06:00
Daniel Bertalan
2123fdd678 Kernel: Remove FIFO::{attach,detach}(Direction)
These functions would have caused a `-Woverloaded-virtual` warning with
GCC 13, as they shadow `File::{attach,detach}(OpenFileDescription&)`.

Both of these functions had a single call site. This commit inlines
`attach` into its only caller, `FIFO::open_direction`.

Instead of explicitly checking `is_fifo()` in `~OpenFileDescription`
before running the `detach(Direction)` overload, let's just override the
regular `detach(OpenFileDescription&)` for `FIFO` to perform this action
instead.
2023-05-15 07:00:29 +02:00
Pankaj Raghav
ac9d60bb13 Kernel: Promote the entry to the front during a cache hit
Whenever an entry is added to the cache, the last element is removed to
make space for the new entry(if the cache is full). To make this an LRU
cache, the entry needs to be moved to the front of the list when there
is a cache hit so that the least recently used entry moves to the end
to be evicted first.
2023-05-06 08:00:55 +02:00
Timothy Flynn
3d4d0a1243 Kernel: Colorize log message for paths which haven't been unveiled
The log message can be hard to spot in a sea of debug messages. Colorize
it to make the message more immediately pop out.
2023-04-25 18:04:15 +02:00
Liav A
ee4e9b807a Kernel: Protect internal structures in InodeWatcher with spinlocks
This was the last change that was needed to be able boot with the flag
of LOCK_IN_CRITICAL_DEBUG. That flag is not always enabled because there
are still other issues in which we hold a spinlock and still try to lock
a mutex.

Instead of using one global mutex we can protect internal structures of
the InodeWatcher class with SpinlockProtected wrappers. This in turn
allows the InodeWatcher code to be called from other parts in the kernel
while holding a prior spinlock properly.
2023-04-22 07:16:41 +02:00
Tim Schumacher
12ce6ef3d7 Kernel+Userland: Remove the nfds entry from /sys/kernel/processes
`process.fds()` is protected by a Mutex, which causes issues when we try
to acquire it while holding a Spinlock. Since nothing seems to use this
value, let's just remove it entirely for now.
2023-04-21 13:55:23 +02:00
Tim Schumacher
d4e114a31e Kernel: Remove unused functions related to reading full inodes 2023-04-17 01:20:23 +02:00
Tim Schumacher
6f524e35a7 Kernel: Use purpose-sized buffers when resolving inodes as links 2023-04-17 01:20:23 +02:00
Tim Schumacher
acd8c8dba4 Kernel: Add Inode::read_until_filled_or_end
The existing `read_entire` is quite slow due to allocating and copying
multiple times, but it is simultaneously quite hard to get rid of in a
single step. As a replacement, add a new function that reads as much as
possible directly into a user-provided buffer.
2023-04-17 01:20:23 +02:00
Liav A
b02ee664e7 Kernel: Get rid of *LockRefPtr in the SysFS filesystem code
To do this we also need to get rid of LockRefPtrs in the USB code as
well.
Most of the SysFS nodes are statically generated during boot and are not
mutated afterwards.

The same goes for general device code - once we generate the appropriate
SysFS nodes, we almost never mutate the node pointers afterwards, making
locking unnecessary.
2023-04-14 19:24:54 +02:00
Liav A
cbf78975f1 Kernel: Add the futimens syscall
We have a problem with the original utimensat syscall because when we
do call LibC futimens function, internally we provide an empty path,
and the Kernel get_syscall_path_argument method will detect this as an
invalid path.

This happens to spit an error for example in the touch utility, so if a
user is running "touch non_existing_file", it will create that file, but
the user will still see an error coming from LibC futimens function.

This new syscall gets an open file description and it provides the same
functionality as utimensat, on the specified open file description.
The new syscall will be used later by LibC to properly implement LibC
futimens function so the situation described with relation to the
"touch" utility could be fixed.
2023-04-10 10:21:28 +02:00
Liav A
6c4a47d916 Kernel: Remove redundant HID name from all associated files 2023-04-09 18:11:37 +02:00
Liav A
7b745a20f1 Kernel: Mark a bunch of NonnullRefPtrs also const to ensure immutability
These were easy to pick-up as these pointers are assigned during the
construction point and are never changed afterwards.

This small change to these pointers will ensure that our code will not
accidentally assign these pointers with a new object which is always a
kind of bug we will want to prevent.
2023-04-08 13:44:21 +02:00
Andreas Kling
1c77803845 Kernel: Stop using *LockRefPtr for TTY
TTY was only stored in Process::m_tty, so make that a SpinlockProtected.
2023-04-05 11:37:27 +02:00
Andreas Kling
c3915e4058 Kernel: Stop using *LockRefPtr for Thread
These were stored in a bunch of places. The main one that's a bit iffy
is the Mutex::m_holder one, which I'm going to simplify in a subsequent
commit.

In Plan9FS and WorkQueue, we can't make the NNRPs const due to
initialization order problems. That's probably doable with further
cleanup, but left as an exercise for our future selves.

Before starting this, I expected the thread blockers to be a problem,
but as it turns out they were super straightforward (for once!) as they
don't mutate the thread after initiating a block, so they can just use
simple const-ified NNRPs.
2023-04-04 10:33:42 +02:00
Andreas Kling
a098266ff5 Kernel: Simplify Process factory functions
- Instead of taking the first new thread as an out-parameter, we now
  bundle the process and its first thread in a struct and use that
  as the return value.

- Make all Process factory functions return ErrorOr. Use this to convert
  some places to more TRY().

- Drop the "try_" prefix on Process factory functions.
2023-04-04 10:33:42 +02:00
Andreas Kling
65438d8a85 Kernel: Stop using *LockRefPtr for Process pointers
The only persistent one of these was Thread::m_process and that never
changes after initialization. Make it const to enforce this and switch
everything over to RefPtr & NonnullRefPtr.
2023-04-04 10:33:42 +02:00
Andreas Kling
19084ef743 Kernel: Simplify Mount internals
- The host custody never changes after initialization, so there's no
  need to protect it with a spinlock.

- To enforce the fact that some members don't change after
  initialization, make them const.
2023-04-04 10:33:42 +02:00
Andreas Kling
673592dea8 Kernel: Stop using *LockRefPtr for FileSystem pointers
There was only one permanent storage location for these: as a member
in the Mount class.

That member is never modified after Mount initialization, so we don't
need to worry about races there.
2023-04-04 10:33:42 +02:00
Marco Cutecchia
1b04c43690 Kernel: Initialize DiskCache's buffer before the dirty&clean lists
This commit fixes a kernel panic that happened when unmounting
a disk due to an invalid memory access.
This was because `DiskCache` initializes two linked lists that use
an argument `KBuffer` as the storage for their elements.
Since the member `KBuffer` was declared after the two lists,
when `DiskCache`'s destructor was called, then `KBuffer`'s destructor
was called before the ones of the two lists, causing a page fault in
the kernel.
2023-04-02 12:43:17 -06:00
Tim Schumacher
ecd1862859 AK: Rename Stream::write_entire_buffer to Stream::write_until_depleted
No functional changes.
2023-03-13 15:16:20 +00:00
Liav A
633006926f Kernel: Make the Jails' internal design a lot more sane
This is done with 2 major steps:
1. Remove JailManagement singleton and use a structure that resembles
    what we have with the Process object. This is required later for the
    second step in this commit, but on its own, is a major change that
    removes this clunky singleton that had no real usage by itself.
2. Use IntrusiveLists to keep references to Process objects in the same
    Jail so it will be much more straightforward to iterate on this kind
    of objects when needed. Previously we locked the entire Process list
    and we did a simple pointer comparison to check if the checked
    Process we iterate on is in the same Jail or not, which required
    taking multiple Spinlocks in a very clumsy and heavyweight way.
2023-03-12 10:21:59 -06:00
Andreas Kling
03cc45e5a2 Kernel: Use RefPtr instead of LockRefPtr for File and subclasses
This was mostly straightforward, as all the storage locations are
guarded by some related mutex.

The use of old-school associated mutexes instead of MutexProtected
is unfortunate, but the process to modernize such code is ongoing.
2023-03-10 13:15:44 +01:00
Andreas Kling
e6fc7b3ff7 Kernel: Switch LockRefPtr<Inode> to RefPtr<Inode>
The main place where this is a little iffy is in RAMFS where inodes
have a LockWeakPtr to their parent inode. I've left that as a
LockWeakPtr for now.
2023-03-09 21:54:59 +01:00
Andreas Kling
d1371d66f7 Kernel: Use non-locking {Nonnull,}RefPtr for OpenFileDescription
This patch switches away from {Nonnull,}LockRefPtr to the non-locking
smart pointers throughout the kernel.

I've looked at the handful of places where these were being persisted
and I don't see any race situations.

Note that the process file descriptor table (Process::m_fds) was already
guarded via MutexProtected.
2023-03-07 00:30:12 +01:00
Andreas Kling
7369d0ab5f Kernel: Stop using NonnullLockRefPtrVector 2023-03-06 23:46:36 +01:00
Andreas Kling
21db2b7b90 Everywhere: Remove NonnullOwnPtr.h includes 2023-03-06 23:46:35 +01:00
Andreas Kling
359d6e7b0b Everywhere: Stop using NonnullOwnPtrVector
Same as NonnullRefPtrVector: weird semantics, questionable benefits.
2023-03-06 23:46:35 +01:00
Liav A
39de5b7f82 Kernel: Actually check Process unveil data when creating perfcore dump
Before of this patch, we looked at the unveil data of the FinalizerTask,
which naturally doesn't have any unveil restrictions, therefore allowing
an unveil bypass for a process that enabled performance coredumps.

To ensure we always check the dumped process unveil data, an option to
pass a Process& has been added to a couple of methods in the class of
VirtualFileSystem.
2023-03-05 15:15:55 +00:00
Liav A
c56e1c5378 Kernel/FileSystem: Simplify the ProcFS significantly
Since the ProcFS doesn't hold many global objects within it, the need
for a fully-structured design of backing components and a registry like
with the SysFS is no longer true.

To acommodate this, let's remove all backing store and components of the
ProcFS, so now it resembles what we had in the early days of ProcFS in
the project - a mostly-static filesystem, with very small amount of
kmalloc allocations needed.
We still use the inode index mechanism to understand the role of each
inode, but this is done in a much "static"ier way than before.
2023-02-24 22:14:18 +01:00
Liav A
9216caeec2 Kernel: Fix typo proccess => process in a name of Process method 2023-02-24 22:14:18 +01:00
Liav A
12b7328c22 AK+Kernel: Add includes before removing Kernel/ProcessExposed.h
Apparently without this file, we won't be able to compile due to missing
includes to TimeManagement and KBufferBuilder.
2023-02-24 22:14:18 +01:00
Andreas Kling
68c9781299 Kernel: Fix const-correctness of PCI::DeviceIdentifier usage 2023-02-21 00:54:04 +01:00
Liav A
61f4914d6e Kernel+Userland: Add constants subdirectory at /sys/kernel directory
This subdirectory is meant to hold all constant data related to the
kernel. This means that this data is never meant to updated and is
relevant from system boot to system shutdown.
Move the inodes of "load_base", "cmdline" and "system_mode" to that
directory. All nodes under this new subdirectory are generated during
boot, and therefore don't require calling kmalloc each time we need to
read them. Locking is also not necessary, because these nodes and their
data are completely static once being generated.
2023-02-19 13:47:11 +01:00
Liav A
1acd679775 Kernel: Remove unnecessary include from SysFS PowerStateSwitch code
I added that include in 2e55956784 by a
mistake, so we should get rid of it as soon as possible.
2023-02-19 08:13:04 +00:00
Liav A
8266e40b35 Kernel/FileSystem: Don't assume flags for root filesystem mount flags
This is considered somewhat an abstraction layer violation, because we
should always let userspace to decide on the root filesystem mount flags
because it allows the user to configure the mount table to preferences
that they desire.
Now that SystemServer is modified to re-mount the root mount with the
desired flags, we can just mount the root filesystem without assuming
special flags.
2023-02-19 01:20:10 +01:00
Liav A
4a14138230 Kernel/FileSystem: Fix check of read offset for the RAMFSInode code
The check of ensuring we are not trying to read beyond the end of the
inode data buffer is already there, it's just that we need to disallow
further reading if the read offset equals to the inode data size.
2023-02-19 01:01:45 +01:00
Liav A
9790b81959 Kernel/FileSystem: Add check of read offset for the FATInode code
Apparently we lacked this important check from the beginning of this
piece of code. This check is crucial to ensure we only give back data
being related to the FATInode data buffer and nothing beyond it.
2023-02-19 01:01:45 +01:00
Peter Elliott
ae5d7f542c Kernel: Change polarity of weak ownership between Inode and LocalSocket
There was a bug in which bound Inodes would lose all their references
(because localsocket does not reference them), and they would be
deallocated, and clients would get ECONNREFUSED as a result. now
LocalSocket has a strong reference to inode so that the inode will live
as long as the socket, and Inode has a weak reference to the socket,
because if the socket stops being referenced anywhere it should not be
bound.

This still prevents the reference loop that
220b7dd779 was trying to fix.
2023-02-19 00:37:37 +01:00
Undefine
ab298ca106 Kernel: Dont crash if power states gets set to an invalid value 2023-02-18 23:52:20 +01:00
Ollrogge
361df6eff8 AK: Add conversion functions for packed DOS time format
This also adjusts the FATFS code to use the new functions and removes
the now redundant old conversion functions.
2023-02-12 13:13:15 -07:00
Timothy Flynn
52687814ea Kernel: Explicitly copy Plan9FS read errors to registered delegates 2023-02-10 09:08:52 +00:00
MacDue
63b11030f0 Everywhere: Use ReadonlySpan<T> instead of Span<T const> 2023-02-08 19:15:45 +00:00
Tim Schumacher
81863eaf57 Kernel: Use AK::Stream to write packed binary data 2023-02-08 18:50:31 +00:00
Tim Schumacher
23e10a30ad Kernel: Modernize Error handling when serializing directory entries 2023-02-08 18:50:31 +00:00
Sam Atkins
1014aefe64 Kernel: Protect Thread::m_name with a spinlock
This replaces manually grabbing the thread's main lock.

This lets us remove the `get_thread_name` and `set_thread_name` syscalls
from the big lock. :^)
2023-02-06 20:36:53 +01:00
Sam Atkins
fe7b08dad7 Kernel: Protect Process::m_name with a spinlock
This also lets us remove the `get_process_name` and `set_process_name`
syscalls from the big lock. :^)
2023-02-06 20:36:53 +01:00
MacDue
83a59396c8 Kernel: Fix CPUInfo error propagation fixme
We can now propagate the errors directly from for_each_split_view(),
which I think counts as "Make this nicer" :^)
2023-02-05 19:31:21 +01:00
Liav A
ed67a877a3 Kernel+SystemServer+Base: Introduce the RAMFS filesystem
This filesystem is based on the code of the long-lived TmpFS. It differs
from that filesystem in one keypoint - its root inode doesn't have a
sticky bit on it.

Therefore, we mount it on /dev, to ensure only root can modify files on
that directory. In addition to that, /tmp is mounted directly in the
SystemServer main (start) code, so it's no longer specified in the fstab
file. We ensure that /tmp has a sticky bit and has the value 0777 for
root directory permissions, which is certainly a special case when using
RAM-backed (and in general other) filesystems.

Because of these 2 changes, it's no longer needed to maintain the TmpFS
filesystem, hence it's removed (renamed to RAMFS), because the RAMFS
represents the purpose of this filesystem in a much better way - it
relies on being backed by RAM "storage", and therefore it's easy to
conclude it's temporary and volatile, so its content is gone on either
system shutdown or unmounting of the filesystem.
2023-02-04 15:32:45 -07:00
Tim Schumacher
ae64b68717 AK: Deprecate the old AK::Stream
This also removes a few cases where the respective header wasn't
actually required to be included.
2023-01-29 19:16:44 -07:00
Liav A
722ae35329 Kernel/FileSystem: Simplify the ProcFS inode code
This is done by merging all scattered pieces of derived classes from the
ProcFSInode class into that one class, so we don't use inheritance but
rather simplistic checks to determine the proper code for each ProcFS
inode with its specific characteristics.
2023-01-29 12:59:30 +01:00
Sam Atkins
3cbc0fdbb0 Kernel: Remove declarations for non-existent methods 2023-01-27 20:33:18 +00:00
Liav A
a7677f1d9b Kernel/PCI: Expose PCI option ROM data from the sysfs interface
For each exposed PCI device in sysfs, there's a new node called "rom"
and by reading it, it exposes the raw data of a PCI option ROM blob to
a user for examining the blob.
2023-01-26 23:04:26 +01:00
Liav A
1f9d3a3523 Kernel/PCI: Hold a reference to DeviceIdentifier in the Device class
There are now 2 separate classes for almost the same object type:
- EnumerableDeviceIdentifier, which is used in the enumeration code for
  all PCI host controller classes. This is allowed to be moved and
  copied, as it doesn't support ref-counting.
- DeviceIdentifier, which inherits from EnumerableDeviceIdentifier. This
  class uses ref-counting, and is not allowed to be copied. It has a
  spinlock member in its structure to allow safely executing complicated
  IO sequences on a PCI device and its space configuration.
  There's a static method that allows a quick conversion from
  EnumerableDeviceIdentifier to DeviceIdentifier while creating a
  NonnullRefPtr out of it.

The reason for doing this is for the sake of integrity and reliablity of
the system in 2 places:
- Ensure that "complicated" tasks that rely on manipulating PCI device
  registers are done in a safe manner. For example, determining a PCI
  BAR space size requires multiple read and writes to the same register,
  and if another CPU tries to do something else with our selected
  register, then the result will be a catastrophe.
- Allow the PCI API to have a united form around a shared object which
  actually holds much more data than the PCI::Address structure. This is
  fundamental if we want to do certain types of optimizations, and be
  able to support more features of the PCI bus in the foreseeable
  future.

This patch already has several implications:
- All PCI::Device(s) hold a reference to a DeviceIdentifier structure
  being given originally from the PCI::Access singleton. This means that
  all instances of DeviceIdentifier structures are located in one place,
  and all references are pointing to that location. This ensures that
  locking the operation spinlock will take effect in all the appropriate
  places.
- We no longer support adding PCI host controllers and then immediately
  allow for enumerating it with a lambda function. It was found that
  this method is extremely broken and too much complicated to work
  reliably with the new paradigm being introduced in this patch. This
  means that for Volume Management Devices (Intel VMD devices), we
  simply first enumerate the PCI bus for such devices in the storage
  code, and if we find a device, we attach it in the PCI::Access method
  which will scan for devices behind that bridge and will add new
  DeviceIdentifier(s) objects to its internal Vector. Afterwards, we
  just continue as usual with scanning for actual storage controllers,
  so we will find a corresponding NVMe controllers if there were any
  behind that VMD bridge.
2023-01-26 23:04:26 +01:00
Karol Kosek
8cfd445c23 Kernel: Allow to remove files from sticky directory if user owns it
It's what the Linux chmod(1) manpage says (in the 'Restricted Deletion
Flag or Sticky Bit' section), and it just makes sense to me. :^)
2023-01-24 20:13:30 +00:00
Andrew Kaster
7ab37ee22c Everywhere: Remove string.h include from AK/Traits.h and resolve fallout
A lot of places were relying on AK/Traits.h to give it strnlen, memcmp,
memcpy and other related declarations.

In the quest to remove inclusion of LibC headers from Kernel files, deal
with all the fallout of this included-everywhere header including less
things.
2023-01-21 10:43:59 -07:00
Andrew Kaster
100fb38c3e Kernel+Userland: Move LibC/sys/ioctl_numbers to Kernel/API/Ioctl.h
This header has always been fundamentally a Kernel API file. Move it
where it belongs. Include it directly in Kernel files, and make
Userland applications include it via sys/ioctl.h rather than directly.
2023-01-21 10:43:59 -07:00
Brian Gianforcaro
bfa890251c Kernel: Fix uninitialized member variable in FATFS Filesystem
Reported-by: PVS Studio
2023-01-16 09:45:46 +01:00
Taj Morton
20991a6a3c Kernel/FileSystem: Fix kernel panic during FS init or mount failure
Resolves issue where a panic would occur if the file system failed to
initialize or mount, due to how the FileSystem was already added to
VFS's list. The newly-created FileSystem destructor would fail as a
result of the object still remaining in the IntrusiveList.
2023-01-09 19:26:01 -07:00
Liav A
04221a7533 Kernel: Mark Process::jail() method as const
We really don't want callers of this function to accidentally change
the jail, or even worse - remove the Process from an attached jail.
To ensure this never happens, we can just declare this method as const
so nobody can mutate it this way.
2023-01-07 03:44:59 +03:30
Liav A
d8ebcaede8 Kernel: Add helper function to check if a Process is in jail
Use this helper function in various places to replace the old code of
acquiring the SpinlockProtected<RefPtr<Jail>> of a Process to do that
validation.
2023-01-06 17:29:47 +01:00
Liav A
a9839d7ac5 Kernel/SysFS: Don't refresh/set-values inside the Jail spinlock scope
Only do so after a brief check if we are in a Jail or not. This fixes
SMP, because apparently it is crashing when calling try_generate()
from the SysFSGlobalInformation::refresh_data method, so the fix for
this is to simply not do that inside the Process' Jail spinlock scope,
because otherwise we will simply have a possible flow of taking
multiple conflicting Spinlocks (in the wrong order multiple times), for
the SysFSOverallProcesses generation code:
Process::current().jail(), and then Process::for_each_in_same_jail being
called, we take Process::all_instances(), and Process::current().jail()
again.
Therefore, we should at the very least eliminate the first taking of the
Process::current().jail() spinlock, in the refresh_data method of the
SysFSGlobalInformation class.
2023-01-05 23:58:13 +01:00
Taj Morton
31eeea08ba Kernel/FileSystem: Fix handling of FAT names that don't fill an entry
* Fix bug where last character of a filename or extension would be
   truncated (HELLO.TXT -> HELL.TX).
 * Fix bug where additional NULL characters would be added to long
   filenames that did not completely fill one of the Long Filename Entry
   character fields.
2023-01-04 09:02:13 +00:00
Taj Morton
a91fc697bb Kernel/FileSystem: Remove FIXME about old/new path being the same
Added comment after confirming that Linux and OpenBSD implenment the
same behavior.
2023-01-04 09:02:13 +00:00
Ben Wiederhake
65b420f996 Everywhere: Remove unused includes of AK/Memory.h
These instances were detected by searching for files that include
AK/Memory.h, but don't match the regex:

\\b(fast_u32_copy|fast_u32_fill|secure_zero|timing_safe_compare)\\b

This regex is pessimistic, so there might be more files that don't
actually use any memory function.

In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
2023-01-02 20:27:20 -05:00
Ben Wiederhake
143a64f9a2 Kernel: Remove unused includes of Kernel/Debug.h
These instances were detected by searching for files that include
Kernel/Debug.h, but don't match the regex:
\\bdbgln_if\(|_DEBUG\\b
This regex is pessimistic, so there might be more files that don't check
for any real *_DEBUG macro. There seem to be no corner cases anyway.

In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
2023-01-02 20:27:20 -05:00
kleines Filmröllchen
a6a439243f Kernel: Turn lock ranks into template parameters
This step would ideally not have been necessary (increases amount of
refactoring and templates necessary, which in turn increases build
times), but it gives us a couple of nice properties:
- SpinlockProtected inside Singleton (a very common combination) can now
  obtain any lock rank just via the template parameter. It was not
  previously possible to do this with SingletonInstanceCreator magic.
- SpinlockProtected's lock rank is now mandatory; this is the majority
  of cases and allows us to see where we're still missing proper ranks.
- The type already informs us what lock rank a lock has, which aids code
  readability and (possibly, if gdb cooperates) lock mismatch debugging.
- The rank of a lock can no longer be dynamic, which is not something we
  wanted in the first place (or made use of). Locks randomly changing
  their rank sounds like a disaster waiting to happen.
- In some places, we might be able to statically check that locks are
  taken in the right order (with the right lock rank checking
  implementation) as rank information is fully statically known.

This refactoring even more exposes the fact that Mutex has no lock rank
capabilites, which is not fixed here.
2023-01-02 18:15:27 -05:00
Andreas Kling
16f934474f Kernel+Tests: Allow deleting someone else's file in my sticky directory
This should be allowed according to Dr. POSIX. :^)
2023-01-01 10:09:02 +01:00
Andreas Kling
47b9e8e651 Kernel: Annotate VirtualFileSystem::rmdir() errors with spec comments 2023-01-01 10:09:02 +01:00
Andreas Kling
8619f2c6f3 Kernel+Tests: Remove inaccurate FIXME in sys$rmdir()
We were already handling the rmdir("..") case by refusing to remove
directories that were not empty.

This patch removes a FIXME from January 2019 and adds a test. :^)
2023-01-01 10:09:02 +01:00
Andreas Kling
8d781d0216 Kernel+Tests: Make sys$rmdir() fail with EINVAL if basename is "."
Dr. POSIX says that we should reject attempts to rmdir() the file named
"." so this patch does exactly that. We also add a test.

This solves a FIXME from January 2019. :^)
2023-01-01 10:09:02 +01:00
Liav A
91db482ad3 Kernel: Reorganize Arch/x86 directory to Arch/x86_64 after i686 removal
No functional change.
2022-12-28 11:53:41 +01:00
Liav A
5ff318cf3a Kernel: Remove i686 support 2022-12-28 11:53:41 +01:00
Liav A
2e710de2f4 Kernel/FileSystem: Prevent symlink creation in veiled directory paths
Also, try to resolve the target path and check if it is allowed to be
accessed under the unveil rules.
2022-12-21 09:17:09 +00:00
Freakness109
1f1e58ed75 Kernel/Plan9FS: Propagate errors in Plan9FSMessage::append_data 2022-12-17 09:37:04 +00:00
sin-ack
3275015786 Kernel: Implement flock downgrading
This commit makes it possible for a process to downgrade a file lock it
holds from a write (exclusive) lock to a read (shared) lock. For this,
the process must point to the exact range of the flock, and must be the
owner of the lock.
2022-12-11 19:55:37 -07:00
sin-ack
2a502fe232 Kernel+LibC+LibCore+UserspaceEmulator: Implement faccessat(2)
Co-Authored-By: Daniel Bertalan <dani@danielbertalan.dev>
2022-12-11 19:55:37 -07:00
sin-ack
fa692e13f9 Kernel: Use real UID/GID when checking for file access
This aligns the rest of the system with POSIX, who says that access(2)
must check against the real UID and GID, not effective ones.
2022-12-11 19:55:37 -07:00
sin-ack
3472c84d14 Kernel: Remove InodeMetadata::may_{read,write,execute}(Process const&)
These have no definition and are never used.
2022-12-11 19:55:37 -07:00
sin-ack
d5fbdf1866 Kernel+LibC+LibCore: Implement renameat(2)
Now with the ability to specify different bases for the old and new
paths.
2022-12-11 19:55:37 -07:00
Liav A
aa9fab9c3a Kernel/FileSystem: Convert the mount table from Vector to IntrusiveList
The fact that we used a Vector meant that even if creating a Mount
object succeeded, we were still at a risk that appending to the actual
mounts Vector could fail due to OOM condition. To guard against this,
the mount table is now an IntrusiveList, which always means that when
allocation of a Mount object succeeded, then inserting that object to
the list will succeed, which allows us to fail early in case of OOM
condition.
2022-12-09 23:29:33 -07:00
Liav A
6a555af1f1 Kernel: Add callback on ".." directory entry for a TmpFS root directory 2022-12-09 22:59:08 -07:00
Liav A
69f41eb062 Kernel: Reject create links on paths that were not unveiled as writable
This solves one of the security issues being mentioned in issue #15996.
We simply don't allow creating hardlinks on paths that were not unveiled
as writable to prevent possible bypass on a certain path that was
unveiled as non-writable.
2022-12-03 11:00:34 -07:00
Liav A
0bb7c8f4c4 Kernel+SystemServer: Don't hardcode coredump directory path
Instead, allow userspace to decide on the coredump directory path. By
default, SystemServer sets it to the /tmp/coredump directory, but users
can now change this by writing a new path to the sysfs node at
/sys/kernel/variables/coredump_directory, and also to read this node to
check where coredumps are currently generated at.
2022-12-03 05:56:59 -07:00
Liav A
7dcf8f971b Kernel: Rename SysFSSystemBoolean => SysFSSystemBooleanVariable 2022-12-03 05:56:59 -07:00
Liav A
95d8aa2982 Kernel: Allow read access sparingly to some /sys/kernel directory nodes
Those nodes are not exposing any sensitive information so there's no
harm in exposing them.
2022-12-03 05:47:58 -07:00
Liav A
1ca0ac5207 Kernel: Disallow jailed processes to read files in /sys/kernel directory
By default, disallow reading of values in that directory. Later on, we
will enable sparingly read access to specific files.

The idea that led to this mechanism was suggested by Jean-Baptiste
Boric (also known as boricj in GitHub), to prevent access to sensitive
information in the SysFS if someone adds a new file in the /sys/kernel
directory.
2022-12-03 05:47:58 -07:00
Liav A
2e55956784 Kernel: Forbid access to /sys/kernel/power_state for Jailed processes
There's simply no benefit in allowing sandboxed programs to change the
power state of the machine, so disallow writes to the mentioned node to
prevent malicious programs to request that.
2022-12-03 05:47:58 -07:00
Andreas Kling
4dd148f07c Kernel: Add File::is_regular_file()
This makes it easy and expressive to check if a File is a regular file.
2022-11-29 11:09:19 +01:00
Liav A
718ae68621 Kernel+LibCore+LibC: Implement support for forcing unveil on exec
To accomplish this, we add another VeilState which is called
LockedInherited. The idea is to apply exec unveil data, similar to
execpromises of the pledge syscall, on the current exec'ed program
during the execve sequence. When applying the forced unveil data, the
veil state is set to be locked but the special state of LockedInherited
ensures that if the new program tries to unveil paths, the request will
silently be ignored, so the program will continue running without
receiving an error, but is still can only use the paths that were
unveiled before the exec syscall. This in turn, allows us to use the
unveil syscall with a special utility to sandbox other userland programs
in terms of what is visible to them on the filesystem, and is usable on
both programs that use or don't use the unveil syscall in their code.
2022-11-26 12:42:15 -07:00
sin-ack
3b03077abb Kernel: Update the ".." inode for directories after a rename
Because the ".." entry in a directory is a separate inode, if a
directory is renamed to a new location, then we should update this entry
the point to the new parent directory as well.

Co-authored-by: Liav A <liavalb@gmail.com>
2022-11-25 17:33:05 +01:00
Andreas Kling
a9d55ddf57 Kernel/TmpFS: Update mtime instead of ctime when asked to update mtime 2022-11-24 16:56:27 +01:00
Andreas Kling
10fa72d451 Kernel: Use AK::Time for InodeMetadata timestamps instead of time_t
Before this change, we were truncating the nanosecond part of file
timestamps in many different places.
2022-11-24 16:56:27 +01:00
Andreas Kling
fb00d3ed25 Kernel+lsirq: Track per-CPU IRQ handler call counts
Each GenericInterruptHandler now tracks the number of calls that each
CPU has serviced.

This takes care of a FIXME in the /sys/kernel/interrupts generator.

Also, the lsirq command line tool now displays per-CPU call counts.
2022-11-19 15:39:30 +01:00
Andreas Kling
9b3db63e14 Kernel: Rename GenericInterruptHandler "invoking count" to "call count" 2022-11-19 15:39:30 +01:00
Liav A
31d4c07dee Kernel: Add missing includes for Mount.h file 2022-11-11 10:25:54 +01:00
Liav A
3cc0d60141 Kernel: Split the Ext2FileSystem.{cpp,h} files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
1c91881a1d Kernel: Split the ISO9660FileSystem.{cpp,h} files to smaller components 2022-11-08 02:54:48 -07:00
Liav A
fca3b7f1f9 Kernel: Split the DevPtsFS files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
3fc52a6d1c Kernel: Split the Plan9FileSystem.{cpp,h} file into smaller components 2022-11-08 02:54:48 -07:00
Liav A
3906dd3aa3 Kernel: Split the ProcFS core file into smaller components 2022-11-08 02:54:48 -07:00
Liav A
e882b2ed05 Kernel: Split the FATFileSystem.{cpp,h} files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
5e6101dd3e Kernel: Split the TmpFS core files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
f53149d5f6 Kernel: Split the SysFS core files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
5e062414c1 Kernel: Add support for jails
Our implementation for Jails resembles much of how FreeBSD jails are
working - it's essentially only a matter of using a RefPtr in the
Process class to a Jail object. Then, when we iterate over all processes
in various cases, we could ensure if either the current process is in
jail and therefore should be restricted what is visible in terms of
PID isolation, and also to be able to expose metadata about Jails in
/sys/kernel/jails node (which does not reveal anything to a process
which is in jail).

A lifetime model for the Jail object is currently plain simple - there's
simpy no way to manually delete a Jail object once it was created. Such
feature should be carefully designed to allow safe destruction of a Jail
without the possibility of releasing a process which is in Jail from the
actual jail. Each process which is attached into a Jail cannot leave it
until the end of a Process (i.e. when finalizing a Process). All jails
are kept being referenced in the JailManagement. When a last attached
process is finalized, the Jail is automatically destroyed.
2022-11-05 18:00:58 -06:00
Timon Kruiper
0475407f9f Kernel: Remove bunch of unused includes in SysFS/Processes.cpp 2022-10-26 20:01:45 +02:00
Timon Kruiper
97f1fa7d8f Kernel: Include missing headers for various files
With these missing header files, we can now build these files for
aarch64.
2022-10-26 20:01:45 +02:00
Timon Kruiper
fcbb6b79ac Kernel: Don't expose processor information for aarch64 in sysfs
We do not (yet) acquire this information for the aarch64 processors.
2022-10-26 20:01:45 +02:00
Liav A
75f01692b4 Kernel+Userland: Move /sys/firmware/power_state to /sys/kernel directory
Let's put the power_state global node into the /sys/kernel directory,
because that directory represents all global nodes and variables being
related to the Kernel. It's also a mutable node, that is more acceptable
being in the mentioned directory due to the fact that all other files in
the /sys/firmware directory are just firmware blobs and are not mutable
at all.
2022-10-25 15:33:34 -06:00
Liav A
a91589c09b Kernel: Introduce global variables and stats in /sys/kernel directory
The ProcFS is an utter mess currently, so let's start move things that
are not related to processes-info. To ensure it's done in a sane manner,
we start by duplicating all /proc/ global nodes to the /sys/kernel/
directory, then we will move Userland to use the new directory so the
old directory nodes can be removed from the /proc directory.
2022-10-25 15:33:34 -06:00
Liav A
03ae9f94cf Kernel/FileSystem: Remove hardcoded unveil path of /usr/lib/Loader.so
If a program needs to execute a dynamic executable program, then it
should unveil /usr/lib/Loader.so by itself and not rely on the Kernel to
allow using this binary without any sense of respect to unveil promises
being made by the running parent program.
2022-10-24 19:41:32 -06:00
Gunnar Beutner
ce4b66e908 Kernel: Add support for MSG_NOSIGNAL and properly send SIGPIPE
Previously we didn't send the SIGPIPE signal to processes when
sendto()/sendmsg()/etc. returned EPIPE. And now we do.

This also adds support for MSG_NOSIGNAL to suppress the signal.
2022-10-24 15:49:39 +02:00
Liav A
fea3cb5ff9 Kernel/FileSystem: Discard safely filesystems when unmounted last time
This commit reached that goal of "safely discarding" a filesystem by
doing the following:
1. Stop using the s_file_system_map HashMap as it was an unsafe measure
to access pointers of FileSystems. Instead, make sure to register all
FileSystems at the VFS layer, with an IntrusiveList, to avoid problems
related to OOM conditions.
2. Make sure to cleanly remove the DiskCache object from a BlockBased
filesystem, so the destructor of such object will not need to do that in
the destruction point.
3. For ext2 filesystems, don't cache the root inode at m_inode_cache
HashMap. The reason for this is that when unmounting an ext2 filesystem,
we lookup at the cache to see if there's a reference to a cached inode
and if that's the case, we fail with EBUSY. If we keep the m_root_inode
also being referenced at the m_inode_cache map, we have 2 references to
that object, which will lead to fail with EBUSY. Also, it's much simpler
to always ask for a root inode and get it immediately from m_root_inode,
instead of looking up the cache for that inode.
2022-10-22 16:57:52 -04:00
Liav A
24977996a6 Kernel: Append root filesystem to the VFS FileBackedFileSystem list 2022-10-22 16:57:52 -04:00
Liav A
0fd7b688af Kernel: Introduce support for using FileSystem object in multiple mounts
The idea is to enable mounting FileSystem objects across multiple mounts
in contrast to what happened until now - each mount has its own unique
FileSystem object being attached to it.

Considering a situation of mounting a block device at 2 different mount
points at in system, there were a couple of critical flaws due to how
the previous "design" worked:
1. BlockBasedFileSystem(s) that pointed to the same actual device had a
separate DiskCache object being attached to them. Because both instances
were not synchronized by any means, corruption of the filesystem is most
likely achieveable by a simple cache flush of either of the instances.
2. For superblock-oriented filesystems (such as the ext2 filesystem),
lack of synchronization between both instances can lead to severe
corruption in the superblock, which could render the entire filesystem
unusable.
3. Flags of a specific filesystem implementation (for example, with xfs
on Linux, one can instruct to mount it with the discard option) must be
honored across multiple mounts, to ensure expected behavior against a
particular filesystem.

This patch put the foundations to start fix the issues mentioned above.
However, there are still major issues to solve, so this is only a start.
2022-10-22 16:57:52 -04:00