Commit graph

8033 commits

Author SHA1 Message Date
Idan Horowitz
d1082a00b7 Kernel: Mark sys$set_mmap_name as not needing the big lock
All accesses to shared mutable data are already serialized behind the
process address space spinlock.
2023-04-06 20:30:03 +03:00
Idan Horowitz
2f79d0e8b9 Kernel: Mark sys$mprotect as not needing the big lock
All accesses to shared mutable data are already serialized behind the
process address space spinlock.
2023-04-06 20:30:03 +03:00
Idan Horowitz
3697214166 Kernel: Mark sys$mmap as not needing the big lock
All accesses to shared mutable data are already serialized behind the
process address space spinlock.
2023-04-06 20:30:03 +03:00
Idan Horowitz
dcdcab0099 Kernel: Remove unused credentials() call in validate_inode_mmap_prot
For some reason GCC did not complain about this.
2023-04-06 20:30:03 +03:00
Idan Horowitz
0b14081ae1 Kernel: Mark sys$map_time_page as not needing the big lock
All accesses to shared mutable data are already serialized behind the
process address space spinlock.
2023-04-06 20:30:03 +03:00
Idan Horowitz
0e564240a6 Kernel: Mark sys$madvise as not needing the big lock
All accesses to shared mutable data are already serialized behind the
process address space spinlock.
2023-04-06 20:30:03 +03:00
Pankaj Raghav
a65b0cbe4a Kernel/NVMeQueue: Use waitqueue in submit_sync_sqe
The current way we handle sync commands is very ugly and depends on lot
of preconditions. Now that we have an end_io handler for a request, we
can use WaitQueue to do sync commands more elegantly.

This does depend on block layer sending one request at a time but this
change is a step forward towards better IO handling.
2023-04-05 12:45:27 +02:00
Pankaj Raghav
0096eadf40 Kernel/NVMe: Redesign the tracking of requests in an NVMe Queue
There was a private variable named m_current_request which was used to
track a single request at a time. This guarantee is given by the block
layer where we wait on each IO. This design will break down in the
driver once the block layer removes that constraint.

Redesign the IO handling in a completely asynchronous way by maintaining
requests up to queue depth. NVMeIO struct is introduced to track an IO
submitted along with other information such whether the IO is still
being processed and an endio callback which will be called during the
end of a request.

A hashmap private variable is created which will key based on the
command id of a request with a value of NVMeIO. endio handler will come
in handy if we are doing a sync request and we want to wake up the wait
queue during the end.

This change also simplified the code by removing some special condition
in submit_sqe function, etc that were marked as FIXME for a long time.
2023-04-05 12:45:27 +02:00
Pankaj Raghav
3fe7bda021 Kernel/NVMe: Use an Atomic for command id instead of sq index
Using sq_tail as cid makes an inherent assumption that we send only
one IO at a time. Use an atomic variable instead for command id of a
submission queue entry.

As sq_tail is not used as cid anymore, remove m_prev_sq_tail which used
to hold the last used sq_tail value.
2023-04-05 12:45:27 +02:00
Andreas Kling
e219662ce0 Kernel: Mark sys$setpgid as not needing the big lock
This function is already serialized by access to process protected data.
2023-04-05 11:37:27 +02:00
Andreas Kling
84ac957d7a Kernel: Make Credentials the authority on process SID
The SID was duplicated between the process credentials and protected
data. And to make matters worse, the credentials SID was not updated in
sys$setsid.

This patch fixes this by removing the SID from protected data and
updating the credentials SID everywhere.
2023-04-05 11:37:27 +02:00
Andreas Kling
f764b8b113 Kernel: Mark sys$setsid as not needing the big lock
This function is now serialized by access to the process group list,
and to the current process's protected data.
2023-04-05 11:37:27 +02:00
Andreas Kling
3e30d9bc99 Kernel: Make ProcessGroup a ListedRefCounted and fix two races
This closes two race windows:

- ProcessGroup removed itself from the "all process groups" list in its
  destructor. It was possible to walk the list between the last unref()
  and the destructor invocation, and grab a pointer to a ProcessGroup
  that was about to get deleted.

- sys$setsid() could end up creating a process group that already
  existed, as there was a race window between checking if the PGID
  is used, and actually creating a ProcessGroup with that PGID.
2023-04-05 11:37:27 +02:00
Andreas Kling
37bfc36601 Kernel: Make SlavePTY store pointer to MasterPTY as NonnullRefPtr
No need for LockRefPtr here, as the pointer never changes after
initialization.
2023-04-05 11:37:27 +02:00
Andreas Kling
e69b2572a6 Kernel: Move Process's TTY pointer into protected data 2023-04-05 11:37:27 +02:00
Andreas Kling
1e2ef59965 Kernel: Move Process's process group pointer into protected data
Now that it's no longer using LockRefPtr, we can actually move it into
protected data. (LockRefPtr couldn't be stored there because protected
data is immutable at times, and LockRefPtr uses some of its own bits
for locking.)
2023-04-05 11:37:27 +02:00
Andreas Kling
1c77803845 Kernel: Stop using *LockRefPtr for TTY
TTY was only stored in Process::m_tty, so make that a SpinlockProtected.
2023-04-05 11:37:27 +02:00
Andreas Kling
350e5f9261 Kernel: Remove ancient InterruptDisabler in sys$setsid
This was some pre-SMP historical artifact.
2023-04-05 11:37:27 +02:00
Andreas Kling
ca1f8cac66 Kernel: Mark sys$faccessat as not needing the big lock 2023-04-05 11:37:27 +02:00
Andreas Kling
bd46397e1f Kernel: Mark inode watcher syscalls as not needing the big lock
These syscalls are already protected by existing locking mechanisms,
including the mutex inside InodeWatcher.
2023-04-04 10:33:42 +02:00
Andreas Kling
08d79c757a Kernel: Mark sys$killpg as not needing the big lock
Same as sys$kill, nothing here that isn't already protected by existing
locks.
2023-04-04 10:33:42 +02:00
Andreas Kling
e71b84228e Kernel: Mark sys$kill as not needing the big lock
This syscall sends a signal to other threads or itself. This mechanism
is already guarded by locking mechanisms, and widely used within the
kernel without help from the big lock.
2023-04-04 10:33:42 +02:00
Andreas Kling
3108daecc5 Kernel: Remove ancient InterruptDisablers in the kill/killpg syscalls
These are artifacts from the pre-SMP times.
2023-04-04 10:33:42 +02:00
Andreas Kling
46ab245e74 Kernel: Mark sys$getrusage as not needing the big lock
Same deal as sys$times, nothing here that needs locking at the moment.
2023-04-04 10:33:42 +02:00
Andreas Kling
3371165588 Kernel: Make the getsockname/getpeername syscall helper a bit nicer
Instead of templatizing on a bool parameter, use an enum for clarity.
2023-04-04 10:33:42 +02:00
Andreas Kling
5bc7882b68 Kernel: Make sys$times not use the big lock
...and also make the Process tick counters clock_t instead of u32.
It seems harmless to get interrupted in the middle of reading these
counters and reporting slightly fewer ticks in some category.
2023-04-04 10:33:42 +02:00
Andreas Kling
b98f537f11 Kernel+Userland: Make some of the POSIX types larger
Expand the following types from 32-bit to 64-bit:
- blkcnt_t
- blksize_t
- dev_t
- nlink_t
- suseconds_t
- clock_t

This matches their size on other 64-bit systems.
2023-04-04 10:33:42 +02:00
Andreas Kling
d8bb32117e Kernel: Mark sys$umask as not needing the big lock
The body of this syscall is already serialized by calling
with_mutable_protected_data().
2023-04-04 10:33:42 +02:00
Andreas Kling
6c02c493f1 Kernel: Mark sys$sigtimedwait as not needing the big lock
Yet another syscall that only messes with the current thread.
2023-04-04 10:33:42 +02:00
Andreas Kling
f0b5c585f2 Kernel: Mark sys$sigpending as not needing the big lock
Another one that only touches the current thread.
2023-04-04 10:33:42 +02:00
Andreas Kling
e9fe0ecbae Kernel: Mark sys$sigprocmask as not needing the big lock
Another one that only messes with the current thread.
2023-04-04 10:33:42 +02:00
Andreas Kling
d1fae8b09c Kernel: Mark sys$sigsuspend as not needing the big lock
This syscall is only concerned with the current thread.
2023-04-04 10:33:42 +02:00
Andreas Kling
374f4aeab9 Kernel: Mark sys$sigreturn as not needing the big lock
This syscall is only concerned with the current thread (except in the
case of a pledge violation, when it will add some details about that
to the process coredump metadata. That stuff is already serialized.)
2023-04-04 10:33:42 +02:00
Andreas Kling
a7212a7488 Kernel: Mark sys$open as not needing the big lock
All the individual sub-operations of this syscall are protected by their
own locking mechanisms, so it should be okay to get it off the big lock.
2023-04-04 10:33:42 +02:00
Andreas Kling
97ac4601f5 Kernel: Use custody_for_dirfd() in more syscalls
This simplifies a lot of syscalls, some of which were doing very
unnecessarily verbose things instead of calling this.
2023-04-04 10:33:42 +02:00
Andreas Kling
f0c9c5e076 Kernel: Make custody_for_dirfd() fail on files other than directories 2023-04-04 10:33:42 +02:00
Andreas Kling
41f5598516 Kernel: Make sys$getsid not require the big lock
Reorganize the code slightly to avoid creating a TOCTOU bug, then mark
the syscall as not needing the big lock anymore.
2023-04-04 10:33:42 +02:00
Andreas Kling
1382439267 Kernel: Mark sys$getpgrp as not needing the big lock
Access to the process's process group is already serialized by
SpinlockProtected.
2023-04-04 10:33:42 +02:00
Andreas Kling
2ddd69260c Kernel: Mark sys$getpgid as not needing the big lock
Access to the process's process group is already serialized by
SpinlockProtected.
2023-04-04 10:33:42 +02:00
Andreas Kling
775e6d6865 Kernel: Mark sys$fcntl as not needing the big lock
This syscall operates on the file descriptor table, and on individual
open file descriptions. Both of those are already protected by scoped
locking mechanisms.
2023-04-04 10:33:42 +02:00
Andreas Kling
6132193bd4 Kernel: Make sys$disown not require the big lock
This syscall had a TOCTOU where it checked the peer's PPID before
locking the protected data (where the PPID is stored).

After closing the race window, we can mark the syscall as not needing
the big lock.
2023-04-04 10:33:42 +02:00
Andreas Kling
5759ea19fb Kernel: Mark sys$alarm as not needing the big lock
Access to Process::m_alarm_timer is serialized via SpinlockProtected,
so there's no longer need for this syscall to use the big lock.
2023-04-04 10:33:42 +02:00
Andreas Kling
496d918e92 Kernel: Stop using *LockRefPtr for Kernel::Timer 2023-04-04 10:33:42 +02:00
Andreas Kling
83b409083b Kernel: Stop using *LockRefPtr for ProcessGroup
Had to wrap Process::m_pg in a SpinlockProtected for this to be safe.
2023-04-04 10:33:42 +02:00
Andreas Kling
ed1253ab90 Kernel: Don't ref/unref the holder thread in Mutex
There was a whole bunch of ref counting churn coming from Mutex, which
had a RefPtr<Thread> m_holder to (mostly) point at the thread holding
the mutex.

Since we never actually dereference the m_holder value, but only use it
for identity checks against thread pointers, we can store it as an
uintptr_t and skip the ref counting entirely.

Threads can't die while holding a mutex anyway, so there's no risk of
them going missing on us.
2023-04-04 10:33:42 +02:00
Andreas Kling
c3915e4058 Kernel: Stop using *LockRefPtr for Thread
These were stored in a bunch of places. The main one that's a bit iffy
is the Mutex::m_holder one, which I'm going to simplify in a subsequent
commit.

In Plan9FS and WorkQueue, we can't make the NNRPs const due to
initialization order problems. That's probably doable with further
cleanup, but left as an exercise for our future selves.

Before starting this, I expected the thread blockers to be a problem,
but as it turns out they were super straightforward (for once!) as they
don't mutate the thread after initiating a block, so they can just use
simple const-ified NNRPs.
2023-04-04 10:33:42 +02:00
Andreas Kling
a098266ff5 Kernel: Simplify Process factory functions
- Instead of taking the first new thread as an out-parameter, we now
  bundle the process and its first thread in a struct and use that
  as the return value.

- Make all Process factory functions return ErrorOr. Use this to convert
  some places to more TRY().

- Drop the "try_" prefix on Process factory functions.
2023-04-04 10:33:42 +02:00
Andreas Kling
65438d8a85 Kernel: Stop using *LockRefPtr for Process pointers
The only persistent one of these was Thread::m_process and that never
changes after initialization. Make it const to enforce this and switch
everything over to RefPtr & NonnullRefPtr.
2023-04-04 10:33:42 +02:00
Andreas Kling
19084ef743 Kernel: Simplify Mount internals
- The host custody never changes after initialization, so there's no
  need to protect it with a spinlock.

- To enforce the fact that some members don't change after
  initialization, make them const.
2023-04-04 10:33:42 +02:00
Andreas Kling
673592dea8 Kernel: Stop using *LockRefPtr for FileSystem pointers
There was only one permanent storage location for these: as a member
in the Mount class.

That member is never modified after Mount initialization, so we don't
need to worry about races there.
2023-04-04 10:33:42 +02:00