Commit graph

613 commits

Author SHA1 Message Date
Andreas Kling
ae197deb6b Kernel: Strongly typed user & group ID's
Prior to this change, both uid_t and gid_t were typedef'ed to `u32`.
This made it easy to use them interchangeably. Let's not allow that.

This patch adds UserID and GroupID using the AK::DistinctNumeric
mechanism we've already been employing for pid_t/ProcessID.
2021-08-29 01:09:19 +02:00
Andreas Kling
59335bd8ea Kernel: Rename FileDescription::create() => try_create() 2021-08-29 01:09:19 +02:00
Andrew Kaster
54161bf5b4 Kernel: Acquire reference to waitee before trying to block in sys$waitid
Previously, we would try to acquire a reference to the all processes
lock or other contended resources while holding both the scheduler lock
and the thread's blocker lock. This could lead to a deadlock if we
actually have to block on those other resources.
2021-08-28 20:53:38 +02:00
Andrew Kaster
dea62fe93c Kernel: Guard the all processes list with a Spinlock rather than a Mutex
There are callers of processes().with or processes().for_each that
require interrupts to be disabled. Taking a Mutexe with interrupts
disabled is a recipe for deadlock, so convert this to a Spinlock.
2021-08-28 20:53:38 +02:00
Andrew Kaster
70518e69f4 Kernel: Unlock ptrace lock before entering a critical section in execve
While it might not be as bad to release a mutex while interrupts are
disabled as it is to acquire one, we don't want to mess with that.
2021-08-28 20:53:38 +02:00
Andrew Kaster
8e70b85215 Kernel: Don't disable interrupts in validate_inode_mmap_prot
There's no need to disable interrupts when trying to access an inode's
shared vmobject. Additionally, Inode::shared_vmobject() acquires a Mutex
which is a recipe for deadlock if acquired with interrupts disabled.
2021-08-28 20:53:38 +02:00
Brian Gianforcaro
485f51690d Kernel: Always observe the return value of Region::map and remap
We have seen cases where the map fails, but we return the region
to the caller, causing them to page fault later on when they touch
the region.

The fix is to always observe the return code of map/remap.
2021-08-25 00:18:42 +02:00
Andreas Kling
6c16bedd69 Kernel: Remove unnecessary FutexQueue::did_remove()
This was only ever called immediately after FutexQueue::try_remove()
to VERIFY() that the state looks exactly like it should after returning
from try_remove().
2021-08-23 00:02:09 +02:00
Andreas Kling
85546af417 Kernel: Rename Thread::BlockCondition to BlockerSet
This class represents a set of Thread::Blocker objects attached to
something that those blockers are waiting on.
2021-08-23 00:02:09 +02:00
Andreas Kling
bcd2025311 Everywhere: Core dump => Coredump
We all know what a coredump is, and it feels more natural to refer to
it as a coredump (most code already does), so let's be consistent.
2021-08-23 00:02:09 +02:00
Andreas Kling
1b9916439f Kernel: Make Processor::platform_string() return StringView 2021-08-23 00:02:09 +02:00
Andreas Kling
c922a7da09 Kernel: Rename ScopedSpinlock => SpinlockLocker
This matches MutexLocker, and doesn't sound like it's a lock itself.
2021-08-22 03:34:10 +02:00
Andreas Kling
55adace359 Kernel: Rename SpinLock => Spinlock 2021-08-22 03:34:10 +02:00
Idan Horowitz
cf271183b4 Kernel: Make Process::current() return a Process& instead of Process*
This has several benefits:
1) We no longer just blindly derefence a null pointer in various places
2) We will get nicer runtime error messages if the current process does
turn out to be null in the call location
3) GCC no longer complains about possible nullptr dereferences when
compiling without KUBSAN
2021-08-19 23:49:53 +02:00
Andreas Kling
961f727448 Kernel: Consolidate a bunch of i386/x86_64 code paths
Add some arch-specific getters and setters that allow us to merge blocks
that were previously specific to either ARCH(I386) or ARCH(X86_64).
2021-08-19 23:22:02 +02:00
Andreas Kling
4226b662cd Kernel+Userland: Remove global futexes
We only ever use private futexes, so it doesn't make sense to carry
around all the complexity required for global (cross-process) futexes.
2021-08-17 01:21:47 +02:00
sin-ack
0a18425cbb Kernel: Make Memory::Region allocation functions return KResultOr
This makes for some nicer handling of errors compared to checking an
OwnPtr for null state.
2021-08-15 15:41:02 +02:00
sin-ack
4bfd6e41b9 Kernel: Make Kernel::VMObject allocation functions return KResultOr
This makes for nicer handling of errors compared to checking whether a
RefPtr is null. Additionally, this will give way to return different
types of errors in the future.
2021-08-15 15:41:02 +02:00
Andreas Kling
1b739a72c2 Kernel+Userland: Remove chroot functionality
We are not using this for anything and it's just been sitting there
gathering dust for well over a year, so let's stop carrying all this
complexity around for no good reason.
2021-08-15 12:44:35 +02:00
Andreas Kling
0f6f863382 Kernel: Convert remaining users of copy_string_from_user()
This patch replaces the remaining users of this API with the new
try_copy_kstring_from_user() instead. Note that we still convert to a
String for continued processing, and I've added FIXME about continuing
work on using KString all the way.
2021-08-15 12:44:35 +02:00
sin-ack
748938ea59 Kernel: Handle allocation failure in ProcFS and friends
There were many places in which allocation failure was noticed but
ignored.
2021-08-15 02:27:13 +02:00
Andreas Kling
7676edfb9b Kernel: Stop allowing implicit conversion from KResult to int
This patch removes KResult::operator int() and deals with the fallout.
This forces a lot of code to be more explicit in its handling of errors,
greatly improving readability.
2021-08-14 15:19:00 +02:00
Andreas Kling
d30d776ca4 Kernel: Make FileSystem::initialize() return KResult
This forced me to also come up with error codes for a bunch of
situations where we'd previously just panic the kernel.
2021-08-14 15:19:00 +02:00
Brian Gianforcaro
296452a981 Kernel: Make cloning of FileDescriptions OOM safe 2021-08-13 11:09:25 +02:00
Brian Gianforcaro
ed6d842f85 Kernel: Fix OOB read in sys$dbgputstr(..) during fuzzing
The implementation uses try_copy_kstring_from_user to allocate a kernel
string using, but does not use the length of the resulting string.
The size parameter to the syscall is untrusted, as try copy kstring will
attempt to perform a `safe_strlen(..)` on the user mode string and use
that value for the allocated length of the KString instead. The bug is
that we are printing the kstring, but with the usermode size argument.

During fuzzing this resulted in us walking off the end of the allocated
KString buffer printing garbage (or any kernel data!), until we stumbled
in to the KSym region and hit a fatal page fault.

This is technically a kernel information disclosure, but (un)fortunately
the disclosure only happens to the Bochs debug port, and or the serial
port if serial debugging is enabled. As far as I can tell it's not
actually possible for an untrusted attacker to use this to do something
nefarious, as they would need access to the host. If they have host
access then they can already do much worse things :^).
2021-08-13 11:08:11 +02:00
Brian Gianforcaro
40a942d28b Kernel: Remove char* versions of path argument / kstring copy methods
The only two paths for copying strings in the kernel should be going
through the existing Userspace<char const*>, or StringArgument methods.

Lets enforce this by removing the option for using the raw cstring APIs
that were previously available.
2021-08-13 11:08:11 +02:00
Brian Gianforcaro
5121e58d4a Kernel: Fix sys$dbgputstr(...) to take a char* instead of u8*
We always attempt to print this as a string, and it's defined as such in
LibC, so fix the signature to match.
2021-08-13 11:08:11 +02:00
Liav A
01b79910b3 Kernel/Process: Move protected values to the end of the object
The compiler can re-order the structure (class) members if that's
necessary, so if we make Process to inherit from ProcFSExposedComponent,
even if the declaration is to inherit first from ProcessBase, then from
ProcFSExposedComponent and last from Weakable<Process>, the members of
class ProcFSExposedComponent (including the Ref-counted parts) are the
first members of the Process class.

This problem made it impossible to safely use the current toggling
method with the write-protection bit on the ProcessBase members, so
instead of inheriting from it, we make its members the last ones in the
Process class so we can safely locate and modify the corresponding page
write protection bit of these values.

We make sure that the Process class doesn't expand beyond 8192 bytes and
the protected values are always aligned on a page boundary.
2021-08-12 20:57:32 +02:00
Andreas Kling
a29788e58a Kernel: Don't record sys$perf_event() if profiling is not enabled
If you want to record perf events, just enable profiling. This allows us
to add random perf events to programs without littering the file system
with perfcore files.
2021-08-12 00:03:40 +02:00
Andreas Kling
1e90a3a542 Kernel: Make sys$perf_register_string() generate the string ID's
Making userspace provide a global string ID was silly, and made the API
extremely difficult to use correctly in a global profiling context.

Instead, simply make the kernel do the string ID allocation for us.
This also allows us to convert the string storage to a Vector in the
kernel (and an array in the JSON profile data.)
2021-08-12 00:03:39 +02:00
Andreas Kling
4657c79143 Kernel+LibC: Add sys$perf_register_string()
This syscall allows userspace to register a keyed string that appears in
a new "strings" JSON object in profile output.

This will be used to add custom strings to profile signposts. :^)
2021-08-12 00:03:39 +02:00
Gunnar Beutner
3322efd4cd Kernel: Fix kernel panic when blocking on the process' big lock
Another thread might end up marking the blocking thread as holding
the lock before it gets a chance to finish invoking the scheduler.
2021-08-10 22:33:50 +02:00
Andreas Kling
fdfc66db61 Kernel+LibC: Allow clock_gettime() to run without syscalls
This patch adds a vDSO-like mechanism for exposing the current time as
an array of per-clock-source timestamps.

LibC's clock_gettime() calls sys$map_time_page() to map the kernel's
"time page" into the process address space (at a random address, ofc.)
This is only done on first call, and from then on the timestamps are
fetched from the time page.

This first patch only adds support for CLOCK_REALTIME, but eventually
we should be able to support all clock sources this way and get rid of
sys$clock_gettime() in the kernel entirely. :^)

Accesses are synchronized using two atomic integers that are incremented
at the start and finish of the kernel's time page update cycle.
2021-08-10 19:21:16 +02:00
Andreas Kling
fa64ab26a4 Kernel+UserspaceEmulator: Remove unused sys$gettimeofday()
Now that LibC uses clock_gettime() to implement gettimeofday(), we can
get rid of this entire syscall. :^)
2021-08-10 13:01:39 +02:00
Andreas Kling
0a02496f04 Kernel/SMP: Change critical sections to not disable interrupts
Leave interrupts enabled so that we can still process IRQs. Critical
sections should only prevent preemption by another thread.

Co-authored-by: Tom <tomut@yahoo.com>
2021-08-10 02:49:37 +02:00
Andreas Kling
9babb92a4b Kernel/SMP: Make entering/leaving critical sections multi-processor safe
By making these functions static we close a window where we could get
preempted after calling Processor::current() and move to another
processor.

Co-authored-by: Tom <tomut@yahoo.com>
2021-08-10 02:49:37 +02:00
Andreas Kling
c94c15d45c Everywhere: Replace AK::Singleton => Singleton 2021-08-08 00:03:45 +02:00
Andreas Kling
15d033b486 Kernel: Remove unused Process pointer in Memory::AddressSpace
Nobody was using the back-pointer to the process, so let's lose it.
2021-08-08 00:03:45 +02:00
Idan Horowitz
9d21c79671 Kernel: Disable big process lock for sys$sync
This syscall doesn't touch any intra-process shared resources and only
calls VirtualFileSystem::sync, which is self-locking.
2021-08-07 15:30:26 +02:00
sin-ack
0d468f2282 Kernel: Implement a ISO 9660 filesystem reader :^)
This commit implements the ISO 9660 filesystem as specified in ECMA 119.
Currently, it only supports the base specification and Joliet or Rock
Ridge support is not present. The filesystem will normalize all
filenames to be lowercase (same as Linux).

The filesystem can be mounted directly from a file. Loop devices are
currently not supported by SerenityOS.

Special thanks to Lubrsi for testing on real hardware and providing
profiling help.

Co-Authored-By: Luke <luke.wilde@live.co.uk>
2021-08-07 15:21:58 +02:00
Andreas Kling
5acb7e4eba Kernel: Remove outdated FIXME about ProcessHandle
ProcessHandle hasn't been a thing since Process became ref-counted.
2021-08-07 12:29:26 +02:00
Jean-Baptiste Boric
08891e82a5 Kernel: Migrate process list locking to ProtectedValue
The existing recursive spinlock is repurposed for profiling only, as it
was shared with the process list.
2021-08-07 11:48:00 +02:00
Jean-Baptiste Boric
8554b66d09 Kernel: Make process list a singleton 2021-08-07 11:48:00 +02:00
Jean-Baptiste Boric
626b99ce1c Kernel: Migrate hostname locking to ProtectedValue 2021-08-07 11:48:00 +02:00
Andreas Kling
f770b9d430 Kernel: Fix bad search-and-replace renames
Oops, I didn't mean to change every *Range* to *VirtualRange*!
2021-08-07 00:39:06 +02:00
Idan Horowitz
ad419a669d Kernel: Disable big process lock for sys$sysconf
This syscall only reads constant kernel globals, and as such does not
need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
efeb01e35f Kernel: Disable big process lock for sys$get_stack_bounds
This syscall only reads from the shared m_space field, but that field
is only over written to by Process::attach_resources, before the
process was initialized (aka, before syscalls can happen), by
Process::finalize which is only called after all the process' threads
have exited (aka, syscalls can not happen anymore), and by
Process::do_exec which calls all other syscall-capable threads before
doing so. Space's find_region_containing already holds its own lock,
and as such there's no need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
d40038a04f Kernel: Disable big process lock for sys$gettimeofday
This syscall doesn't touch any intra-process shared resources and only
accesses the time via the atomic TimeManagement::now so there's no need
to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
3ba2449058 Kernel: Disable big process lock for sys$clock_nanosleep
This syscall doesn't touch any intra-process shared resources and only
accesses the time via the atomic TimeManagement::current_time so there's
no need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
fbd848e6eb Kernel: Disable big process lock for sys$clock_gettime()
This syscall doesn't touch any intra-process shared resources and
reads the time via the atomic TimeManagement::current_time, so it
doesn't need to hold any lock.
2021-08-06 23:36:12 +02:00