Commit graph

806 commits

Author SHA1 Message Date
Idan Horowitz
bc85b64a38 Kernel: Replace usages of String::formatted with KString in sys$exec 2022-01-12 16:09:09 +02:00
Idan Horowitz
dba0840942 Kernel: Remove outdated FIXME comment in sys$sethostname 2022-01-12 16:09:09 +02:00
Daniel Bertalan
182016d7c0 Kernel+LibC+LibCore+UE: Implement fchmodat(2)
This function is an extended version of `chmod(2)` that lets one control
whether to dereference symlinks, and specify a file descriptor to a
directory that will be used as the base for relative paths.
2022-01-12 14:54:12 +01:00
Andreas Kling
a62bdb0761 Kernel: Delay Process data unprotection in sys$pledge()
Don't unprotect the protected data area until we've validated the pledge
syscall inputs.
2022-01-02 18:08:02 +01:00
circl
63760603f3 Kernel+LibC+LibCore: Add lchown and fchownat functions
This modifies sys$chown to allow specifying whether or not to follow
symlinks and in which directory.

This was then used to implement lchown and fchownat in LibC and LibCore.
2022-01-01 15:08:49 +01:00
Daniel Bertalan
1d2f78682b Kernel+AK: Eliminate a couple of temporary String allocations 2021-12-30 14:16:03 +01:00
Brian Gianforcaro
54b9a4ec1e Kernel: Handle promise violations in the syscall handler
Previously we would crash the process immediately when a promise
violation was found during a syscall. This is error prone, as we
don't unwind the stack. This means that in certain cases we can
leak resources, like an OwnPtr / RefPtr tracked on the stack. Or
even leak a lock acquired in a ScopeLockLocker.

To remedy this situation we move the promise violation handling to
the syscall handler, right before we return to user space. This
allows the code to follow the normal unwind path, and grantees
there is no longer any cleanup that needs to occur.

The Process::require_promise() and Process::require_no_promises()
functions were modified to return ErrorOr<void> so we enforce that
the errors are always propagated by the caller.
2021-12-29 18:08:15 +01:00
Brian Gianforcaro
0f7fe1eb08 Kernel: Use Process::require_no_promises instead of REQUIRE_NO_PROMISES
This change lays the foundation for making the require_promise return
an error hand handling the process abort outside of the syscall
implementations, to avoid cases where we would leak resources.

It also has the advantage that it makes removes a gs pointer read
to look up the current thread, then process for every syscall. We
can instead go through the Process this pointer in most cases.
2021-12-29 18:08:15 +01:00
Brian Gianforcaro
bad6d50b86 Kernel: Use Process::require_promise() instead of REQUIRE_PROMISE()
This change lays the foundation for making the require_promise return
an error hand handling the process abort outside of the syscall
implementations, to avoid cases where we would leak resources.

It also has the advantage that it makes removes a gs pointer read
to look up the current thread, then process for every syscall. We
can instead go through the Process this pointer in most cases.
2021-12-29 18:08:15 +01:00
Brian Gianforcaro
b5367bbf31 Kernel: Clarify why ftruncate() & pread() are passed off_t const*
I fell into this trap and tried to switch the syscalls to pass by
the `off_t` by register. I think it makes sense to add a clarifying
comment for future readers of the code, so they don't fall into the
same trap. :^)
2021-12-29 05:54:04 -08:00
Brian Gianforcaro
737a11389c Kernel: Fix info leak from sockaddr_un in socket syscalls
In `sys$accept4()` and `get_sock_or_peer_name()` we were not
initializing the padding of the `sockaddr_un` struct, leading to
an kernel information leak if the
caller looked back at it's contents.

Before Fix:

    37.766 Clipboard(11:11): accept4 Bytes:
    2f746d702f706f7274616c2f636c6970626f61726440eac130e7fbc1e8abbfc
    19c10ffc18440eac15485bcc130e7fbc1549feaca6c9deaca549feaca1bb0bc
    03efdf62c0e056eac1b402d7acd010ffc14602000001b0bc030100000050bf0
    5c24602000001e7fbc1b402d7ac6bdc

After Fix:

    0.603 Clipboard(11:11): accept4 Bytes:
    2f746d702f706f7274616c2f636c6970626f617264000000000000000000000
    000000000000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000000000000000000000000000000000000
    0000000000000000000000000000000
2021-12-29 03:41:32 -08:00
Daniel Bertalan
fcdd202741 Kernel: Return the actual number of CPU cores that we have
... instead of returning the maximum number of Processor objects that we
can allocate.

Some ports (e.g. gdb) rely on this information to determine the number
of worker threads to spawn. When gdb spawned 64 threads, the kernel
could not cope with generating backtraces for it, which prevented us
from debugging it properly.

This commit also removes the confusingly named
`Processor::processor_count` function so that this mistake can't happen
again.
2021-12-29 03:17:41 -08:00
Idan Horowitz
d7ec5d042f Kernel: Port Process to ListedRefCounted 2021-12-29 12:04:15 +01:00
Guilherme Goncalves
33b78915d3 Kernel: Propagate overflow errors from Memory::page_round_up
Fixes #11402.
2021-12-28 23:08:50 +01:00
Daniel Bertalan
52beeebe70 Kernel: Remove the KString::try_create(String::formatted(...)) pattern
We can now directly create formatted KStrings with KString::formatted.

:^)
2021-12-28 01:55:22 -08:00
Guilherme Gonçalves
da6aef9fff Kernel: Make msync return EINVAL when regions are too large
As a small cleanup, this also makes `page_round_up` verify its
precondition with `page_round_up_would_wrap` (which callers are expected
to call), rather than having its own logic.

Fixes #11297.
2021-12-23 17:43:12 -08:00
Daniel Bertalan
8e3d1a42e3 Kernel+UE+LibC: Store address as void* in SC_m{re,}map_params
Most other syscalls pass address arguments as `void*` instead of
`uintptr_t`, so let's do that here too. Besides improving consistency,
this commit makes `strace` correctly pretty-print these arguments in
hex.
2021-12-23 23:08:10 +01:00
Daniel Bertalan
77f9272aaf Kernel+UE: Add MAP_FIXED_NOREPLACE mmap() flag
This feature was introduced in version 4.17 of the Linux kernel, and
while it's not specified by POSIX, I think it will be a nice addition to
our system.

MAP_FIXED_NOREPLACE provides a less error-prone alternative to
MAP_FIXED: while regular fixed mappings would cause any intersecting
ranges to be unmapped, MAP_FIXED_NOREPLACE returns EEXIST instead. This
ensures that we don't corrupt our process's address space if something
is already at the requested address.

Note that the more portable way to do this is to use regular
MAP_ANONYMOUS, and check afterwards whether the returned address matches
what we wanted. This, however, has a large performance impact on
programs like Wine which try to reserve large portions of the address
space at once, as the non-matching addresses have to be unmapped
separately.
2021-12-23 23:08:10 +01:00
Andreas Kling
1d08b671ea Kernel: Enter new address space before destroying old in sys$execve()
Previously we were assigning to Process::m_space before actually
entering the new address space (assigning it to CR3.)

If a thread was preempted by the scheduler while destroying the old
address space, we'd then attempt to resume the thread with CR3 pointing
at a partially destroyed address space.

We could then crash immediately in write_cr3(), right after assigning
the new value to CR3. I am hopeful that this may have been the bug
haunting our CI for months. :^)
2021-12-23 01:18:26 +01:00
Daniel Bertalan
ce1bf3724e Kernel: Replace intersecting ranges in mmap when MAP_FIXED is specified
This behavior is mandated by POSIX and is used by software like Wine
after reserving large chunks of the address range.
2021-12-22 00:02:36 -08:00
Martin Bříza
86b249f02f Kernel: Implement sysconf(_SC_SYMLOOP_MAX)
Not much to say here, this is an implementation of this call that
accesses the actual limit constant that's used by the VirtualFileSystem
class.

As a side note, this is required for my eventual Qt port.
2021-12-21 12:54:11 -08:00
Liav A
5a649d0fd5 Kernel: Return EINVAL when specifying -1 for setuid and similar syscalls
For setreuid and setresuid syscalls, -1 means to set the current
uid/euid/gid/egid value, to be more convenient for programming.
However, for other syscalls where we pass only one argument, there's no
justification to specify -1.

This behavior is identical to how Linux handles the value -1, and is
influenced by the fact that the manual pages for the group of one
argument syscalls that handle ID operations is ambiguous about this
topic.
2021-12-20 11:32:16 +01:00
Hendiadyoin1
18013f3c06 Kernel: Remove a redundant check in Process::remap_range_as_stack
We already VERIFY that we have carved something out, so we don't need to
check that again.
2021-12-18 10:31:18 -08:00
Hendiadyoin1
2d28b441bf Kernel: Collapse a redundant boolean conditional return statement in …
validate_mmap_prot
2021-12-18 10:31:18 -08:00
Hendiadyoin1
f38d32535c Kernel: Access OpenFileDescriptions::max_open() statically in Syscalls 2021-12-18 10:31:18 -08:00
Hendiadyoin1
c860e0ab95 Kernel: Add implicit auto qualifiers in Syscalls 2021-12-18 10:31:18 -08:00
Hendiadyoin1
f5b495d92c Kernel: Remove else after return in Process::do_write 2021-12-18 10:31:18 -08:00
Andreas Kling
32aa623eff Kernel: Fix 4-byte uninitialized memory leak in sys$sigaltstack()
It was possible to extract 4 bytes of uninitialized kernel stack memory
on x86_64 by looking in the padding of stack_t.
2021-12-18 11:30:10 +01:00
Andreas Kling
0ae8702692 Kernel: Make File::stat() & friends return Error<struct stat>
Instead of making the caller provide a stat buffer, let's just return
one as a value.
2021-12-18 11:30:10 +01:00
Andreas Kling
abf2204402 Kernel: Use copy_typed_from_user() in more places :^) 2021-12-18 11:30:10 +01:00
Andreas Kling
39d9337db5 Kernel: Make sys${ftruncate,pread} take off_t as const pointer
These syscalls don't write back to the off_t value (unlike sys$lseek)
so let's take Userspace<off_t const*> instead of Userspace<off_t*>.
2021-12-18 11:30:10 +01:00
Jean-Baptiste Boric
23257cac52 Kernel: Remove sys$select() syscall
Now that the userland has a compatiblity wrapper for select(), the
kernel doesn't need to implement this syscall natively. The poll()
interface been around since 1987, any code still using select()
should be slapped silly.

Note: the SerenityOS source tree mostly uses select() and not poll()
despite SerenityOS having support for poll() since early 2019...
2021-12-12 21:48:50 +01:00
Jean-Baptiste Boric
2177c2a30b Kernel: Split off sys$poll() into Syscalls/poll.cpp 2021-12-12 21:48:50 +01:00
Idan Horowitz
762e047ec9 Kernel+LibC: Implement sigtimedwait()
This includes a new Thread::Blocker called SignalBlocker which blocks
until a signal of a matching type is pending. The current Blocker
implementation in the Kernel is very complicated, but cleaning it up is
a different yak for a different day.
2021-12-12 08:34:19 +02:00
Idan Horowitz
81a76a30a1 Kernel: Preserve pending signals across execve(2)s
As required by posix. Also rename Thread::clear_signals to
Thread::reset_signals_for_exec since it doesn't actually clear any
pending signals, but rather does execve related signal book-keeping.
2021-12-12 08:34:19 +02:00
Idan Horowitz
0ca1231d8f Kernel: Inherit alternative signal stack on fork(2)
A child process created via fork(2) inherits a copy of its parent's
alternate signal stack settings.
2021-12-12 08:34:19 +02:00
Idan Horowitz
92a6c91f4e Kernel: Preserve signal mask across fork(2) and execve(2)
A child created via fork(2) inherits a copy of its parent's signal
mask; the signal mask is preserved across execve(2).
2021-12-12 08:34:19 +02:00
Ben Wiederhake
0e6e1092f0 Kernel: Make ptrace return an error on error
Returning 'result.error().code()' erroneously creates an
ErrorOr<FlatPtr> of the positive errno code, which breaks our
error-returning convention.

This seems to be due to a forgotten minus-sign during the refactoring in
9e51e295cf. This latent bug was never
discovered, because currently the error-handling paths are rarely
exercised.
2021-12-05 22:59:09 +01:00
Ben Wiederhake
0f8483f09c Kernel: Implement new ptrace function PT_PEEKBUF
This enables the tracer to copy large amounts of data in a much saner
way.
2021-12-05 22:59:09 +01:00
Ben Wiederhake
3e223185b3 Kernel+strace: Remove unnecessary indirection for PEEK
Also, remove incomplete, superfluous check.
Incomplete, because only the byte at the provided address was checked;
this misses the last bytes of the "jerk page".
Superfluous, because it is already correctly checked by peek_user_data
(which calls copy_from_user).

The caller/tracer should not typically attempt to read non-userspace
addresses, we don't need to "hot-path" it either.
2021-12-05 22:59:09 +01:00
Idan Horowitz
265764ff2f Kernel: Add support for the POLLWRBAND poll event 2021-12-05 12:53:29 +01:00
Idan Horowitz
f415218afe Kernel+LibC: Implement sigaltstack()
This is required for compiling wine for serenity
2021-12-01 21:44:11 +02:00
Idan Horowitz
d5d0eb45bf Kernel: Clear up some comments in the sys$mprotect implementation 2021-12-01 21:44:11 +02:00
Idan Horowitz
f27bbec7b2 Kernel: Move incorrect early return in sys$mprotect
Since we're iterating over multiple regions that interesect with the
requested range, just one of them having the requested access flags
is not enough to finish the syscall early.
2021-12-01 21:44:11 +02:00
Idan Horowitz
4ca39c7110 Kernel: Move the expand_range_to_page_boundaries helper to MemoryManager
This helper can (and will) be used in more parts of the kernel besides
the mmap-family of syscalls.
2021-12-01 21:44:11 +02:00
Idan Horowitz
fc13d0782f LibC: Make the madvise advice field a value instead of a bitfield
The advices are almost always exclusive of one another, and while POSIX
does not define madvise, most other unix-like and *BSD systems also only
accept a singular value per call.
2021-12-01 21:44:11 +02:00
Hendiadyoin1
c7b90fa7d3 Kernel: Don't rewrite the whole file on sys$msync 2021-12-01 09:47:46 +01:00
Hendiadyoin1
259f78545a Kernel: Allow flushing of partial regions in sys$msync 2021-12-01 09:47:46 +01:00
Hendiadyoin1
49d6ad6633 Kernel: Handle more error cases in sys$msync 2021-12-01 09:47:46 +01:00
Ben Wiederhake
33079c8ab9 Kernel+UE+LibC: Remove unused dbgputch syscall
Everything uses the dbgputstr syscall anyway, so there is no need to
keep supporting it.
2021-11-24 22:56:39 +01:00