Commit graph

7635 commits

Author SHA1 Message Date
sin-ack
d9e1a6c566 Kernel: Bump maximum pthread stack size to 32MiB
The Zig compiler asks for this much stack on its main thread via the use
of PT_GNU_STACK.
2022-12-11 19:55:37 -07:00
sin-ack
ef6921d7c7 Kernel+LibC+LibELF: Set stack size based on PT_GNU_STACK during execve
Some programs explicitly ask for a different initial stack size than
what the OS provides. This is implemented in ELF by having a
PT_GNU_STACK header which has its p_memsz set to the amount that the
program requires. This commit implements this policy by reading the
p_memsz of the header and setting the main thread stack size to that.
ELF::Image::validate_program_headers ensures that the size attribute is
a reasonable value.
2022-12-11 19:55:37 -07:00
sin-ack
3275015786 Kernel: Implement flock downgrading
This commit makes it possible for a process to downgrade a file lock it
holds from a write (exclusive) lock to a read (shared) lock. For this,
the process must point to the exact range of the flock, and must be the
owner of the lock.
2022-12-11 19:55:37 -07:00
sin-ack
9b425b860c Kernel+LibC+Tests: Implement pwritev(2)
While this isn't really POSIX, it's needed by the Zig port and was
simple enough to implement.
2022-12-11 19:55:37 -07:00
sin-ack
70337f3a4b Kernel+LibC: Implement setregid(2)
This copies and adapts the setresgid syscall, following in the footsteps
of setreuid and setresuid.
2022-12-11 19:55:37 -07:00
sin-ack
2a502fe232 Kernel+LibC+LibCore+UserspaceEmulator: Implement faccessat(2)
Co-Authored-By: Daniel Bertalan <dani@danielbertalan.dev>
2022-12-11 19:55:37 -07:00
sin-ack
fa692e13f9 Kernel: Use real UID/GID when checking for file access
This aligns the rest of the system with POSIX, who says that access(2)
must check against the real UID and GID, not effective ones.
2022-12-11 19:55:37 -07:00
sin-ack
3472c84d14 Kernel: Remove InodeMetadata::may_{read,write,execute}(Process const&)
These have no definition and are never used.
2022-12-11 19:55:37 -07:00
sin-ack
d5fbdf1866 Kernel+LibC+LibCore: Implement renameat(2)
Now with the ability to specify different bases for the old and new
paths.
2022-12-11 19:55:37 -07:00
sin-ack
eb5389e933 Kernel+LibC+LibCore: Implement mkdirat(2) 2022-12-11 19:55:37 -07:00
sin-ack
6445a706cf Kernel+LibC: Implement readlinkat(2)
Co-Authored-By: Daniel Bertalan <dani@danielbertalan.dev>
2022-12-11 19:55:37 -07:00
sin-ack
9850a69cd1 Kernel+LibC+LibCore: Implement symlinkat(2)
Co-Authored-By: Daniel Bertalan <dani@danielbertalan.dev>
2022-12-11 19:55:37 -07:00
sin-ack
5c1d5ed51d Kernel: Implement Process::custody_for_dirfd
This allows deduplicating a bunch of code that has to work with
POSIX' *at syscall semantics.
2022-12-11 19:55:37 -07:00
kleines Filmröllchen
bfb3fc58dd Kernel: Allow dead threads to be joined
Joining dead threads is allowed for two main reasons:
- Thread join behavior should not be racy when a thread is joined and
  exiting at roughly the same time. This is common behavior when threads
  are given a signal to end (meaning they are going to exit ASAP) and
  then joined.
- POSIX requires that exited threads are joinable (at least, there is no
  language in the specification forbidding it).

The behavior is still well-defined; e.g. it doesn't allow a dead
detached thread to be joined or a thread to be joined more than once.
2022-12-11 19:07:20 -07:00
Thomas Queiroz
0380ff30aa Kernel: Use HashMap::try_ensure_capacity 2022-12-10 14:29:46 +01:00
Sergey Lisov
18af8be0e6 Kernel: Set EFLAGS/RFLAGS to a known-good value on signal entry 2022-12-10 13:11:49 +01:00
Liav A
aa9fab9c3a Kernel/FileSystem: Convert the mount table from Vector to IntrusiveList
The fact that we used a Vector meant that even if creating a Mount
object succeeded, we were still at a risk that appending to the actual
mounts Vector could fail due to OOM condition. To guard against this,
the mount table is now an IntrusiveList, which always means that when
allocation of a Mount object succeeded, then inserting that object to
the list will succeed, which allows us to fail early in case of OOM
condition.
2022-12-09 23:29:33 -07:00
Liav A
d4b65f644e Kernel: Allow opening some device nodes sparingly for jailed processes
From now on, we don't allow jailed processes to open all device nodes in
/dev, but only allow jailed processes to open /dev/full, /dev/zero,
/dev/null, and various TTY and PTY devices (and not including virtual
consoles) so we basically restrict applications to what they can do when
they are in jail.
The motivation for this type of restriction is to ensure that even if a
remote code execution occurred, the damage that can be done is very
small.
We also don't restrict reading and writing on device nodes that were
already opened, because that limit seems not useful, especially in the
case where we do want to provide an OpenFileDescription to such device
but nothing further than that.
2022-12-09 23:09:00 -07:00
Liav A
6a555af1f1 Kernel: Add callback on ".." directory entry for a TmpFS root directory 2022-12-09 22:59:08 -07:00
Filiph Sandström
83380ebebc Kernel/aarch64: Initialize components that are already working
`SysFSComponentRegistry`, `ProcFSComponentRegistry` and
`attach_null_device` "just work" already; let's include them to match
x86_64 as closely as possible.
2022-12-08 09:20:27 +00:00
Thomas Queiroz
07f1aad3dd Kernel: Add missing VERIFY in MM::allocate_committed_physical_page 2022-12-07 16:31:16 +00:00
Thomas Queiroz
c681330450 Kernel: Don't panic if MemoryManager::find_free_physical_page fails 2022-12-07 16:31:16 +00:00
Thomas Queiroz
8e8ea99bf3 Kernel: Return nullptr instead of PANICking in KmallocSlabHeap
I dared to return nullptr :^)
2022-12-07 16:31:16 +00:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Andreas Kling
d8a3e2fc4e Kernel: Don't memset() allocated memory twice in kcalloc()
This patch adds a way to ask the allocator to skip its internal
scrubbing memset operation. Before this change, kcalloc() would scrub
twice: once internally in kmalloc() and then again in kcalloc().

The same mechanism already existed in LibC malloc, and this patch
brings it over to the kernel heap allocator as well.

This solves one FIXME in kcalloc(). :^)
2022-12-05 10:29:18 +01:00
Linus Groh
babfc13c84 Everywhere: Remove 'clang-format off' comments that are no longer needed
https://github.com/SerenityOS/serenity/pull/15654#issuecomment-1322554496
2022-12-03 23:52:23 +00:00
Linus Groh
d26aabff04 Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Vitriol1744
e3e1566fd7 Kernel: Implement PIT::set_periodic() and PIT::set_non_periodic() 2022-12-03 23:10:36 +00:00
Liav A
69f41eb062 Kernel: Reject create links on paths that were not unveiled as writable
This solves one of the security issues being mentioned in issue #15996.
We simply don't allow creating hardlinks on paths that were not unveiled
as writable to prevent possible bypass on a certain path that was
unveiled as non-writable.
2022-12-03 11:00:34 -07:00
Liav A
0bb7c8f4c4 Kernel+SystemServer: Don't hardcode coredump directory path
Instead, allow userspace to decide on the coredump directory path. By
default, SystemServer sets it to the /tmp/coredump directory, but users
can now change this by writing a new path to the sysfs node at
/sys/kernel/variables/coredump_directory, and also to read this node to
check where coredumps are currently generated at.
2022-12-03 05:56:59 -07:00
Liav A
7dcf8f971b Kernel: Rename SysFSSystemBoolean => SysFSSystemBooleanVariable 2022-12-03 05:56:59 -07:00
Liav A
95d8aa2982 Kernel: Allow read access sparingly to some /sys/kernel directory nodes
Those nodes are not exposing any sensitive information so there's no
harm in exposing them.
2022-12-03 05:47:58 -07:00
Liav A
1ca0ac5207 Kernel: Disallow jailed processes to read files in /sys/kernel directory
By default, disallow reading of values in that directory. Later on, we
will enable sparingly read access to specific files.

The idea that led to this mechanism was suggested by Jean-Baptiste
Boric (also known as boricj in GitHub), to prevent access to sensitive
information in the SysFS if someone adds a new file in the /sys/kernel
directory.
2022-12-03 05:47:58 -07:00
Liav A
2e55956784 Kernel: Forbid access to /sys/kernel/power_state for Jailed processes
There's simply no benefit in allowing sandboxed programs to change the
power state of the machine, so disallow writes to the mentioned node to
prevent malicious programs to request that.
2022-12-03 05:47:58 -07:00
Andreas Kling
4277e2d58f Kernel: Add some spec links and comments to sys$posix_fallocate() 2022-11-29 11:09:19 +01:00
Andreas Kling
961e1e590b Kernel: Make sys$posix_fallocate() fail with ENODEV on non-regular files
Previously we tried to determine if `fd` refers to a non-regular file by
doing a stat() operation on the file.

This didn't work out very well since many File subclasses don't
actually implement stat() but instead fall back to failing with EBADF.

This patch fixes the issue by checking for regular files with
File::is_regular_file() instead.
2022-11-29 11:09:19 +01:00
Andreas Kling
4dd148f07c Kernel: Add File::is_regular_file()
This makes it easy and expressive to check if a File is a regular file.
2022-11-29 11:09:19 +01:00
Andreas Kling
9249bcb5aa Kernel: Remove unnecessary FIXME in sys$posix_fallocate()
This syscall doesn't need to do anything for ENOSPC, as that is already
handled by its callees.
2022-11-29 11:09:19 +01:00
Keegan Saunders
89b23c473a LibC: Use uintptr_t for __stack_chk_guard
We used size_t, which is a type that is guarenteed to be large
enough to hold an array index, but uintptr_t is designed to be used
to hold pointer values, which is the case of stack guards.
2022-11-29 11:04:21 +01:00
Liav A
718ae68621 Kernel+LibCore+LibC: Implement support for forcing unveil on exec
To accomplish this, we add another VeilState which is called
LockedInherited. The idea is to apply exec unveil data, similar to
execpromises of the pledge syscall, on the current exec'ed program
during the execve sequence. When applying the forced unveil data, the
veil state is set to be locked but the special state of LockedInherited
ensures that if the new program tries to unveil paths, the request will
silently be ignored, so the program will continue running without
receiving an error, but is still can only use the paths that were
unveiled before the exec syscall. This in turn, allows us to use the
unveil syscall with a special utility to sandbox other userland programs
in terms of what is visible to them on the filesystem, and is usable on
both programs that use or don't use the unveil syscall in their code.
2022-11-26 12:42:15 -07:00
sin-ack
3b03077abb Kernel: Update the ".." inode for directories after a rename
Because the ".." entry in a directory is a separate inode, if a
directory is renamed to a new location, then we should update this entry
the point to the new parent directory as well.

Co-authored-by: Liav A <liavalb@gmail.com>
2022-11-25 17:33:05 +01:00
Andreas Kling
5556b27e38 Kernel: Update tv_nsec field when using utimensat() with UTIME_NOW
We were only updating the tv_sec field and leaving UTIME_NOW in tv_nsec.
2022-11-24 16:56:27 +01:00
Andreas Kling
a9d55ddf57 Kernel/TmpFS: Update mtime instead of ctime when asked to update mtime 2022-11-24 16:56:27 +01:00
Andreas Kling
10fa72d451 Kernel: Use AK::Time for InodeMetadata timestamps instead of time_t
Before this change, we were truncating the nanosecond part of file
timestamps in many different places.
2022-11-24 16:56:27 +01:00
Andreas Kling
fb00d3ed25 Kernel+lsirq: Track per-CPU IRQ handler call counts
Each GenericInterruptHandler now tracks the number of calls that each
CPU has serviced.

This takes care of a FIXME in the /sys/kernel/interrupts generator.

Also, the lsirq command line tool now displays per-CPU call counts.
2022-11-19 15:39:30 +01:00
Andreas Kling
94b514b981 Kernel: Add MAX_CPU_COUNT global constant
Instead of just hard-coding the x86 Processor array to size 64,
we now use a named constant that you can also reference elsewhere. :^)
2022-11-19 15:39:30 +01:00
Andreas Kling
9b3db63e14 Kernel: Rename GenericInterruptHandler "invoking count" to "call count" 2022-11-19 15:39:30 +01:00
Steffen Rusitschka
7725042235 Kernel: Fix includes when building aarch64
This patch fixes some include problems on aarch64. aarch64 is still
currently broken but this will get us back to the underlying problem
of FloatExtractor.
2022-11-18 16:25:33 -08:00
Liav A
9559682f5c Kernel: Disallow jail creation from a process within a jail
We now disallow jail creation from a process within a jail because there
is simply no valid use case to allow it, and we will probably not enable
this behavior (which is considered a bug) again.

Although there was no "real" security issue with this bug, as a process
would still be denied to join that jail, there's an information reveal
about the amount of jails that are or were present in the system.
2022-11-13 16:58:54 -07:00
b14ckcat
9baa521b04 Kernel/USB: Use proper verbs for Pipe transfer methods 2022-11-12 09:08:02 -07:00