serenity

mirror of https://github.com/SerenityOS/serenity.git synced 2025-01-23 09:51:57 -05:00

Author	SHA1	Message	Date
logkos	3a610931f5	Kernel/Syscalls: Require `inet` promise for AF_INET6 domain	2024-10-05 12:52:10 -04:00
Liav A.	b93ca74d81	Kernel: Add a prctl option to enter jail mode until an execve syscall In addition to the already existing option to enter jail mode (which is set indefinitely), there should be a less restrictive option that should allow exiting jail mode when doing the execve syscall. This option will be useful for programs that need this kind of security layer only in their runtime, but they're meant to actually initiate another program in the end.	2024-10-03 12:39:45 +02:00
Liav A.	b90a36d2a9	Kernel+Userland: Rename jailed => jailed_until_exit In all instances, it should be clear that the jailing of a process is ending when the process exits. This is a preparation before introducing another option to set a process as jailed until it calls the execve syscall.	2024-10-03 12:39:45 +02:00
Liav A.	fdf3e0aca1	Kernel: Don't assume sizes of needed buffers early in the execve syscall Instead, start by trying to read a buffer with size of Elf_Ehdr, and check it for the shebang sign. If it's indeed an executable with shebang then read again from the file, now with PAGE_SIZE size, which should suffice for finding the interpreter path. However, if the executable is an ELF, we quickly validate it and then pass the preliminary buffer to the find_elf_interpreter_for_executable method. That method calculates the last byte offset which is needed to read all of the program headers, so we don't just assume 4096 bytes is sufficient anymore. The same pattern is applied when loading the interpreter ELF main header and its program headers.	2024-09-01 20:52:55 +02:00
Liav A.	4aec3f4ef9	Kernel+Userland: Simplify loading of an ELF interpreter path The LibELF validate_program_headers method tried to do too many things at once, and as a result, we had an awkward return type from it. To be able to simplify it, we no longer allow passing a StringBuilder* but instead we require to pass an Optional<Elf_Phdr> by reference so it could be filled with actual ELF program header that corresponds to an INTERP header if such found. As a result, we ensure that only certain implementations that actually care about the ELF interpreter path will actually try to load it on their own and if they fail, they can have better diagnostics for an invalid INTERP header. This change also fixes a bug that on which we failed to execute an ELF program if the INTERP header is located outside the first 4KiB page of the ELF file, as the kernel previously didn't have support for looking beyond that for that header.	2024-07-21 15:38:52 +02:00
Liav A.	c0f55d4b11	Kernel: Add a check on ELF interpreter to verify we open a regular file While extremely unlikely, it's possible to change the dynamic loader to a non regular file, which will result in a kernel panic upon VERIFY of the `interpreter_description->inode()` statement.	2024-07-21 15:38:52 +02:00
Liav A.	03ae9fdb0a	Kernel: Check condition earlier for ELF file type It makes no sense to do all of the loading work just to figure out that the ELF file is an object file that is a result of compiling and not an actual executable. In addition to that, we should disallow running coredumps as well, so the condition is changed now to only allow ET_DYN or ET_EXEC ELF files.	2024-07-21 15:38:52 +02:00
Liav A.	0e6624dc86	Kernel: Introduce the unshare syscall family These 2 syscalls are responsible for unsharing resources in the system, such as hostname, VFS root contexts and process lists. Together with an appropriate userspace implementation, these syscalls could be used for creating a sandbox environment (containers) for user programs.	2024-07-21 11:44:23 +02:00
Liav A.	e52abd4c09	Kernel: Introduce the HostnameContext class Similarly to VFSRootContext and ScopedProcessList, this class intends to form resource isolation as well. We add this class as an infrastructure preparation of hostname contexts which should allow processes to obtain different hostnames on the same machine.	2024-07-21 11:44:23 +02:00
Liav A.	3692af528e	Kernel: Move most of VirtualFileSystem code to be in a namespace There's no point in constructing an object just for the sake of keeping a state that can be touched by anything in the kernel code. Let's reduce everything to be in a C++ namespace called with the previous name "VirtualFileSystem" and keep a smaller textual-footprint struct called "VirtualFileSystemDetails". This change also cleans up old "friend class" statements that were no longer needed, and move methods from the VirtualFileSystem code to more appropriate places as well. Please note that the method of locking all filesystems during shutdown is removed, as in that place there's no meaning to actually locking all filesystems because of running in kernel mode entirely.	2024-07-21 11:44:23 +02:00
Liav A.	4370bbb3ad	Kernel+Userland: Introduce the copy_mount syscall This new syscall will be used by the upcoming runc (run-container) utility. In addition to that, this syscall allows userspace to neatly copy RAMFS instances to other places, which was not possible in the past.	2024-07-21 11:44:23 +02:00
Liav A.	dd59fe35c7	Kernel+Userland: Reduce jails to be a simple boolean flag The whole concept of Jails was far more complicated than I actually want it to be, so let's reduce the complexity of how it works from now on. Please note that we always leaked the attach count of a Jail object in the fork syscall if it failed midway. Instead, we should have attach to the jail just before registering the new Process, so we don't need to worry about unsuccessful Process creation. The reduction of complexity in regard to jails means that instead of relying on jails to provide PID isolation, we could simplify the whole idea of them to be a simple SetOnce, and let the ProcessList (now called ScopedProcessList) to be responsible for this type of isolation. Therefore, we apply the following changes to do so: - We make the Jail concept no longer a class of its own. Instead, we simplify the idea of being jailed to a simple ProtectedValues boolean flag. This means that we no longer check of matching jail pointers anywhere in the Kernel code. To set a process as jailed, a new prctl option was added to set a Kernel SetOnce boolean flag (so it cannot change ever again). - We provide Process & Thread methods to iterate over process lists. A process can either iterate on the global process list, or if it's attached to a scoped process list, then only over that list. This essentially replaces the need of checking the Jail pointer of a process when iterating over process lists.	2024-07-21 11:44:23 +02:00
Liav A.	91c87c5b77	Kernel+Userland: Prepare for considering VFSRootContext when mounting Expose some initial interfaces in the mount-related syscalls to select the desired VFSRootContext, by specifying the VFSRootContext index number. For now there's still no way to create a different VFSRootContext, so the only valid IDs are -1 (for currently attached VFSRootContext) or 1 for the first userspace VFSRootContext.	2024-07-21 11:44:23 +02:00
Liav A.	01e1af732b	Kernel/FileSystem: Introduce the VFSRootContext class The VFSRootContext class, as its name suggests, holds a context for a root directory with its mount table and the root custody/inode in the same class. The idea is derived from the Linux mount namespace mechanism. It mimicks the concept of the ProcessList object, but it is adjusted for a root directory tree context. In contrast to the ProcessList concept, processes that share the default VFSRootContext can't see other VFSRootContext related properties such as as the mount table and root custody/inode. To accommodate to this change progressively, we internally create 2 main VFS root contexts for now - one for kernel processes (as they don't need to care about VFS root contexts for the most part), and another for all userspace programs. This separation allows us to continue pretending for userspace that everything is "normal" as it is used to be, until we introduce proper interfaces in the mount-related syscalls as well as in the SysFS. We make VFSRootContext objects being listed, as another preparation before we could expose interfaces to userspace. As a result, the PowerStateSwitchTask now iterates on all contexts and tear them down one by one.	2024-07-21 11:44:23 +02:00
Ryan Castellucci	a2a6bc5348	Documentation: Fix some minor ESL grammar issues There are a few instances where comments and documentation have minor grammar issues likely resulting from English being the author's second language. This PR fixes several such cases, changing to idiomatic English and resolving where it is unclear whether the user or program/code is being referred to.	2024-07-03 00:17:46 +02:00
Liav A.	ecc9c5409d	Kernel: Ignore dirfd if absolute path is given in VFS-related syscalls To be able to do this, we add a new class called CustodyBase, which can be resolved on-demand internally in the VirtualFileSystem resolving path code. When being resolved, CustodyBase will return a known custody if it was constructed with such, if that's not the case it will provide the root custody if the original path is absolute. Lastly, if that's not the case as well, it will resolve the given dirfd to provide a Custody object.	2024-06-01 19:25:15 +02:00
implicitfield	4574a8c334	Kernel+LibC+LibCore: Implement `mknodat(2)`	2024-05-14 22:30:39 +02:00
implicitfield	05cf1327ed	Kernel: Make utimensat ignore the dirfd when given an absolute path	2024-05-14 22:30:39 +02:00
Liav A.	15ddc1f17a	Kernel+Userland: Reject W->X prot region transition after a prctl call We add a prctl option which would be called once after the dynamic loader has finished to do text relocations before calling the actual program entry point. This change makes it much more obvious when we are allowed to change a region protection access from being writable to executable. The dynamic loader should be able to do this, but after a certain point it is obvious that such mechanism should be disabled.	2024-05-14 12:41:51 -06:00
Liav A.	e756567341	Kernel+Userland: Convert process syscall region enforce flag to SetOnce This flag is set only once, and should never reset once it has been set, making it an ideal SetOnce use-case. It also simplifies the expected conditions for the enabling prctl call, as we don't expect a boolean flag, but rather the specific prctl option will always set (enable) Process' AddressSpace syscall region enforcing.	2024-05-14 12:41:51 -06:00
Dan Klishch	cc5bacf886	Kernel: Allow annotating initially loaded executable segments This allows marking regions as VirtualMemoryRangeFlags::SyscallCode in static executables.	2024-05-07 16:36:38 -06:00
Sönke Holz	243d7003a2	Kernel+LibC+LibELF: Move TLS handling to userspace This removes the allocate_tls syscall and adds an archctl option to set the fs_base for the current thread on x86-64, since you can't set that register from userspace. enter_thread_context loads the fs_base for the next thread on each context switch. This also moves tpidr_el0 (the thread pointer register on AArch64) to the register state, so it gets properly saved/restored on context switches. The userspace TLS allocation code is kept pretty similar to the original kernel TLS code, aside from a couple of style changes. We also have to add a new argument "tls_pointer" to SC_create_thread_params, as we otherwise can't prevent race conditions between setting the thread pointer register and signal handling code that might be triggered before the thread pointer was set, which could use TLS.	2024-04-19 16:46:47 -06:00
Andrew Kaster	a65c385057	Kernel: Don't try to copy empty Vector in sys$recvmsg If there's no fds to copy in a message with proper space for an SCM_RIGHTS set of cmsg headers, then don't try to copy them. This avoids a Kernel panic when recvmsg-ing, as copy_to_user(p, 0, 0) hits a VERIFY.	2024-04-19 16:38:55 -04:00
Dan Klishch	5ed7cd6e32	Everywhere: Use east const in more places These changes are compatible with clang-format 16 and will be mandatory when we eventually bump clang-format version. So, since there are no real downsides, let's commit them now.	2024-04-19 06:31:19 -04:00
Sönke Holz	04ca9f393f	Kernel/riscv64: Implement create_thread	2024-03-25 14:10:05 -06:00
Sönke Holz	65724efac3	Kernel/riscv64: Implement fork	2024-03-25 14:10:05 -06:00
Sönke Holz	faede8c93a	Kernel/riscv64: Implement execve	2024-03-25 14:10:05 -06:00
Idan Horowitz	e38ccebfc8	Kernel: Stop swallowing thread unblocks while process is stopped This easily led to kernel deadlocks if the stopped thread held an important global mutex (like the disk cache lock) while blocking. Resolve this by ensuring stopped threads have a chance to return to the userland boundary before actually stopping.	2024-02-10 08:42:53 +01:00
Idan Horowitz	458e990b7b	Kernel: Stop locking the scheduler spinlock before the ptrace mutex Locking a mutex while holding a spinlock is always wrong, but in the case of the scheduler lock, it also causes an assertion failure. (Which would be triggered by 2 separate threads trying to ptrace at the same time).	2024-02-10 08:42:53 +01:00
hanaa12G	7abda6a36f	Kernel: Add new sysconf option `_SC_GETGR_R_SIZE_MAX`	2024-01-06 04:59:50 -07:00
Idan Horowitz	519214697b	Kernel: Mark sys$getsockname as not needing the big process lock This syscall does not access any big process lock protected resources.	2023-12-26 19:20:21 +01:00
Idan Horowitz	ed5406e47d	Kernel: Mark sys$getpeername as not needing the big process lock This syscall does not access any big process lock protected resources.	2023-12-26 19:20:21 +01:00
Idan Horowitz	24a60c5a10	Kernel: Mark sys$ioctl as not needing the big process lock This syscall does not access any big process lock protected resources.	2023-12-26 19:20:21 +01:00
Idan Horowitz	d63667dbf1	Kernel: Mark sys$kill_thread as not needing the big process lock This syscall does not access any big process lock protected resources.	2023-12-26 19:20:21 +01:00
Idan Horowitz	b44628c1fb	Kernel: Mark sys$join_thread as not needing the big process lock This syscall does not access any big process lock protected resources.	2023-12-26 19:20:21 +01:00
Idan Horowitz	82e6090f47	Kernel: Mark sys$detach_thread as not needing the big process lock This syscall does not access any big process lock protected resources.	2023-12-26 19:20:21 +01:00
Idan Horowitz	b49a0e2c61	Kernel: Mark sys$create_thread as not needing the big process lock Now that the master TLS region is spinlock protected, this syscall does not access any big process lock protected resources.	2023-12-26 19:20:21 +01:00
Idan Horowitz	6a4b93b3e0	Kernel: Protect processes' master TLS with a fine-grained spinlock This moves it out of the scope of the big process lock, and allows us to wean some syscalls off it, starting with sys$allocate_tls.	2023-12-26 19:20:21 +01:00
Idan Horowitz	a49b7e92eb	Kernel: Shrink instead of expand sigaltstack range to page boundaries Since the POSIX sigaltstack manpage suggests allocating the stack region using malloc(), and many heap implementations (including ours) store heap chunk metadata in memory just before the vended pointer, we would end up zeroing the metadata, leading to various crashes.	2023-12-24 16:11:35 +01:00
Idan Horowitz	1bea780a7f	Kernel: Reject loading ELF files with no loadable segments If there's no loadable segments then there can't be any code to execute either. This resolves a crash these kinds of ELF files would cause from the directly following VERIFY statement.	2023-12-15 21:36:25 +01:00
Idan Horowitz	2a6b492c7f	Kernel: Copy over TLS region size and alignment when forking Previously we would unintentionally leave them zero-initialized, resulting in any threads created post fork (but without execve) having invalid thread local storage pointers stored in their FS register.	2023-12-15 21:36:03 +01:00
Daniel Bertalan	45d81dceed	Everywhere: Replace `ElfW(type)` macro usage with `Elf_type` This works around a `clang-format-17` bug which caused certain usages to be misformatted and fail to compile. Fixes #8315	2023-12-01 10:02:39 +02:00
Liav A	5dba1dedb7	Kernel: Don't warn when running dynamically-linked ELF without PT_INTERP We could technically copy the dynamic loader to other path and run it from there, so let's not assume paths. If the user is so determined to do such thing, then a warning is quite meaningless.	2023-11-27 09:27:34 -07:00
Idan Horowitz	16a53c811e	Kernel: Treat a backlog argument of 0 to listen() as if it was 1 As per POSIX, the behavior of listen() with a backlog value of 0 is implementation defined: "A backlog argument of 0 may allow the socket to accept connections, in which case the length of the listen queue may be set to an implementation-defined minimum value." Since creating a socket that can't accept any connections seems relatively useless, and as other platforms (Linux, FreeBSD, etc) chose to support accepting connections with this backlog value, support it as well by normalizing it to 1.	2023-11-25 16:34:38 +01:00
Sönke Holz	da88d766b2	Kernel/riscv64: Make the kernel compile This commits inserts TODOs into all necessary places to make the kernel compile on riscv64!	2023-11-10 15:51:31 -07:00
Uku Loskit	ecbb1df01b	Kernel/Syscalls: Allow root to ptrace any process Previously root (euid=0) was not able to ptrace any dumpable process as expected. This change fixes this.	2023-11-06 10:03:07 +01:00
Romain Chardiny	6d31d81309	Kernel: Allow negative value for backlog in sys$listen	2023-11-04 17:35:54 +01:00
Liav A	1b00618fd9	Kernel+Userland: Replace the beep syscall with the new /dev/beep device There's no need to have separate syscall for this kind of functionality, as we can just have a device node in /dev, called "beep", that allows writing tone generation packets to emulate the same behavior. In addition to that, we remove LibC sysbeep function, as this function was never being used by any C program nor it was standardized in any way. Instead, we move the userspace implementation to LibCore.	2023-11-03 15:19:33 +01:00
kleines Filmröllchen	398d271a46	Kernel: Share Processor class (and others) across architectures About half of the Processor code is common across architectures, so let's share it with a templated base class. Also, other code that can be shared in some ways, like FPUState and TrapFrame functions, is adjusted here. Functions which cannot be shared trivially (without internal refactoring) are left alone for now.	2023-10-03 16:08:29 -06:00
Liav A	cbaa3465a8	Kernel: Add jail semantics to methods iterating over thread lists We should consider whether the selected Thread is within the same jail or not. Therefore let's make it clear to callers with jail semantics if a called method checks if the desired Thread object is within the same jail. As for Thread::for_each_* methods, currently nothing in the kernel codebase needs iteration with consideration for jails, so the old Thread::for_each* were simply renamed to include "ignoring_jails" suffix in their names.	2023-09-15 11:06:48 -06:00

1 2 3 4 5 ...

1225 commits