Commit graph

248 commits

Author SHA1 Message Date
Andreas Kling
32e93c8808 Kernel: Mark write_cr0() and write_cr4() as UNMAP_AFTER_INIT
This removes a very useful tool for attackers trying to disable
SMAP/SMEP/etc. :^)
2021-02-19 20:23:05 +01:00
Andreas Kling
6136faa4eb Kernel: Add .unmap_after_init section for code we don't need after init
You can now declare functions with UNMAP_AFTER_INIT and they'll get
segregated into a separate kernel section that gets completely
unmapped at the end of initialization.

This can be used for anything we don't need to call once we've booted
into userspace.

There are two nice things about this mechanism:

- It allows us to free up entire pages of memory for other use.
  (Note that this patch does not actually make use of the freed
  pages yet, but in the future we totally could!)

- It allows us to get rid of obviously dangerous gadgets like
  write-to-CR0 and write-to-CR4 which are very useful for an attacker
  trying to disable SMAP/SMEP/etc.

I've also made sure to include a helpful panic message in case you
hit a kernel crash because of this protection. :^)
2021-02-19 20:23:05 +01:00
Andreas Kling
da100f12a6 Kernel: Add helpers for manipulating x86 control registers
Use read_cr{0,2,3,4} and write_cr{0,3,4} helpers instead of inline asm.
2021-02-19 20:23:05 +01:00
Andreas Kling
5f610417d0 Kernel: Remove kprintf()
There are no remaining users of this API.
2021-02-17 16:33:43 +01:00
Andreas Kling
8ee42e47df Kernel: Mark a handful of things in CPU.cpp as READONLY_AFTER_INIT 2021-02-14 18:12:00 +01:00
Andreas Kling
f0a1d9bfa5 Kernel: Mark the x86 IDT as READONLY_AFTER_INIT
We never need to modify the interrupt descriptor table after finishing
initialization, so let's make it an error to do so.
2021-02-14 18:12:00 +01:00
Andreas Kling
a10accd48c Kernel: Print a helpful panic message for READONLY_AFTER_INIT crashes 2021-02-14 18:12:00 +01:00
Andreas Kling
d8013c60bb Kernel: Add mechanism to make some memory read-only after init finishes
You can now use the READONLY_AFTER_INIT macro when declaring a variable
and we will put it in a special ".ro_after_init" section in the kernel.

Data in that section remains writable during the boot and init process,
and is then marked read-only just before launching the SystemServer.

This is based on an idea from the Linux kernel. :^)
2021-02-14 18:11:32 +01:00
Andreas Kling
0e92a80434 Kernel: Add some bits of randomness to kernel stack pointers
Since kernel stacks are much smaller (64 KiB) than userspace stacks,
we only add a small bit of randomness here (0-256 bytes, 16b aligned.)

This makes the location of the task context switch buffer not be
100% predictable. Note that we still also add extra randomness upon
syscall entry, so this patch primarily affects context switching.
2021-02-14 12:30:07 +01:00
Andreas Kling
10b7f6b77e Kernel: Mark handle_crash() as [[noreturn]] 2021-02-14 11:47:14 +01:00
Andreas Kling
09b1b09c19 Kernel: Assert if rounding-up-to-page-size would wrap around to 0
If we try to align a number above 0xfffff000 to the next multiple of
the page size (4 KiB), it would wrap around to 0. This is most likely
never what we want, so let's assert if that happens.
2021-02-14 10:01:50 +01:00
Andreas Kling
b712345c92 Kernel: Use PANIC() in a bunch of places :^) 2021-02-14 09:36:58 +01:00
Andreas Kling
34a83aba71 Kernel: Convert klog() => dbgln()/dmesgln() in Arch/i386/CPU.cpp 2021-02-13 21:51:16 +01:00
Tom
b445f15131 Kernel: Avoid flushing the tlb if there's only one thread
If we're flushing user space pointers and the process only has one
thread, we do not need to broadcast this to other processors as
they will all discard that request anyway.
2021-02-13 19:46:45 +01:00
Andreas Kling
1f277f0bd9 Kernel: Convert all *Builder::appendf() => appendff() 2021-02-09 19:18:13 +01:00
Andreas Kling
f39c2b653e Kernel: Reorganize ptrace implementation a bit
The generic parts of ptrace now live in Kernel/Syscalls/ptrace.cpp
and the i386 specific parts are moved to Arch/i386/CPU.cpp
2021-02-08 19:34:41 +01:00
AnotherTest
09a43969ba Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...)
Replacement made by `find Kernel Userland -name '*.h' -o -name '*.cpp' | sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`
2021-02-08 18:08:55 +01:00
AnotherTest
1f8a633cc7 Kernel: Make Arch/i386/CPU.cpp safe to run through clang-format
This file was far too messy, and touching it was a major pain.
Also enable clang-format linting on it.
2021-02-08 18:08:55 +01:00
Andreas Kling
b0b51c3955 Kernel: Limit the size of stack traces
Let's not allow infinitely long stack traces. Cap it at 4096 frames.
2021-02-02 18:58:26 +01:00
Tom
c531084873 Kernel: Track processor idle state and wake processors when waking threads
Attempt to wake idle processors to get threads to be scheduled more quickly.
We don't want to wait until the next timer tick if we have processors that
aren't doing anything.
2021-01-27 22:48:41 +01:00
Tom
e2f9e557d3 Kernel: Make Processor::id a static function
This eliminates the window between calling Processor::current and
the member function where a thread could be moved to another
processor. This is generally not as big of a concern as with
Processor::current_thread, but also slightly more light weight.
2021-01-27 21:12:24 +01:00
Tom
21d288a10e Kernel: Make Thread::current smp-safe
Change Thread::current to be a static function and read using the fs
register, which eliminates a window between Processor::current()
returning and calling a function on it, which can trigger preemption
and a move to a different processor, which then causes operating
on the wrong object.
2021-01-27 21:12:24 +01:00
Tom
f88a8b16d7 Kernel: Make entering and leaving critical sections atomic
We also need to store m_in_critical in the Thread upon switching,
and we need to restore it. This solves a problem where threads
moving between different processors could end up with an unexpected
value.
2021-01-27 21:12:24 +01:00
Tom
0bd558081e Kernel: Track previous mode when entering/exiting traps
This allows us to determine what the previous mode (user or kernel)
was, e.g. in the timer interrupt. This is used e.g. to determine
whether a signal handler should be set up.

Fixes #5096
2021-01-27 21:12:24 +01:00
asynts
7cf0c7cc0d Meta: Split debug defines into multiple headers.
The following script was used to make these changes:

    #!/bin/bash
    set -e

    tmp=$(mktemp -d)

    echo "tmp=$tmp"

    find Kernel \( -name '*.cpp' -o -name '*.h' \) | sort > $tmp/Kernel.files
    find . \( -path ./Toolchain -prune -o -path ./Build -prune -o -path ./Kernel -prune \) -o \( -name '*.cpp' -o -name '*.h' \) -print | sort > $tmp/EverythingExceptKernel.files

    cat $tmp/Kernel.files | xargs grep -Eho '[A-Z0-9_]+_DEBUG' | sort | uniq > $tmp/Kernel.macros
    cat $tmp/EverythingExceptKernel.files | xargs grep -Eho '[A-Z0-9_]+_DEBUG' | sort | uniq > $tmp/EverythingExceptKernel.macros

    comm -23 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/Kernel.unique
    comm -1 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/EverythingExceptKernel.unique

    cat $tmp/Kernel.unique | awk '{ print "#cmakedefine01 "$1 }' > $tmp/Kernel.header
    cat $tmp/EverythingExceptKernel.unique | awk '{ print "#cmakedefine01 "$1 }' > $tmp/EverythingExceptKernel.header

    for macro in $(cat $tmp/Kernel.unique)
    do
        cat $tmp/Kernel.files | xargs grep -l $macro >> $tmp/Kernel.new-includes ||:
    done
    cat $tmp/Kernel.new-includes | sort > $tmp/Kernel.new-includes.sorted

    for macro in $(cat $tmp/EverythingExceptKernel.unique)
    do
        cat $tmp/Kernel.files | xargs grep -l $macro >> $tmp/Kernel.old-includes ||:
    done
    cat $tmp/Kernel.old-includes | sort > $tmp/Kernel.old-includes.sorted

    comm -23 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.new
    comm -13 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.old
    comm -12 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.mixed

    for file in $(cat $tmp/Kernel.includes.new)
    do
        sed -i -E 's/#include <AK\/Debug\.h>/#include <Kernel\/Debug\.h>/' $file
    done

    for file in $(cat $tmp/Kernel.includes.mixed)
    do
        echo "mixed include in $file, requires manual editing."
    done
2021-01-26 21:20:00 +01:00
Tom
b580c005f1 Kernel: Fix possible context switch within first context switch of a thread
We were enabling interrupts too early, before the first context switch to
a thread was complete. This could then trigger another context switch
within the context switch, which lead to a crash.
2021-01-25 23:22:12 +01:00
asynts
8465683dcf Everywhere: Debug macros instead of constexpr.
This was done with the following script:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/dbgln<debug_([a-z_]+)>/dbgln<\U\1_DEBUG>/' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/if constexpr \(debug_([a-z0-9_]+)/if constexpr \(\U\1_DEBUG/' {} \;
2021-01-25 09:47:36 +01:00
asynts
1a3a0836c0 Everywhere: Use CMake to generate AK/Debug.h.
This was done with the help of several scripts, I dump them here to
easily find them later:

    awk '/#ifdef/ { print "#cmakedefine01 "$2 }' AK/Debug.h.in

    for debug_macro in $(awk '/#ifdef/ { print $2 }' AK/Debug.h.in)
    do
        find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/#ifdef '$debug_macro'/#if '$debug_macro'/' {} \;
    done

    # Remember to remove WRAPPER_GERNERATOR_DEBUG from the list.
    awk '/#cmake/ { print "set("$2" ON)" }' AK/Debug.h.in
2021-01-25 09:47:36 +01:00
Jean-Baptiste Boric
7eaefa5aa6 Kernel: Make use of interrupts as an entropy source
Booting old computers without RDRAND/RDSEED and without a disk makes
the system severely starved for entropy. Uses interrupts as a source
to side-step that issue.

Also warn whenever the system is starved of entropy, because that's
a non-obvious failure mode.
2021-01-24 22:16:18 +01:00
asynts
9d588cc9cc Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-22 22:14:30 +01:00
Andreas Kling
928ee2c791 Kernel: Don't let signals unblock threads while handling a page fault
It was possible to signal a process while it was paging in an inode
backed VM object. This would cause the inode read to EINTR, and the
page fault handler would assert.

Solve this by simply not unblocking threads due to signals if they are
currently busy handling a page fault. This is probably not the best way
to solve this issue, so I've added a FIXME to that effect.
2021-01-21 00:14:56 +01:00
Jean-Baptiste Boric
6677ab1ccd Boot: Fix undefined Multiboot behaviors
Both ESP and GDTR are left undefined by the Multiboot specification and
OS images must not rely on these values to be valid. Fix the undefined
behaviors so that booting with PXELINUX does not triple-fault the CPU.
2021-01-19 09:03:37 +01:00
Tom
b17a889320 Kernel: Add safe atomic functions
This allows us to perform atomic operations on potentially unsafe
user space pointers.
2021-01-17 20:30:31 +01:00
asynts
872f2a3b90 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.Everything:
2021-01-11 11:55:47 +01:00
asynts
11d651d447 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.Everything:
2021-01-11 11:55:47 +01:00
asynts
938e5c7719 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.Everything:

The modifications in this commit were automatically made using the
following command:

    find . -name '*.cpp' -exec sed -i -E 's/dbg\(\) << ("[^"{]*");/dbgln\(\1\);/' {} \;
2021-01-09 21:11:09 +01:00
Tom
0d44ee6f2b Kernel: Ignore TLB flush requests for user addresses of other processes
If a TLB flush request is broadcast to other processors and the addresses
to flush are user mode addresses, we can ignore such a request on the
target processor if the page directory currently in use doesn't match
the addresses to be flushed. We still need to broadcast to all processors
in that case because the other processors may switch to that same page
directory at any time.
2021-01-02 20:56:35 +01:00
Linus Groh
bbe787a0af Everywhere: Re-format with clang-format-11
Compared to version 10 this fixes a bunch of formatting issues, mostly
around structs/classes with attributes like [[gnu::packed]], and
incorrect insertion of spaces in parameter types ("T &"/"T &&").
I also removed a bunch of // clang-format off/on and FIXME comments that
are no longer relevant - on the other hand it tried to destroy a couple of
neatly formatted comments, so I had to add some as well.
2020-12-31 21:51:00 +01:00
Luke
865f5ed4f6 Kernel: Prevent sign bit extension when creating a PDPTE
When doing the cast to u64 on the page directory physical address,
the sign bit was being extended. This only beomes an issue when
crossing the 2 GiB boundary. At >= 2 GiB, the physical address
has the sign bit set. For example, 0x80000000.

This set all the reserved bits in the PDPTE, causing a GPF
when loading the PDPT pointer into CR3. The reserved bits are
presumably there to stop you writing out a physical address that
the CPU physically cannot handle, as the size of the reserved bits
is determined by the physical address width of the CPU.

This fixes this by casting to FlatPtr instead. I believe the sign
extension only happens when casting to a bigger type. I'm also using
FlatPtr because it's a pointer we're writing into the PDPTE.
sizeof(FlatPtr) will always be the same size as sizeof(void*).

This also now asserts that the physical address in the PDPTE is
within the max physical address the CPU supports. This is better
than getting a GPF, because CPU::handle_crash tries to do the same
operation that caused the GPF in the first place. That would cause
an infinite loop of GPFs until the stack was exhausted, causing a
triple fault.

As far as I know and tested, I believe we can now use the full 32-bit
physical range without crashing.

Fixes #4584. See that issue for the full debugging story.
2020-12-30 20:33:15 +01:00
Lenny Maiorani
b2316701a8 Everywhere: void arguments to C functions
Problem:
- C functions with no arguments require a single `void` in the argument list.

Solution:
- Put the `void` in the argument list of functions in C header files.
2020-12-26 10:10:27 +01:00
Andreas Kling
c25cf5fb56 Kernel: Panic if we're about to switch to a user thread with IOPL!=0
This is a crude protection against IOPL elevation attacks. If for
any reason we find ourselves about to switch to a user mode thread
with IOPL != 0, we'll now simply panic the kernel.

If this happens, it basically means that something tricked the kernel
into incorrectly modifying the IOPL of a thread, so it's no longer
safe to trust the kernel anyway.
2020-12-23 14:30:10 +01:00
Andreas Kling
6bfbc5f5f5 Kernel: Don't allow modifying IOPL via sys$ptrace() or sys$sigreturn()
It was possible to overwrite the entire EFLAGS register since we didn't
do any masking in the ptrace and sigreturn syscalls.

This made it trivial to gain IO privileges by raising IOPL to 3 and
then you could talk to hardware to do all kinds of nasty things.

Thanks to @allesctf for finding these issues! :^)

Their exploit/write-up: https://github.com/allesctf/writeups/blob/master/2020/hxpctf/wisdom2/writeup.md
2020-12-22 19:38:25 +01:00
Liav A
39c1783387 Kernel: Allow to install a real IRQ handler on a spurious one
IRQ 7 and 15 on the PIC architecture are used for spurious interrupts.
IRQ 7 could also be used for LPT connection, and IRQ 15 can be used for
the secondary IDE channel. Therefore, we need to allow to install a
real IRQ handler and check if a real IRQ was asserted. If so, we handle
them in the usual way.

A note on this fix - unregistering or registering a new IRQ handler
after we already registered one in the spurious interrupt handler is
not supported yet.
2020-12-21 00:19:21 +01:00
Lenny Maiorani
765936ebae
Everywhere: Switch from (void) to [[maybe_unused]] (#4473)
Problem:
- `(void)` simply casts the expression to void. This is understood to
  indicate that it is ignored, but this is really a compiler trick to
  get the compiler to not generate a warning.

Solution:
- Use the `[[maybe_unused]]` attribute to indicate the value is unused.

Note:
- Functions taking a `(void)` argument list have also been changed to
  `()` because this is not needed and shows up in the same grep
  command.
2020-12-21 00:09:48 +01:00
Tom
c455fc2030 Kernel: Change wait blocking to Process-only blocking
This prevents zombies created by multi-threaded applications and brings
our model back to closer to what other OSs do.

This also means that SIGSTOP needs to halt all threads, and SIGCONT needs
to resume those threads.
2020-12-12 21:28:12 +01:00
Tom
da5cc34ebb Kernel: Fix some issues related to fixes and block conditions
Fix some problems with join blocks where the joining thread block
condition was added twice, which lead to a crash when trying to
unblock that condition a second time.

Deferred block condition evaluation by File objects were also not
properly keeping the File object alive, which lead to some random
crashes and corruption problems.

Other problems were caused by the fact that the Queued state didn't
handle signals/interruptions consistently. To solve these issues we
remove this state entirely, along with Thread::wait_on and change
the WaitQueue into a BlockCondition instead.

Also, deliver signals even if there isn't going to be a context switch
to another thread.

Fixes #4336 and #4330
2020-12-12 21:28:12 +01:00
Tom
766db673c1 Kernel: Flush TLBs concurrently
Instead of flushing the TLB on the current processor first and then
notifying the other processors to do the same, notify the others
first, and while waiting on the others flush our own.
2020-12-02 23:49:52 +01:00
Tom
5e08ae4e14 Kernel: Fix counting interrupts
Move counting interrupts out of the handle_interrupt method so that
it is done in all cases without the interrupt handler having to
implement it explicitly.

Also make the counter an atomic value as e.g. the LocalAPIC interrupts
may be triggered on multiple processors simultaneously.

Fixes #4297
2020-12-02 23:19:59 +01:00
Tom
78f1b5e359 Kernel: Fix some problems with Thread::wait_on and Lock
This changes the Thread::wait_on function to not enable interrupts
upon leaving, which caused some problems with page fault handlers
and in other situations. It may now be called from critical
sections, with interrupts enabled or disabled, and returns to the
same state.

This also requires some fixes to Lock. To aid debugging, a new
define LOCK_DEBUG is added that enables checking for Lock leaks
upon finalization of a Thread.
2020-12-01 09:48:34 +01:00
Tom
6a620562cc Kernel: Allow passing a thread argument for new kernel threads
This adds the ability to pass a pointer to kernel thread/process.
Also add the ability to use a closure as thread function, which
allows passing information to a kernel thread more easily.
2020-11-30 13:17:02 +01:00