The behaviour of the PT_TRACEME feature has been broken for some time,
this change fixes it.
When this ptrace flag is used, the traced process should be paused
before exiting execve.
We previously were sending the SIGSTOP signal at a stage where
interrupts are disabled, and the traced process continued executing
normally, without pausing and waiting for the tracer.
This change fixes it.
Upon leaving a critical section (such as a SpinLock) we need to
check if we're already asynchronously invoking the Scheduler.
Otherwise we might end up triggering another context switch
as soon as leaving the scheduler lock.
Fixes#2883
For now, only the non-standard _SC_NPROCESSORS_CONF and
_SC_NPROCESSORS_ONLN are implemented.
Use them to make ninja pick a better default -j value.
While here, make the ninja package script not fail if
no other port has been built yet.
We need to halt the BSP briefly until all APs are ready for the
first context switch, but we can't hold the same spinlock by all
of them while doing so. So, while the APs are waiting on each other
they need to release the scheduler lock, and then once signaled
re-acquire it. Should solve some timing dependent hangs or crashes,
most easily observed using qemu with kvm disabled.
We can now properly initialize all processors without
crashing by sending SMP IPI messages to synchronize memory
between processors.
We now initialize the APs once we have the scheduler running.
This is so that we can process IPI messages from the other
cores.
Also rework interrupt handling a bit so that it's more of a
1:1 mapping. We need to allocate non-sharable interrupts for
IPIs.
This also fixes the occasional hang/crash because all
CPUs now synchronize memory with each other.
The short-circuit path added for waiting on a queue that already had a
pending wake was able to return with interrupts disabled, which breaks
the API contract of wait_on() always returning with IF=1.
Fix this by adding a way to override the restored IF in ScopedCritical.
These changes solve a number of problems with the software
context swithcing:
* The scheduler lock really should be held throughout context switches
* Transitioning from the initial (idle) thread to another needs to
hold the scheduler lock
* Transitioning from a dying thread to another also needs to hold
the scheduler lock
* Dying threads cannot necessarily be finalized if they haven't
switched out of it yet, so flag them as active while a processor
is running it (the Running state may be switched to Dying while
it still is actually running)
If we're trying to walk the stack for another thread, we can
no longer retreive the EBP register from Thread::m_tss. Instead,
we need to look at the top of the kernel stack, because all threads
not currently running were last in kernel mode. Context switches
now always trigger a brief switch to kernel mode, and Thread::m_tss
only is used to save ESP and EIP.
Fixes#2678
When delivering urgent signals to the current thread
we need to check if we should be unblocked, and if not
we need to yield to another process.
We also need to make sure that we suppress context switches
during Process::exec() so that we don't clobber the registers
that it sets up (eip mainly) by a context switch. To be able
to do that we add the concept of a critical section, which are
similar to Process::m_in_irq but different in that they can be
requested at any time. Calls to Scheduler::yield and
Scheduler::donate_to will return instantly without triggering
a context switch, but the processor will then asynchronously
trigger a context switch once the critical section is left.
CPUs which support RDRAND do not necessarily support RDSEED. This
introduces a flag g_cpu_supports_rdseed which is set appropriately
by CPUID. This causes Haswell CPUs in particular (and probably a lot
of AMD chips) to now fail to boot with #2634, rather than an illegal
instruction.
It seems like the KernelRng needs either an initial reseed call or
more random events added before the first call to get_good_random,
but I don't feel qualified to make that kind of change.
We were getting a little overly memey in some places, so let's scale
things back to business-casual.
Informal language is fine in comments, commits and debug logs,
but let's keep the runtime nice and presentable. :^)
This patch adds a MappedROM abstraction to the Kernel VM subsystem.
It's basically the read-only byte buffer equivalent of a TypedMapping.
We use this in the ACPI and MP table parsers to scan for interesting
stuff in low memory instead of doing a bunch of address arithmetic.
Let's not be paying the function call overhead for these tiny ops.
Maybe there's an argument for having fewer gadgets in the kernel but
for now we're actually seeing stac() in profiles so let's put
that above theoretical security issues.
This was supposed to be the foundation for some kind of pre-kernel
environment, but nobody is working on it right now, so let's move
everything back into the kernel and remove all the confusion.
This patch adds PageFaultResponse::OutOfMemory which informs the fault
handler that we were unable to allocate a necessary physical page and
cannot continue.
In response to this, the kernel will crash the current process. Because
we are OOM, we can't symbolicate the crash like we normally would
(since the ELF symbolication code needs to allocate), so we also
communicate to Process::crash() that we're out of memory.
Now we can survive "allocate 300 MB" (only the allocate process dies.)
This is definitely not perfect and can easily end up killing a random
innocent other process who happened to allocate one page at the wrong
time, but it's a *lot* better than panicking on OOM. :^)
We currently only care about debug exceptions that are triggered
by the single-step execution mode.
The debug exception is translated to a SIGTRAP, which can be caught
and handled by the tracing thread.
Also, duplicate data in dbg() and klog() calls were removed.
In addition, leakage of virtual address to kernel log is prevented.
This is done by replacing kprintf() calls to dbg() calls with the
leaked data instead.
Also, other kprintf() calls were replaced with klog().