This enables the APIC timer on all CPUs, which means Scheduler::timer_tick
is now called on all CPUs independently. We still don't do anything on
the APs as it instantly crashes due to a number of other problems.
Similar to Process, we need to make Thread refcounted. This will solve
problems that will appear once we schedule threads on more than one
processor. This allows us to hold onto threads without necessarily
holding the scheduler lock for the entire duration.
The thread joining logic hadn't been updated to account for the subtle
differences introduced by software context switching. This fixes several
race conditions related to thread destruction and joining, as well as
finalization which did not properly account for detached state and the
fact that threads can be joined after termination as long as they're not
detached.
Fixes#3596
Fixes two flaws in the thread donation logic: Scheduler::donate_to
would never really donate, but just trigger a deferred yield. And
that deferred yield never actually donated to the beneficiary.
So, when we can't immediately donate, we need to save the beneficiary
and use this information as soon as we can perform the deferred
context switch.
Fixes#3495
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
We need to wait until a thread is fully set up and ready for running
before attempting to deliver a signal. Otherwise we may not have a
user stack yet.
Also, remove the Skip0SchedulerPasses and Skip1SchedulerPass thread
states that we don't really need anymore with software context switching.
Fixes the kernel crash reported in #3419
We need to always return from Thread::wait_on, even when a thread
is being killed. This is necessary so that the kernel call stack
can clean up and release references held by it. Then, right before
transitioning back to user mode, we check if the thread is
supposed to die, and at that point change the thread state to
Dying to prevent further scheduling of this thread.
This addresses some possible resource leaks similar to #3073
This compiles, and contains exactly the same bugs as before.
The regex 'FIXME: PID/' should reveal all markers that I left behind, including:
- Incomplete conversion
- Issues or things that look fishy
- Actual bugs that will go wrong during runtime
Allow passing in an optional timeout to Thread::block and move
the timeout check out of Thread::Blocker. This way all Blockers
implicitly support timeouts and don't need to implement it
themselves. Do however allow them to override timeouts (e.g.
for sockets).
We need to have a Thread lock to protect threading related
operations, such as Thread::m_blocker which is used in
Thread::block.
Also, if a Thread::Blocker indicates that it should be
unblocking immediately, don't actually block the Thread
and instead return immediately in Thread::block.
This fixes a regression introduced by the new software context
switching where the Kernel would not deliver a signal unless the
process is making system calls. This is because the TSS no longer
updates the CS value, so the scheduler never considered delivery
as the process always appeared to be in kernel mode. With software
context switching we can just set up the signal trampoline at
any time and when the processor returns back to user mode it'll
get executed. This should fix e.g. killing programs that are
stuck in some tight loop that doesn't make any system calls and
is only pre-empted by the timer interrupt.
Fixes#2958
By making the Process class RefCounted we don't really need
ProcessInspectionHandle anymore. This also fixes some race
conditions where a Process may be deleted while still being
used by ProcFS.
Also make sure to acquire the Process' lock when accessing
regions.
Last but not least, there's no reason why a thread can't be
scheduled while being inspected, though in practice it won't
happen anyway because the scheduler lock is held at the same
time.
Upon leaving a critical section (such as a SpinLock) we need to
check if we're already asynchronously invoking the Scheduler.
Otherwise we might end up triggering another context switch
as soon as leaving the scheduler lock.
Fixes#2883
This is something I've been meaning to do for a long time, and here we
finally go. This patch moves all sys$foo functions out of Process.cpp
and into files in Kernel/Syscalls/.
It's not exactly one syscall per file (although it could be, but I got
a bit tired of the repetitive work here..)
This makes hacking on individual syscalls a lot less painful since you
don't have to rebuild nearly as much code every time. I'm also hopeful
that this makes it easier to understand individual syscalls. :^)
We can now properly initialize all processors without
crashing by sending SMP IPI messages to synchronize memory
between processors.
We now initialize the APs once we have the scheduler running.
This is so that we can process IPI messages from the other
cores.
Also rework interrupt handling a bit so that it's more of a
1:1 mapping. We need to allocate non-sharable interrupts for
IPIs.
This also fixes the occasional hang/crash because all
CPUs now synchronize memory with each other.
These changes solve a number of problems with the software
context swithcing:
* The scheduler lock really should be held throughout context switches
* Transitioning from the initial (idle) thread to another needs to
hold the scheduler lock
* Transitioning from a dying thread to another also needs to hold
the scheduler lock
* Dying threads cannot necessarily be finalized if they haven't
switched out of it yet, so flag them as active while a processor
is running it (the Running state may be switched to Dying while
it still is actually running)
The Lock class still permits no reason, but for everything else
require a reason to be passed to Thread::wait_on. This makes it
easier to diagnose why a Thread is in Queued state.
When delivering urgent signals to the current thread
we need to check if we should be unblocked, and if not
we need to yield to another process.
We also need to make sure that we suppress context switches
during Process::exec() so that we don't clobber the registers
that it sets up (eip mainly) by a context switch. To be able
to do that we add the concept of a critical section, which are
similar to Process::m_in_irq but different in that they can be
requested at any time. Calls to Scheduler::yield and
Scheduler::donate_to will return instantly without triggering
a context switch, but the processor will then asynchronously
trigger a context switch once the critical section is left.
pselect() is similar() to select(), but it takes its timeout
as timespec instead of as timeval, and it takes an additional
sigmask parameter.
Change the sys$select parameters to match pselect() and implement
select() in terms of pselect().
In case WNOHANG was specified, we want to always set should_unblock to
true (which we do since commit 4402207b98), not
wait_finished -- the latter causes us to immediately return this child to our
caller, which is not what we want -- perhaps we should return another child
which has actually exited or stopped, or nobody at all.
To avoid confusion, also rename wait_finished to fits_the_spec.
This fixes service keepalive functionality in SystemServer.
We stopped using gettimeofday() in Core::EventLoop a while back,
in favor of clock_gettime() for monotonic time.
Maintaining an optimization for a syscall we're not using doesn't make
a lot of sense, so let's go back to the old-style sys$gettimeofday().
Previosuly, if we sent a SIGCONT to a stopped thread
and then waitpid() with WSTOPPED on that thread before
the signal was dispatched,
then the WaitBlocker would first unblock (because the thread is stopped)
and only after that the thread would get the SIGCONT signal.
This would mean that when waitpid returns
the waitee is not stopped.
To fix this, we do not unblock the waiting thread
if the waitee thread has a pending SIGCONT.
This new subsystem includes better abstractions of how time will be
handled in the OS. We take advantage of the existing RTC timer to aid
in keeping time synchronized. This is standing in contrast to how we
handled time-keeping in the kernel, where the PIT was responsible for
that function in addition to update the scheduler about ticks.
With that new advantage, we can easily change the ticking dynamically
and still keep the time synchronized.
In the process context, we no longer use a fixed declaration of
TICKS_PER_SECOND, but we call the TimeManagement singleton class to
provide us the right value. This allows us to use dynamic ticking in
the future, a feature known as tickless kernel.
The scheduler no longer does by himself the calculation of real time
(Unix time), and just calls the TimeManagment singleton class to provide
the value.
Also, we can use 2 new boot arguments:
- the "time" boot argument accpets either the value "modern", or
"legacy". If "modern" is specified, the time management subsystem will
try to setup HPET. Otherwise, for "legacy" value, the time subsystem
will revert to use the PIT & RTC, leaving HPET disabled.
If this boot argument is not specified, the default pattern is to try
to setup HPET.
- the "hpet" boot argumet accepts either the value "periodic" or
"nonperiodic". If "periodic" is specified, the HPET will scan for
periodic timers, and will assert if none are found. If only one is
found, that timer will be assigned for the time-keeping task. If more
than one is found, both time-keeping task & scheduler-ticking task
will be assigned to periodic timers.
If this boot argument is not specified, the default pattern is to try
to scan for HPET periodic timers. This boot argument has no effect if
HPET is disabled.
In hardware context, PIT & RealTimeClock classes are merely inheriting
from the HardwareTimer class, and they allow to use the old i8254 (PIT)
and RTC devices, managing them via IO ports. By default, the RTC will be
programmed to a frequency of 1024Hz. The PIT will be programmed to a
frequency close to 1000Hz.
About HPET, depending if we need to scan for periodic timers or not,
we try to set a frequency close to 1000Hz for the time-keeping timer
and scheduler-ticking timer. Also, if possible, we try to enable the
Legacy replacement feature of the HPET. This feature if exists,
instructs the chipset to disconnect both i8254 (PIT) and RTC.
This behavior is observable on QEMU, and was verified against the source
code:
ce967e2f33
The HPETComparator class is inheriting from HardwareTimer class, and is
responsible for an individual HPET comparator, which is essentially a
timer. Therefore, it needs to call the singleton HPET class to perform
HPET-related operations.
The new abstraction of Hardware timers brings an opportunity of more new
features in the foreseeable future. For example, we can change the
callback function of each hardware timer, thus it makes it possible to
swap missions between hardware timers, or to allow to use a hardware
timer for other temporary missions (e.g. calibrating the LAPIC timer,
measuring the CPU frequency, etc).
Now it actually defaults to "a < b" comparison, instead of forcing you
to provide a trivial less-than comparator. Also you can pass in any
collection type that has .begin() and .end() and we'll sort it for you.
We don't have to log the process name/PID/TID, dbg() automatically adds
that as a prefix to every line.
Also we don't have to do .characters() on Strings passed to dbg() :^)
This allows a process wich has more than 1 thread to call exec, even
from a thread. This kills all the other threads, but it won't wait for
them to finish, just makes sure that they are not in a running/runable
state.
In the case where a thread does exec, the new program PID will be the
thread TID, to keep the PID == TID in the new process.
This introduces a new function inside the Process class,
kill_threads_except_self which is called on exit() too (exit with
multiple threads wasn't properly working either).
Inside the Lock class, there is the need for a new function,
clear_waiters, which removes all the waiters from the
Process::big_lock. This is needed since after a exit/exec, there should
be no other threads waiting for this lock, the threads should be simply
killed. Only queued threads should wait for this lock at this point,
since blocked threads are handled in set_should_die.