If init crashes, all other userspace processes exit too, thus rendering
the system unusable. Previously, the kernel would still keep running
even without a userland, showing just a black screen without any
indication of the issue.
We now panic the kernel, which shows a message on the console. In the
case of the CI runners, it shuts down the virtual machine, so we don't
have to wait for the 1 hour timeout if an issue arises with
SystemServer.
To declare that we don't have a PCI bus in the system we do two things:
1. Probe IO ports before enabling access -
In case we are using the QEMU ISA-PC machine type, IO probing results in
floating bus condition (returning 0xFF values), thus, we know we don't
have PCI bus on the system.
2. Allow the user to specify to not use the PCI bus at all in the kernel
commandline.
This change allow the user to request the kernel to not use any PCI
resources/devices at all.
Also, don't try to initialize devices that rely on PCI if disabled.
Arguments larger than 32bit need to be passed as a pointer on a 32bit
architectures. sys$profiling_enable has u64 event_mask argument,
which means that it needs to be passed as an pointer. Previously upper
32bits were filled by garbage.
We have 3 new components:
1. The AudioManagement singleton. This class like in other subsystems,
is responsible to find hardware audio controllers and keep a reference
to them.
2. AudioController class - this class is the parent class for hardware
controllers like the Sound Blaster 16 or Intel 82801AA (AC97). For now,
this class has simple interface for getting and controlling sample rate
of audio channels, as well a write interface for specific audio channel
but not reading from it. One AudioController object might have multiple
AudioChannel "child" objects to hold with reference counting.
3. AudioChannel class - this is based on the CharacterDevice class, and
represents hardware PCM audio channel. It facilitates an ioctl interface
which should be consistent across all supported hardware currently.
It has a weak reference to a parent AudioController, and when trying to
write to a channel, it redirects the data to the parent AudioController.
Each audio channel device should be added into a new directory under the
/dev filesystem called "audio".
If the bootloader that loaded us is providing a framebuffer details from
the Multiboot protocol then we can instantiate a framebuffer console.
Otherwise, we should use a text mode console, assuming that the BIOS and
the bootloader didn't try to modeset the screen resolution so we have is
a VGA 80x25 text mode being displayed on screen.
Since "boot_framebuffer_console" is no longer a good representative as a
global variable name, it's changed to g_boot_console to match the fact
that it can be assigned with a text mode console and not framebuffer
console if needed.
Instead of seeing a black screen until GraphicsManagement was fully
initialized, this allows us to see the console output much earlier.
So, if the bootloader provided us with a framebuffer, set up a console
as early as possible.
Add polling support to NVMe so that it does not use interrupt to
complete a IO but instead actively polls for completion. This probably
is not very efficient in terms of CPU usage but it does not use
interrupts to complete a IO which is beneficial at the moment as there
is no MSI(X) support and it can reduce the latency of an IO in a very
fast NVMe device.
The NVMeQueue class has been made the base class for NVMeInterruptQueue
and NVMePollQueue. The factory function `NVMeQueue::try_create` will
return the appropriate queue to the controller based on the polling
boot parameter.
The polling mode can be enabled by adding an extra boot parameter:
`nvme_poll`.
There's no need to perform it this early, and until the MemoryManager
is initialized we have very limited kmalloc capacity, so let's try and
keep anything that's not required to be there out of there.
This device will assist userspace to manage hotplug events.
A userspace application reads a DeviceEvent entry until the return value
is zero which indicates no events that are queued and waiting for
processing.
Trying to read with a buffer smaller than sizeof(DeviceEvent) results in
EOVERFLOW.
For now, there's no ioctl mechanism for this device but in the future an
acknowledgement mechanism can be implemented via ioctl(2) interface.
Currently the APIC class is constructed irrespective of whether it
is used or not.
So, move APIC initialization from init to the InterruptManagement
class and construct the APIC class only when it is needed.
The function to protect ksyms after initialization, is only used during
boot of the system, so it can be UNMAP_AFTER_INIT as well.
This requires we switch the order of the init sequence, so we now call
`MM.protect_ksyms_after_init()` before `MM.unmap_text_after_init()`.
The Prekernel's memory is only accessed until MemoryManager has been
initialized. Keeping them around afterwards is both unnecessary and bad,
as it prevents the userland from using the 0x100000-0x155000 virtual
address range.
Co-authored-by: Idan Horowitz <idan.horowitz@gmail.com>
Since this range is mapped in already in the kernel page directory, we
can initialize it before jumping into the first kernel process which
lets us avoid mapping in the range into init_stage2's address space.
This brings us half-way to removing the shared bottom 2 MiB mapping in
every process, leaving only the Prekernel.
We can leave the .ksyms section mapped-but-read-only and then have the
symbols index simply point into it.
Note that we manually insert null-terminators into the symbols section
while parsing it.
This gets rid of ~950 KiB of kmalloc_eternal() at startup. :^)
This small change allows to use the IOAPIC by default without to enable
SMP mode, which emulates Uni-Processor setup with IOAPIC instead of
using the PIC.
This opens the opportunity to utilize other types of interrupts like MSI
and MSI-X interrupts.
The platform independent Processor.h file includes the shared processor
code and includes the specific platform header file.
All references to the Arch/x86/Processor.h file have been replaced with
a reference to Arch/Processor.h.
This singleton simplifies many aspects that we struggled with before:
1. There's no need to make derived classes of Device expose the
constructor as public anymore. The singleton is a friend of them, so he
can call the constructor. This solves the issue with try_create_device
helper neatly, hopefully for good.
2. Getting a reference of the NullDevice is now being done from this
singleton, which means that NullDevice no longer needs to use its own
singleton, and we can apply the try_create_device helper on it too :)
3. We can now defer registration completely after the Device constructor
which means the Device constructor is merely assigning the major and
minor numbers of the Device, and the try_create_device helper ensures it
calls the after_inserting method immediately after construction. This
creates a great opportunity to make registration more OOM-safe.
Both should reside in the SysFS firmware directory which is normally
located in /sys/firmware.
Also, apply some OOM-safety patterns when creating the BIOS and ACPI
directories.
This will somwhat help unify them also under the same SysFS directory in
the commit.
Also, it feels much more like this change reflects the reality that both
ACPI and the BIOS are part of the firmware on x86 computers.
Let's remove the DynamicParser class, as it really did nothing yet in
the Kernel. Instead, when we add support for AML parsing, we can figure
out how to do it properly without the need of a derived class that just
complicates everything for no good reason.
This is really a basic support for AHCI hotplug events, so we know how
to add a node representing the device in /sys/dev/block and removing it
according to the event type (insertion/removal).
This change doesn't take into account what happens if the device was
mounted or a read/write operation is being handled.
For this to work correctly, StorageManagement now uses the Singleton
container, as it might be accessed simultaneously from many CPUs
for hotplug events. DiskPartition holds a WeakPtr instead of a RefPtr,
to allow removal of a StorageDevice object from the heap.
StorageDevices are now stored and being referenced to via an
IntrusiveList to make it easier to remove them on hotplug event.
In future changes, all of the stated above might change, but for now,
this commit represents the least amount of changes to make everything
to work correctly.
These files are not marked as block devices or character devices so they
are not meant to be used as device nodes. The filenames are formatted to
the pattern "major:minor", but a Userland program need to call the parse
these format and inspect the the major and minor numbers and create the
real device nodes in /dev.
Later on, it might be a good idea to ensure we don't create new
SysFSComponents on the heap for each Device, but rather generate
them only when required (and preferably to not create a SysFSComponent
at all if possible).
This function is currently only ever used to create the init process
(SystemServer). It had a few idiosyncratic things about it that this
patch cleans up:
- Errors were returned in an int& out-param.
- It had a path for non-0 process PIDs which was never taken.
Prior to this change, both uid_t and gid_t were typedef'ed to `u32`.
This made it easy to use them interchangeably. Let's not allow that.
This patch adds UserID and GroupID using the AK::DistinctNumeric
mechanism we've already been employing for pid_t/ProcessID.
This has several benefits:
1) We no longer just blindly derefence a null pointer in various places
2) We will get nicer runtime error messages if the current process does
turn out to be null in the call location
3) GCC no longer complains about possible nullptr dereferences when
compiling without KUBSAN
This patch does three things:
- Convert the global thread list from a HashMap to an IntrusiveList
- Combine the thread list and its lock into a SpinLockProtectedValue
- Customize Thread::unref() so it locks the list while unreffing
This closes the same race window for Thread as @sin-ack's recent changes
did for Process.
Note that the HashMap->IntrusiveList conversion means that we lose O(1)
lookups, but the majority of clients of this list are doing traversal,
not lookup. Once we have an intrusive hashing solution, we should port
this to use that, but for now, this gets rid of heap allocations during
a sensitive time.
We are not using this for anything and it's just been sitting there
gathering dust for well over a year, so let's stop carrying all this
complexity around for no good reason.
This removes Pipes dependency on the UHCIController by introducing a
controller base class. This will be used to implement other controllers
such as OHCI.
Additionally, there can be multiple instances of a UHCI controller.
For example, multiple UHCI instances can be required for systems with
EHCI controllers. EHCI relies on using multiple of either UHCI or OHCI
controllers to drive USB 1.x devices.
This means UHCIController can no longer be a singleton. Multiple
instances of it can now be created and passed to the device and then to
the pipe.
To handle finding and creating these instances, USBManagement has been
introduced. It has the same pattern as the other management classes
such as NetworkManagement.
The Clang error message reads like this (`-Wdeprecated-array-compare`):
> error: comparison between two arrays is deprecated; to compare
> array addresses, use unary '+' to decay operands to pointers.
When I laid down the foundation for the start of the big process lock
separation, I added asserts to all system call implementations to
validate we hold the big process lock in the locations we think we
should be.
Adding that assert to sys$profiling_enable broke boot time profiling as
we were never holding the lock on boot. Even though it's not technically
required, lets make sure to hold the lock while enabling to appease the
assert.
This enables further work on implementing KASLR by adding relocation
support to the pre-kernel and updating the kernel to be less dependent
on specific virtual memory layouts.
This allows us to specify virtual addresses for things the kernel should
access via virtual addresses later on. By doing this we can make the
kernel independent from specific physical addresses.
Previously the kernel relied on a fixed offset between virtual and
physical addresses based on the kernel's load address. This allows us
to specify an independent offset.