We can't use deferred functions for anything that may require preemption,
such as copying from/to user or accessing the disk. For those purposes
we should use a work queue, which is essentially a kernel thread that
may be preempted or blocked.
These errors are classed as fatal, so we need to recover from them.
Found while trying to debug AHCI boot on VMware Player,
where I got TFES.
From the spec: "Fatal errors (signified by the setting of PxIS.HBFS,
PxIS.HBDS, PxIS.IFS, or PxIS.TFES) will cause the HBA to enter
the ERR:Fatal state"
We were already recovering from IFS.
Instead of blindly resetting every AHCI port, let's just reset only the
controller by default. The user can still request to reset everything
with a new kernel boot argument called ahci_reset_mode which is set
by default to "controller", so the code will only invoke an HBA reset.
This kernel boot argument can be set to 3 different values:
1. "controller" - reset the HBA and skip resetting AHCI ports
2. "none" - don't reset anything, so we rely on the firmware to
initialize the AHCI HBA and ports for us.
3. "complete" - reset the AHCI HBA and ports.
Instead of waiting for the AHCI HBA to reset the signature after SATA
reset sequence, let's just check if the Port x Serial ATA Status
register was set to value 3, indicating that device was detected
and phy communication was established.
The intention is to make the boot to be faster, therefore we should
decrease the time deltas in timeout loops to allow earlier break
from these.
Also, there's no need to wait 10 milliseconds before setting
the interface state to "no action request" during the reset sequence.
This class is used in the AHCI code to handle a big request of
read/write to the disk. If we happen to encounter such request,
we will get the needed amount of physical pages from the
already-allocated physical pages in AHCIPort, and with that we
will create a ScatterList that will create a Region that maps
all of these pages in a contiguous virtual memory range.
Then, we could easily copy to/from this range, before and after
calling the operation on the StorageDevice as needed with
read or write operations.
The hierarchy is AHCIController, AHCIPortHandler, AHCIPort and
SATADiskDevice. Each AHCIController has at least one AHCIPortHandler.
An AHCIPortHandler is an interrupt handler that takes care of
enumeration of handled AHCI ports when an interrupt occurs. Each
AHCIPort takes care of one SATADiskDevice, and later on we can add
support for Port multiplier.
When we implement support of Message signalled interrupts, we can spawn
many AHCIPortHandlers, and allow each one of them to be responsible for
a set of AHCIPorts.
Previously all of the CommandLine parsing was spread out around the
Kernel. Instead move it all into the Kernel CommandLine class, and
expose a strongly typed API for querying the state of options.
This may seem like a no-op change, however it shrinks down the Kernel by a bit:
.text -432
.unmap_after_init -60
.data -480
.debug_info -673
.debug_aranges 8
.debug_ranges -232
.debug_line -558
.debug_str -308
.debug_frame -40
With '= default', the compiler can do more inlining, hence the savings.
I intentionally omitted some opportunities for '= default', because they
would increase the Kernel size.
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
If we try to align a number above 0xfffff000 to the next multiple of
the page size (4 KiB), it would wrap around to 0. This is most likely
never what we want, so let's assert if that happens.
This change can be actually seen as two logical changes, the first
change is about to ensure we only read the ATA Status register only
once, because if we read it multiple times, we acknowledge interrupts
unintentionally. To solve this issue, we always use the alternate Status
register and only read the original status register in the IRQ handler.
The second change is how we handle interrupts - if we use DMA, we can
just complete the request and return from the IRQ handler. For PIO mode,
it's more complicated. For PIO write operation, after setting the ATA
registers, we send out the data to IO port, and wait for an interrupt.
For PIO read operation, we set the ATA registers, and wait for an
interrupt to fire, then we just read from the data IO port.
This replaces the current disk detection and disk access code with
code based on https://wiki.osdev.org/IDE
This allows the system to boot on VirtualBox with serial debugging
enabled and VMWare Player.
I believe there were several issues with the current code:
- It didn't utilise the last 8 bits of the LBA in 24-bit mode.
- {read,write}_sectors_with_dma was not setting the obsolete bits,
which according to OSdev wiki aren't used but should be set.
- The PIO and DMA methods were using slightly different copy
and pasted access code, which is now put into a single
function called "ata_access"
- PIO mode doesn't work. This doesn't fix that and should
be looked into in the future.
- The detection code was not checking for ATA/ATAPI.
- The detection code accidentally had cyls/heads/spt as 8-bit,
when they're 16-bit.
- The capabilities of the device were not considered. This is now
brought in and is currently used to check if the device supports
LBA. If not, use CHS.
This was done with the help of several scripts, I dump them here to
easily find them later:
awk '/#ifdef/ { print "#cmakedefine01 "$2 }' AK/Debug.h.in
for debug_macro in $(awk '/#ifdef/ { print $2 }' AK/Debug.h.in)
do
find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/#ifdef '$debug_macro'/#if '$debug_macro'/' {} \;
done
# Remember to remove WRAPPER_GERNERATOR_DEBUG from the list.
awk '/#cmake/ { print "set("$2" ON)" }' AK/Debug.h.in
Since devices are enumerable and can compute their own name inside the
/dev hierarchy, there is no need to try and parse "root=/dev/xxx" by
hand.
This also makes any block device a candidate for the boot device, which
now includes ramdisk devices, so SerenityOS can now boot diskless too.
The disk image generated for QEMU is suitable, as long as it fits in
memory with room to spare for the rest of the system.
Besides removing the monolithic DevFSDeviceInode::determine_name()
method, being able to determine a device's name inside the /dev
hierarchy outside of DevFS has its uses.
..and allow implicit creation of KResult and KResultOr from ErrnoCode.
This means that kernel functions that return those types can finally
do "return EINVAL;" and it will just work.
There's a handful of functions that still deal with signed integers
that should be converted to return KResults.
Problem:
- Many constructors are defined as `{}` rather than using the ` =
default` compiler-provided constructor.
- Some types provide an implicit conversion operator from `nullptr_t`
instead of requiring the caller to default construct. This violates
the C++ Core Guidelines suggestion to declare single-argument
constructors explicit
(https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit).
Solution:
- Change default constructors to use the compiler-provided default
constructor.
- Remove implicit conversion operators from `nullptr_t` and change
usage to enforce type consistency without conversion.