For "destructive" disallowance of allocations throughout the system,
Thread gains a member that controls whether allocations are currently
allowed or not. kmalloc checks this member on both allocations and
deallocations (with the exception of early boot) and panics the kernel
if allocations are disabled. This will allow for critical sections that
can't be allowed to allocate to fail-fast, making for easier debugging.
PS: My first proper Kernel commit :^)
The purpose of the PageDirectory::m_page_tables map was really just
to act as ref-counting storage for PhysicalPage objects that were
being used for the directory's page tables.
However, this was basically redundant, since we can find the physical
address of each page table from the page directory, and we can find the
PhysicalPage object from MemoryManager::get_physical_page_entry().
So if we just manually ref() and unref() the pages when they go in and
out of the directory, we no longer need PageDirectory::m_page_tables!
Not only does this remove a bunch of kmalloc() traffic, it also solves
a race condition that would occur when lazily adding a new page table
to a directory:
Previously, when MemoryManager::ensure_pte() would call HashMap::set()
to insert the new page table into m_page_tables, if the HashMap had to
grow its internal storage, it would call kmalloc(). If that kmalloc()
would need to perform heap expansion, it would end up calling
ensure_pte() again, which would clobber the page directory mapping used
by the outer invocation of ensure_pte().
The net result of the above bug would be that any invocation of
MemoryManager::ensure_pte() could erroneously return a pointer into
a kernel page table instead of the correct one!
This whole problem goes away when we remove the HashMap, as ensure_pte()
no longer does anything that allocates from the heap.
This was broken in commit 0a1b34c753 / PR #11687 since the buffer
descriptor list size was not page-aligned, and the new
`MM.allocate_dma_buffer_pages` expects a page-aligned size.
Add them in `<Kernel/API/Device.h>` and use these to provides
`{makedev,major,minor}` in `<sys/sysmacros.h>`. It aims to be more in
line with other Unix implementations and avoid code duplication in user
land.
Not all drivers need the PhysicalPage output parameter while creating
a DMA buffer. This overload will avoid creating a temporary variable
for the caller
The cacheable parameter to allocate_kernel_region should be explicitly
set to No as this region is used to do physical memory transfers. Even
though most architectures ignore this even if it is set, it is better
to make this explicit.
Two classes are added - HostBridge and MemoryBackedHostBridge, which
both derive from HostController class. This allows the kernel to map
different busses from different PCI domains in the same time. Each
HostController implementation doesn't take the Address object to address
PCI devices but instead we take distinct numbers of the PCI bus, device
and function as it allows us to specify arbitrary PCI domains in the
Address structure and still to get the correct PCI devices. This also
matches the hardware behavior of PCI domains - the host bridge merely
takes memory operations or IO operations and translates them to
addressing of three components - PCI bus, device and function.
These changes also greatly simplify how enumeration of Host Bridges work
now - scanning of the hardware depends on what the Host bridges can do
for us, so in case we have multiple host bridges that expose a memory
mapped region or IO ports to access PCI configuration space, we simply
let the code of the host bridge to figure out how to fetch data for us.
Another semantical change is that a PCI domain structure is no longer
attached to a PhysicalAddress, so even in the case that the machine
doesn't implement PCI domains, we still treat that machine to contain 1
PCI domain to treat that one host bridge in the same way, like with a
machine with one or more PCI domains.
FixedArray now doesn't expose any infallible constructors anymore.
Rather, it exposes fallible methods. Therefore, it can be used for
OOM-safe code.
This commit also converts the rest of the system to use the new API.
However, as an example, VMObject can't take advantage of this yet,
as we would have to endow VMObject with a fallible static
construction method, which would require a very fundamental change
to VMObject's whole inheritance hierarchy.
When writing into a block via an O_DIRECT open file description, we
would first flush every dirty block *except* the one we're about to
write into.
The purpose of flushing is to ensure coherency when mixing direct and
indirect accesses to the same file. This patch fixes the issue by only
flushing the affected block.
Our behavior was totally backwards and could end up doing a lot of
unnecessary work while avoiding the work that actually mattered.
Since this function always returns a CacheEntry& (after potentially
evicting someone else to make room), let's call it "ensure" instead of
"get" to match how we usually use these terms.
When doing the last unref() on a listed-ref-counted object, we keep
the list locked while mutating the ref count. The destructor itself
is invoked after unlocking the list.
This was racy with weakable classes, since their weak pointer factory
still pointed to the object after we'd decided to destroy it. That
opened a small time window where someone could try to strong-ref a weak
pointer to an object after it was removed from the list, but just before
the destructor got invoked.
This patch closes the race window by explicitly revoking all weak
pointers while the list is locked.
Although we can still consider this impossible to happen now, because
the mmap syscall entry code verifies that specified offset must be page
aligned, it's still a good practice to VERIFY we actually take a start
address as page-aligned in case of doing mmap on /dev/mem.
As for read(2) on /dev/mem, we don't map anything to userspace so it's
safe to read from whatever offset userspace specified as long as it does
not break the original rules of reading physical memory from /dev/mem.
So far we only had mmap(2) functionality on the /dev/mem device, but now
we can also do read(2) on it.
The test unit was updated to check we are doing it safely.
As it was pointed by Idan Horowitz, the rest of the method doesn't
assume we have any reserved ranges to allow mmap(2) to work on them, so
the VERIFY is not needed at all.
We should only look at the framebuffer structure members if the
MULTIBOOT_INFO_FRAMEBUFFER_INFO bit is set in the flags field.
Also add some logging if we ignored the fbdev command line argument
due to either not having a framebuffer provided by the bootloader, or
because we don't support the framebuffer format.
This allows forcing the use of only the framebuffer set up by the
bootloader and skips instantiating devices for any other graphics
cards that may be present.
Use the same trick as SlavePTY and override unref() to provide safe
removal from the sockets_by_tuple table when destroying a TCPSocket.
This should fix the TCPSocket::from_tuple() flake seen on CI.
Previously we were using this vector to store the inodes as we iterated.
However, we don't need to store all of them, just the previous inode, as
we know it will be safe to remove it once we've iterated past that
element.
The first byte of the EBDA structure contains the size of the EBDA
in 1 KiB units. We were incorrectly using the word at offset 0x413
of the BDA which specifies the number of KiB before the EBDA structure.
Contradictory to the comment above it, this while loop was actually
clearing the selectors above or equal to the edited one (instead of
the selectors that were skipped when the gdt was extended), this wasn't
really an issue so far, as all calls to this function did extend the
GDT, which meant this condition was always false, but future calls to
this function that will try to edit an existing entry would fail.