serenity

mirror of https://github.com/SerenityOS/serenity.git synced 2025-01-26 19:32:06 -05:00

Author	SHA1	Message	Date
Andreas Kling	3623e35978	Kernel: Oops, actually enable CR4.PGE (page table global bit) Turns out we were setting the wrong bit here. Now we will actually keep kernel memory mappings in the TLB across context switches.	2019-12-24 22:45:27 +01:00
Andreas Kling	ae2d72377d	Kernel: Enable the x86 WP bit to catch invalid memory writes in ring 0 Setting this bit will cause the CPU to generate a page fault when writing to read-only memory, even if we're executing in the kernel. Seemingly the only change needed to make this work was to have the inode-backed page fault handler use a temporary mapping for writing the read-from-disk data into the newly-allocated physical page.	2019-12-21 16:21:13 +01:00
Andreas Kling	62c2309336	Kernel: Fix some warnings about passing non-POD to kprintf	2019-12-20 20:19:46 +01:00
Andreas Kling	b6ee8a2c8d	Kernel: Rename vmo => vmobject everywhere	2019-12-19 19:15:27 +01:00
Andreas Kling	1d4d6f16b2	Kernel: Add a specific-page variant of Region::commit()	2019-12-18 22:43:32 +01:00
Andreas Kling	0a75a46501	Kernel: Make sure the kernel info page is read-only for userspace To enforce this, we create two separate mappings of the same underlying physical page. A writable mapping for the kernel, and a read-only one for userspace (the one returned by sys$get_kernel_info_page.)	2019-12-15 22:21:28 +01:00
Andreas Kling	931e4b7f5e	Kernel+SystemMonitor: Prevent userspace access to process ELF image Every process keeps its own ELF executable mapped in memory in case we need to do symbol lookup (for backtraces, etc.) Until now, it was mapped in a way that made it accessible to the program, despite the program not having mapped it itself. I don't really see a need for userspace to have access to this right now, so let's lock things down a little bit. This patch makes it inaccessible to userspace and exposes that fact through /proc/PID/vm (per-region "user_accessible" flag.)	2019-12-15 20:11:57 +01:00
Andreas Kling	05a441afb2	Kernel: Don't turn private read-only regions into shared ones on fork Even if they are read-only now, they can be mprotect(PROT_WRITE)'d in the future, so we have to make sure they are CoW mapped.	2019-12-15 16:53:46 +01:00
Andreas Kling	3fbc50a350	Kernel+SystemMonitor: Expose the number of set CoW bits in each Region This number tells us how many more pages in a given region will trigger a CoW fault if written to.	2019-12-15 16:53:00 +01:00
Andreas Kling	9ad151c665	Kernel: Improve comment about the system virtual memory map a bit	2019-12-15 16:13:08 +01:00
Andreas Kling	65229a4082	Kernel: Move VMObject::for_each_region() to MemoryManager.h It can't be in VMObject.h since it depends on MemoryManager.h	2019-12-09 20:06:03 +01:00
Andreas Kling	a22b7f96fc	Kernel: Remap all regions referring to a PurgeableVMObject on purge Otherwise we won't get page faults next time you try to access the purged memory.	2019-12-09 20:05:04 +01:00
Andreas Kling	dbb644f20c	Kernel: Start implementing purgeable memory support It's now possible to get purgeable memory by using mmap(MAP_PURGEABLE). Purgeable memory has a "volatile" flag that can be set using madvise(): - madvise(..., MADV_SET_VOLATILE) - madvise(..., MADV_SET_NONVOLATILE) When in the "volatile" state, the kernel may take away the underlying physical memory pages at any time, without notifying the owner. This gives you a guilt discount when caching very large things. :^) Setting a purgeable region to non-volatile will return whether or not the memory has been taken away by the kernel while being volatile. Basically, if madvise(..., MADV_SET_NONVOLATILE) returns 1, that means the memory was purged while volatile, and whatever was in that piece of memory needs to be reconstructed before use.	2019-12-09 19:12:38 +01:00
Andreas Kling	05c65fb4f1	Kernel: Don't CoW non-writable pages A page fault in a page marked for CoW should not trigger a CoW if the page is non-writable. I think this makes sense.	2019-12-02 19:20:09 +01:00
Andreas Kling	f41ae755ec	Kernel: Crash on memory access in non-readable regions This patch makes it possible to make memory regions non-readable. This is enforced using the "present" bit in the page tables. A process that hits an not-present page fault in a non-readable region will be crashed.	2019-12-02 19:18:52 +01:00
Andreas Kling	7dc9c90f83	Kernel: Fix bug where mprotect() would ignore setting PROT_WRITE A typo in Region::set_writable() caused us to update the readable flag rather than the writable flag.	2019-12-02 18:15:36 +01:00
Andreas Kling	cde0a1eeb5	Kernel: Put some debug spam behind PAGE_FAULT_DEBUG	2019-12-01 16:03:24 +01:00
Andreas Kling	e56daf547c	Kernel: Disallow syscalls from writeable memory Processes will now crash with SIGSEGV if they attempt making a syscall from PROT_WRITE memory. This neat idea comes from OpenBSD. :^)	2019-11-29 16:30:05 +01:00
Andreas Kling	2d1bcce34a	Kernel: Fix triple-fault when clicking on SystemServer in SystemMonitor The fault was happening when retrieving a current backtrace for the SystemServer process. To generate a backtrace, we go into the paging scope of the process, meaning we temporarily switch to using its page directory as our own. Because kernel VM is allocated on demand, it's possible for a process's mappings above the 3GB mark to be out-of-date. Normally this just gets fixed up transparently by the page fault handler (which simply copies the PDE from the canonical MM.kernel_page_directory() into the current process.) However, if the current kernel stack is in a piece of memory that the backtraced process lacks up-to-date PDE's for, we still get a page fault, but are unable to handle it, since the CPU wants to push to the stack as part of calling the page fault handler. So we're screwed and it's a triple-fault. Fix this by always updating the kernel VM mappings before switching into a paging scope. In practical terms, this is a 1KB memcpy() that happens when generating a backtrace, or doing exec().	2019-11-27 12:40:42 +01:00
Andreas Kling	5b8cf2ee23	Kernel: Make syscall counters and page fault counters per-thread Now that we show individual threads in SystemMonitor and "top", it's also very nice to have individual counters for the threads. :^)	2019-11-26 21:37:38 +01:00
Andreas Kling	3dc87be891	Kernel: Mark mmap()-created regions with a special bit Then only allow regions with that bit to be manipulated via munmap() and mprotect(). This prevents messing with non-mmap()ed regions in a process's address space (stacks, shared buffers, ...)	2019-11-24 12:26:21 +01:00
Andreas Kling	9a157b5e81	Revert "Kernel: Move Kernel mapping to 0xc0000000" This reverts commit `bd33c66273`. This broke the network card drivers, since they depended on kmalloc addresses being identity-mapped.	2019-11-23 17:27:09 +01:00
Jesse Buhagiar	bd33c66273	Kernel: Move Kernel mapping to 0xc0000000 The kernel is now no longer identity mapped to the bottom 8MiB of memory, and is now mapped at the higher address of `0xc0000000`. The lower ~1MiB of memory (from GRUB's mmap), however is still identity mapped to provide an easy way for the kernel to get physical pages for things such as DMA etc. These could later be mapped to the higher address too, as I'm not too sure how to go about doing this elegantly without a lot of address subtractions.	2019-11-22 16:23:23 +01:00
Andreas Kling	794758df3a	Kernel: Implement some basic stack pointer validation VM regions can now be marked as stack regions, which is then validated on syscall, and on page fault. If a thread is caught with its stack pointer pointing into anything that's not a Region with its stack bit set, we'll crash the whole process with SIGSTKFLT. Userspace must now allocate custom stacks by using mmap() with the new MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now we have it too, yay! :^)	2019-11-17 12:15:43 +01:00
Liav A	bce510bf6f	Kernel: Fix the search method of free userspace physical pages (#742 ) Now the userspace page allocator will search through physical regions, and stop the search as it finds an available page. Also remove an "address of" sign since we don't need that when counting size of physical regions	2019-11-08 22:39:29 +01:00
supercomputer7	c3c905aa6c	Kernel: Removing hardcoded offsets from Memory Manager Now the kernel page directory and the page tables are located at a safe address, to prevent from paging data colliding with garbage.	2019-11-08 17:38:23 +01:00
Andreas Kling	19398cd7d5	Kernel: Reorganize memory layout a bit Move the kernel image to the 1 MB physical mark. This prevents it from colliding with stuff like the VGA memory. This was causing us to end up with the BIOS screen contents sneaking into kernel memory sometimes. This patch also bumps the kmalloc heap size from 1 MB to 3 MB. It's not the perfect permanent solution (obviously) but it should get the OOM monkey off our backs for a while.	2019-11-04 12:04:35 +01:00
Andreas Kling	a6e9119537	Kernel: Tweak some outdated kprintfs in Region	2019-11-04 00:48:45 +01:00
Andreas Kling	d67c6a92db	Kernel: Move page fault handling from MemoryManager to Region After the page fault handler has found the region in which the fault occurred, do the rest of the work in the region itself. This patch also makes all fault types consistently crash the process if a new page is needed but we're all out of pages.	2019-11-04 00:47:03 +01:00
Andreas Kling	0e8f1d7cb6	Kernel: Don't expose a region's page directory to the outside world Now that region manages its own mapping/unmapping, there's no need for the outside world to be able to grab at its page directory.	2019-11-04 00:26:00 +01:00
Andreas Kling	6ed9cc4717	Kernel: Remove Region API's for setting/unsetting the page directory This is done implicitly by mapping or unmapping the region.	2019-11-04 00:24:20 +01:00
Andreas Kling	e3dda4e87b	Kernel: Fix weird Region constructor that took nullable RefPtr<Inode> It's never valid to construct a Region with a null Inode pointer using this constructor, so just take a NonnullRefPtr<Inode> instead.	2019-11-04 00:21:08 +01:00
Andreas Kling	9b2dc36229	Kernel: Merge MemoryManager::map_region_at_address() into Region::map()	2019-11-04 00:05:57 +01:00
Andreas Kling	98b328754e	Kernel: Fix bad setup of CoW faults for offset regions Regions with an offset into their VMObject were incorrectly adding the page offset when indexing into the CoW bitmap.	2019-11-03 23:54:35 +01:00
Andreas Kling	5b7f8634e3	Kernel: Set the G (global) bit for kernel page tables Since the kernel page tables are shared between all processes, there's no need to (implicitly) flush the TLB for them on every context switch. Setting the G bit on kernel page tables allows the CPU to keep the translation caches around.	2019-11-03 23:51:55 +01:00
Andreas Kling	4bf1a72d21	Kernel: Teach Region how to remap itself Now remapping (i.e flushing kernel metadata to the CPU page tables) is done by simply calling Region::remap().	2019-11-03 21:11:08 +01:00
Andreas Kling	3dce0f23f4	Kernel: Regions should be mapped into a PageDirectory, not a Process This patch changes the parameter to Region::map() to be a PageDirectory since that matches how we think about the memory model: Regions are views onto VMObjects, and are mapped into PageDirectories. Each Process has a PageDirectory. The kernel also has a PageDirectory.	2019-11-03 21:11:08 +01:00
Andreas Kling	2cfc43c982	Kernel: Move region map/unmap operations into the Region class The more Region can take care of itself, the better.	2019-11-03 21:11:08 +01:00
Andreas Kling	a221cddeec	Kernel: Clean up a bunch of wrong-looking Region/VMObject code Since a Region is merely a "window" onto a VMObject, it can both begin and end at a distance from the VMObject's boundaries. Therefore, we should always be computing indices into a VMObject's physical page array by adding the Region's "first_page_index()". There was a whole bunch of code that forgot to do that. This fixes many wrong behaviors for Regions that start part-way into a VMObject.	2019-11-03 15:44:13 +01:00
Andreas Kling	fe455c5ac4	Kernel: Move page remapping into Region::remap_page(index) Let Region deal with this, instead of everyone calling MemoryManager.	2019-11-03 15:32:11 +01:00
Andreas Kling	b0321bf290	Kernel: Zero-fill faults should not temporarily enable interrupts We were doing a temporary STI/CLI in MemoryManager::zero_page() to be able to acquire the VMObject's lock before zeroing out a page. This logic was inherited from the inode fault handler, where we need to enable interrupts anyway, since we might need to interact with the underlying storage device. Zero-fill faults don't actually need to lock the VMObject, since they are already guaranteed exclusivity by interrupts being disabled when entering the fault handler. This is different from inode faults, where a second thread can often get an inode fault for the same exact page in the same VMObject before the first fault handler has received a response from the disk. This is why the lock exists in the first place, to prevent this race. This fixes an intermittent crash in sys$execve() that was made much more visible after I made userspace stacks lazily allocated.	2019-11-01 17:59:47 +01:00
Tom	00a7c48d6e	APIC: Enable APIC and start APs	2019-10-16 19:14:02 +02:00
Andreas Kling	35138437ef	Kernel+SystemMonitor: Add fault counters This patch adds three separate per-process fault counters: - Inode faults An inode fault happens when we've memory-mapped a file from disk and we end up having to load 1 page (4KB) of the file into memory. - Zero faults Memory returned by mmap() is lazily zeroed out. Every time we have to zero out 1 page, we count a zero fault. - CoW faults VM objects can be shared by multiple mappings that make their own unique copy iff they want to modify it. The typical reason here is memory shared between a parent and child process.	2019-10-02 14:13:49 +02:00
Andreas Kling	d481ae95b5	Kernel: Defer creation of Region CoW bitmaps until they're needed Instead of allocating and populating a Copy-on-Write bitmap for each Region up front, wait until we actually clone the Region for sharing with another process. In most cases, we never need any CoW bits and we save ourselves a lot of kmalloc() memory and time.	2019-10-01 19:58:41 +02:00
Andreas Kling	c58d1868cb	Kernel: Fix munmap() bad splitting of already-split Regions When splitting an Region that's already the result of an earlier split, we have to take the Region's offset-in-VMObject into account since it may be non-zero.	2019-10-01 11:40:40 +02:00
Andreas Kling	ac20919b13	Kernel: Make it possible to turn off VM guard pages at compile time This might be useful for debugging since guard pages introduce a fair amount of noise in the virtual address space.	2019-09-30 17:22:16 +02:00
Conrad Pankoff	fa20a447a9	Kernel: Repair unaligned regions supplied by the boot loader We were just blindly trusting that the bootloader would only give us page-aligned memory regions. This is apparently not always the case, so now we can try to repair those regions. Fixes #601	2019-09-28 09:23:52 +02:00
Andreas Kling	2584636d19	Kernel: Fix partial munmap() deallocating still-in-use VM We were always returning the full VM range of the partially-unmapped Region to the range allocator. This caused us to re-use those addresses for subsequent VM allocations. This patch also skips creating a new VMObject in partial munmap(). Instead we just make split regions that point into the same VMObject. This fixes the mysterious GCC ICE on large C++ programs.	2019-09-27 20:21:52 +02:00
Andreas Kling	7f9a33dba1	Kernel: Make Region single-owner instead of ref-counted This simplifies the ownership model and makes Region easier to reason about. Userspace Regions are now primarily kept by Process::m_regions. Kernel Regions are kept in various OwnPtr<Regions>'s. Regions now only ever get unmapped when they are destroyed.	2019-09-27 14:25:42 +02:00
Andreas Kling	9c549c178a	Kernel: Pad virtual address space allocations with guard pages Put one unused page on each side of VM allocations to make invalid accesses more likely to generate crashes. Note that we will not add this guard padding for mmap() at a specific memory address, only to "mmap it anywhere" requests.	2019-09-22 15:12:29 +02:00

1 2 3

142 commits