Setting this bit will cause the CPU to generate a page fault when
writing to read-only memory, even if we're executing in the kernel.
Seemingly the only change needed to make this work was to have the
inode-backed page fault handler use a temporary mapping for writing
the read-from-disk data into the newly-allocated physical page.
Use simple stack cookies to try to provoke an assertion failure on
stack overflow.
This is far from perfect, since we use a constant cookie instead of
generating a random one on startup, but it can still help us catch
bugs, which is the primary concern right now. :^)
Allow everything to be built from the top level directory with just
'make', cleaned with 'make clean', and installed with 'make
install'. Also support these in any particular subdirectory.
Specifying 'make VERBOSE=1' will print each ld/g++/etc. command as
it runs.
Kernel and early host tools (IPCCompiler, etc.) are built as
object.host.o so that they don't conflict with other things built
with the cross-compiler.
Toolchain build makes git repo out of toolchain to allow patching
Fix Makefiles to use new libstdc++
Parameterize BuildIt with default TARGET of i686 but arm is experimental
While setting up the main thread stack for a new process, we'd incur
some zero-fill page faults. This was to be expected, since we allocate
a huge stack but lazily populate it with physical pages.
The problem is that page fault handlers may enable interrupts in order
to grab a VMObject lock (or to page in from an inode.)
During exec(), a process is reorganizing itself and will be in a very
unrunnable state if the scheduler should interrupt it and then later
ask it to run again. Which is exactly what happens if the process gets
pre-empted while the new stack's zero-fill page fault grabs the lock.
This patch fixes the issue by creating new main thread stacks before
disabling interrupts and going into the critical part of exec().
This patch introduces the second MenuApplet: Audio. To make this work,
menu applet windows now also receive mouse events.
There's still some problem with mute/unmute via clicking not actually
working, but the call goes from the applet program over IPC to the
AudioServer, where something goes wrong with the state change message.
Need to look at that separately.
Anyways, it's pretty cool to have more applets running in their own
separate processes. :^)
To enforce this, we create two separate mappings of the same underlying
physical page. A writable mapping for the kernel, and a read-only one
for userspace (the one returned by sys$get_kernel_info_page.)
This patch adds a single "kernel info page" that is mappable read-only
by any process and contains the current time of day.
This is then used to implement a version of gettimeofday() that doesn't
have to make a syscall.
To protect against race condition issues, the info page also has a
serial number which is incremented whenever the kernel updates the
contents of the page. Make sure to verify that the serial number is the
same before and after reading the information you want from the page.
Every process keeps its own ELF executable mapped in memory in case we
need to do symbol lookup (for backtraces, etc.)
Until now, it was mapped in a way that made it accessible to the
program, despite the program not having mapped it itself.
I don't really see a need for userspace to have access to this right
now, so let's lock things down a little bit.
This patch makes it inaccessible to userspace and exposes that fact
through /proc/PID/vm (per-region "user_accessible" flag.)
Currently only Ext2FS and TmpFS supports InodeWatchers. We now fail
with ENOTSUPP if watch_file() is called on e.g ProcFS.
This fixes an issue with FileManager chewing up all the CPU when /proc
was opened. Watchers don't keep the watched Inode open, and when they
close, the watcher FD will EOF.
Since nothing else kept /proc open in FileManager, the watchers created
for it would EOF immediately, causing a refresh over and over.
Fixes#879.
I had to change the layout of RegisterDump a little bit to make the new
IRQ entry points work. This broke get_register_dump_from_stack() which
was expecting the RegisterDump to be badly aligned due to a goofy extra
16 bits which are no longer there.
Instead of having a common entry point and looking at the PIC ISR to
figure out which IRQ we're servicing, just make a separate entryway
for each IRQ that pushes the IRQ number and jumps to a common routine.
This fixes a weird issue where incoming network packets would sometimes
cause the mouse to stop working. I didn't track it down further than
realizing we were sometimes EOI'ing the wrong IRQ.
The majority of the time in NetworkTask was being spent in allocating
and deallocating KBuffers for each incoming packet.
We'll now keep up to 100 buffers around and reuse them for new packets
if the next incoming packet fits in an old buffer. This is pretty
naively implemented but definitely cuts down on time spent here.
Since stream sockets don't actually need to deliver packets-at-a-time
data in recvfrom(), they can just buffer the payload bytes instead.
This avoids keeping one KBuffer per incoming packet in the receive
queue, which was a big performance issue in ProtocolServer.
This code is definitely not perfect and is something we should keep
improving over time.
We begin with a simple treeview that shows a recorded profile.
To record and view a profile of a process with <PID>, simply do this:
$ profile <PID> on
... wait while PID does something interesting ...
$ profile <PID> off
$ cat /proc/profile > my-profile.prof
$ ProfileViewer my-profile.prof
This is pretty large but since it's not eagerly committed to physical
pages, it's probably okay. I'm bumping this after running out of space
for a ProtocolServer profile run :^)
The kernel now supports basic profiling of all the threads in a process
by calling profiling_enable(pid_t). You finish the profiling by calling
profiling_disable(pid_t).
This all works by recording thread stacks when the timer interrupt
fires and the current thread is in a process being profiled.
Note that symbolication is deferred until profiling_disable() to avoid
adding more noise than necessary to the profile.
A simple "/bin/profile" command is included here that can be used to
start/stop profiling like so:
$ profile 10 on
... wait ...
$ profile 10 off
After a profile has been recorded, it can be fetched in /proc/profile
There are various limits (or "bugs") on this mechanism at the moment:
- Only one process can be profiled at a time.
- We allocate 8MB for the samples, if you use more space, things will
not work, and probably break a bit.
- Things will probably fall apart if the profiled process dies during
profiling, or while extracing /proc/profile
Okay, one "dunce hat" point for me. The new PTY majors conflicted with
PATAChannel. Now they are 200 for master and 201 for slave, not used
by anything else.. I hope!
The 1st master pseudoterminal had the same device ID as /dev/psaux
which was caught by an assertion in Device VFS registration.
This would cause us to overwrite the PS/2 mouse device registration
which was definitely not good.
This patch makes SharedBuffer use a PurgeableVMObject as its underlying
memory object.
A new syscall is added to control the volatile flag of a SharedBuffer.
It's now possible to get purgeable memory by using mmap(MAP_PURGEABLE).
Purgeable memory has a "volatile" flag that can be set using madvise():
- madvise(..., MADV_SET_VOLATILE)
- madvise(..., MADV_SET_NONVOLATILE)
When in the "volatile" state, the kernel may take away the underlying
physical memory pages at any time, without notifying the owner.
This gives you a guilt discount when caching very large things. :^)
Setting a purgeable region to non-volatile will return whether or not
the memory has been taken away by the kernel while being volatile.
Basically, if madvise(..., MADV_SET_NONVOLATILE) returns 1, that means
the memory was purged while volatile, and whatever was in that piece
of memory needs to be reconstructed before use.
Using int was a mistake. This patch changes String, StringImpl,
StringView and StringBuilder to use size_t instead of int for lengths.
Obviously a lot of code needs to change as a result of this.
The main thread of each kernel/user process will take the name of
the process. Extra threads will get a fancy new name
"ProcessName[<tid>]".
Thread backtraces now list the thread name in addtion to tid.
Add the thread name to /proc/all (should it get its own proc
file?).
Add two new syscalls, set_thread_name and get_thread_name.
Now that we have proper wait queues to drive waiter wakeup, we can use
the wake actions to break out of the scheduler's idle loop when we've
got a thread to run.
This patch makes it possible to make memory regions non-readable.
This is enforced using the "present" bit in the page tables.
A process that hits an not-present page fault in a non-readable
region will be crashed.
This patch introduces code generation for the WindowServer IPC with
its clients. The client/server endpoints are defined by the two .ipc
files in Servers/WindowServer/: WindowServer.ipc and WindowClient.ipc
It now becomes significantly easier to add features and capabilities
to WindowServer since you don't have to know nearly as much about all
the intricate paths that IPC messages take between LibGUI and WSWindow.
The new system also uses significantly less IPC bandwidth since we're
now doing packed serialization instead of passing fixed-sized structs
of ~600 bytes for each message.
Some repaint coalescing optimizations are lost in this conversion and
we'll need to look at how to implement those in the new world.
The old CoreIPC::Client::Connection and CoreIPC::Server::Connection
classes are removed by this patch and replaced by use of ConnectionNG,
which will be renamed eventually.
Goodbye, old WindowServer IPC. You served us well :^)
This patch adds these I/O counters to each thread:
- (Inode) file read bytes
- (Inode) file write bytes
- Unix socket read bytes
- Unix socket write bytes
- IPv4 socket read bytes
- IPv4 socket write bytes
These are then exposed in /proc/all and seen in SystemMonitor.
Instead of using the generic block mechanism, wait-queued threads now
go into the special Queued state.
This fixes an issue where signal dispatch would unblock a wait-queued
thread (because signal dispatch unblocks blocked threads) and cause
confusion since the thread only expected to be awoken by the queue.
Instead of waking up repeatedly to check if a disk operation has
finished, use a WaitQueue and wake it up in the IRQ handler.
This simplifies the device driver a bit, and makes it more responsive
as well :^)
There was a race window between instantiating a WaitQueueBlocker and
setting the thread state to Blocked. If a thread was preempted between
those steps, someone else might try to wake the wait queue and find an
unblocked thread in a wait queue, which is not sane.
The kernel's Lock class now uses a proper wait queue internally instead
of just having everyone wake up regularly to try to acquire the lock.
We also keep the donation mechanism, so that whenever someone tries to
take the lock and fails, that thread donates the remainder of its
timeslice to the current lock holder.
After unlocking a Lock, the unlocking thread calls WaitQueue::wake_one,
which unblocks the next thread in queue.
Kernel modules can now be unloaded via a syscall. They get a chance to
run some code of course. Before deallocating them, we call their
"module_fini" symbol.
It's now possible to load a .o file into the kernel via a syscall.
The kernel will perform all the necessary ELF relocations, and then
call the "module_init" symbol in the loaded module.
This defaults to 1500 for all adapters, but LoopbackAdapter increases
it to 65536 on construction.
If an IPv4 packet is larger than the MTU, we'll need to break it into
smaller fragments before transmitting it. This part is a FIXME. :^)
Turns out we can use abi::__cxa_demangle() for this, and all we need to
provide is sprintf(), realloc() and free(), so this patch exposes them.
We now have fully demangled C++ backtraces :^)
The fault was happening when retrieving a current backtrace for the
SystemServer process.
To generate a backtrace, we go into the paging scope of the process,
meaning we temporarily switch to using its page directory as our own.
Because kernel VM is allocated on demand, it's possible for a process's
mappings above the 3GB mark to be out-of-date. Normally this just gets
fixed up transparently by the page fault handler (which simply copies
the PDE from the canonical MM.kernel_page_directory() into the current
process.)
However, if the current kernel *stack* is in a piece of memory that
the backtraced process lacks up-to-date PDE's for, we still get a page
fault, but are unable to handle it, since the CPU wants to push to the
stack as part of calling the page fault handler. So we're screwed and
it's a triple-fault.
Fix this by always updating the kernel VM mappings before switching
into a paging scope. In practical terms, this is a 1KB memcpy() that
happens when generating a backtrace, or doing exec().
Previously it was not possible to see what each thread in a process was
up to, or how much CPU it was consuming. This patch fixes that.
SystemMonitor and "top" now show threads instead of just processes.
"ps" is gonna need some more fixing, but it at least builds for now.
Fixes#66.
Then only allow regions with that bit to be manipulated via munmap()
and mprotect(). This prevents messing with non-mmap()ed regions in
a process's address space (stacks, shared buffers, ...)
This patch adds ProtocolServer, a server that handles network requests
on behalf of its clients. The first protocol implemented is HTTP.
The idea here is to use a plug-in architecture where any number of
protocols can be added and implemented without having to mess around
with each client program that wants to use the protocol.
A simple client API is provided through LibProtocol::Client. :^)