Commit graph

46 commits

Author SHA1 Message Date
Nico Weber
4a57cbc98f LibX86: Remove some allocations from Instruction::to_string
Instruction::to_string used to copy a string literal into a String,
and then the String into a StringBuilder. Copy it to the StringBuilder
directly.

No measurable performance benefit, but it's also less code.
2020-08-16 19:38:55 +02:00
Nico Weber
fd73de684e X86+Profiler+UserspaceEmulator: Deduplicate ELFSymbolProvider to LibX86
From a layering perspective, it's maybe a bit surprising that the
X86::SymbolProvider implementation also lives in LibX86, but since
everything depends on LibELF via LibC, and since all current
LibX86-based disassemblers want to use ELFSymbolProvider, it makes
some amount of sense to put it there.
2020-08-16 19:37:58 +02:00
asynts
05abfc0e1f AK: Rename MakeUnsigned::type to MakeUnsigned::Type.
Also renames MakeSigned::type to MakeSigned::Type.
2020-08-06 10:33:16 +02:00
Nico Weber
5892e5a300 LibX86: Remove unused private fields m_segment, m_handler 2020-08-04 17:42:08 +02:00
Nico Weber
8e8cbe6a12 LibX86: FPU instructions never have a lock prefix 2020-07-30 16:53:33 +02:00
Nico Weber
8593bdb711 LibX86: Disassemble most remaining FPU instructions
Some of the remaining instructions have different behavior for
register and non-register ops.  Since we already have the
two-level flags tables, model this by setting all handlers in
the two-level table to the register op handler, while the
first-level flags table stores the action for the non-reg handler.
2020-07-30 16:53:33 +02:00
Nico Weber
c99a3efc5b LibX86: Disassemble most FPU instructions starting with D9
Some of these don't just use the REG bits of the mod/rm byte
as slashes, but also the R/M bits to have up to 9 different
instructions per opcode/slash combination (1 opcode requires
that MOD is != 11, the other 8 have MODE == 11).

This is done by making the slashes table two levels deep for
these cases.

Some of this is cosmetic (e.g "FST st0" has no effect already,
but its bit pattern gets disassembled as "FNOP"), but for
most uses it isn't.

FSTENV and FSTCW have an extraordinary 0x9b prefix. This is
not yet handled in this patch.
2020-07-28 18:55:29 +02:00
Nico Weber
f6db97b8a9 LibX86: Support disassembling a few FPU opcodes better 2020-07-26 11:29:03 +02:00
Andreas Kling
be5f42adea UserspaceEmulator+LibX86: Start tracking uninitialized memory :^)
This patch introduces the concept of shadow bits. For every byte of
memory there is a corresponding shadow byte that contains metadata
about that memory.

Initially, the only metadata is whether the byte has been initialized
or not. That's represented by the least significant shadow bit.

Shadow bits travel together with regular values throughout the entire
CPU and MMU emulation. There are two main helper classes to facilitate
this: ValueWithShadow and ValueAndShadowReference.

ValueWithShadow<T> is basically a struct { T value; T shadow; } whereas
ValueAndShadowReference<T> is struct { T& value; T& shadow; }.

The latter is used as a wrapper around general-purpose registers, since
they can't use the plain ValueWithShadow memory as we need to be able
to address individual 8-bit and 16-bit subregisters (EAX, AX, AL, AH.)

Whenever a computation is made using uninitialized inputs, the result
is tainted and becomes uninitialized as well. This allows us to track
this state as it propagates throughout memory and registers.

This patch doesn't yet keep track of tainted flags, that will be an
important upcoming improvement to this.

I'm sure I've messed up some things here and there, but it seems to
basically work, so we have a place to start! :^)
2020-07-21 02:37:29 +02:00
Andreas Kling
036ce64cef LibX86: Don't cache whether instruction have a sub-opcode
We can just check if the first opcode byte is 0x0f to know this.
2020-07-15 13:42:15 +02:00
Andreas Kling
6a926a8c61 LibX86+UserspaceEmulator: Don't store a32 in MemoryOrRegisterReference
The a32 bit tells us whether a memory address is 32-bit or not.
We already have this information in Instruction, so just plumb that
around instead of double-caching the bit.
2020-07-15 13:42:15 +02:00
Andreas Kling
bc66221ee3 LibX86: Don't store the prefix/imm1/imm2 byte counts individually
We can shrink and simplify Instruction a bit by combining these into
a single "extra bytes" count.
2020-07-15 13:42:15 +02:00
Andreas Kling
4f8e86ad67 LibX86: Remove Instruction::m_handler
We can fetch the handler via Instruction::m_descriptor.
2020-07-15 13:42:15 +02:00
Andreas Kling
ef84865c8c LibX86+UserspaceEmulator: Devirtualize and inline more instruction code
Use some template hacks to force GCC to inline more of the instruction
decoding stuff into the UserspaceEmulator main execution loop.

This is my last optimization for today, and we've gone from ~60 seconds
when running "UserspaceEmulator UserspaceEmulator id" to ~8 seconds :^)
2020-07-13 21:00:51 +02:00
Andreas Kling
7ea36f5ed0 LibX86: Don't build_opcode_table_if_needed() every instruction decode
Instead, just do this once at startup. :^)
2020-07-13 20:42:37 +02:00
Andreas Kling
868db2313f LibX86: Apply aggressive inlining to Instruction decoding functions
These functions really benefit from being inlined together instead
of being separated.

This yields roughly a ~2x speedup.
2020-07-13 20:34:54 +02:00
Andreas Kling
a27473cbc2 UserspaceEmulator+LibX86: Turn on -O3 optimization for emulation code
Since this code is performance-sensitive, let's have the compiler do
whatever it can to help us with the most important files.

This yields a ~8% speedup.
2020-07-13 20:23:00 +02:00
Andreas Kling
f1bbc39148 LibX86: ALWAYS_INLINE some Instruction members 2020-07-13 13:50:22 +02:00
Andreas Kling
97f4cebc8d UserspaceEmulator+LibX86: Implement the LEA instruction
This piggybacks nicely on Instruction's ModR/M resolution code. :^)
2020-07-11 23:57:14 +02:00
Andreas Kling
0cf7fd5268 UserspaceEmulator+LibX86: Implement all the forms of XOR
And they're all generic, which will make it easy to support more ops.
2020-07-10 20:20:27 +02:00
Andreas Kling
45bfdd0063 LibX86: Add a templatized way to resolve ModR/M memory references
Hopefully this will be flexible enough for our SoftCPU. :^)
2020-07-10 20:20:27 +02:00
Andreas Kling
3a1cf9505d LibX86: Store Instruction's segment prefix as Optional<SegmentRegister>
Instead of having a dedicated enum value for the empty state.
2020-07-10 20:20:27 +02:00
Andreas Kling
4d8683b632 UserspaceEmulator: Tidy up SoftCPU's general purpose registers
This patch adds a PartAddressableRegister type, which divides a 32-bit
value into separate parts needed for the EAX/AX/AL/AH register splits.

Clean up the code around register access to make it a little less
cumbersome to use.
2020-07-09 23:27:50 +02:00
Andreas Kling
6440e59ead LibX86: Expose some more things on X86::Instruction 2020-07-07 22:44:58 +02:00
Andreas Kling
7ab2a4dde7 LibX86: Add an abstract X86::Interpreter class
This abstract class has a pure virtual member function for all of the
X86 instructions. This can be used to implement.. something. :^)
2020-07-07 22:44:58 +02:00
Andreas Kling
b2a7943b4e LibX86: Disassemble the XADD instruction
This makes functrace usable again :^)
2020-06-28 21:10:53 +02:00
Sergey Bugaev
450a2a0f9c Build: Switch to CMake :^)
Closes https://github.com/SerenityOS/serenity/issues/2080
2020-05-14 20:15:18 +02:00
Linus Groh
57f68ac5d7 LibX86: Rename build0FSlash() to build_0f_slash()
As spotted by @heavyk this was missed in 57b2b96.
2020-05-07 12:22:36 +02:00
Andreas Kling
a75af443d4 LibX86: Simplify "register index to string" functions a bit 2020-05-04 13:49:15 +02:00
Andreas Kling
57b2b96a67 LibX86: Remove accidental camelCase in some names 2020-05-04 13:49:15 +02:00
Andreas Kling
fec52fa94b LibX86: Disassemble BSWAP 2020-04-30 22:15:16 +02:00
Andreas Kling
3cdf4cd204 LibX86: Use MakeUnsigned<T> from AK instead of making a custom one 2020-04-15 16:58:46 +02:00
Andreas Kling
e880e4c2d2 LibX86: Add a way for Instruction::to_string() to symbolicate addresses
This patch adds a pure virtual X86::SymbolProvider that can be passed
to Instruction::to_string(). If the instruction contains what appears
to be a program address, stringification will try to symbolicate that
address via the SymbolProvider.

This makes it possible (and very flexible) to add symbolication to
clients of the disassembler. :^)
2020-04-12 14:20:04 +02:00
Andreas Kling
34d07e35bd LibX86: Decode RDRAND instruction
I was looking at Kernel::get_good_random_bytes() and wondering where
the RDRAND instruction was. :^)
2020-04-11 23:37:00 +02:00
Andreas Kling
8daddcfa0a LibX86: Fix duplicate '+' in SIB byte disassembly
For SIB bytes with base but no index, we were emitting two '+' chars
which looked very off.
2020-04-11 23:11:10 +02:00
Andreas Kling
4eceea7c62 LibX86: When there are multiple REPZ/REPNZ prefixes, the last one wins 2020-04-11 14:05:10 +02:00
Andreas Kling
1924112d7d LibX86: Tolerate invalid segment register indices when disassembling
While #6 and #7 are not valid segment register, they can still be
encoded in otherwise-valid instructions, so let's tolerate it.
2020-04-11 14:00:20 +02:00
Andreas Kling
d7d7a32d47 LibX86: Disassemble unknown opcodes as "db %#02x" 2020-04-11 13:57:28 +02:00
Andreas Kling
95df0847c5 LibX86: Decode PADDB, PADDW and PADDD 2020-04-11 13:57:20 +02:00
Andreas Kling
16455e91db LibX86: Don't choke on invalid LOCK prefixes for now
This might be interesting information later, but I'm not sure how to
encode it at the moment.
2020-04-11 13:53:12 +02:00
Andreas Kling
f115416db3 LibX86: Fix backwards arguments to ENTER imm16,imm8 2020-04-11 13:51:00 +02:00
Andreas Kling
cf7d042e0f LibX86: Add 8-bit CMPXCHG and allow LOCK CMPXCHG 2020-04-11 13:46:30 +02:00
Andreas Kling
2ce38d4699 LibX86: Support decoding basic MMX instructions like MOVQ 2020-04-11 13:42:18 +02:00
Andreas Kling
e5cde0082a LibX86: Run the instruction decoder in 32-bit mode by default
Let's assume a 32-bit execution environment unless otherwise specified.
2020-04-11 13:24:55 +02:00
Andreas Kling
8f503da93f LibX86: Remove some unnecessary stuff from Disassembler.h 2020-04-11 13:23:52 +02:00
Andreas Kling
32d83fdee4 LibX86: Add an X86 instruction decoder library + basic disassembler
This will be very useful for developer tools like ProfileView, and also
for future tools like debuggers and such. :^)
2020-04-11 13:16:17 +02:00