Commit graph

10 commits

Author SHA1 Message Date
Andreas Kling
0fecdb7904 disasm: Use make<X86::ELFSymbolProvider> instead of naked new 2020-08-17 13:12:46 +02:00
Nico Weber
f025204dfe disasm: For ELF inputs, pass an ELFSymbolProvider to disassembler
This lets disasm output contain the symbol names of call and jump
destinations:

    8048111:	e8 88 38 01 00       	call   805b99e <__cxa_atexit>
    ...
    8048150:	74 15                	je     8048167 <_start+0x4c>

The latter (the symbol of the current function with an offset) is
arguably more distracting than useful because you usually want to look
at the instruction at the absolute offset in this case, but the former
is very nice to have.

For reasons I do not understand, this cuts the time to run
`disasm /bin/id` in half, from ~1s to ~0.5s.
2020-08-16 19:37:58 +02:00
Nico Weber
a2b99dd3ea disasm: Print correct offset-relative jumps in ELF file disassembly 2020-08-14 10:29:41 +02:00
Nico Weber
6613a4cb8c disasm: Insert symbol names in disassembly stream
The symbol name insertion scheme is different from objdump -d's.
Compare the output on Build/Userland/id:

* disasm:

        ...
        _start (08048305-0804836b):
        08048305  push ebp
        ...
        08048366  call 0x0000df56

        0804836b  o16 nop
        0804836d  o16 nop
        0804836f  nop

        (deregister_tm_clones (08048370-08048370))

        08048370  mov eax, 0x080643e0
        ...
        _ZN2AK8Utf8ViewC1ERKNS_6StringE (0805d9b2-0805d9b7):
        _ZN2AK8Utf8ViewC2ERKNS_6StringE (0805d9b2-0805d9b7):
        0805d9b2  jmp 0x00014ff2

        0805d9b7  nop

* objdump -d:

        08048305 <_start>:
         8048305:	55                   	push   %ebp
        ...
         8048366:	e8 9b dc 00 00       	call   8056006 <exit>
         804836b:	66 90                	xchg   %ax,%ax
         804836d:	66 90                	xchg   %ax,%ax
         804836f:	90                   	nop

        08048370 <deregister_tm_clones>:
         8048370:	b8 e0 43 06 08       	mov    $0x80643e0,%eax
        ...
        0805d9b2 <_ZN2AK8Utf8ViewC1ERKNS_6StringE>:
         805d9b2:	e9 eb f6 ff ff       	jmp    805d0a2 <_ZN2AK10StringViewC1ERKNS_6StringE>
         805d9b7:	90                   	nop

Differences:

1. disasm can show multiple symbols that cover the same instructions.
   I've only seen this happen for C1/C2 (and D1/D2) ctor/dtor pairs,
   but it could conceivably happen with ICF as well.

2. disasm separates instructions that do not belong to a symbol with
   a newline, so that nop padding isn't shown as part of a function
   when it technically isn't.

3. disasm shows symbols that are skipped (due to having size 0)
   in parenthesis, separated from preceding and following instructions.
2020-08-10 11:48:10 +02:00
Nico Weber
9c136be08b disasm: For ELF files, disassemble .text section
Since disasm is built in lagom, this requires adding LibELF to lagom.
2020-08-09 21:12:54 +02:00
Linus Groh
c8d690d840 Userland: Use Core::ArgsParser for 'disasm' 2020-08-06 20:41:13 +02:00
Ben Wiederhake
538a3a2579 Userland: Add missing checks for MappedFile.is_valid() 2020-07-31 11:34:06 +02:00
Linus Groh
3153739d60 Userland: Add missing copyright header to disasm.cpp 2020-05-09 23:45:16 +02:00
Andreas Kling
e5cde0082a LibX86: Run the instruction decoder in 32-bit mode by default
Let's assume a 32-bit execution environment unless otherwise specified.
2020-04-11 13:24:55 +02:00
Andreas Kling
32d83fdee4 LibX86: Add an X86 instruction decoder library + basic disassembler
This will be very useful for developer tools like ProfileView, and also
for future tools like debuggers and such. :^)
2020-04-11 13:16:17 +02:00