Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
It was fragile to use the address of the body of the memory management
functions to disable memory auditing within them. Functions called from
these did not get exempted from the audits, so in some cases
UserspaceEmulator reported bogus heap buffer overflows.
Memory auditing did not work at all on Clang because when querying the
addresses, their offset was taken relative to the base of `.text` which
is not the first segment in the `R/RX/RW(RELRO)/RW(non-RELRO)` layout
produced by LLD.
Similarly to when setting metadata about the allocations, we now use the
`emuctl` system call to selectively suppress auditing when we reach
these functions. This ensures that functions called from `malloc` are
affected too, and no issues occur because of the inconsistency between
Clang and GCC memory layouts.
Previously each malloc size class would keep around a limited number of
unused blocks which were marked with MADV_SET_VOLATILE which could then
be reinitialized when additional blocks were needed.
This changes malloc() so that it also keeps around a number of blocks
without marking them with MADV_SET_VOLATILE. I termed these "hot"
blocks whereas blocks which were marked as MADV_SET_VOLATILE are called
"cold" blocks because they're more expensive to reinitialize.
In the worst case this could increase memory usage per process by
1MB when a program requests a bunch of memory and frees all of it.
Also, in order to make more efficient use of these unused blocks
they're now shared between size classes.
Problem:
- `size_classes` is a C-style array which makes it difficult to use in
algorithms.
- `all_of` algorithm is re-written for the specific implementation.
Solution:
- Change `size_classes` to be an `Array`.
- Directly use the generic `all_of` algorithm instead of
reimplementing.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
Previous a mallocation was marked as 'reachable' when any other
mallocation or memory region had a pointer to that mallocation. However
there could be the situation that two mallocations have pointers to each
other while still being unreachable from anywhere else. They would be
marked as 'reachable' regardless.
This patch replaces the old way of detemining whether a mallocation is
reachable by analyzing the dependencies of the different mallocations
using a graph-approach. Now mallocations are only reachable if pointed
to by other reachable mallocations or other memory regions.
A nice bonus is that this gets rid of a nested for_each_mallocation, so
the complexity of leak finding becomes linear instead of quadratic.
Accesses in the header (or trailing padding) of a malloc block should
not be associated with any mallocation since only the chunk-sized slots
actually get returned by malloc.
Basically, allow address-to-chunk lookup to fail, and handle such
failures gracefully at call sites.
Fixes#5706.
We don't want to audit accesses into the region *while* we're setting
up malloc tracking for it. Fetching the chunk size from the header
was tripping up the auditing code.
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.