We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Previously this was handled implicitly, as our implementation of Tar
would just stop processing input as soon as it found something invalid.
However, since we now error out as soon as something is found to be
wrong, we require proper handling for zero blocks, which aren't actually
fatal.
Otherwise, we end up propagating those dependencies into targets that
link against that library, which creates unnecessary link-time
dependencies.
Also included are changes to readd now missing dependencies to tools
that actually need them.
Before this change the behavior was, confusingly:
- never null-terminate if set_field() is passed a StringView.
- which would also not fail if the StringView is too large.
- require null-termination if set_field() is passed a String.
Not only are both of these wrong, having different behavior for those is
very confusing, and creating a String copy to force a type checker to
cause a string to be null-terminated is extremely weird.
The new behavior is to always null-terminate when possible, never
null-terminate if the last byte is used, and always verify that the
string will fit.
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
This commit moves the length calculations out to be directly on the
StringView users. This is an important step towards the goal of removing
StringView(char const*), as it moves the responsibility of calculating
the size of the string to the user of the StringView (which will prevent
naive uses causing OOB access).
glibc before 2.28 defines major() and minor() macros from sys/types.h.
This triggers a Lagom warning for old distros that use versions older
than that, such as Ubuntu 18.04. This fixes a break in the
compiler-explorer Lagom build, which is based off 18.04 docker
containers.
Since 8209c2b570 was added the requires
check for copy_characters_to_buffer matched StringViews as well, which
caused unexpected null bytes to be inserted for non null-terminated
fields.
Benefits:
- Braced-initialization prevents unknown narrowing conversions.
- Using designated initializers will result in a compiler error when a
member is skipped or forgotten.
Problem:
- `memset` is used to initialize data instead of using default
initialization.
Solution:
- Default initialize all member variables.
- Eliminate use of `memset` in favor of C++ braced initialization.
Problem:
- The getters and setters duplicate code for conversions.
- Getters are returning `const StringView` rather than non-`const`.
Solution:
- Factor out common code to helper functions.
- Return `StringView` as non-`const`.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
Since the central directory offset in the end of central directory
record and the local file offset in each central directory header
are user-controlled arbitary data, we have to bounds check them
before using them.
We now also check we have enough space in the incoming buffer for the
various signatures and optional (length specified) fields. This helps
prevents a possible heap overflow read.