Commit graph

57681 commits

Author SHA1 Message Date
Timothy Flynn
1b4a23095c AK: Add a Utf16View::starts_with method
Based heavily on Utf8View::starts_with.
2024-01-04 12:43:10 +01:00
Timothy Flynn
c46ba7e68d AK: Allow constructing a UTF-16 view from a UTF-16 string literal
UTF-16 string literals are a language-level feature. It is convenient to
be able to construct a Utf16View from these strings.
2024-01-04 12:43:10 +01:00
Nico Weber
e16345555b LibPDF: Port 59b50fa43f8c2 to xref and object streams
0000440.pdf contains an xref stream object (at offset 3643676) starting:

```
294 0 obj <<
/Type /XRef
/Index [0 295]
/Size 295
```

and an object stream object (at offset 3640121) starting:

```
230 0 obj <<
/Type /ObjStm
/N 73
/First 614
```

In both cases, the `obj` and the `<<` are separated by non-newline
whitespace.

633e1632d0 made parse_indirect_value() tolerate this, but it didn't
update neither parse_xref_stream() (which parses xref streams) nor
parse_compressed_object_with_index() (which parses object streams),
despite all three changes being part of #14873.

Make parse_xref_stream() and parse_compressed_object_with_index()
call parse_indirect_value() to pick up the fix over there. It's a bit
less code too.

(0000440.pdf is the only PDF in my 1000 test PDFs that this helps,
somewhat surprisingly.)
2024-01-04 11:27:24 +01:00
Shannon Booth
a545935997 LibWeb: Create XML Documents in DOMParser.parseFromString 2024-01-04 11:23:20 +01:00
Shannon Booth
cd156bad6b LibWeb: Create XMLDocuments in DOMImplementation.createDocument 2024-01-04 11:23:20 +01:00
Shannon Booth
36c145b197 LibWeb: Fix crash in DOMImplementation.createDocument for null namespace
We were blindly assuming that the namespace was non-null instead of
simply passing it through.
2024-01-04 11:23:20 +01:00
Shannon Booth
a028c87069 LibWeb: Add a default URL of about:blank to DOM::XMLDocument
This matches DOM::Document.
2024-01-04 11:23:20 +01:00
Nico Weber
9d69c5d434 LibPDF: Tolerate trailing whitespace after %%EOF marker
At first I tried implmenting the quirk from PDF 1.7 Appendix H,
3.4.4, "File Trailer": """Acrobat viewers require only that the %%EOF
marker appear somewhere within the last 1024 bytes of the file.""
This would've been like #22548 but at end-of-file instead of at
start-of-file.

This helped a bunch of files, but also broke a bunch of files that
made more than 1024 bytes of stuff at the end, and it wouldn't have
helped 0000059.pdf, which has over 40k of \0 bytes after the %%EOF.
So just tolerate whitespace after the %%EOF line, and keep ignoring
and arbitrary amount of other stuff after that like before.

This helps:
* 0000599.pdf
  One trailing \0 byte after %%EOF. Due to that byte, the
  is_linearized() check fails and we go down the non-linearized
  codepath. But with this fix, that code path succeeds.
* 0000937.pdf
  Same.
* 0000055.pdf
  Has one space followed by a \n after %%EOF
* 0000059.pdf
  Has over 40kB of trailing \0 bytes

The following files keep working with it:
* 0000242.pdf
  5586 bytes of trailing HTML
* 0000336.pdf
  5586 bytes of trailing HTML fragment
* 0000136.pdf
  2054 bytes of trailing space characters
  This one kind of only worked by accident before since it found
  the %%EOF block before the final %%EOF block. Maybe this is
  even an intentional XRefStm compat hack? Anyways, now it
  find the final block instead.
* 0000327.pdf
  11044 bytes of trailing HTML
2024-01-04 11:19:15 +01:00
Nico Weber
2d12647e29 LibPDF: Add FIXME for "was linearized PDF incrementally updated" check
It's pretty tricky to do, and also tricky with respect to skipping
trailing bytes after %%EOF: The check requires knowning the full size of
the PDF (which means web servers not sending content lengths are out),
but that size has to be after stripping trailing bytes, which normal
static file servers won't do. So PDF viewers would have to download the
last couple bytes of the PDF unconditionally, then strip trailing bytes
and use the count to figure out the final actual PDF size.

Luckily, we don't incrementally download PDFs from the net but
instead require all data to be available in one chunk, so it's
not currently a problem.
2024-01-04 11:19:15 +01:00
Nico Weber
1b45c3e127 LibPDF: Tolerate whitespace after xref and startxref
The spec isn't super clear on if this is allowed:

"""Each cross-reference section shall begin with a line containing the
keyword xref. Following this line..."""

"""The two preceding lines shall contain, one per line and in order, the
keyword startxref and..."""

It kind of sounds like anything goes on both lines as long as they
contain `xref` and `startxref`.

In practice, both seem to always occur at the start of their line,
but in 0000780.pdf (and nowhere else), there's one space after each
keyword before the following linebreak, and this makes that file load.
2024-01-04 10:14:30 +01:00
Nico Weber
efb37f7252 LibPDF: Add Reader::consume_non_eol_whitespace() 2024-01-04 10:14:30 +01:00
Nico Weber
c59e08123b LibPDF: Add a FIXME and a spec comment to Encoding::from_object() 2024-01-04 10:12:11 +01:00
Nico Weber
ad5fc0eda1 LibPDF: An Encoding's /Differences entry is optional
Per "TABLE 5.11 Entries in an encoding dictionary", /Differences is
optional.

(Per "Encodings for TrueType Fonts" in 5.5.5 Character Encoding,
nonsymbolic truetype fonts are even recommended to have "no Differences
array." But in practice, most seem to have it.)

Fixes crashes on:
* 0000001.pdf
* 0000574.pdf
* 0000337.pdf

All three don't render super great, but at least they no longer crash.
2024-01-04 10:12:11 +01:00
Shannon Booth
e9dfa61588 LibWeb: Use UTF-16 code unit offsets in Range::to_string
Similar to another problem we had in CharacterData, we were assuming
that the offsets were raw utf8 byte offsets into the data, instead of
utf16 code units. Fix this by using the substring helpers in
CharacterData to get the text data from the Range.

There are more instances of this issue around the place that we will
need to track down and add tests for, but this fixes one of them :^)

For the test included in this commit, we were previously returning:

llo💨😮

Instead of the expected:

llo💨😮 Wo
2024-01-04 10:10:44 +01:00
Shannon Booth
ee431e6911 LibWeb: Use WebIDL typedefs in Range/AbstractRange
In the public APIs which have their types exposed through IDL.
2024-01-04 10:10:44 +01:00
Aliaksandr Kalenik
b6123df492 LibWeb: Add support for start, center and end justify-content in GFC
Fixes https://github.com/SerenityOS/serenity/issues/22555
2024-01-04 09:47:20 +01:00
Aliaksandr Kalenik
56ff9bffae LibWeb: Support "normal" and "stretch" justify-content in CSS parser 2024-01-04 09:47:20 +01:00
Aliaksandr Kalenik
b395cfccb0 LibWeb: Add support for "align-content: normal" in CSS parser 2024-01-04 09:47:20 +01:00
Nico Weber
fa24fbf120 LibGfx/OpenType: Survive simple glyphs with 0 contours
These are valid per spec, and do sometimes occur in practice, e.g.
in embedded fonts in 0000550.pdf and 0000246.pdf in 0000.zip in the
PDFA test set.
2024-01-04 03:32:46 +01:00
Hugh Davenport
486c562c7e Taskbar: Use name of Ladybird as default QuickLaunch
As the name of the Browser app is now titled Ladybird this was resulting in a
double up if installed fresh then rebooted (or likely after an upgrade). This
change corrects this by using the Ladybird title
2024-01-03 21:30:14 +01:00
Tim Schumacher
707a36dd79 LibCompress/Brotli: Update the lookback buffer with uncompressed data
We previously skipped updating the lookback buffer when copying
uncompressed data, which resulted in a wrong total byte count.
With a wrong total byte count, our decompressor implementation
ended up choosing a wrong offset into the dictionary.
2024-01-03 17:54:36 +01:00
Ali Mohammad Pur
c3167afa3a LibTLS: Notify the client for app data as soon as some data is available
Previously we were waiting until the socket was no longer immediately
readable to notify the client, resulting in large buffers and longer
latency.
2024-01-03 14:59:59 +01:00
Ali Mohammad Pur
b1297a267c LibCrypto: Avoid branching in galois_multiply()
This makes GHash a little more than twice as fast.
2024-01-03 14:59:59 +01:00
Andreas Kling
27a294547d LibTLS: Add segmentation to the application buffer to avoid memcpy churn
We were previously doing a *lot* of unnecessary memcpy work when
transferring large files.

This patch addresses the issue by introducing a simple segmented buffer
with no additional work when appending new data, or when transfering out
of the buffer.
2024-01-03 14:59:59 +01:00
Andreas Kling
40f87f0954 LibWeb: Stop timers when finalizing a Window or WorkerGlobalScope
This avoids an assertion that timers are not active when destroyed.
2024-01-03 12:56:18 +01:00
MacDue
b4eb66d9fe LibGfx: Simplify condition
This is just an XOR. No behaviour change.
2024-01-03 12:56:01 +01:00
MacDue
db51e80d50 LibGfx: Fix typo 2024-01-03 12:56:01 +01:00
MacDue
a9502396ee LibGfx: Remove somewhat outdated comment
Most of these optimizations have been tried now, so this comment is a
bit misleading.
2024-01-03 12:56:01 +01:00
MacDue
096bdb142b LibGfx: Speed up filling solid colors in path rasterizer
For solid color fills (with alpha = 255), the rasterizer now tracks
spans of solid colors within a scanline and fills the entire span with
a single call to fast_u32_fill().

This gave up to a 1.5x speedup drawing the Ghostscript Tiger within
SerenityOS.
2024-01-03 12:56:01 +01:00
MacDue
2fa488cfa9 LibGfx: Skip horizontal edges in path rasterizer
Only the vertical parts of edges are plotted (then accumulated
horizontally). Fully horizontal edges won't be plotted (and just result
in NaNs).
2024-01-03 12:56:01 +01:00
Nico Weber
4380be9d01 Tests/LibPDF: Add a PDF using the standard 14 fonts
Hand-written (with offsets fixed up by `mutool clean`).
Uses the default encoding for each font.  Manual test for now.

Byte strings generated with:

    python3 -c "for i in range(4):
        print('<' +
              ''.join('%02x' % r for r in range(i * 64, (i + 1) * 64)) +
              '>')"
2024-01-03 10:19:24 +01:00
Timothy Flynn
34160743dc LibIPC: Avoid redundant copy of every tranferred IPC message
For every IPC message sent, we currently prepend the message size to the
IPC message buffer. This incurs the cost of copying the entire message
to its newly allocated position. Instead, reserve the bytes for the size
at the front of the buffer upon creation. Prevent dangerous access to
the buffer with specific public methods.
2024-01-03 10:17:00 +01:00
Timothy Flynn
f2db700ae7 LibIPC: Ensure message sizes do not exceed the limits of u32
We encode the size as a u32, so let's be sure the size does not exceed
that storage. This is unlikely to happen, but no reason not to check.
2024-01-03 10:17:00 +01:00
Timothy Flynn
91558fa381 LibIPC+LibWeb: Add an IPC helper to transfer an IPC message buffer
This large block of code is repeated nearly verbatim in LibWeb. Move it
to a helper function that both LibIPC and LibWeb can defer to. This will
let us make changes to this method in a singular location going forward.

Note this is a bit of a regression for the MessagePort. It now suffers
from the same performance issue that IPC messages face - we prepend the
meessage size to the message buffer. This degredation is very temporary
though, as a fix is imminent, and this change makes that fix easier.
2024-01-03 10:17:00 +01:00
Timothy Flynn
bf15b66117 LibIPC: Use a simpler encoding for arithmetic values
This is less code, but mostly serves to reduce the amount of methods to
be added to IPC::MessageBuffer in an upcoming patch.
2024-01-03 10:17:00 +01:00
Timothy Flynn
3adf01b816 LibIPC: Move MessageBuffer forward declaration from Stub.h to Forward.h
The type of MessageBuffer will be changing, and it was a bit awkward to
look around to find where the forward declaration was. This patch just
moves it to the obvious forwarding header.
2024-01-03 10:17:00 +01:00
Shannon Booth
fa1ef30985 LibWeb: Port Element::set_attribute_value from ByteString
Also making set_attribute_ns take a String instead of a FlyString as
this is only used as an Attr value and no FlyString properties are used.
2024-01-03 10:13:47 +01:00
Shannon Booth
285bca1633 LibWeb: Use Optional<FlyString> const& in Element and NamedNodeMap
This is enabled with the newly added IDL generator support for
FlyStrings.
2024-01-03 10:13:47 +01:00
Shannon Booth
8ba3caf6ab LibWeb: Add support for generating FlyString parameters from IDL
We would previously always generate string parameters to pass through
to functions as a `String`. This works fine if the argument is a
`FlyString const&`, but falls apart for optional types where we need to
accept an `Optional<FlyString> const&`.

Support this by implementing a [FlyString] extended attribute which
if present results in the parameter for the function being generated
as a FlyString.
2024-01-03 10:13:47 +01:00
Shannon Booth
f32185420d LibWeb: Use FlyString where possible in NamedNodeMap
We cannot port over Optional<FlyString> until the IDL generator supports
passing that through as an argument (as opposed to an Optional<String>).

Change to FlyString where possible, and resolve any fallout as a result.
2024-01-03 10:13:47 +01:00
Nico Weber
0bb0c7dac2 LibPDF: Scan for PDF file start in first 1024 bytes
Other readers do this too, and files depend on this.

Fixes opening these four files from the PDFA 0000.zip dataset:

* 0000015.pdf
  Starts with `C:\web\webeuncet\_cat\_docs\_publics\` before header
* 0000408.pdf
  Starts with UTF-8 BOM
* 0000524.pdf
  Starts with 867 bytes of HTML containing a PHP backtrace
* 0000680.pdf
  Starts with `C:\web\webeuncet\_cat\_docs\_publics\` too
2024-01-03 10:12:35 +01:00
Nico Weber
9495f64f91 LibPDF: Improve hex string parsing
A local (non-public) PDF I have lying around contains this in
a page's operator stream:

```
[<00b4003e> 3 <002600480051> 3 <005700550044004f0003> -29
<00330044> 3 <0055> -3 <004e0040> 4 <0003> -29 <004c00560003> -31
<0057004b> 4 <00480003> -37 <0050
>] TJ
```

That is, there's a newline in a hexstring after a character.

This led to `Parser error at offset 5184: Unexpected character`.

The spec says in 3.2.3 String Objects, Hexadecimal Strings:
"""Each pair of hexadecimal digits defines one byte of the string.
White-space characters (such as space, tab, carriage return, line feed,
and form feed) are ignored."""

But we didn't ignore whitespace before or after a character, only
in between the bytes.

The spec also says:
"""If the final digit of a hexadecimal string is missing—that is, if
there is an odd number of digits—the final digit is assumed to be 0."""

In that case, we were skipping the closing `>` twice -- or, more
accurately, we ignored the character after it too. This has been
wrong all the way back in #6974.

Add a test that fails if either of the two changes isn't present.
2024-01-02 22:13:21 +01:00
Timothy Flynn
d11c7a19da LibWebView: Properly decode Base64-encoded strings as UTF-8
In the UI process, we encode generated HTML as Base64 to avoid having to
deal with things like arbitrarily nested quotes. The HTML is encoded as
UTF-8, and the raw bytes of that encoding are transcoded to Base64.

In the Inspector process, we are decoding the Base64 string using atob,
which has awkward non-Unicode limitations. The resulting string is only
a byte string. We must further decode the bytes as UTF-8, which we do
using TextDecoder.
2024-01-02 22:09:25 +01:00
Andreas Kling
0a05be69cf LibWeb: Update create_new_child_navigable() after spec fix
Now that https://github.com/whatwg/html/issues/9686 is fixed, let's
fix it the exact same way in our implementation. :^)
2024-01-02 21:47:36 +01:00
Aliaksandr Kalenik
49fcc5dcd8 LibWeb: Do not require box to be positioned to create stacking context
Instead of implementing stacking context painting order exactly as it
is defined in CSS2.2 "Appendix E. Elaborate description of Stacking
Contexts" we need to account for changes in the latest standards where
a box can establish a stacking context without being positioned, for
example, by having an opacity different from 1.

Fixes https://github.com/SerenityOS/serenity/issues/21137
2024-01-02 21:45:05 +01:00
Nico Weber
aa769d95df Meta/gn: Port f900957d26, d748edd994 2024-01-02 12:36:17 -05:00
Nico Weber
40138a4a3b Meta/gn: Build WebContent Qt bits only if enable_qt is set
With this, it's possible to build Ladybird without having Qt installed.
(Previously, the build required `moc` to exist.)

In fact, it's possible to build Ladybird without anything off `brew`
as long as you have `ninja` and `gn` (both of which don't have any
dependencies themselves and are easy to build).
2024-01-02 12:36:17 -05:00
Nico Weber
9621e77a31 Meta/gn: Move enable_qt arg into dedicated gni file
No behavior change.
2024-01-02 12:36:17 -05:00
Torstennator
82e85172e5 PixelPaint: Fix crash when started with path
This change fixes the initial tool selection when pixelpaint is started
with a path. Previously an already existing editor was expected when
the default tool was initially propagated - which was not the case if
pixelpaint was launched to directly load an existing image.
2024-01-02 17:14:38 +01:00
Lucas CHOLLET
4e09ee1f2f LibGfx/TIFF: Reject images that declare a sample with abnormal bit depth
Anything with a bit depth of zero or greater than 32 is outside our
working range, so let's reject them.
2024-01-02 06:52:50 -07:00