ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2025-01-23 17:52:26 -05:00

Author	SHA1	Message	Date
Max Wipfli	519a1cdc22	LibWeb: Change HTMLToken storage architecture This completely changes how HTMLTokens store their data. Previously, space was allocated for all token types separately. Now, the HTMLToken's data is stored in just a String, two booleans and a Variant. This change reduces sizeof(HTMLToken) from 68 to 32. Also, this reduces raw tokenization time by around 20 to 50 percent, depending on the page. Full document parsing time (with HTMLDocumentParser, on a local HTML page without any dependency files) is reduced by between 4 and 20 percent, depending on the page. Since tokenizing HTML pages can easily generated 50'000 tokens and more, the storage has been designed in a way that avoids heap allocations where possible, while trying to reduce the size of the tokens. The only tokens which need to allocate on the heap are thus DOCTYPE tokens (max. 1 per document), and tag tokens (but only if they have attributes). This way, only around 5 percent of all tokens generated need to allocate on the heap (except for StringImpl allocations).	2021-07-17 16:24:57 +04:30
Max Wipfli	8b31e41692	LibWeb: Change HTMLToken::m_doctype into named DoctypeData struct This is in preparation for an upcoming storage change of HTMLToken. In contrast to the other token types, the accessor can hand out a mutable reference to allow users to change parts of the DoctypeData easily.	2021-07-17 16:24:57 +04:30
Max Wipfli	918bde98b1	LibWeb: Hide implementation details of HTMLToken attribute list Previously, HTMLToken would expose the Vector<Attribute> directly to its users. In preparation for a future change, all users now use implementation-agnostic APIs which do not expose the Vector directly.	2021-07-17 16:24:57 +04:30
Max Wipfli	15d8635afc	LibWeb: User getter+setter for HTMLToken tag name and self-closing flag	2021-07-17 16:24:57 +04:30
Gunnar Beutner	c3ad8e9a52	LibWeb: Remove StringBuilder from HTMLToken::m_comment_or_character	2021-07-14 23:03:36 +02:00
Gunnar Beutner	3aa202c432	LibWeb: Remove StringBuilder from HTMLToken::m_tag	2021-07-14 23:03:36 +02:00
Gunnar Beutner	901d71148b	LibWeb: Remove StringBuilders from HTMLToken::AttributeBuilder	2021-07-14 23:03:36 +02:00
Gunnar Beutner	992964aa7d	LibWeb: Remove StringBuilders from HTMLToken::m_doctype	2021-07-14 23:03:36 +02:00
Max Wipfli	932161e581	LibWeb: Be more forgiving when adding source positions in HTMLTokenizer This patch changes HTMLTokenizer::nth_last_position to not fail if the requested position is not available. Rather, it will just return (0-0). While this is not the correct solution, it prevents the tokenizer from crashing just because it cannot find a source position. This should only affect SyntaxHighlighter.	2021-06-05 00:32:28 +04:30
Ali Mohammad Pur	aa7939bc6c	LibWeb: Add position tracking information to HTML tokens	2021-05-20 22:06:45 +02:00
Brian Gianforcaro	1682f0b760	Everything: Move to SPDX license identifiers in all files. SPDX License Identifiers are a more compact / standardized way of representing file license information. See: https://spdx.dev/resources/use/#identifiers This was done with the `ambr` search and replace tool. ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *	2021-04-22 11:22:27 +02:00
Andreas Kling	5d180d1f99	Everywhere: Rename ASSERT => VERIFY (...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED) Since all of these checks are done in release builds as well, let's rename them to VERIFY to prevent confusion, as everyone is used to assertions being compiled out in release. We can introduce a new ASSERT macro that is specifically for debug checks, but I'm doing this wholesale conversion first since we've accumulated thousands of these already, and it's not immediately obvious which ones are suitable for ASSERT.	2021-02-23 20:56:54 +01:00
Andreas Kling	13d7c09125	Libraries: Move to Userland/Libraries/	2021-01-12 12:17:46 +01:00

13 commits