ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2025-01-23 17:52:26 -05:00

Author	SHA1	Message	Date
Luke Wilde	c0a64f7317	LibWeb: Check for HTML integration points in the tree constructor This particularly implements these two points: - "If the adjusted current node is an HTML integration point and the token is a start tag" - "If the adjusted current node is an HTML integration point and the token is a character token" This also adds spec comments to the tree constructor.	2021-10-01 12:26:41 +02:00
Andreas Kling	831fdcaabc	LibWeb: Add the PageTransitionEvent interface and fire "pageshow" events We now fire "pageshow" events at the appropriate time during document loading (done by the parser.) Note that there are no corresponding "pagehide" events yet.	2021-09-26 12:47:51 +02:00
Andreas Kling	508edcd217	LibWeb: Add a "page showing" flag to documents This will be used to determine whether "pageshow" and "pagehide" events are appropriate. We won't actually make use of it until we implement more of history traversal and document unloading.	2021-09-26 12:47:51 +02:00
Andreas Kling	a2f77a2e39	LibWeb: Implement "update the current document readiness" from spec The only difference from what we were already doing is that setting the same ready state twice no longer fires a "readystatechange" event. I don't think that could happen in practice though.	2021-09-26 12:47:51 +02:00
Andreas Kling	8496024756	LibWeb: Store HTML document ready state as an enum	2021-09-26 12:47:51 +02:00
Andreas Kling	dbba0a520f	LibWeb: Allow HTML parser to delay delivery of the document "load" event We will now spin in "the end" until there are no more "things delaying the load event". Of course, nothing actually uses this yet, and there are a lot of things that need to.	2021-09-26 02:00:00 +02:00
Andreas Kling	e7af6af626	LibWeb: Implement more of HTMLParser::the_end() and bring closer to spec	2021-09-26 00:52:19 +02:00
Andreas Kling	e452550fda	LibWeb: Split out "The end" from the HTML parsing spec to a function Also add a spec link and some comments.	2021-09-26 00:04:33 +02:00
Andreas Kling	f67648f872	LibWeb: Rename HTMLDocumentParser => HTMLParser	2021-09-25 23:36:43 +02:00
Ben Wiederhake	32e98d0924	Libraries: Use AK::Variant default initialization where appropriate	2021-09-21 04:22:52 +04:30
Andreas Kling	c34da16089	LibWeb: Make <script src> loads partially async (by following the spec) Instead of firing up a network request and synchronously blocking for it to finish via a nested event loop, we now start an asynchronous request when encountering <script src>. Once the script load finishes (or fails), it gets executed at one of the synchronization points in the HTML parser. This solves some long-standing issues with random unexpected events getting dispatched in the middle of parsing.	2021-09-20 17:22:25 +02:00
Andreas Kling	e11ae33c66	LibWeb: Pop entire stack of open elements at the end of parsing	2021-09-20 17:22:25 +02:00
Andreas Kling	cb895edad4	LibWeb: Move Attribute into the DOM namespace	2021-09-16 01:39:47 +02:00
Andreas Kling	70398645f3	LibWeb: Improvements to error handling in HTML foreign content parsing Follow the spec more closely when encountering an invalid start or end tag during foreign content parsing.	2021-09-14 23:49:45 +02:00
Luke Wilde	f62477c093	LibWeb: Implement HTML fragment serialisation and use it in innerHTML The previous implementation was about a half implementation and was tied to Element::innerHTML. This separates it and puts it into HTMLDocumentParser, as this is in the parsing section of the spec. This provides a near finished HTML fragment serialisation algorithm, bar namespaces in attributes and the `is` value.	2021-09-14 02:09:18 +02:00
Idan Horowitz	4629f2e4ad	LibWeb: Add the Web::URL namespace and move URLEncoder to it This namespace will be used for all interfaces defined in the URL specification, like URL and URLSearchParams. This has the unfortunate side-effect of requiring us to use the fully qualified AK::URL name whenever we want to refer to the AK class, so this commit also fixes all such references.	2021-09-13 01:43:10 +02:00
Andreas Kling	882c7b1295	LibWeb: Spin the event loop in HTML parser until scripts can run Call HTML::EventLoop::spin_until() from the HTML parser when deciding whether we can run a script yet. Note that spin_until() actually doesn't do any work yet.	2021-09-09 02:30:54 +02:00
TheFightingCatfish	08359ba578	LibWeb: Fix regression of "contenteditable" attribute	2021-07-31 17:39:28 +02:00
ovf	898b8ffcb6	LibWeb: Avoid assertion failure on parsing numeric character references	2021-07-28 18:32:22 +02:00
ovf	13c7d55320	LibWeb: Fix parsing of character references in attribute values	2021-07-27 00:03:43 +02:00
Max Wipfli	ccae0cae45	LibWeb: Rename HTMLToken::doctype_data() => ensure_doctype_data() This renames the accessor to better reflect what it does, as this will allocate a DoctypeData struct if there is none.	2021-07-17 16:24:57 +04:30
Max Wipfli	519a1cdc22	LibWeb: Change HTMLToken storage architecture This completely changes how HTMLTokens store their data. Previously, space was allocated for all token types separately. Now, the HTMLToken's data is stored in just a String, two booleans and a Variant. This change reduces sizeof(HTMLToken) from 68 to 32. Also, this reduces raw tokenization time by around 20 to 50 percent, depending on the page. Full document parsing time (with HTMLDocumentParser, on a local HTML page without any dependency files) is reduced by between 4 and 20 percent, depending on the page. Since tokenizing HTML pages can easily generated 50'000 tokens and more, the storage has been designed in a way that avoids heap allocations where possible, while trying to reduce the size of the tokens. The only tokens which need to allocate on the heap are thus DOCTYPE tokens (max. 1 per document), and tag tokens (but only if they have attributes). This way, only around 5 percent of all tokens generated need to allocate on the heap (except for StringImpl allocations).	2021-07-17 16:24:57 +04:30
Max Wipfli	8a4c44db8c	LibWeb: Make HTMLTokens non-copyable	2021-07-17 16:24:57 +04:30
Max Wipfli	7eb294df0d	LibWeb: Move HTMLToken in HTMLDocumentParser This replaces a copy construction of an HTMLToken with a move(). This allows HTMLToken to be made non-copyable in a further commit.	2021-07-17 16:24:57 +04:30
Max Wipfli	2532bdfabf	LibWeb: Remove friend class declarations from HTMLToken Since all interaction with the HTMLToken class now happens over getters and setters, there is no more need for HTMLTokenizer and HTMLDocumentParser to have direct access to the members.	2021-07-17 16:24:57 +04:30
Max Wipfli	25cba4387b	LibWeb: Add HTMLToken(Type) constructor and use it	2021-07-17 16:24:57 +04:30
Max Wipfli	f2e3c770f9	LibWeb: Use setter for HTMLToken::m_{start,end}_position	2021-07-17 16:24:57 +04:30
Max Wipfli	8b31e41692	LibWeb: Change HTMLToken::m_doctype into named DoctypeData struct This is in preparation for an upcoming storage change of HTMLToken. In contrast to the other token types, the accessor can hand out a mutable reference to allow users to change parts of the DoctypeData easily.	2021-07-17 16:24:57 +04:30
Max Wipfli	918bde98b1	LibWeb: Hide implementation details of HTMLToken attribute list Previously, HTMLToken would expose the Vector<Attribute> directly to its users. In preparation for a future change, all users now use implementation-agnostic APIs which do not expose the Vector directly.	2021-07-17 16:24:57 +04:30
Max Wipfli	15d8635afc	LibWeb: User getter+setter for HTMLToken tag name and self-closing flag	2021-07-17 16:24:57 +04:30
Max Wipfli	1aeafcc58b	LibWeb: Use getter and setter for Character type HTMLTokens While storing the code point in a UTF-8 encoded String in horrendously inefficient, this problem will be addressed at a later stage.	2021-07-17 16:24:57 +04:30
Max Wipfli	e8e9426b4f	LibWeb: User getter and setter for Comment type HTMLTokens	2021-07-17 16:24:57 +04:30
Max Wipfli	f886aa15b8	LibWeb: Rename HTMLToken::AttributeBuilder struct to Attribute This does not contain StringBuilders anymore, so it can do with a simpler name: Attribute.	2021-07-17 16:24:57 +04:30
Max Wipfli	d82f3eb085	LibWeb: Make HTMLToken::{Position,AttributeBuilder} structs public There was and is no reason for those to be private. Making them public also allows us to explicitly specify the return type of some getters.	2021-07-17 16:24:57 +04:30
Max Wipfli	e22a34badb	LibWeb: Fix assertion failures in HTMLTokenizer The *TagName states are all very similar, so it seems to be correct to apply the fix from #8761 to all of those states. This fixes #8788.	2021-07-16 11:55:55 +02:00
Max Wipfli	2404ad6897	LibWeb: Fix assertion failure when tokenizing JS regex literals This fixes parsing the following regular expression: /</g; It also adds a simple script element to the HTMLTokenizer regression test, which also contains that specific regex.	2021-07-15 01:47:22 +02:00
Max Wipfli	bb2aed7d76	LibWeb: Correct behavior of Comment* states in HTMLTokenizer Previously, this would lead to assertion failures when parsing HTML comments. This fixes #8757.	2021-07-15 00:48:45 +02:00
Max Wipfli	af0b483123	LibWeb: VERIFY an empty builder when emitting tokens in HTMLTokenizer	2021-07-15 00:48:45 +02:00
Max Wipfli	045a6a566b	LibWeb: Remove unused HTMLTokenizer::m_input member variable	2021-07-14 23:03:36 +02:00
Max Wipfli	35f32ac170	LibWeb: Change HTMLToken.h to east const style	2021-07-14 23:03:36 +02:00
Max Wipfli	125982943a	LibWeb: Change HTMLTokenizer.{cpp,h} to east const style	2021-07-14 23:03:36 +02:00
Gunnar Beutner	300823c314	LibWeb: Use move() when enqueuing tokens in HTMLTokenizer We're not using the current token anymore once it's enqueued so let's use move() when enqueuing the tokens.	2021-07-14 23:03:36 +02:00
Gunnar Beutner	c3ad8e9a52	LibWeb: Remove StringBuilder from HTMLToken::m_comment_or_character	2021-07-14 23:03:36 +02:00
Gunnar Beutner	3aa202c432	LibWeb: Remove StringBuilder from HTMLToken::m_tag	2021-07-14 23:03:36 +02:00
Gunnar Beutner	901d71148b	LibWeb: Remove StringBuilders from HTMLToken::AttributeBuilder	2021-07-14 23:03:36 +02:00
Gunnar Beutner	992964aa7d	LibWeb: Remove StringBuilders from HTMLToken::m_doctype	2021-07-14 23:03:36 +02:00
Gunnar Beutner	2150609590	LibWeb: Remove more unused StringBuilders in HTMLToken These fields aren't read anywhere but I didn't feel like removing them outright.	2021-07-14 23:03:36 +02:00
Gunnar Beutner	d9e52997e2	LibWeb: Use an Optional<String> to track the last HTML start tag Using an HTMLToken object here is unnecessary because the only attribute we're interested in is the tag_name.	2021-07-14 23:03:36 +02:00
Luke	e9eae9d880	LibWeb: Add extracting character encoding from a meta content attribute Some Gmail emails contain this.	2021-07-13 20:23:44 +02:00
Andreas Kling	ee3a73ddbb	AK: Rename downcast<T> => verify_cast<T> This makes it much clearer what this cast actually does: it will VERIFY that the thing we're casting is a T (using is<T>()).	2021-06-24 19:57:01 +02:00

1 2

77 commits