mirror of
https://github.com/LadybirdBrowser/ladybird.git
synced 2025-01-22 17:24:48 -05:00
LibWeb: Fix numeric character reference at EOF leaking its last digit
Previously, if the NumericCharacterReferenceEnd state was reached when current_input_character was None, then the DONT_CONSUME_NEXT_INPUT_CHARACTER macro would restore back before the EOF, and allow the next state (after the SWITCH_TO_RETURN_STATE) to proceed with the last digit of the numeric character reference. For example, with something like `ї`, before this commit the output would incorrectly be `<code point with the value 1111>1` instead of just `<code point with the value 1111>`. Instead of putting the `if (current_input_character.has_value())` check inside NumericCharacterReferenceEnd directly, it was instead added to DONT_CONSUME_NEXT_INPUT_CHARACTER, because all usages of the macro benefit from this check, even if the other existing usage sites don't exhibit any bugs without it: - In MarkupDeclarationOpen, if the current_input_character is EOF, then the previous character is always `!`, so restoring and then checking forward for strings like `--`, `DOCTYPE`, etc won't match and the BogusComment state will run one extra time (once for `!` and once for EOF) with no practical consequences. With the `has_value()` check, BogusComment will only run once with EOF. - In AfterDOCTYPEName, ConsumeNextResult::RanOutOfCharacters can only occur when stopping at the insertion point, and because of how the code is structured, it is guaranteed that current_input_character is either `P` or `S`, so the `has_value()` check is irrelevant.
This commit is contained in:
parent
752deaf6ef
commit
df87a9689c
Notes:
github-actions[bot]
2025-01-06 23:44:49 +00:00
Author: https://github.com/squeek502 Commit: https://github.com/LadybirdBrowser/ladybird/commit/df87a9689c1 Pull-request: https://github.com/LadybirdBrowser/ladybird/pull/3163 Reviewed-by: https://github.com/gmta ✅
3 changed files with 15 additions and 6 deletions
|
@ -94,9 +94,10 @@ namespace Web::HTML {
|
|||
} \
|
||||
} while (0)
|
||||
|
||||
#define DONT_CONSUME_NEXT_INPUT_CHARACTER \
|
||||
do { \
|
||||
restore_to(m_prev_utf8_iterator); \
|
||||
#define DONT_CONSUME_NEXT_INPUT_CHARACTER \
|
||||
do { \
|
||||
if (current_input_character.has_value()) \
|
||||
restore_to(m_prev_utf8_iterator); \
|
||||
} while (0)
|
||||
|
||||
#define ON(code_point) \
|
||||
|
|
|
@ -199,6 +199,15 @@ TEST_CASE(character_reference_in_attribute)
|
|||
END_ENUMERATION();
|
||||
}
|
||||
|
||||
TEST_CASE(numeric_character_reference)
|
||||
{
|
||||
auto tokens = run_tokenizer("ї"sv);
|
||||
BEGIN_ENUMERATION(tokens);
|
||||
EXPECT_CHARACTER_TOKEN(1111);
|
||||
EXPECT_END_OF_FILE_TOKEN();
|
||||
END_ENUMERATION();
|
||||
}
|
||||
|
||||
TEST_CASE(comment)
|
||||
{
|
||||
auto tokens = run_tokenizer("<p><!-- This is a comment --></p>"sv);
|
||||
|
|
|
@ -2,8 +2,7 @@ Harness status: OK
|
|||
|
||||
Found 63 tests
|
||||
|
||||
62 Pass
|
||||
1 Fail
|
||||
63 Pass
|
||||
Pass html5lib_tests2.html e070301fb578bd639ecbc7ec720fa60222d05826
|
||||
Pass html5lib_tests2.html aaf24dabcb42470e447d241a40def0d136c12b93
|
||||
Pass html5lib_tests2.html b6c1142484570bb90c36e454ee193cca17bb618a
|
||||
|
@ -27,7 +26,7 @@ Pass html5lib_tests2.html 73b97cd984a62703ec54ec4a876ec32aa5fd3b8c
|
|||
Pass html5lib_tests2.html 2db9616ed62fc2a26056f3395459869cf556974d
|
||||
Pass html5lib_tests2.html b59aa1c714892618eaccd51696658887fcbd2045
|
||||
Pass html5lib_tests2.html 98818e7fda2506603bd208662613edb40297c2d3
|
||||
Fail html5lib_tests2.html e0c43080cf61c0696031bdb097bea4f2a647cfc2
|
||||
Pass html5lib_tests2.html e0c43080cf61c0696031bdb097bea4f2a647cfc2
|
||||
Pass html5lib_tests2.html f7753d80a422c40b5fa04d99e52d8ae83369757a
|
||||
Pass html5lib_tests2.html 7cbd584aef9508a90c98f80040078149a92ec869
|
||||
Pass html5lib_tests2.html e0f7f130b1e3653dd06f10f3492e4f0bf4cd3cfa
|
||||
|
|
Loading…
Reference in a new issue