LibWeb: Fix off-by-one in HTMLTokenizer::restore_to()

The difference should be between m_utf8_iterator and the
the new position, if m_prev_utf8_iterator is used one fewer
source position is popped than required.

This issue was not apparent on most pages since restore_to
used for tokens such  <!doctype> that are normally
followed by a newline that resets the column to zero,
but it can be seen on pages with minified HTML.
This commit is contained in:
MacDue 2022-02-13 14:08:53 +00:00 committed by Linus Groh
parent 62ad33af93
commit b193351a99

View file

@ -2726,15 +2726,13 @@ bool HTMLTokenizer::consumed_as_part_of_an_attribute() const
void HTMLTokenizer::restore_to(Utf8CodePointIterator const& new_iterator) void HTMLTokenizer::restore_to(Utf8CodePointIterator const& new_iterator)
{ {
if (new_iterator != m_prev_utf8_iterator) { auto diff = m_utf8_iterator - new_iterator;
auto diff = m_prev_utf8_iterator - new_iterator; if (diff > 0) {
if (diff > 0) { for (ssize_t i = 0; i < diff; ++i)
for (ssize_t i = 0; i < diff; ++i) m_source_positions.take_last();
m_source_positions.take_last(); } else {
} else { // Going forwards...?
// Going forwards...? TODO();
TODO();
}
} }
m_utf8_iterator = new_iterator; m_utf8_iterator = new_iterator;
} }