LibWeb: Fix off-by-one in HTMLTokenizer::restore_to()

The difference should be between m_utf8_iterator and the
the new position, if m_prev_utf8_iterator is used one fewer
source position is popped than required.

This issue was not apparent on most pages since restore_to
used for tokens such  <!doctype> that are normally
followed by a newline that resets the column to zero,
but it can be seen on pages with minified HTML.
This commit is contained in:
MacDue 2022-02-13 14:08:53 +00:00 committed by Linus Groh
parent 62ad33af93
commit b193351a99

View file

@ -2726,15 +2726,13 @@ bool HTMLTokenizer::consumed_as_part_of_an_attribute() const
void HTMLTokenizer::restore_to(Utf8CodePointIterator const& new_iterator)
{
if (new_iterator != m_prev_utf8_iterator) {
auto diff = m_prev_utf8_iterator - new_iterator;
if (diff > 0) {
for (ssize_t i = 0; i < diff; ++i)
m_source_positions.take_last();
} else {
// Going forwards...?
TODO();
}
auto diff = m_utf8_iterator - new_iterator;
if (diff > 0) {
for (ssize_t i = 0; i < diff; ++i)
m_source_positions.take_last();
} else {
// Going forwards...?
TODO();
}
m_utf8_iterator = new_iterator;
}