WordPress/wp-includes/html-api
Bernhard Reiter 8e5db640de HTML API: Avoid processing incomplete tokens.
Currently the Tag Processor assumes that an input document is a ''full'' HTML document. Because of this, if there's lingering content after the last tag match it will treat that content as plaintext and skip over it. This is fine for the Tag Processor because if there is lingering content that isn't a valid tag then there's nothing for `next_tag()` to match.

However, in order to support a number of feature expansions it is important to recognize that the remaining content ''may'' involve partial syntax elements, such as incomplete tags, attributes, or comments.

In this patch we're adding a mode inside the Tag Processor which will flip when we start parsing HTML syntax but the document finishes before the token does. This will provide the ability to:

- extend the input document,
- avoid misinterpreting syntax as text, and
- guess if we have a complete document, know if we have an incomplete document.

In the process of building this patch a few fixes were identified and fixed in the Tag Processor, namely in the handling of incomplete syntax elements.

Props dmsnell, jonsurrell.
Fixes #60122, #60108.
Built from https://develop.svn.wordpress.org/trunk@57211


git-svn-id: http://core.svn.wordpress.org/trunk@56717 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2023-12-20 17:52:30 +00:00
..
class-wp-html-active-formatting-elements.php HTML API: Apply linting changes to @TODO comments. 2023-12-20 12:36:31 +00:00
class-wp-html-attribute-token.php HTML API: Track spans of text with (offset, length) instead of (start, end). 2023-12-10 13:19:28 +00:00
class-wp-html-open-elements.php HTML API: Add support for H1-H6 elements in the HTML Processor. 2023-12-13 17:53:19 +00:00
class-wp-html-processor-state.php HTML API: Store current token reference in HTML Processor state. 2023-09-12 15:12:17 +00:00
class-wp-html-processor.php HTML API: Apply linting changes to @TODO comments. 2023-12-20 12:36:31 +00:00
class-wp-html-span.php HTML API: Track spans of text with (offset, length) instead of (start, end). 2023-12-10 13:19:28 +00:00
class-wp-html-tag-processor.php HTML API: Avoid processing incomplete tokens. 2023-12-20 17:52:30 +00:00
class-wp-html-text-replacement.php HTML API: Track spans of text with (offset, length) instead of (start, end). 2023-12-10 13:19:28 +00:00
class-wp-html-token.php HTML-API: Prevent unintended behavior when WP_HTML_Token is unserialized. 2023-12-06 16:05:19 +00:00
class-wp-html-unsupported-exception.php HTML-API: Introduce minimal HTML Processor. 2023-07-20 13:43:25 +00:00