Commit Graph

2 Commits

Author SHA1 Message Date
dmsnell
9fc546b2ce HTML API: Add custom text decoder.
Provides a custom decoder for strings coming from HTML attributes and
markup. This custom decoder is necessary because of deficiencies in
PHP's `html_entity_decode()` function:

  - It isn't aware of 720 of the possible named character references in
    HTML, leaving many out that should be translated.

  - It isn't aware of the ambiguous ampersand rule, which allows
    conversion of character references in certain contexts when they
    are missing their closing `;`.

  - It doesn't draw a distinction for the ambiguous ampersand rule
    when decoding attribute values instead of markup values.

  - Use of `html_entity_decode()` requires manually passing non-default
    paramter values to ensure it decodes properly.

This decoder also provides some conveniences, such as making a
single-pass and interruptable decode operation possible. This will
provide a number of opportunities to optimize detection and decoding
of things like value prefixes, and whether a value contains a given
substring.

Developed in https://github.com/WordPress/wordpress-develop/pull/6387
Discussed in https://core.trac.wordpress.org/ticket/61072

Props dmsnell, gziolo, jonsurrell, jorbin, westonruter, zieladam.
Fixes #61072.

Built from https://develop.svn.wordpress.org/trunk@58281


git-svn-id: http://core.svn.wordpress.org/trunk@57741 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2024-06-02 15:16:13 +00:00
dmsnell
cbc1c955d8 Introduce Token Map: An optimized static translation class.
This patch introduces a new class: `WP_Token_Map`, designed for efficient
lookup and translation of static mappings between string keys or tokens, and
string replacements (for example, HTML character references).

The Token Map imposes certain restrictions on the byte length of the lookup
tokens and their replacements, but is a highly-optimized data structure for
mappings with a very high number of tokens.

Developed in https://github.com/WordPress/wordpress-develop/pull/5373
Discussed in https://core.trac.wordpress.org/ticket/60698

Fixes #60698.
Props: dmsnell, gziolo, jonsurrell, jorbin.

Built from https://develop.svn.wordpress.org/trunk@58188


git-svn-id: http://core.svn.wordpress.org/trunk@57651 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2024-05-23 19:56:08 +00:00