Emoji: Port the Twemoji regex to PHP.

Previously, `wp_encode_emoji()` and `wp_staticize_emoji()` used inaccurate regular expressions to find emoji, and transform then into HTML entities or `<img>`s, respectively. This would result in emoji not being correctly transformed, or occasionally, non-emoji being incorrectly transformed.

This commit adds a new `grunt` task - `grunt precommit:emoji`. It finds the regex in `twemoji.js`, transforms it into a PHP-friendly version, and adds it to `formatting.php`. This task is also automatically run by `grunt precommit`, when it detects that `twemoji.js` has changed.

The new regex requires features introduced in PCRE 8.32, which was introduced in PHP 5.4.14, though it was also backported to later releases of the PHP 5.3 series. For versions of PHP that don't support this, it will fall back to an updated version of the loose-matching regex.

For short posts, the performance difference between the old and new regex is negligible. As the posts get longer, however, the new method is exponentially faster.

Merges [41043], [41045], and [41046] to the 4.8 branch.

Fixes #35293.


Built from https://develop.svn.wordpress.org/branches/4.8@41069


git-svn-id: http://core.svn.wordpress.org/branches/4.8@40921 1a063a9b-81f0-0310-95a4-ce76da25c4cd
This commit is contained in:
Gary Pendergast 2017-07-18 03:48:38 +00:00
parent e875520cec
commit 471cb97374

File diff suppressed because one or more lines are too long