Permalinks: Sanitize non-visible characters inside `sanitize_title_with_dashes()`.

This change prevents non-visible characters in titles from creating encoded values in permalinks, opting instead for the following replacement strategy:

* Non-visible non-zero-width characters are replaced with hyphens
* Non-visible zero-width characters are removed entirely

Included with this change are 64 additional PHPUnit assertions to confirm that only the targeted non-visible characters are sanitized as intended.

Before this change, URLs would unintentionally contain encoded values where these non-visible characters were. After this change, URLs intentionally strip out or hyphenate these non-visible characters.

Props costdev, dhanendran, hellofromtonya, paaljoachim, peterwilsoncc, poena, sergeybiryukov.

Fixes #47912.
Built from https://develop.svn.wordpress.org/trunk@51984


git-svn-id: http://core.svn.wordpress.org/trunk@51573 1a063a9b-81f0-0310-95a4-ce76da25c4cd
This commit is contained in:
johnjamesjacoby 2021-11-02 18:47:57 +00:00
parent 3ab8d52d78
commit 43644069ea
2 changed files with 35 additions and 1 deletions

View File

@ -2288,11 +2288,45 @@ function sanitize_title_with_dashes( $title, $raw_title = '', $context = 'displa
'%cc%80',
'%cc%84',
'%cc%8c',
// Non-visible characters that display without a width.
'%e2%80%8b',
'%e2%80%8c',
'%e2%80%8d',
'%e2%80%8e',
'%e2%80%8f',
'%e2%80%aa',
'%e2%80%ab',
'%e2%80%ac',
'%e2%80%ad',
'%e2%80%ae',
'%ef%bb%bf',
),
'',
$title
);
// Convert non-visible characters that display with a width to hyphen.
$title = str_replace(
array(
'%e2%80%80',
'%e2%80%81',
'%e2%80%82',
'%e2%80%83',
'%e2%80%84',
'%e2%80%85',
'%e2%80%86',
'%e2%80%87',
'%e2%80%88',
'%e2%80%89',
'%e2%80%8a',
'%e2%80%a8',
'%e2%80%a9',
'%e2%80%af',
),
'-',
$title
);
// Convert &times to 'x'.
$title = str_replace( '%c3%97', 'x', $title );
}

View File

@ -16,7 +16,7 @@
*
* @global string $wp_version
*/
$wp_version = '5.9-alpha-51982';
$wp_version = '5.9-alpha-51984';
/**
* Holds the WordPress DB revision, increments when changes are made to the WordPress DB schema.