wpseek.com
A WordPress-centric search engine for devs and theme authors



wp_has_noncharacters › WordPress Function

Since6.9.0
Deprecatedn/a
wp_has_noncharacters ( $text )
Parameters:
  • (string) $text Are there noncharacters in this string?
    Required: Yes
See:
Returns:
  • (bool) Whether noncharacters were found in the string.
Defined at:
Codex:

Returns whether the given string contains Unicode noncharacters.

XML recommends against using noncharacters and HTML forbids their use in attribute names. Unicode recommends that they not be used in open exchange of data. Noncharacters are code points within the following ranges: - U+FDD0–U+FDEF - U+FFFE–U+FFFF - U+1FFFE, U+1FFFF, U+2FFFE, U+2FFFF, …, U+10FFFE, U+10FFFF


Source

function wp_has_noncharacters( string $text ): bool {
	/*
	 * Match the UTF-8 byte sequences directly so malformed UTF-8 elsewhere
	 * in the subject does not cause PCRE's Unicode mode to reject the string.
	 */
	return 1 === preg_match(
		'~
			# U+FDD0-U+FDEF, U+FFFE-U+FFFF
			\xEF(?:\xB7[\x90-\xAF]|\xBF[\xBE\xBF])
			|
			# U+nFFFE/U+nFFFF
			(?:\xF0[\x9F\xAF\xBF]|[\xF1-\xF3][\x8F\x9F\xAF\xBF]|\xF4\x8F)\xBF[\xBE\xBF]
		~x',
		$text
	);
}