wpseek.com
A WordPress-centric search engine for devs and theme authors



wp_check_invalid_utf8 › WordPress Function

Since2.8.0
Deprecatedn/a
wp_check_invalid_utf8 ( $text, $strip = false )
Parameters: (2)
  • (string) $text String which is expected to be encoded as UTF-8 unless `blog_charset` is another encoding.
    Required: Yes
  • (bool) $strip Optional. Whether to replace invalid sequences of bytes with the Unicode replacement character (U+FFFD `�`). Default `false` returns an empty string for invalid UTF-8 inputs.
    Required: No
    Default: false
Returns:
  • (string) The checked text.
Defined at:
Codex:
Change Log:
  • 6.9.0

Checks for invalid UTF8 in a string.

Note! This function only performs its work if the blog_charset is set to UTF-8. For all other values it returns the input text unchanged. Note! Unless requested, this returns an empty string if the input contains any sequences of invalid UTF-8. To replace invalid byte sequences, pass true as the optional $strip parameter. Consider using {@see} instead which does not depend on the value of blog_charset. Example: // The blog_charset is latin1, so this returns the input unchanged. $every_possible_input === wp_check_invalid_utf8( $every_possible_input ); // Valid strings come through unchanged. 'test' === wp_check_invalid_utf8( 'test' ); $invalid = "the byte xC0 is never allowed in a UTF-8 string."; // Invalid strings are rejected outright. '' === wp_check_invalid_utf8( $invalid ); // “Stripping” invalid sequences produces the replacement character instead. "the byte u{FFFD} is never allowed in a UTF-8 string." === wp_check_invalid_utf8( $invalid, true ); 'the byte � is never allowed in a UTF-8 string.' === wp_check_invalid_utf8( $invalid, true );


Source

function wp_check_invalid_utf8( $text, $strip = false ) {
	$text = (string) $text;

	if ( 0 === strlen( $text ) ) {
		return '';
	}

	// Store the site charset as a static to avoid multiple calls to get_option().
	static $is_utf8 = null;
	if ( ! isset( $is_utf8 ) ) {
		$is_utf8 = is_utf8_charset();
	}

	if ( ! $is_utf8 || wp_is_valid_utf8( $text ) ) {
		return $text;
	}

	return $strip
		? wp_scrub_utf8( $text )
		: '';
}