Günther Bosch - 2009-12-19 16:31:48
Hi,
I want to strip the chars that are invalid in XML based on the specs from the W3 Homepage:
function strip_invalid_xml_chars( $in ) {
$out = "";
$length = strlen($in);
for ( $i = 0; $i < $length; $i++) {
$current = ord($in{$i});
if ( ($current == 0x9) || ($current == 0xA) || ($current == 0xD) || (($current >= 0x20) && ($current <= 0x7E)) || (($current >= 0xA0) && ($current <= 0xD7FF)) || (($current >= 0xE000) && ($current <= 0xFFFD)) || (($current >= 0x10000) && ($current <= 0x10FFFF))) {
$out .= chr($current);
} else {
$out .= " ";
}
}
return $out;
}
But the performance is not the best, so I decided to use regex:
$input_sting = "abcdefg ™ ´ ®";
$clean_string=preg_replace('/[^\x9\xA\xD\x20-\x7E\xA0-\{xD7FF}\x{E000}-x{FFFD}\x{10000}-\x{10FFFF}]/u','',$input_sting);
But I get an Warning and an empty $clear_string:
Compilation failed: range out of order in character class at offset 26
Could someone fix this? Thanks