Quick PHP Tip: How to parse CDATA sections using SimpleXML
Applies to: simplexml_load_string and simplexml_load_file
Problem : SimpleXML does not parse text inside CDATA tags in an XML.
Consider the XML below:
$str = ''; $str.=' ';some text goes here '; $str.='
To parse it we use following syntax:
$xml = simplexml_load_string($str);
On printing it outputs:
SimpleXMLElement Object
(
[childNode] => some text goes here
)
Thats OK. Now the same xml but this time the text is enclosed in CDATA tags.
$str = ''; $str.=' ''; $str.='
On printing this gives following output:
SimpleXMLElement Object
(
[childNode] => SimpleXMLElement Object
(
)
)
Yes its empty. This is because SimpleXML does not parse CDATA tags. All data enclosed within CDATA is ignored by SimpleXML parser.
Solution: Set the 3rd parameter to LIBXML_NOCDATA while parsing.
simplexml_load_string(simplexml_load_file too) actually takes 3 parameters.
- The string to parse
- Optional parameter – to return an object of class specified in this parameter. (By default it returns a SimpleXMLElement Object)
- Also optional – libxml parameters can be specified as options. This option provides the solution to our CDATA problem
Provide the 3rd parameter LIBXML_NOCDATA and SimpleXML will consider CDATA nodes as text nodes and will parse them.
$xml = simplexml_load_string($str,'SimpleXMLElement', LIBXML_NOCDATA);
This will output as desired:
SimpleXMLElement Object
(
[childNode] => some text goes here
)
Please note that using the third parameter requires PHP >=5.1 compiled with libxml.
I will be publishing a new post shortly on how to use SimpleXML to parse xml and extract data.
Related Posts
3 Comments to “Quick PHP Tip: How to parse CDATA sections using SimpleXML”
Add Comments (+)-
This has proved very helpful thanks!
-
Have following PHP which gives me car hire location, time and date but i cannot seem to extract the CDATA? All data comes from the following and included in this is CDATA
$string=”http://www.thrifty.co.uk/cgi-bin/gen5?runprog=thxml&xsrc=7qhfqou3&mode=quote”;
Any ideas????
$value )
{
$value = ( array ) $value ;
if ( isset ( $value [ 0 ] ) )
{
$newArray [ $key ] = trim ( $value [ 0 ] ) ;
}
else
{
$newArray [ $key ] = XML2Array ( $value , true ) ;
}
}
return $newArray ;
}function disp_date($str)
{
$y=substr($str,0,4);
$m=substr($str,4,2);
$d=substr($str,6,2);
//echo $y.”-”.$m.”-”.$d;
return date(“M d, Y”,strtotime($y.”-”.$m.”-”.$d));
}$handle = fopen($string, “r”);
$xml_string=”";
// If there is something, read and return
if ($handle) {
while (!feof($handle)) {
$buffer = fgets($handle, 4096);
$xml_string.=$buffer;
}
fclose($handle);
}//$xmlDoc = new DOMDocument();
$xmlDoc=simplexml_load_string ( $xml_string ) ;//print_r($xmlDoc);
echo “Zipcode: “.$xmlDoc->hire->loccode.”Location Name: “.$xmlDoc->hire->locname.”Pickup Time: “.disp_date($xmlDoc->hire->pickupdate).” “.$xmlDoc->hire->pickuptime.”Dropback Time: “.disp_date($xmlDoc->hire->dropbackdate).” “.$xmlDoc->hire->dropbacktime.”";
echo “Car TypeDescriptionRate”;
foreach($xmlDoc->car as $car)
{
echo “carimage.”‘ align=’right’ style=’padding:5px; width:64px’>“.$car->cartype.”“.$car->carsipp.”".$car->transmission.”“.$car->carexample.”£”.$car->price.”/dayUnlimited Miles”;
}
echo “”;
?>

thanks