vijayjoshi.org - php | javascript | ajax | and all things web

Quick PHP Tip: How to parse CDATA sections using SimpleXML

Applies to:  simplexml_load_string and simplexml_load_file

Problem : SimpleXML does not parse text inside CDATA tags in an XML.

Consider the XML below:


$str = '';
$str.='some text goes here';
$str.='';

To parse it we use following syntax:

$xml = simplexml_load_string($str);

On printing it outputs:


SimpleXMLElement Object
(
    [childNode] => some text goes here
)

Thats OK. Now the same xml but this time the text is enclosed in CDATA tags.


$str = '';
$str.='';
$str.=''

On printing this gives following output:


SimpleXMLElement Object
(
    [childNode] => SimpleXMLElement Object
        (
        )

)

Yes its empty. This is because SimpleXML does not parse CDATA tags. All data enclosed within CDATA is ignored by SimpleXML parser.

Solution: Set the 3rd parameter to LIBXML_NOCDATA while parsing.

simplexml_load_string(simplexml_load_file too) actually takes 3 parameters.

  • The string to parse
  • Optional parameter – to return an object of class specified in this parameter. (By default it returns a SimpleXMLElement Object)
  • Also optional – libxml parameters can be specified as options. This option provides the solution to our CDATA problem

Provide the 3rd parameter LIBXML_NOCDATA and SimpleXML will consider CDATA nodes as text nodes and will parse them.


$xml = simplexml_load_string($str,'SimpleXMLElement', LIBXML_NOCDATA);

This will output as desired:


SimpleXMLElement Object
(
    [childNode] => some text goes here
)

Please note that using the third parameter requires PHP >=5.1 compiled with libxml.

I will be publishing a new post shortly on how to use SimpleXML to parse xml and extract data.

Related Posts

Posted in : FAQ, PHP
Tags: , ,

3 Comments to “Quick PHP Tip: How to parse CDATA sections using SimpleXML”

Add Comments (+)

  1. Rajaraman says:

    thanks

  2. dVyper says:

    This has proved very helpful thanks!

  3. andrew says:

    Have following PHP which gives me car hire location, time and date but i cannot seem to extract the CDATA? All data comes from the following and included in this is CDATA

    $string=”http://www.thrifty.co.uk/cgi-bin/gen5?runprog=thxml&xsrc=7qhfqou3&mode=quote”;

    Any ideas????

    $value )
    {
    $value = ( array ) $value ;
    if ( isset ( $value [ 0 ] ) )
    {
    $newArray [ $key ] = trim ( $value [ 0 ] ) ;
    }
    else
    {
    $newArray [ $key ] = XML2Array ( $value , true ) ;
    }
    }
    return $newArray ;
    }

    function disp_date($str)
    {
    $y=substr($str,0,4);
    $m=substr($str,4,2);
    $d=substr($str,6,2);
    //echo $y.”-”.$m.”-”.$d;
    return date(“M d, Y”,strtotime($y.”-”.$m.”-”.$d));
    }

    $handle = fopen($string, “r”);
    $xml_string=”";
    // If there is something, read and return
    if ($handle) {
    while (!feof($handle)) {
    $buffer = fgets($handle, 4096);
    $xml_string.=$buffer;
    }
    fclose($handle);
    }

    //$xmlDoc = new DOMDocument();
    $xmlDoc=simplexml_load_string ( $xml_string ) ;

    //print_r($xmlDoc);

    echo “Zipcode: “.$xmlDoc->hire->loccode.”Location Name: “.$xmlDoc->hire->locname.”Pickup Time: “.disp_date($xmlDoc->hire->pickupdate).” “.$xmlDoc->hire->pickuptime.”Dropback Time: “.disp_date($xmlDoc->hire->dropbackdate).” “.$xmlDoc->hire->dropbacktime.”";

    echo “Car TypeDescriptionRate”;
    foreach($xmlDoc->car as $car)
    {
    echo “carimage.”‘ align=’right’ style=’padding:5px; width:64px’>“.$car->cartype.”“.$car->carsipp.”".$car->transmission.”“.$car->carexample.”£”.$car->price.”/dayUnlimited Miles”;
    }
    echo “”;
    ?>

Leave a Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

* Copy this password:

* Type or paste password here:

55,801 Spam Comments Blocked so far by Spam Free Wordpress

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>