Parse XML file into array (PHP)

Nowadays XML-based information storing is increasing its use, so usually you’ll have to parse these files. If you want a PHP function that resolves these needs, take a look at this:

<?php
/**
 * xml2array() will convert the given XML text to an array in the XML structure.
 * Link: http://www.bin-co.com/php/scripts/xml2array/
 * Arguments : $contents - The XML text
 *                $get_attributes - 1 or 0. If this is 1 the function will get the attributes as well as the tag values - this results in a different array structure in the return value.
 *                $priority - Can be 'tag' or 'attribute'. This will change the way the resulting array sturcture. For 'tag', the tags are given more importance.
 * Return: The parsed XML in an array form. Use print_r() to see the resulting array structure.
 * Examples: $array =  xml2array(file_get_contents('feed.xml'));
 *              $array =  xml2array(file_get_contents('feed.xml', 1, 'attribute'));
 */
function xml2array($contents, $get_attributes=1, $priority = 'tag') {
    if(!$contents) return array();
 
    if(!function_exists('xml_parser_create')) {
        //print "'xml_parser_create()' function not found!";
        return array();
    }
 
    //Get the XML parser of PHP - PHP must have this module for the parser to work
    $parser = xml_parser_create('');
    xml_parser_set_option($parser, XML_OPTION_TARGET_ENCODING, "UTF-8"); # http://minutillo.com/steve/weblog/2004/6/17/php-xml-and-character-encodings-a-tale-of-sadness-rage-and-data-loss
    xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0);
    xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1);
    xml_parse_into_struct($parser, trim($contents), $xml_values);
    xml_parser_free($parser);
 
    if(!$xml_values) return;//Hmm...
 
    //Initializations
    $xml_array = array();
    $parents = array();
    $opened_tags = array();
    $arr = array();
 
    $current = &$xml_array; //Refference
 
    //Go through the tags.
    $repeated_tag_index = array();//Multiple tags with same name will be turned into an array
    foreach($xml_values as $data) {
        unset($attributes,$value);//Remove existing values, or there will be trouble
 
        //This command will extract these variables into the foreach scope
        // tag(string), type(string), level(int), attributes(array).
        extract($data);//We could use the array by itself, but this cooler.
 
        $result = array();
        $attributes_data = array();
 
        if(isset($value)) {
            if($priority == 'tag') $result = $value;
            else $result['value'] = $value; //Put the value in a assoc array if we are in the 'Attribute' mode
        }
 
        //Set the attributes too.
        if(isset($attributes) and $get_attributes) {
            foreach($attributes as $attr => $val) {
                if($priority == 'tag') $attributes_data[$attr] = $val;
                else $result['attr'][$attr] = $val; //Set all the attributes in a array called 'attr'
            }
        }
 
        //See tag status and do the needed.
        if($type == "open") {//The starting of the tag '<tag>'
            $parent[$level-1] = &$current;
            if(!is_array($current) or (!in_array($tag, array_keys($current)))) { //Insert New tag
                $current[$tag] = $result;
                if($attributes_data) $current[$tag. '_attr'] = $attributes_data;
                $repeated_tag_index[$tag.'_'.$level] = 1;
 
                $current = &$current[$tag];
 
            } else { //There was another element with the same tag name
 
                if(isset($current[$tag][0])) {//If there is a 0th element it is already an array
                    $current[$tag][$repeated_tag_index[$tag.'_'.$level]] = $result;
                    $repeated_tag_index[$tag.'_'.$level]++;
                } else {//This section will make the value an array if multiple tags with the same name appear together
                    $current[$tag] = array($current[$tag],$result);//This will combine the existing item and the new item together to make an array
                    $repeated_tag_index[$tag.'_'.$level] = 2;
 
                    if(isset($current[$tag.'_attr'])) { //The attribute of the last(0th) tag must be moved as well
                        $current[$tag]['0_attr'] = $current[$tag.'_attr'];
                        unset($current[$tag.'_attr']);
                    }
 
                }
                $last_item_index = $repeated_tag_index[$tag.'_'.$level]-1;
                $current = &$current[$tag][$last_item_index];
            }
 
        } elseif($type == "complete") { //Tags that ends in 1 line '<tag />'
            //See if the key is already taken.
            if(!isset($current[$tag])) { //New Key
                $current[$tag] = $result;
                $repeated_tag_index[$tag.'_'.$level] = 1;
                if($priority == 'tag' and $attributes_data) $current[$tag. '_attr'] = $attributes_data;
 
            } else { //If taken, put all things inside a list(array)
                if(isset($current[$tag][0]) and is_array($current[$tag])) {//If it is already an array...
 
                    // ...push the new element into that array.
                    $current[$tag][$repeated_tag_index[$tag.'_'.$level]] = $result;
 
                    if($priority == 'tag' and $get_attributes and $attributes_data) {
                        $current[$tag][$repeated_tag_index[$tag.'_'.$level] . '_attr'] = $attributes_data;
                    }
                    $repeated_tag_index[$tag.'_'.$level]++;
 
                } else { //If it is not an array...
                    $current[$tag] = array($current[$tag],$result); //...Make it an array using using the existing value and the new value
                    $repeated_tag_index[$tag.'_'.$level] = 1;
                    if($priority == 'tag' and $get_attributes) {
                        if(isset($current[$tag.'_attr'])) { //The attribute of the last(0th) tag must be moved as well
 
                            $current[$tag]['0_attr'] = $current[$tag.'_attr'];
                            unset($current[$tag.'_attr']);
                        }
 
                        if($attributes_data) {
                            $current[$tag][$repeated_tag_index[$tag.'_'.$level] . '_attr'] = $attributes_data;
                        }
                    }
                    $repeated_tag_index[$tag.'_'.$level]++; //0 and 1 index is already taken
                }
            }
 
        } elseif($type == 'close') { //End of tag '</tag>'
            $current = &$parent[$level-1];
        }
    }
 
    return($xml_array);
}
</code>

An example of how to use it:

Suppose you have an XML file like:

<object><name>Objeto3</name><id>003</id><referenceNumber></referenceNumber><groupIdentifier></groupIdentifier><persistentIdentifier></persistentIdentifier><masterCreationDate locale="CEST"><date format="yyyyMMdd">20090612</date><time format="HHmmssSSS">103530703</time></masterCreationDate><objectComposition>simple</objectComposition><structuralType><name></name><extension></extension></structuralType><hardwareEnvironment>x86</hardwareEnvironment><softwareEnvironment>OS: Windows XP 5.1, JVM:Sun Microsystems Inc. 1.6.0_13</softwareEnvironment><installationRequirements></installationRequirements><accessInhibitors></accessInhibitors><accessFacilitators></accessFacilitators><quirks></quirks><metadataRecordCreator></metadataRecordCreator><metadataCreationDate locale="CEST"><date format="yyyyMMdd">20090612</date><time format="HHmmssSSS">103530734</time></metadataCreationDate><comments></comments><files><file xmlns:nz_govt_natlib_xsl_XSLTFunctions="nz.govt.natlib.xsl.XSLTFunctions">
<fileIdentifier/>
<path>D:\00077-70\00077-70_0001.jpg</path>
<filename>
<name>00077-70_0001.jpg</name>
<extension>jpg</extension>
</filename>
<size>162096</size>
<fileDateTime>
<date format="yyyyMMdd">20090303</date>
<time format="HHmmssSSS">133008000</time>
</fileDateTime>
<mimetype>image/jpeg</mimetype>
<fileFormat>
<format>JPEG</format>
</fileFormat>
<image>
<imageResolution>
<samplingFrequencyUnit>2</samplingFrequencyUnit>
<xsamplingFrequency>150</xsamplingFrequency>
<ysamplingFrequency>150</ysamplingFrequency>
</imageResolution>
<imageDimension>
<width>998</width>
<length>1321</length>
</imageDimension>
<bitsPerSample>8</bitsPerSample>
<photometricInterpretation>YCbCr</photometricInterpretation>
<iccprofileName/>
<colorMap/>
<orientation>0degrees</orientation>
<compression>
<scheme>6</scheme>
<level/>
</compression>
</image>
</file>
</files></object>

Then you'll parse it like:

<code>
<strong>$xml = xml2array($contents);</strong>
$query1 = "INSERT into `master` values (" .
 		$ultimoID . ", " .
 		<strong>$xml[Object][Files][File][Size] . ", '" .
 		$xml[Object][Files][File][Filename][Extension] . "', '" .
 		$xml[Object][Files][File][Mimetype] . "', '" .</strong>
 		md5_file($listado[$i]) . "', " .
 		$xml[Object][Files][File][FileDateTime][Date] . ", " .
 		substr($xml[Object][Files][File][FileDateTime][Time],0,6) . ", '" .
 		$xml[Object][Files][File][Filename][Name] . "', '" .
 		addslashes($xml[Object][Files][File][Path]) . "', " .
 		$xml[Object][Files][File][Image][ImageResolution][XSamplingFrequency] . ", " .
 		$xml[Object][Files][File][Image][ImageResolution][YSamplingFrequency] . ", " .
 		$xml[Object][Files][File][Image][ImageDimension][Width] . ", " .
 		$xml[Object][Files][File][Image][ImageDimension][Length] . ", '" .
 		$xml[Object][Files][File][FileFormat][Format] . "', '" .
 		$xml[Object][Files][File][FileFormat][Version] . "', " .
 		$xml[Object][Files][File][Image][BitsPerSample]. ")";
</code>

Related posts:

  1. Smarty templates: print_r an array [SOLVED]
  2. PHP: check if url exists (or file via url)
  3. [SOLVED] Apache: ‘[error] [client ::XXX] File does not exist:’
  4. WAMP and DRUPAL: “Deprecated: Function ereg() is deprecated in C:\wamp\www\includes\file.inc on line 902″ [SOLVED]

One Response to “Parse XML file into array (PHP)”

Leave a Reply

Paypal donate

Please help me keep this blog up by donating.

Por favor, ayúdame a continuar con el blog donando.