Viewed   77 times

I have an XML document with the following structure:

<posts>
<user id="1222334">
  <post>
    <message>hello</message>
    <client>client</client>
    <time>time</time>
  </post>
  <post>
    <message>hello client how can I help?</message>
    <client>operator</client>
    <time>time</time>
  </post>
</user>
<user id="2333343">
  <post>
    <message>good morning</message>
    <client>client</client>
    <time>time</time>
  </post>
  <post>
    <message>good morning how can I help?</message>
    <client>operator</client>
    <time>time</time>
  </post>
</user>
</posts>

I am able to create the parser and print out the whole document, the problem is however that I want to print only the (user) node and children with a specific attribute (id).

my PHP code is:

if( !empty($_GET['id']) ){
    $id = $_GET['id'];
    $parser=xml_parser_create();
    function start($parser,$element_name,$element_attrs)
      {
    switch($element_name)
        {
        case "USER": echo "-- User --<br>";
        break;
        case "CLIENT": echo "Name: ";
        break;
        case "MESSAGE": echo "Message: ";
        break;
        case "TIME": echo "Time: ";
        break;
        case "POST": echo "--Post<br> ";
        }
  }

function stop($parser,$element_name){  echo "<br>";  }
function char($parser,$data){ echo $data; }
xml_set_element_handler($parser,"start","stop");
xml_set_character_data_handler($parser,"char");

$file = "test.xml";
$fp = fopen($file, "r");
while ($data=fread($fp, filesize($file)))
  {
  xml_parse($parser,$data,feof($fp)) or 
  die (sprintf("XML Error: %s at line %d", 
  xml_error_string(xml_get_error_code($parser)),
  xml_get_current_line_number($parser)));
  }
xml_parser_free($parser);
}

using this in the start() function can select the right node but it doesn't have any effect on the reading process:

    if(($element_name == "USER") && $element_attrs["ID"] && ($element_attrs["ID"] == "$id"))

any help would be appreciated

UPDATE: XMLReader works but when using if statement it stops working:

foreach ($filteredUsers as $user) {
echo "<table border='1'>";
foreach ($user->getChildElements('post') as $index => $post) {

    if( $post->getChildElements('client') == "operator" ){
    printf("<tr><td class='blue'>%s</td><td class='grey'>%s</td></tr>", $post->getChildElements('message'), $post->getChildElements('time'));
    }else{
    printf("<tr><td class='green'>%s</td><td class='grey'>%s</td></tr>", $post->getChildElements('message'), $post->getChildElements('time'));

    }
}
echo "</table>";
}

 Answers

1

As suggested in a comment earlier, you can alternatively use the XMLReaderDocs.

The XMLReader extension is an XML Pull parser. The reader acts as a cursor going forward on the document stream and stopping at each node on the way.

It is a class (with the same name: XMLReader) which can open a file. By default you use next() to move to the next node. You would then check if the current position is at an element and then if the element has the name you're looking for and then you could process it, for example by reading the outer XML of the element XMLReader::readOuterXml()Docs.

Compared with the callbacks in the Expat parser, this is a little burdensome. To gain more flexibility with XMLReader I normally create myself iterators that are able to work on the XMLReader object and provide the steps I need.

They allow to iterate over the concrete elements directly with foreach. Here is such an example:

require('xmlreader-iterators.php'); // https://gist.github.com/hakre/5147685

$xmlFile = '../data/posts.xml';

$ids = array(3, 8);

$reader = new XMLReader();
$reader->open($xmlFile);

/* @var $users XMLReaderNode[] - iterate over all <user> elements */
$users = new XMLElementIterator($reader, 'user');

/* @var $filteredUsers XMLReaderNode[] - iterate over elements with id="3" or id="8" */
$filteredUsers = new XMLAttributeFilter($users, 'id', $ids);

foreach ($filteredUsers as $user) {
    printf("---------------nUser with ID %d:n", $user->getAttribute('id'));
    echo $user->readOuterXml(), "n";
}

I have create an XML file that contains some more posts like in your question, numbered in the id attribute from one and up:

$xmlFile = '../data/posts.xml';

Then I created an array with two ID values of the user interested in:

$ids = array(3, 8);

It will be used in the filter-condition later. Then the XMLReader is created and the XML file is opened by it:

$reader = new XMLReader();
$reader->open($xmlFile);

The next step creates an iterator over all <user> elements of that reader:

$users = new XMLElementIterator($reader, 'user');

Which are then filtered for the id attribute values stored into the array earlier:

$filteredUsers = new XMLAttributeFilter($users, 'id', $ids);

The rest is iterating with foreach now as all conditions have been formulated:

foreach ($filteredUsers as $user) {
    printf("---------------nUser with ID %d:n", $user->getAttribute('id'));
    echo $user->readOuterXml(), "n";
}

which will return the XML of the users with the IDs 3 and 8:

---------------
User with ID 3:
<user id="3">
        <post>
            <message>message</message>
            <client>client</client>
            <time>time</time>
        </post>
    </user>
---------------
User with ID 8:
<user id="8">
        <post>
            <message>message 8.1</message>
            <client>client</client>
            <time>time</time>
        </post>
        <post>
            <message>message 8.2</message>
            <client>client</client>
            <time>time</time>
        </post>
        <post>
            <message>message 8.3</message>
            <client>client</client>
            <time>time</time>
        </post>
    </user>

The XMLReaderNode which is part of the XMLReader iterators does also provide a SimpleXMLElementDocs in case you want to easily read values inside of the <user> element.

The following example shows how to get the count of <post> elements inside the <user> element:

foreach ($filteredUsers as $user) {
    printf("---------------nUser with ID %d:n", $user->getAttribute('id'));
    echo $user->readOuterXml(), "n";
    echo "Number of posts: ", $user->asSimpleXML()->post->count(), "n";
}

This would then display Number of posts: 1 for the user ID 3 and Number of posts: 3 for the user ID 8.

However, if that outer XML is large, you don't want to do that and you want to continue to iterate inside that element:

// rewind
$reader->open($xmlFile);

foreach ($filteredUsers as $user) {
    printf("---------------nUser with ID %d:n", $user->getAttribute('id'));
    foreach ($user->getChildElements('post') as $index => $post) {
        printf(" * #%d: %sn", ++$index, $post->getChildElements('message'));
    }
    echo "Number of posts: ", $index, "n";
}

Which produces the following output:

---------------
User with ID 3:
 * #1: message 3
Number of posts: 1
---------------
User with ID 8:
 * #1: message 8.1
 * #2: message 8.2
 * #3: message 8.3
Number of posts: 3

This example shows: depending on how large the nested children are, you can traverse further with the iterators available via getChildElements() or you can use as well the common XML parser like SimpleXML or even DOMDocument on a subset of the XML.

Wednesday, November 16, 2022
 
2

Using simpleXML:

$xml = new SimpleXMLElement($xmlstr);
echo $xml->file['path']."n";

Output:

http://www.thesite.com/download/eysjkss.zip
Friday, August 26, 2022
1

WideDonkey, have you considered using the DOM instead? You can easily do :

$dom = new DOMDocument();
$dom->loadXML(file_get_contents('php://input'));

$data = $dom->getElementsByTagName('data');
$data = $data[0]->asXML();
Tuesday, October 4, 2022
 
5

It might be a bit old thread, but i will post anyway. i had the same problem (needed to deserialize like 10kb of data from a file that had more than 1MB). In main object (which has a InnerObject that needs to be deserializer) i implemented a IXmlSerializable interface, then changed the ReadXml method.

We have xmlTextReader as input , the first line is to read till a XML tag:

reader.ReadToDescendant("InnerObjectTag"); //tag which matches the InnerObject

Then create XMLSerializer for a type of the object we want to deserialize and deserialize it

XmlSerializer   serializer = new XmlSerializer(typeof(InnerObject));

this.innerObject = serializer.Deserialize(reader.ReadSubtree()); //this gives serializer the part of XML that is for  the innerObject data

reader.close(); //now skip the rest 

this saved me a lot of time to deserialize and allows me to read just a part of XML (just some details that describe the file, which might help the user to decide if the file is what he wants to load).

Monday, December 12, 2022
 
vitalie
 
5

Give LINQ to XML a try. XElement result = XElement.Load(@"C:div_kid.xml");

Querying in LINQ is brilliant but admittedly a little weird at the start. You select nodes from the Document in a SQL like syntax, or using lambda expressions. Then create anonymous objects (or use existing classes) containing the data you are interested in.

Best is to see it in action.

  • miscellaneous examples of LINQ to XML
  • simple sample using xquery and lambdas
  • sample denoting namespaces
  • There is tons more on msdn. Search for LINQ to XML.

Based on your sample XML and code, here's a specific example:

var element = XElement.Load(@"C:div_kid.xml");
var shopsQuery =
    from shop in element.Descendants("shop")
    select new
    {
        Name = (string) shop.Descendants("name").FirstOrDefault(),
        Company = (string) shop.Descendants("company").FirstOrDefault(),
        Categories = 
            from category in shop.Descendants("category")
            select new {
                Id = category.Attribute("id").Value,
                Parent = category.Attribute("parentId").Value,
                Name = category.Value
            },
        Offers =
            from offer in shop.Descendants("offer")
            select new { 
                Price = (string) offer.Descendants("price").FirstOrDefault(),
                Picture = (string) offer.Descendants("picture").FirstOrDefault()
            }

    };

foreach (var shop in shopsQuery){
    Console.WriteLine(shop.Name);
    Console.WriteLine(shop.Company);
    foreach (var category in shop.Categories)
    {
        Console.WriteLine(category.Name);
        Console.WriteLine(category.Id);
    }
    foreach (var offer in shop.Offers)
    {
        Console.WriteLine(offer.Price);
        Console.WriteLine(offer.Picture);
    }
}  

As an extra: Here's how to deserialize the tree of categories from the flat category elements. You need a proper class to house them, for the list of Children must have a type:

class Category
{
    public int Id { get; set; }
    public int? ParentId { get; set; }
    public List<Category> Children { get; set; }
    public IEnumerable<Category> Descendants {
        get
        {
            return (from child in Children
                    select child.Descendants).SelectMany(x => x).
                    Concat(new Category[] { this });
        }
    }
}

To create a list containing all distinct categories in the document:

var categories = (from category in element.Descendants("category")
                    orderby int.Parse( category.Attribute("id").Value )
                    select new Category()
                    {
                        Id = int.Parse(category.Attribute("id").Value),
                        ParentId = category.Attribute("parentId") == null ?
                            null as int? : int.Parse(category.Attribute("parentId").Value),
                        Children = new List<Category>()
                    }).Distinct().ToList();

Then organize them into a tree (Heavily borrowed from flat list to hierarchy):

var lookup = categories.ToLookup(cat => cat.ParentId);
foreach (var category in categories)
{
    category.Children = lookup[category.Id].ToList();
}
var rootCategories = lookup[null].ToList();

To find the root which contains theCategory:

var root = (from cat in rootCategories
            where cat.Descendants.Contains(theCategory)
            select cat).FirstOrDefault();
Sunday, November 13, 2022
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :