Viewed   180 times

Possible Duplicate:
How to extract a node attribute from XML using PHP's DOM Parser

How do I extract an HTML tag value?

HTML:

<input type="hidden" name="text1" id="text1" value="need to get this">

PHP:

$homepage = file_get_contents('http://www.example.com');
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
@$doc->loadHTML($homepage);
$xpath = new DOMXpath($doc);
$filtered = $xpath->query("//input[@name='text1']");

How do I get value to be "need to get this"?

Update:

I got it working and hope it will help others too. After above code I got the value by:

echo $filtered->item(0)->getAttribute('value');

 Answers

1

XPath can do the job of getting the value attribute with $xpath->query("//input[@name='text1']/@value");. Then you can iterate over the node list of attribute nodes and access the $value property of each attribute node.

Sunday, October 16, 2022
 
javiyu
 
4

According to the documentation, the "$elements != null" check is unnecessary. DOMXPath::query() will always return a DOMNodeList, though maybe it will be of zero length, which won't confuse the foreach loop.

Also, note the use of the nodeValue property to get the element's textual representation:

$elements = $xPath->query("//*[@class='nombrecomplejo']");

foreach ($elements as $e) {
  echo $e->nodeValue;
}

The reason for the error you got is that you can't feed anything other than a string to parse_str(), you tried passing in a DOMElement.

Friday, September 23, 2022
5

Just continue on target attributes which aren't fruit, and then add the textContent of the elements to an array.

$nodes = array();

for ($i; $i < $a->length; $i++) {
    $attr = $a->item($i)->getAttribute('target');

    if ($attr != 'fruit') {
        continue;
    }

    $nodes[] = $a->item($i)->textContent;
}

$nodes now contains all the nodes of the elements which have their target attribute set to fruit.

Wednesday, September 7, 2022
 
dsimard
 
2

Ok, let’s try this complete example of use:

function CatRemove($myXML, $id) {
    $xmlDoc = new DOMDocument();
    $xmlDoc->load($myXML);
    $xpath = new DOMXpath($xmlDoc);
    $nodeList = $xpath->query('//category[@id="'.(int)$id.'"]');
    if ($nodeList->length) {
        $node = $nodeList->item(0);
        $node->parentNode->removeChild($node);
    }
    $xmlDoc->save($myXML);
}

// test data
$xml = <<<XML
<?xml version="1.0"?>
<details>
 <person>name</person>
 <data1>some data</data1>
 <data2>some data</data2>
 <data3>some data</data3>
 <category id="0">
  <categoryName>Cat 1</categoryName>
  <categorydata1>some data</categorydata1>
 </category>
 <category id="1">
  <categoryName>Cat 2</categoryName>
  <categorydata1>some data</categorydata1>
  <categorydata2>some data</categorydata2>
  <categorydata3>some data</categorydata3>
  <categorydata4>some data</categorydata4>
 </category>
</details>
XML;
// write test data into file
file_put_contents('untitled.xml', $xml);
// remove category node with the id=1
CatRemove('untitled.xml', 1);
// dump file content
echo '<pre>', htmlspecialchars(file_get_contents('untitled.xml')), '</pre>';
Monday, December 19, 2022
5

A SimpleXMLElement can only represent elements and attributes, either individually or a collection of siblings of the same type. The ->xpath() method returns an array of SimpleXMLElement objects, which allows them to be non-siblings, but does not allow for any other node type.

Consequently, the expression /td/span/text() matches the two text nodes, but returns them as objects representing their parent element, which in this case happens to be the same <span> element, giving you an array with the same object in twice.

The remaining part of the puzzle is that when you cast a SimpleXML element to string it combines all its direct descendant text and CDATA nodes into one string, so the 193 and 120 get stuck together.

Thus the output is 193120, twice.

(This is definitely unintuitive behaviour, although it's hard to know quite what SimpleXML should do in this situation; perhaps it would be better to produce an error if the XPath expression resolves to something other than elements or attributes).


Since the DOM API has objects for every kind of node that can possibly exist in XML, and PHP includes a full implementation of that API, the XPath expression will work as expected there. What's more, the SimpleXML and DOM objects are actually both wrappers around the same internal memory structures, so you can write operations combining the two using dom_import_simplexml() and simplexml_import_dom().

As a slightly inelegant example, if you wanted to run an XPath expression in the context of an element you'd already traversed to with SimpleXML, you could do something like this:

$dom_node = dom_import_simplexml($simplexml_node);
$dom_xpath = new DOMXPath($dom_node->ownerDocument);
$dom_xpath_result = $dom_xpath->query('span/text()', $dom_node);

foreach($dom_xpath_result as $xnode){
    echo "<br /><br />NodeValue: " . $xnode->nodeValue;
}

Obviously, you could wrap this up into a function as desired. Also note that since your expression starts at the document root (leading /) the actual context is irrelevant, which is why I've used a slightly different expression above.

Wednesday, November 9, 2022
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :