Viewed   73 times

i get the html from another site with file_get_contens, my question is how can i get a specific tag value?

let's say i have:

<div id="global"><p class="paragraph">1800</p></div>

how can i get paragraph's value? thanks

 Answers

2

If the example is really that trivial you could just use a regular expression. For generic HTML parsing though, PHP has DOM support:

$dom = new domDocument();
$dom->loadHTML("<div id="global"><p class="paragraph">1800</p></div>");
echo $dom->getElementsByTagName('p')->item(0)->nodeValue;
Sunday, November 6, 2022
 
4

A few years ago I benchmarked the two and CURL was faster. With CURL you create one CURL instance which can be used for every request, and it maps directly to the very fast libcurl library. Using file_get_contents you have the overhead of protocol wrappers and the initialization code getting executed for every single request.

I will dig out my benchmark script and run on PHP 5.3 but I suspect that CURL will still be faster.

Friday, December 16, 2022
 
3

In your url try:

http://user:[email protected]/ 

(append whatever the rest of the URL for your API should be)

Friday, November 4, 2022
 
matoran
 
2

Hooray!!!

I found this source code:

1) create Readability.php

2) create JSLikeHTMLElement.php

3) create index.php by this code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
    <head>
        <title>!</title>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
    </head>
<body dir="rtl">
<?php
include_once 'Readability.php';


// get latest Medialens alert 
// (change this URL to whatever you'd like to test)
$url = 'http://';
$html = file_get_contents($url);

// Note: PHP Readability expects UTF-8 encoded content.
// If your content is not UTF-8 encoded, convert it 
// first before passing it to PHP Readability. 
// Both iconv() and mb_convert_encoding() can do this.

// If we've got Tidy, let's clean up input.
// This step is highly recommended - PHP's default HTML parser
// often doesn't do a great job and results in strange output.
if (function_exists('tidy_parse_string')) {
    $tidy = tidy_parse_string($html, array(), 'UTF8');
    $tidy->cleanRepair();
    $html = $tidy->value;
}

// give it to Readability
$readability = new Readability($html, $url);
// print debug output? 
// useful to compare against Arc90's original JS version - 
// simply click the bookmarklet with FireBug's console window open
$readability->debug = false;
// convert links to footnotes?
$readability->convertLinksToFootnotes = true;
// process it
$result = $readability->init();
// does it look like we found what we wanted?
if ($result) {
    echo "== Title =====================================n";
    echo $readability->getTitle()->textContent, "nn";
    echo "== Body ======================================n";
    $content = $readability->getContent()->innerHTML;
    // if we've got Tidy, let's clean it up for output
    if (function_exists('tidy_parse_string')) {
        $tidy = tidy_parse_string($content, array('indent'=>true, 'show-body-only' => true), 'UTF8');
        $tidy->cleanRepair();
        $content = $tidy->value;
    }
    echo $content;
} else {
    echo 'Looks like we couldn't find the content. :(';
}
?>
</body>
</html>

in $url = 'http://'; set your site url.

Thank you;)

Saturday, October 15, 2022
 
purag
 
5

as per the manual set ignore_errors to true:

$opts = array(
  'http' => array(
      'method' => "GET",
      'header' => "Accept-language: enrn",
      'ignore_errors' => true
  )
);
Monday, December 26, 2022
 
igrimpe
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :