Viewed   88 times

I'm using file_get_contents() to access a URL.

file_get_contents('http://somenotrealurl.com/notrealpage');

If the URL is not real, it return this error message. How can I get it to error gracefully so that I know that the page doesn't exist and act accordingly without displaying this error message?

file_get_contents('http://somenotrealurl.com/notrealpage') 
[function.file-get-contents]: 
failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found 
in myphppage.php on line 3

for example in zend you can say: if ($request->isSuccessful())

$client = New Zend_Http_Client();
$client->setUri('http://someurl.com/somepage');

$request = $client->request();

if ($request->isSuccessful()) {
 //do stuff with the result
}

 Answers

1

You need to check the HTTP response code:

function get_http_response_code($url) {
    $headers = get_headers($url);
    return substr($headers[0], 9, 3);
}
if(get_http_response_code('http://somenotrealurl.com/notrealpage') != "200"){
    echo "error";
}else{
    file_get_contents('http://somenotrealurl.com/notrealpage');
}
Sunday, October 9, 2022
4

A few years ago I benchmarked the two and CURL was faster. With CURL you create one CURL instance which can be used for every request, and it maps directly to the very fast libcurl library. Using file_get_contents you have the overhead of protocol wrappers and the initialization code getting executed for every single request.

I will dig out my benchmark script and run on PHP 5.3 but I suspect that CURL will still be faster.

Friday, December 16, 2022
 
3

In your url try:

http://user:password@site/ 

(append whatever the rest of the URL for your API should be)

Friday, November 4, 2022
 
matoran
 
2

That webserver appears to return a 403 Forbidden error when your HTTP request does not include a user-agent string. RCurl by default does not pass a user-agent. You can set one with the useragent= parameter.

myurl<-"http://www.transfermarkt.es/liga-mx-apertura/startseite/wettbewerb/MEXA"
url.exists(myurl, useragent="curl/7.39.0 Rcurl/1.95.4.5")
# [1] TRUE
htmlTreeParse(getURL(myurl, useragent="curl/7.39.0 Rcurl/1.95.4.5"))

The httr package is a bit nicer than RCurl for making HTTP requests in my opinion (and it sets a user-agent string by default). Here's the corresponding code

library(httr)
GET(myurl)
Sunday, November 20, 2022
1

STD 66, Percent-Encoding:

A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the allowed set or is being used as a delimiter of, or within, the component.

So percent-encoding is a kind of escape mechanism: Some characters have a special meaning in URI components (→ they are reserved). If you want to use such a character without its special meaning, you percent-encode it.

Unreserved characters (like a, b, c, …) can always be used directly, but it’s also allowed to percent-encode them. Such URIs would be equivalent:

URIs that differ in the replacement of an unreserved character with its corresponding percent-encoded US-ASCII octet are equivalent: they identify the same resource.

Why it’s allowed to percent-encode unreserved characters in the first place? The obsolete RFC 2396 contains (bold by me):

Unreserved characters can be escaped without changing the semantics of the URI, but this should not be done unless the URI is being used in a context that does not allow the unescaped character to appear.

I can’t think of an example for such a "context", but this sentence suggests that there may be some.

Also, maybe some people/implementations like to simply percent-encode everything (except for delimiters etc.), so they don’t have to check if/which characters would need percent-encoding in the corresponding component.

Sunday, September 11, 2022
 
gianmt
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :