I'm teaching myself some basic scraping and I've found that sometimes the URL's that I feed into my code return 404, which gums up all the rest of my code.
So I need a test at the top of the code to check if the URL returns 404 or not.
This would seem like a pretty straightfoward task, but Google's not giving me any answers. I worry I'm searching for the wrong stuff.
One blog recommended I use this:
$valid = @fsockopen($url, 80, $errno, $errstr, 30);
and then test to see if $valid if empty or not.
But I think the URL that's giving me problems has a redirect on it, so $valid is coming up empty for all values. Or perhaps I'm doing something else wrong.
I've also looked into a "head request" but I've yet to find any actual code examples I can play with or try out.
Suggestions? And what's this about curl?
If you are using PHP's
curlbindings, you can check the error code using