Simple PHP script to check for valid links

A friend asked me for this and it turns out to be a bit trickier than you’d think.

There are plenty of tools for crawling a website and reporting broken links.  You can get 90% of this with just wget or curl.

But to check a list of links is a bit tricker.

The basic test is pretty simple


foreach ($urllist as $url) {
   if (fopen($url)) { print "valid"; }
}

But it requires allow_url_fopen to be enabled, doesn’t check for redirects, chokes if you’re behind a proxy, etc.

Using curl solves these particular problems, but requires libcurl to be built with your PHP, and it is quite clunky to use:

Anyway, here’s what I came up with: https://gist.github.com/1508261

You’ll still run into URL parsing problems (like a URL needs a trailing slash after then hostname (curl command line handles this fine, but not in PHP.)  Building the list of URLs is an exercise left to the reader.

If anyone wants it, I can put up a simple web UI wrapper with a text area, file upload button, or REST web service for scanning URLs.

Advertisements

One thought on “Simple PHP script to check for valid links

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s