Posted under .net & Code & Software Design
There are any number of reasons to want to programmatically interact with resources on the web. Some of the more well known are browsers, like Internet Explorer, FireFox, and Safari; search engine spiders like GoogleBot; and RSS readers. Less famous are automated testing tools checking for 404 (Page not Found) and other errors. As people who maintain sites delve into IIS, cPanel, .htaccess, or custom authentication code, it becomes important to verify that pages are returning an HTTP status code of 200 (OK) when you expect them to, or the right type of redirect (temporary vs permanent). Testing software aims to identify unknown problems in a larger site that becomes difficult to manage by hand.
While some of these tools are worth the price, they can easily become overkill. Especially considering how easy it is to gather this information yourself. Using the built-in functionality of the Microsoft .net Framework, we can gather a wealth of information about any reachable URL. From this point, testing an entire site involves little more than writing a loop. The code provided below doesn’t check whether the host machine is connected to the internet; if not a suitable error will be returned to the caller. For an example of how to test a machine’s connectivity, see this ASP Emporium article.
The WebStatus object defined below is re-usable, in the sense that once an instance of the class has been created, you can either use the object to cache state info until it falls out of scope, or you can use the object again and again to query new URLs. This has nothing to do with object oriented programming, but reduces pressure on the managed heap, allowing your application to use less memory and force less garbage collections.