When we first implemented the CMS here at Middlebury I was a summer intern working in a closet in Warner basement between stacks of printer cartridges and Dell driver CDs. One of my job responsibilities at that time was to fix any broken links on the CMS. To “help” me out, a script was created that sent an email to a central mailbox account every time somebody tried to access a page that didn’t exist on the CMS. In web terms, this is know as a 404 error. I received several thousand of these messages each day.
The volume of emails was too great for anyone to keep up with and so the script was turned off. We now rely on the sub-site content providers to maintain their own site and fix any broken links they might hear about from peers. Rather, that was the case until last night when I re-wrote the 404 Email script for our new CMS framework. Stephen Kiel, Technology Coordinator for the Career Services Office noticed a line in our CMS documentation which said:
“404 email notifications can be delivered to Content Providers upon request. Know about broken links on your sub-site as users discover them, very handy for large sub-sites with many channels/postings.”
As far as I can figure, this was to be a new feature of the MiddCMS framework that was just never added to the code, but was placed in the documentation. There is a section in the code that tries to log each 404 request to the server, but it is commented out with a note about performance concerns and even that doesn’t do what our documentation is saying we do. Still, the functionality is simple enough. We just need a configuration file with the email address of a person to whom we’ll send an email and the part of the site they’re concerned with, then just parse the configuration file every time you get a 404 message and send them the email with the information. This took me about a half hour to put together, with some time to add error handling for parsing the configuration file and other concerns.
Now we can send our content providers an email letting them know that a user wasn’t able to get to a site within their sub-site just by adding a line to an XML document like this:
<ErrorMails>
<ErrorMail path=”/administration/cso” email=”imcbride@middlebury.edu”></ErrorMail>
</ErrorMails>
With that in place, I would receive an email for every bad request to anything under “/administration/cso” containing two URLs. The first is the address to which the user attempted to get, such as http://www.middlebury.edu/administration/cso/does_not_exist.htm and the second URL would be the referer of the page request, if it exists. Typically, the referer will be the page the user just came from. This is useful so that the content provider can go to that page and fix any bad links to content that doesn’t exist. There are some times when the referer will be blank. The most common of these are:
- If the user typed the page address directly into their address bar
- If they used a bookmark
- If they clicked a link in an email
There are certainly other scenarios, but those are by far the most common for our site. So if you want this functionality for your sub-site, just let us know by sending an email to the Helpdesk requesting it. I’ll be glad to add your email address to the list!
