Apache

handling 404 errors on hosted CMS

A general housekeeping task for CMS systems such as Wordpress and Drupal and other websites and good practice to keep your site SEO high is to make sure you are gracefully handling missing pages (404 errors).
One of the routine tasks to carryout is checking for crawl errors in Google Webmaster tools. If you see any missing pages in the list it is worth making sure you have some measures in place to handle these and ideally issue a 301 redirect so that Google and other search engines update their indexes.

Safe guard your web site with routine web log analysis and forensics

Whether you are running Drupal,Wordpress, Expression engine, Joomla or in fact any web site one of the regular tasks you should carryout on your web site is a bit of log analysis. It is often left up to modules, plug ins or someone else to protect your web site until it too late.
We all rely on Google Analytics to tell us about visitors and maybe use our log analysis software (AWStats, Webaliser etc) to report on log entries - but it is always worth using tools locally to dig deeper into your logs. These can range from simple reports on accesses to your site to more detailed forensic analysis of site activity.
By doing this we get to know better how visitors are accessing our site and can uncover some interesting answers to questions such as:

  • How often is Google actually spidering my site?
  • How many errors am I getting and what are they?
  • Who is stealing my content?
  • Is anyone trying to crack my site?

In this post I will briefly cover some useful techniques to analyse you logs and see if any one is abusing your hospitality.

Securing access to files on your website

It is easy to forget that the files in your web site are visible to anyone even if they are not linked to or are not files normally requested. In this post we look at how to use the.htaccess file to control access to your site.

htaccess Rewrites - Discarding the unwanted Querystring

I use the .htaccess file a lot on hosted servers. On our own servers I prefer to use the httpd.conf as it performs better and is not reevaluated on every request. But if you are on a hosted server the .htaccess is your earliest port of call for handling incoming traffic and can be more efficient than using modules for certain tasks. One common gotcha is how to discard the querystring for a redirect.