Securing access to files on your website

It is easy to forget that the files in your web site are visible to anyone even if they are not linked to or are not files normally requested. In this post we look at how to use the.htaccess file to control access to your site.

Most CMS installations include txt files and the like that often contain information about your installation that could help malicious visitors to hack your site. One of the simplest ways to deal with this is to simply delete them. However that can be a headache since they exist in many of the sub-folders and will be replaced when you upgrade or add to your CMS installation. Also there are files such as cron and update that you need on your site but you don't want anyone else to access. This is where the .htaccess file helps us out. You can write specific rules to prevent access to any files, folders etc and limit access to specific IP addresses and much much more. In this post we look at a simple way to secure a Drupal installation but is just as relevant to Wordpress, Joomla and Expression Engine or any web site.

The FilesMatch directive of the .htaccess file

The .htaccess file gives us a lot of control over access to our website. It is used mostly to limit access to folders and files, rewrite URLs and control access to the site generally. The FilesMatch directive provides a useful way to limit access to files according to a list or regular expression patterns. In this example we are going to prevent anyone but the site owners accessing the critical Drupal files such as cron, install, update etc and prevent them from viewing any text files on the site. Add the following block near the top of the .htaccess file. Preferably before any Rewrite Rules.

<FilesMatch "(xmlrpc.php|install.php|bbcron.php|cron.php|update.php|.txt)$">
  Order deny,allow
  Deny from all
Allow from 127.0.0.1
Allow from 123.123.123.123
</FilesMatch>
<FilesMatch "(robots.txt)$">
  Order allow,deny
  Allow from all
</FilesMatch>

This says "deny access to the files listed (install.php, cron.php,update.php and any txt files) to all but the two IP addresses".
IP addresses to add would be the local server IP and your remote IP so that you can access these files and run them from the server.
If you use the http.conf file you can use an include file to hold the list of IPs (we'll explore that in a separate post)
Unfortunately you can't have includes in .htaccess.
The next block overrides the rule for text files to allow anyone access to the robots.txt file.
The robots one is critical as you don't want to disallow access to that!
If you are brave enough to use regular expressions you could achieve the whole thing in one block

<FilesMatch "(xmlrpc.php|install.php|bbcron.php|cron.php|update.php|[^robots].*\.txt)$">
Order allow,deny
Allow from 127.0.0.1
Allow from 123.123.123.123
Deny from all
</FilesMatch>

Be sure to test (preferably locally) and backup the .htaccess first! - regular expressions can be mind-numbing at the best of times and any mistakes in the .htaccess will cause an error 500 and disable your site! - the above solution to explicitly allow the robots.txt file is probably safest!

Robots.txt

The robots.txt file is a valuable file to "control" access to files by search engines.
So you don't want to prevent them from reading that.
Remember though that the robots.txt file doesn't limit access to files but provides a guide to well behaved search engine spiders.
Malicious intruders don't respect your Disallows in the robots.txt file and moreover may use that file to hunt down interesting looking folders!
So always protect files that you don't want intruders to access.
If you can, avoid having such files in your web - if you can't then consider using the .htaccess file.
Windows servers allow similar control in IIS7 or you can use the excellent IsapiRewrite filter.
If you control your own server then migrate such rules into your http.conf as that is much more efficient.

Exactly what I'm trying to have

But it doesn't work :(
I use the first part of your' .htaccess to protect some *.avi from direct download - it works fine on direct download

BUT

those $.avi files, that should be display as video via flowplayer, aren't played : I see the flowplayer and a message telling me that the files can't be loaded nor played :(

Any idea ?

Best regards
D

Check for Flowplayer accessing from your site

You will need to put in an exception to allow Flowplayer access to the files.
I would check to see if you can determine the referrer and use a rewrite rule instead.
Only let *.avi be accessed by calls from your site

something like (I haven't tested this):
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://www.yoursite.com [NC]
RewriteCond %{HTTP_REFERER} !^http://yoursite.com [NC]
RewriteRule [^/]+.(avi|swf)$ - [F]