Whitelisting File Extensions in Apache
Last week, Xavier published a great diary about the dangers of leaving behind backup files on your web server. There are a few different ways to avoid this issues, and as usual, defense in depth applies and one should consider multiple controls to prevent these files from hurting you. Many approaches blocklist specific extensions, but as always with blocklists, it is dangerous as it may miss some files. For example, different editors will use different extensions to marks backups files, and Emacs (yes... I am an Emacs fan), may not only leave a backup file by appending a ~ at the end, but it may also leave a second file with a '#' prefix and postfix if you abort the editor.
For all these reasons, it is nice if you can actually white list extensions that are required for your application.
As a first step, enumerate what file extensions are in use on your site (I am assuming that "/srv/www/html" is the document root):
find /srv/www/html -type f | sort | sed 's/.*\.//' | sort | uniq -c | sort -n 19 html~ 20 css 20 pdf 23 js 50 gif 93 html 737 png 3012 jpg
As you see in the abbreviated output above, most of the extensions are what you would expect from a normal web server. We also got a few Emacs backup HTML files (html~).
We will set up a simple text file "goodext.txt" with a list of all allowed extensions. This file will then help us create the Apache configuration, and we can use it for other configuration files as well (anybody knows how to do this well in mod_security?) . The output of the command above can be used to get us started, but of course, we have to remove extensions we don't want to see.
find . -type f | sort | sed 's/.*\.//' | sort -u > ~/goodext.txt
Next, let's run a script to delete all the files that do not match these extensions. I posted a script that I have used in the past on GitHub.
The script does use the "goodext.txt" file we created above. The first couple lines can be used to configure it. Of course, run it in "debug" mode first, to see what files will be deleted, and make a backup of your site first!
Next, we create an Apache configuration file. Currently, the script only works for Apache 2.2. Apache 2.4 changed the syntax somewhat, and I need to test if the order of the directives needs to change. Include it as part of the Directory section of the configuration file:
Order allow,deny Allow from all Include www.goodext
(I don't name the extension file ".conf" so it will not be included automatically but only in this one specific spot).
The two, rather simple, bash scripts to delete the "bad files" and then create the Apache configuration files, can be found here: https://github.com/jullrich/fixbadwebfiles
Why use a script for this vs. just editing the files manually?
- typos
- faster if you have multiple servers
- there are two kinds of sysadmins: those that script, and those that will be replaced by a script.
Note that the scripts are strictly in the "works for me" state. Any bug reports and comments are welcome (use GitHub for bugs)
Comments