Drupal: is that site hacked?

Some months back I was in the need to know if a site was hacked how much code had been modified in a given site. It was not straightforward to install a copy of the site out of the box (it had hardcoded absolute paths and so on).

http://dgo.to/hacked in conjunction with http://dgo.to/diff is a well known solution to guess if core or contrib modules in a drupal site are hacked have been modified. As I said it was difficult to install the site locally. In fact I rejected to deal with its cumbersome admin area. So I think on an elegant solution based on drush_make.

The reverse feature of drush_make is to generate a makefile from a given site. That is drush generate-makefile sitename.make

So the approach is as follow:

First, generate a makefile off the original site. The resulting makefile of STEP 1 is a description of the projects (and their versions) used to build the site, but it may not be complete in some cases. For example it doesn't consider libraries or external projects not hosted in drupal.org. It may also lack effectiveness if projects were git clones (and git_deploy is not enabled) or even worse cvs checkouts.

Secondly, just deploy the makefile to another location in order to get a vanilla copy of th same code.

Lastly, diff both directories. Here I've used two different approachs: Using unix diff diff -r -q hackedsite vanillasite or a python script I did for this same purpose. The python script may be of interest to customize the diff (for example excluding files/ or any scripting you need.

In a scratch:

jonhattan@jengibre:/var/www$ cd hackedsite
jonhattan@jengibre:/var/www/hackedsite$ drush generate-makefile ../hackedsite.make
jonhattan@jengibre:/var/www/hackedsite$ cd ..
jonhattan@jengibre:/var/www$ drush make hackedsite.make vanillasite
jonhattan@jengibre:/var/www$ diff -r -q hackedsite vanillasite