Category: "Security"

Linux - Using file -i instead of the input accept attribute

The file input allows an accept attribute to indicate what type of file may be submitted. The type is the client’s MIME type, which may vary by operating system, installed applications, and end user configuration.

A sample set of MIME types used for an accept attribute is:


The browser usually doesn’t enforce the accept attribute.

The MIME type sent from the client is unreliable, since many clients use the file extension to indicate the MIME type for the browser, and that MIME type is sent to the server.

An alternative is to ignore the MIME type, but use the Linux file command to test the file, and use it for validation.

In the example below, there are three identical files of raw audio, with the extension of pdf, raw, and txt. Linux uses the file content to determine the type, rather than the extensions.

[tmp]$ file -i audio.*
audio.pdf: application/octet-stream
audio.raw: application/octet-stream
audio.txt: application/octet-stream

Mozilla/4.0 (compatible;)

This user agent was in the middle of many page requests in my Apache logs, requesting content referenced by link tags in the head section.

After a bit of research on one of the link tag URLs, I ran this script:

IPS=`grep Author access_log | cut -f 1 -d ' '  | sort | uniq`
for IP in $IPS
        echo Testing "$IP"
        host "$IP"  

In almost every case, the requests came from large organizations - corporations, government agencies, and the military.

These institutions often use proxy servers, and Mozilla/4.0 (compatible;) must be a common user agent setting for the proxy server requests.

In the one case where it wasn’t a large organization, it was a blacklisted IP, and the user agent was Java.

The sample set was limited, but the pattern was clear.

HTTP Blacklist - Http:BL PHP Code - Generic

This is a generic PHP script that can be used with Http:BL. Http:BL can be used to block requests to a web site based on the IP address. There are several configuration settings that allow you to adjust the performance. In the code below, any IP address identified as suspicious by Project Honey Pot, active within the past 30 days, or with a threat score 100 or greater is blocked.

The easiest way to use it is to include it into the top level of the application, for example:

require_once 'bl.php';

This code just logs the requests and the scores. Once you’re comfortable with it, you can use it to redirect unwanted visitors to a 403 page, or down the rabbit hole.


Octet 1: 127 or indicates error
Octet 2: # of days since last activity
Octet 3: Threat score (0=No threat, 255=Extreme threat)
Octet 4: Visitor type

define ('httpBL_API_key','!-- YOUR KEY HERE --!');
define ('httpBL_URL','');
/* These are the settings which control which visitors are blocked */
define ('DAYS_SINCE_LAST_ACTIVITY',30);  /* Active within this many days prior will be blocked */
define ('MAX_THREAT_SCORE',100);         /* Anything over this threat score will be blocked */
define ('MAX_TYPE_VALUE',1);             /* Type of visitor - this isn't really bitmapped */
define ('VISITOR_MAP',3);
0=>'Search Engine',
4=>'Comment Spammer',
8=>'[Reserved for Future Use]',
16=>'[Reserved for Future Use]',
32=>'[Reserved for Future Use]',
64=>'[Reserved for Future Use]',
128=>'[Reserved for Future Use]'
if ($sBL!==null) 
        /* Write out the information to a text file so you can see what is happening */
        file_put_contents('output.txt',$_SERVER['REMOTE_ADDR'].' '.$sBL.PHP_EOL,FILE_APPEND);
        /* Once you are comfortable with your code and settings, you can redirect unwanted visitors elsewhere */
function httpBL($sIP)
        global $aOctetMap;

        if (isset($aResult[0]) && isset($aResult[0]['ip']))
                if ((int)$aResultOctet[$aOctetMap['VISITOR_MAP']]<MAX_TYPE_VALUE) return null;
                if ((int)$aResultOctet[$aOctetMap['MAX_THREAT_SCORE']]>=MAX_THREAT_SCORE) return $sResult;
                if ((int)$aResultOctet[$aOctetMap['DAYS_SINCE_LAST_ACTIVITY']]<=DAYS_SINCE_LAST_ACTIVITY) return $sResult;
        return null;

The advantage of this approach is that after an IP address has been cleared or cleaned up, access is restored without admin action, so blocked addresses aren’t blocked forever, only for a month or so while they are potentially harmful. The .htaccess Allow,Deny configuration can also be used, but it must be manually maintained, by checking the stats frequently and determining the owner and extent of the IP address block.

Blocking Site Visitors by User Agent

One of my sites was receiving a lot of hits with a user agent of Mozilla/4.0 (compatible; ICS), within a very short timeframe (requests within the same second), from a huge variety of IP addresses.

A quick look around showed ‘compatible; ICS’ is probably not a person or search engine.

I checked several IP addresses that used that user agent at: Project HoneyPot, and most of them were listed as potential sources of dictionary attacks and spam senders.

A common reaction to this is to block the requests by user agent, and that’s what I did, using:

RewriteCond %{HTTP_USER_AGENT} (compatible;\ ICS)
RewriteRule ^ - [F]

These must go before any other RewriteRules.

To test the site and ensure it still runs properly, I used Bots vs. Browsers, which allows you to request pages using a specific user agent string. Be sure to check the page with user agents that include the string you want to block, and those which should be allowed.

Login Access Limits

After reviewing the log files for this blog, I noticed many attempts to log into it, and send bogus contact form data.

This is my blog, registration and comments are disabled. To all those who would post helpful comments and legitimate information, I’m sorry.

I access the blog administration from a very limited set of IP addresses, so, instead of wasting my time blocking access from IPs that shouldn’t be logging in, I decided to block all accesses to the administration interface, except my IP address.

This is done using server configuration directives. Refer to the appropriate documentation on blocking access.

After making the changes, be sure to test the effect. The link above is for a nice proxy service that will allow you to visit your pages with a different IP address. The pages should display fine for all navigation through the blog, except things like logging in, and perhaps the contact form. Check anything that’s important to you.

This works if you have a site, blog, or system where the authorized users are from a limited set of IP addresses. It can’t be used to protect against ‘bots and spammers on a forum or contact form. In those cases, I recommend BotScout.

For all those who have been trying to login, please go away.