This site uses advanced css techniques
A customer in the web hosting business was experiencing repeated attempts at password guessing for a subscription site on his servers. The source IP addresses were scattered all over the world, and a bit of investigation showed that they were unrelated anonymous web proxies: clearly the perpetrator was trying to avoid detection.
He had methods to detect hacked accounts by scanning the web logs, so in practice this wasn't impacting his business too much, but it was an annoyance nevertheless. He asked me to investigate.
In the process we developed a technique which we have not seen described before, and we're writing it up here in the hope that it proves helpful to others. Though proxy abuse for the purposes of attacking a webserver may seem like a narrow problem to solve, we've found unrelated problems which this approach seems to apply to, such as tracking IRC bots to their masters.
The RPAT technique involves three components cooperating to produce a unified output.
The key to RPAT working correctly is to detect the abusive activity immediately, and not all circumstances lend themselves to this. In the case of the web-site hacking that prompted this research project, however, it was.
We found that the abusers were brute-forcing passwords via HTTP "HEAD" requests, each of which failed with a "401" error ("Unauthorized"). By using a HEAD request with authentication information, they could crank through passwords in relatively short order. It also made them very easy to detect from the logs.
... "HEAD /Members/index.html HTTP/1.0" 401 0 "-" "-"
We created a small perl program that did a "tail -F" on the logfile, and when it found a HEAD request that failed with a 401, it sent the responsible IP address over to the RPAT daemon for processing. The -F "followed" the logfile and accounted for log rotation: if the web log was rotated out, tail would detect this and reopen the file to get the new one.
All IP stacks maintain an internal table of open connections, and this is normally queried via the netstat command. Under UNIX or Windows NT/2000, the netstat -n command displays this table. The -n parameter supprses the normal IP-to-name DNS lookup, and above we have only shown the TCP connections (netstat also usually shows udp and raw IP connections as well).
$ netstat -n Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 172.27.217.6:1037 220.127.116.11:80 TIME_WAIT tcp 0 93 172.27.217.6:25 172.27.217.2:3903 ESTABLISHED tcp 0 0 172.27.217.6:22 172.27.217.9:4261 ESTABLISHED tcp 0 0 172.27.217.6:143 172.27.217.9:4229 ESTABLISHED tcp 0 0 172.27.217.6:80 18.104.22.168:4227 ESTABLISHED
The first connection is an outbound one to the dslreports.com web server, and the rest are all inbound to the mail or secure shell servers. The last one is from a random (made-up) cable modem address to the web server
We will note here that the customary -a parameter is not included. This shows the current connections as well as all "listening" sockets, and this is normally important data. But for our purposes, only established connections are interesting, so the listening endpoints are not.
Most "interesting" connections identified by RPAT will be in a closing state, usually TIME_WAIT. After a TCP/IP connection is closed, the IP stack must keep track of the connection for a time to allow straggling ACK and FIN packets to filter through the network. This is known as the "2MSL Wait", and it's usually between 30 seconds and two minutes. This timer starts the moment the connection is closed, and we must be able to gather the interesting data before this ages out of the connection table.
It's possible to demonstrate the use of SNMP with standard Linux tools. Initial research was done under Red Hat Linux using the UC Davis SNMP toolkit (ucd-snmp-utils-4.0.1-4): this should be widely available from your favorite depot of RPMs. In particular, snmpwalk is required.
The command "snmpwalk -s 10.1.1.1 public tcpConnTable" can be used to fetch the entire TCP connection table, though in practice this returns more data than is actually useful (as we'll see shortly). We provide the remote IP address, the SNMP community string, and the starting point in the tree, and via a series of GET-NEXT requests, the TCP connection table is fetched one entry at a time to the standard output.
The full OID (Object ID) for the TCP connection table is 22.214.171.124.126.96.36.199, but this is more often abbreviated to just tcpConnTable. There are five columns in the table:
Though it's customary to fetch tables in their entirety a column at a time, it turns out that the first column (tcpConnState) encodes everything from all columns into one. This means we need only fetch that column and split it into the constituent parts. This dramatically reduces the network I/O required when fetching the connection table from the proxy.
Fetching a representative table from a random machine shows how much data there is to wade through. The state is the actual return value, and the local IP/port and remote IP/port are encoded into the returned OID.
$ snmpwalk -s 188.8.131.52 public tcpConnState tcpConnState.0.0.0.0.184.108.40.206.0.0 = listen(2) tcpConnState.0.0.0.0.220.127.116.11.0.0 = listen(2) tcpConnState.0.0.0.0.18.104.22.168.0.0 = listen(2) tcpConnState.0.0.0.0.22.214.171.124.0.0 = listen(2) tcpConnState.0.0.0.0.126.96.36.199.0.0 = listen(2) tcpConnState.0.0.0.0.188.8.131.52.0.0 = listen(2) tcpConnState.0.0.0.0.184.108.40.206.0.0 = listen(2) tcpConnState.0.0.0.0.220.127.116.11.0.0 = listen(2) tcpConnState.0.0.0.0.18.104.22.168.0.0 = listen(2) tcpConnState.0.0.0.0.722.214.171.124.0.0 = listen(2) tcpConnState.0.0.0.0.7126.96.36.199.0.0 = listen(2) tcpConnState.0.0.0.0.2049.0.0.0.0.0 = listen(2) tcpConnState.0.0.0.0.2401.0.0.0.0.0 = listen(2) tcpConnState.0.0.0.0.6000.0.0.0.0.0 = listen(2) tcpConnState.0.0.0.0.6188.8.131.52.0.0 = listen(2) tcpConnState.0.0.0.0.8080.0.0.0.0.0 = listen(2) tcpConnState.0.0.0.0.327184.108.40.206.0.0 = listen(2) tcpConnState.0.0.0.0.327220.127.116.11.0.0 = listen(2) tcpConnState.0.0.0.0.32718.104.22.168.0.0 = listen(2) tcpConnState.0.0.0.0.32722.214.171.124.0.0 = listen(2) tcpConnState.0.0.0.0.327126.96.36.199.0.0 = listen(2) tcpConnState.127.0.0.1.188.8.131.52.1.32770 = established(5) tcpConnState.127.0.0.1.327184.108.40.206.1.199 = established(5) tcpConnState.220.127.116.11.1318.104.22.168.0.0 = listen(2) tcpConnState.22.214.171.124.4242.0.0.0.0.0 = listen(2) tcpConnState.126.96.36.199.4242.132.254.73.1.32860 = established(5) tcpConnState.188.8.131.52.8080.61.196.48.65.4544 = established(5) tcpConnState.184.108.40.206.8080.61.203.129.112.65347 = established(5) tcpConnState.220.127.116.11.8080.65.94.248.56.2607 = finWait1(6) tcpConnState.18.104.22.168.8080.66.27.83.62.26147 = timeWait(11) tcpConnState.22.214.171.124.8080.66.27.83.62.26156 = timeWait(11) tcpConnState.126.96.36.199.8080.81.22.206.49.44179 = finWait2(7) tcpConnState.188.8.131.52.8080.216.37.222.113.1633 = synReceived(4) tcpConnState.184.108.40.206.8080.216.164.251.60.50059 = timeWait(11) tcpConnState.220.127.116.11.8080.216.164.251.60.50060 = established(5) tcpConnState.18.104.22.168.8080.217.233.105.110.3487 = timeWait(11) tcpConnState.22.214.171.124.328126.96.36.199.1.4242 = established(5) tcpConnState.188.8.131.52.486184.108.40.206.15.80 = finWait1(6) tcpConnState.220.127.116.11.54318.104.22.168.146.80 = lastAck(9) tcpConnState.22.214.171.124.546126.96.36.199.8.80 = established(5) tcpConnState.188.8.131.52.546184.108.40.206.108.80 = finWait2(7) tcpConnState.220.127.116.11.54818.104.22.168.146.80 = lastAck(9) tcpConnState.22.214.171.124.55029.210.222.20.14.80 = established(5) tcpConnState.126.96.36.199.55188.8.131.52.212.80 = established(5) tcpConnState.184.108.40.206.55220.127.116.11.174.80 = synSent(3) tcpConnState.18.104.22.168.5522.214.171.124.25.80 = closeWait(8) tcpConnState.126.96.36.199.55188.8.131.52.152.80 = established(5) tcpConnState.184.108.40.206.55220.127.116.11.9.1984 = synSent(3) tcpConnState.18.104.22.168.5522.214.171.124.85.80 = established(5) tcpConnState.126.96.36.199.55188.8.131.52.162.80 = synSent(3) tcpConnState.184.108.40.206.55220.127.116.11.85.80 = synSent(3)
The great bulk of this table can be discarded out of hand by the data-gathering procedure, leaving the remainder for later processing. Some of the rules for processing the full table include:
After filtering with the above rules, the connection table is left as (shown with the "interesting" port in bold):
Local IP:Port Remote IP:Port State ------------------ -------------------- ------------ 18.104.22.168:8080 22.214.171.124:4544 established 126.96.36.199:8080 188.8.131.52:65347 established 184.108.40.206:8080 220.127.116.11:2607 finWait1 18.104.22.168:8080 22.214.171.124:26147 timeWait 126.96.36.199:8080 188.8.131.52:26156 timeWait 184.108.40.206:8080 220.127.116.11:44179 finWait2 18.104.22.168:8080 22.214.171.124:1633 synReceived 126.96.36.199:8080 188.8.131.52:50059 timeWait 184.108.40.206:8080 220.127.116.11:50060 established 18.104.22.168:8080 22.214.171.124:3487 timeWait 126.96.36.199:48625 188.8.131.52:80 finWait1 184.108.40.206:54357 220.127.116.11:80 lastAck 18.104.22.168:54651 22.214.171.124:80 established 126.96.36.199:54691 188.8.131.52:80 finWait2 184.108.40.206:54847 220.127.116.11:80 lastAck 18.104.22.168:55029 22.214.171.124:80 established 126.96.36.199:55114 188.8.131.52:80 established 184.108.40.206:55115 220.127.116.11:80 synSent 18.104.22.168:55143 22.214.171.124:80 closeWait 126.96.36.199:55145 188.8.131.52:80 established 184.108.40.206:55147 220.127.116.11:80 established 18.104.22.168:55148 22.214.171.124:80 synSent 126.96.36.199:55149 188.8.131.52:80 synSent
Correlation of this data is done later.
For a single remote proxy server, it's straightforward enough to call the snmpwalk program for the IP in question, but this does not scale well as things get busy. The fetch itself is fully blocking for the whole request, and this can take many seconds (most of which is spent waiting for I/O). For an active run of password hacking, there is no way that sequential snmpwalk requests could keep up with the attempts.
The RPAT daemon is largely just an SNMP engine that accepts "work requests" over a UDP socket, and it "cranks" multiple simultaneous SNMP sessions to all the targets at once. Written in perl, it is a single-threaded event loop that maintains a list of all active sessions while it fetches one TCP connection entry at a time via SNMP GET-NEXT datagrams.
When a work request is received over the control socket, these steps are taken:
The work-to-do list contains an object for each target we're trying to query, and each object has a bit of state attached. This state records the last query made to the proxy's SNMP TCP connection table, and whenever a response is received the appropriate GET-NEXT request is sent. When the "no more data" response is received from the proxy, the entire job is removed from the work-to-do list.
Timeouts and retries are handled automatically, and when too many of these occur, the job is also removed from the work-to-do list.
All proxies that provide no response of any kind are put on the stop list, and they will never be queried again. This stop list can be manipulated from the control channel (see next section).
The RPAT daemon does the great bulk of the work, but it still must be notified of IP addresses to query. When the abuse-detector finds an entry worthy of investigation, it must notify the daemon: the RPAT client is used for this.
The daemon can listen on a UDP socket for messages that are simply ASCII strings, and the daemon extracts the needed data from it. These message strings come entirely from the command line (usually invokes from a perl or shell script), and the supported commands are:
For instance, if the abuse-detector finds an IP address to check, it runs the command
rpatc --dest=10.1..3:1234 work 192.168.7.3
Here, the --dest parameter tells the client where the RPAT daemon can be found - both IP address and port -- and the underlined portion is the literal message sent to the RPAT daemon.
Because UDP datagrams are used for client/server communications, message delivery is not guaranteed. In circumstances where the client and the server are both on the same machine, localhost (the default) should be used to minimize the network resources used, but for machines that are "close" to each other topologically, even regular UDP should be reliable enough.
If the abuse frequency is so low that the loss of a single work request can jeopardize the results, there probably is probably not enough data to work with in the first place.
The RPAT daemon never replies or acknowledges these messages.
Once the daemon has gathered sufficient data reflecting diversity of abused intermediate proxies, the output logs can be processed in an attempt to "triangulate" to find the IP address of the abuser. By considering all connections to the proxy servers, the common source IP addresses may very well show up.
For the sake of discussion, we define these terms:
The goal is to find the Remote that is attacking us, and this will show up as the same IP address connecting to multiple unrelated proxy servers. This requires that we filter out a lot of extraneous data (indeed, most of the data are extraneous).
While scanning the data one line at a time, we look for reasons to eliminate entries that won't contribute to finding the abuser. Though some filtering has already been performed by the RPAT daemon (throwing away data that could not possibly be useful), we prefer for it not to be so aggressive: when in doubt, save the data. It's much easier to disregard data later found not to be useful than it is to regenerate data discarded too soon.
Once an SNMP entry line is to be kept, it must be filed: perl's associative arrays are ideal for this. Each Remote IP address is the key to a large hash, and this itself contains a list of all proxies that were used by it.
The entire RPAT daemon output log is processed by triangulate, and a tally is kept of each time a Remote had a connection to a Proxy. The result is sorted by the number of proxies used per Remote (showing the "busy" Remotes first), and then listing all the proxies found for each one.
REMOTES HITTING MULTIPLE PROXIES REMOTE 184.108.40.206 pc-62-30-89-34-tr.blueyonder.co.uk PROXY 220.127.116.11 55 PROXY 18.104.22.168 63 PROXY 22.214.171.124 121 mail.aplusmicro.com PROXY 126.96.36.199 40 bdsl.188.8.131.52.gte.net PROXY 184.108.40.206 99 PROXY 220.127.116.11 90 PROXY 18.104.22.168 50 PROXY 22.214.171.124 300 saturn.littleb.com PROXY 126.96.36.199 70 PROXY 188.8.131.52 65 REMOTE 184.108.40.206 option.cds.ne.jp PROXY 220.127.116.11 50 PROXY 18.104.22.168 10 PROXY 22.214.171.124 16 PROXY 126.96.36.199 12 fwall.digitex.net PROXY 188.8.131.52 11 PROXY 184.108.40.206 167 PROXY 220.127.116.11 21 pandora.teimes.gr PROXY 18.104.22.168 12 PROXY 22.214.171.124 18 PROXY 126.96.36.199 14 REMOTE 188.8.131.52 cable1-26.shenhgts.net PROXY 184.108.40.206 13 PROXY 220.127.116.11 36 PROXY 18.104.22.168 69 mail.aplusmicro.com PROXY 22.214.171.124 15 bdsl.126.96.36.199.gte.net PROXY 188.8.131.52 46 PROXY 184.108.40.206 43 PROXY 220.127.116.11 77 PROXY 18.104.22.168 83 saturn.littleb.com PROXY 22.214.171.124 36 PROXY 126.96.36.199 45 ...
The --minproxy and --minhits parameters can be given to the triangulate program to limit the output. For instance, --minhits=10 discards any proxy/count pair for a remote where the count is less than 10: this tosses out proxies with very few uses that may be no more than distractions. The --minproxy=5 won't show any remotes that have less than five proxy servers associated (this factor is considered after --minhits exclusions).
Adding --namelookup will try to look up the inverse DNS name of each IP address in the output display. This adds substantially to processing time because so many inverse DNS servers are misconfigured, but triangulate does caching of names to minimize resolver traffic.
The RPAT daemon itself can run on any UNIX/Linux machine that has network connectivity and a working perl installation. In realtime mode - the only really intersting way to use his program -- a UDP port must be chosen for the daemon to listen on and accept work requests. We typically use 1234 for no good reason. Only root can bind to ports less than 1024.
To launch the server:
$ rpatd --workport=1234 > rpatd.out & $ tail -f rpatd.out
This runs the server to the rpatd.out logfile, which will include both debugging and recorded SNMP information. We "tail" the logfile to watch what it's doing.
Then, the abuse must be monitored, and this is highly dependent on the particular application in question. When abuse is detected that might be from a remote proxy, the rpatc command must be run to send the work IPAddress command to the daemon. Though it would be possible to build the socket communication into the abuse detector directly, for our volumes we've typically just called the system() command.
Though not an official part of the release, the Download section includes a link to the rpatwatch program. This was our ad-hoc program used to monitor an Apache web lot for a particular kind of traffic, and when suspicious activity was seen, it notified the daemon. This perl program may be tuned to local needs. Be sure to set up the --dest parameter properly to reflect the IP address and port of the RPAT daemon.
Once the daemon has been running for a time, it's possible to run a triangulation procedure on that file even while the daemon is still running. The triangulate program knows to ignore the debugging information, considering only the "real" SNMP data.
$ triangulate < rpatd.out > abusers.txt
The resulting abusers.txt file contains the list of potential abusers in descending order of which ones used the widest range of proxies.
We have identified some potential pitfalls that may arise while trying to use this sytem.
There are three programs delivered with the RPAT system.
rpatd - RPAT Daemon
This is the main daemon that talks to the network making SNMP queries to remote servers, and it requires a source of work. This is done by either passing a list of IP addresses on the command line or by using the --workport parameter to specify a command listener port. The latter method is the only way to achieve "realtime" triangulation, but the IP-address-list method can be used for testing.
$ rpatd [options] [IPAddress ...]
rpatc - RPAT Client
This small program is used to send a message to the RPAT daemon, and the general form is:
$ rpatc [options] message here
The "message" is a literal text string, and rpatc does not inspect the message at all - it is merely passed unchanged to the daemon. See the RPAT Client/Server Communications section for details on the supported messages.
triangulate - postprocess RPAT daemon logs
This perl program scans the logs produced by the RPAT daemon and winnows out the interesting IP addresses
$ triangulate [options] < logfile
This is an ongoing research project, and from time to time I update the web site with the latest. I provide an overall bundle, plus individual files for download. Internally for development we use separate main files and helper perl modules, but for release we use an "inliner" tool that bundles all the required local modules into a single file: thus the unusual constitution of some of the modules, especially the rpatd daemon.
Nearly every time this technique is described, somebody points out that we are effectively "hacking" the anonymous proxies to fetch the SNMP data, and this is a point well taken. Whether a large web company could employ this in large scale is one question, but small-time or occasional use is another story.
Since any SNMP failures - for whatever reason - cause the RPAT daemon to put the IP address on the stop list, sites without open SNMP will see no more than three attempts. There is no kind of "banging away" on sites that do not respond to us. And any sites that do have open SNMP are unlikely to notice that we are making these requests. Our experience has shown that these sites often have open NETBIOS as well, so readonly access to SNMP data seems very minor compared with the damage that a malicious user could inflict.
It's been argued that since SNMP is open to the public, that it must indeed be a public service, so we can fetch the data without fear of having this activity interpreted as a criminal act.
But we are very clear that this document contains zero legal advice.
During this process we have come across quite a few ideas that might make this system a bit more full-featured.