Does this site look plain?

This site uses advanced css techniques

This is a web request listener for those curious about exploring the spread of the Code Web worm (and potentially others). It listens on port 80 like a web server does, logs each request, and closes the connection to the remote.

Code Red attempts to infect systems randomly, and those that wish to monitor this activity often use an IDS system to do so. This works only when there are real web servers on the network, because attempts to nonexistant web servers yield only unanswered ARP requests or unanswered SYN connection attempts. In order to detect actual probes by Code Red (or the Next Big Worm), one has to simulate a real web server. This program is for that purpose. It is not a security tool and is useful only to the curious.

Websnarf is a perl program that listens on port 80 and accepts all connection requests. It notes the local and remote IP addresses, waits a short time for the request (GET, POST, whatever), and closes the connection. Then the connection attempt is logged and the whole process repeats. We've managed to collect a fair amount of data with this tool so far.

Setting it up

On a UNIX system this program must be run as root in order to bind to port 80 for accepting requests: this is a traditional requirement of all listener ports less than 1024. It also won't work properly if other programs are listening on port 80: it does not work and play well with other web servers on the same box, though it's likely possible to modify it to listen only on specified interfaces.

When running, the program appends to the logfile, so it can be run multiple times without smashing the previous contents. Time is always recorded in GMT because this simplifies the time display code. The "regular" logfile format records everything we know about the request (except headers past the first), and a sample looks like this (we've truncated the right-hand part of the log so it doesn't overflow too badly).

# ./websnarf --log=logfile.txt
websnarf v1.03 listening on port 80 (timeout=5 secs)
08/02 00:33:14   -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:33:17  -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:33:43  -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:34:03   -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:35:26  -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:37:18  -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:41:57  -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:41:59   -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:43:01   -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:47:27   -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:48:07   -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:49:33     -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:50:08   -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}
08/02 00:52:17   -> : {GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN}

Logs can also be written in Apache format with the --apache command-line parameter, and this is to allow easy submission of Code Red logfiles to some of the centralized logging sites like (who can apparently accept Apache logs via email). An Apache logfile output looks like this wrapped for ease of presentation here):

# ./websnarf --log=logfile.txt  --apache - - [02/Aug/2001:15:15:34 -0000] "GET /default.ida?NNN
78%u0000%u00=a  HTTP/1.0" 404 100

Note that Apache logfiles do not record the destination IP address, so this information is lost. We might find a way to override one of the otherwise-unused fields (auth?) to provide this anyway. It's the lack of the destination IP that make --apache not be the default.

Listening on multiple IP addresses

For machines on a network that can listen on multiple interfaces, this can increase the amount of information reported. Under Red Hat Linux (and probably many others), we can use the ifconfig command to add aliases to the existing Ethernet driver, allowing it to listen on more than one.

For instance, assuming the Ethernet interface is eth0, aliases can be added as root with

# ifconfig eth0:0 netmask broadcast
# ifconfig eth0:1 netmask broadcast
# ifconfig eth0:2 netmask broadcast
# ifconfig eth0:3 netmask broadcast
# ifconfig eth0:4 netmask broadcast
# ifconfig eth0:5 netmask broadcast
# ifconfig eth0:6 netmask broadcast

We're not aware of specific limits on how many IP addresses one machine can listen on, but we'd be surprised if there were none.

Update: previous versions of this document omitted the suggested netmask and broadcast keywords from the ifconfig command. If your netmask is nonstandard, you'll probably need them.

It should go without saying that any external firewalls have to permit the inbound traffic to the local machine's port 80 in order for them to be logged here. Open up external firewalls at your own risk.

Command Line

We support a handful of command line parameters, some of which are only useful for testing.

Show a brief summary of the command-line usage.
Append to the logfile named "FILE". It's created anew if required.
Listen on TCP port ## instead of port 80. This is really only useful while testing the program itself so you can add features to the "new" version while the "old" one runs along collecting data.
Wait at most ## seconds for data once a connection is established. This way a dropped connection or other confusion won't hang the whole program.
Save logs in Apache format, presumably so we can send the logfiles into the centralized recordkeepers.
Only show up to ## chars of the request, with longer ones all being truncated. Code Requests are really long, and we don't really need to see them all. Ignored if the --apache parameter is used.
Save each full header in a file in directory DIR. The filename is named for the source and destination IP -- otherwise this would not be available -- and it's not so clear that this is really all that interesting. We need to find a better format for the filenames or decide if this functionality even belongs here.

Open issues

For a program written in three hours, it's not surprising that there are open issues.

Security and Porting Issues

This program is written in perl, which has a good history of proper buffer management: it's unlikely that running this program itself will add any risk to the local security environment. We don't run subprograms of any kind (no system or ``, for instance), so there aren't too many avenues for shenanigans.

We have made a drive-by test of this program on a Windows NT 4.0 machine and were surprised to see that it appeared to work. We're using the positively outstanding perl from ActiveState, and you can get Active Perl for free right here.

We promise that this program has no overtly malicious code. We certainly don't promise that it has no bugs.

Note that this was written after the Aug 1 infestation, but before the current Code Red II variant started wandering around. Websnarf needs plenty of additions, but I was so consumed with analysing the worm itself that I left this alone. I'll attend to websnarf soon.

Known to work on:

Others having any luck with this are encouraged to notify me at [Email address]

Revision History

1.04 - just changed the comments to no longer discourage --save. No changes to the perl code.

1.03 - added --apache option to keep the logs in Apache format, plus the --save parameter for just fooling around.


The perl code can be downloaded from here. It was built and tested under Red Hat Linux 6.2 (2.2.14 kernel) and perl 5.005_03.

Thanks to Jeffrey for help with the perl. Feedback welcome.