Thumbprint work for ARPA

Introduction

At present, it is difficult to trace intruders across the Internet to their point of origin. Several factors go into this. The main one is that many sites have poor security and intruders are able to amass a collection of accounts on such sites. They then use these to mask their origin by logging into a chain of them in turn before attacking a target site.

This means that it is extremely difficult to hold intruders accountable for their actions.

The idea behind thumbprinting is to notice that the content of the connections at different points in the chain are the same - if an {\it ls} command is given by the attacker, those two characters (and their echoes) traverse the entire length of the chain. Similarly, the target machine's response travels through every connection involved. Thus, if small summaries of the content of the connections were kept at two points, they could later be compared to establish whether or not those two formed links in the same chain. One can think of such a summary as (loosely) a checksum of the data in the connection (or some piece of it). In practice these summaries are formed by passive monitoring of network traffic.

We have characterized the properties required of such summaries, and explored a class of linear algorithms which have these properties. We have performed experiments which have preliminarily established the feasibility of the method.

Additional information

A white paper giving a one page overview of the project.
The slides of Stuart Staniford-Chen's talk given to the 1995 IEEE Oakland conference.
The text of the conference paper.
The Master's Thesis of Stuart Staniford-Chen, which is similar to the previous item, but with more detail and more results.

Stuart Staniford-Chen 8/28/95