Intrusion Detection Analysis Project

Research Goal

The goals of this research are to develop a model of data sanitization that describes the relationship between the requirements of security
analysis and privacy, and to study the features of attacks launched over a network in an academic environment.


Data is sanitized when some set of sensitive information is removed or disguised. The data that is sensitive is defined either by patterns (words) or by position. If left intact, the sensitive data would reveal information that a party requires be kept secret.

Other work in this area has focused on the algorithms used to transform the sensitive data into non-sensitive information (aliases). The algorithms include one-way hash functions to derive pseudonyms that cannot be inverted to obtain the unsanitized words. The problem with this work is that, if the set of sanitized words is known (or can be guessed), a straightforward dictionary attack will reveal the mapping without inverting the hash function. Some work has also examined reconstruction, which usually uses shadow key schemes. The focus here is on both the scheme and on the system mechanisms to prevent unauthorized rederivation of the original data. Finally, some work has explored the relationship between privacy and security analysis, but in a qualitative way.

Our approach is to express the requirements for security analysis and the requirements for privacy as properties of the data. Under sanitization, these properties must be preserved. This reduces the problem of balancing privacy and security analysis to a policy decision. Given the proper form of expression, we can analyze the properties to discover inconsistencies (where privacy requires some data be sanitized, and security analysis requires that the data be present), and resolve these problems.

The specific goals of this part of the project are:

1. Develop a little language to sanitize data that is amenable to such an analysis; and

2. Prove the feasibility of this approach by building a tool to use the language to sanitize network data.

Data Correlation
Intrusion detection systems are designed to detect attacks against hosts throughout the network. This requires a characterization of the signatures of each attack.

To understand attacks better, we need to be able to describe them, and correlate information from data sensors with attacks to be able to characterize the descriptions in low-level terms. As attacks are usually multi-stage, the description of an attack consists of descriptions of the stages of the attack.

Consider an attack to be a sequence of goals. Each intermediate goal corresponds to successful completion of a stage of the attack. Our hypothesis is that attack tools constructed by composing tools to achieve each goal will generate signatures indistinguishable from those of attack tools available on the Internet. If this hypothesis is true, then the collection of attack tools becomes unnecessary. We need only describe the attack in this way, and we can generate the tool and the relevant signature.

The specific goals of this part of the project are:

1. Do attack tools constructed from this type of description have the same traces as existing attack tools? (If so, this validates the hypothesis.)

2. Can this description be used to generate variants on known attacks?


This work is funded by Promia, Incorporated

Project Members
Matt Bishop, P. I.
Rick Crawford
Patty Graves, Administrative Assistant
Karl Levitt, co-P.I.
Brennen Reynolds



For Further Information:

Administrative information:
Patty Graves
(530) 752-2149
fax: (530) 752-4767

Technical Information:
Matt Bishop
phone: (530) 752-8060
fax: (530) 752-4767

web page:

last modified 4/15/02