GLOBAL GUARD MEETING
November 24, 1998
3085 ENG II
9:15 10:00
In attendance:
Karl Levitt (KL), David Klotz (DK), David O'Brien (DOB), Jeff Rowe
(JR), and Steven Templeton (ST)
Todd Heberlein (TH), Chris Wee (CW), and Jason Schatz (JS) arrive near
end.
TOPICS
False positive rates for Wall Street Journal
Statistical Correlation
Direction for the Project

False positive rates for Wall Street Journal

Karl mentioned that the reporter from WSJ was told by industry that there
were no false positives

ST: If you consider it in terms of looking for anomalies, then there are
no false alarms

Statistical Correlation

JR: Yes/No Type Interrupt Alarm Yeminis systems can handle variables/hostname

DOB: Lets look at other correlations: Statistical Correlation covariance
matrix; countable random variables how do we model/count them?

Large sampling over time

Time key measuring factor

KL: Attacks from sites

ST: Concept of Distance Measure/Similar to Clusters

New appropriate metric for distance

Correlate activity method of distilling activity to binary features

Activity à Cluster Center; closer
to cluster A than Cluster B

Activities in disparate places find commonality

Correlating attribute values to normal

Bottom Level counting

DOB: Stat 131 stat correlation will we use any of it?

DK: Packet maximum Percentage of packets is high (in ratio to packet
maximum) from certain hosts

KL: Example: Assume correlation, but not independent events near simultaneous
form of anomaly detection

ST: A à B correlation C enters, sniffing
traffic

DOB: Codebook approach solution for Global Guard?

KL: Worth a try.

ST: Hamming Distance why is this an appropriate distance measure?

Answer: unweighted and unbiased

KL: Reduction process try to eliminate bias and weighting (but not all
is eliminated)

DOB: Paper: A Coding Approach to Event Correlation

ST: Closest match calculate distance for your example with every reference?

DOB: Precomputes every combination. Vector 3 enumerate 8 meaning with
vector.

DK: Hash table to lookup quickly.
P= Problem 
Code Book Vector 
Incoming Vector

P1 
110 
000 à P2 
P1 à 2 
P2 à 2 
P2 
100 
In à P1,
P2 
P1 à 2 
P2 à 2 

Sparse 
Not sparse 



ST: Works for small vector

ST: Read Solomon locality of where errors will occur. Burst as opposed
to a lot. Arrange features to get things together.

KL: Yemini requires human modeling. With statistical methods, only get
profiles

DOB: Correlation decide whether to group events

DK: Numbers that give you statistical correlation

TH: Ad hoc statistical correlation

KL: Correlation vs. Anticorrelation: People who wear Tshirts in their
20s are millionaires in their 50s.

JR: Correlation without stats

ST: Accounting move away from inferencing

TH: Statistical study of power lines and cancer rate more correlated to
the poverty level than the power lines

Networks To run one machine, another machine must be running

Direction for Project

JR: Symptoms what extra information do we need?

JR: Yemini Yes/No Interrupt sensors on network is not enough. We need to
generate our own interrupts

DOB: Model attacks

JR: Continuous variable may not know ahead of time

KL: Unknown attacks anomaly detection

TH: Codebook  flexible signature mechanism?