Misuse Project Meeting Agenda & Notes

Agenda for Misuse Detection Project Meeting: Monday 21-Oct-96, 5-6pm

Technical Paper (0:15) Raymond
    "Statistical Foundations of Audit Trail Analysis
    for the Detection of Computer Misuse", Paul Helman and Gunar Liepins

Milestones (0:15) Julie

Report on collaboration w/ Stanford (0:05) Karl

Debrief meeting with VTMH programmers (0:15) Brant

Med Informatics Survey (0:05) Steven

NSA Project milestones (0:05) Raymond

Topics for next agenda (0:05) Chris
        Chris' paper

Meeting Notes for Misuse Project - 10/21/96

Attendees: Chris, Brant, Raymond, Steven, Julie, Karl

Notes taken by Julie

paper

Statistical Modeling

* N(t): generates a normal activity x at time t
* M(t): generates a misuse activity x at time t
* D(t): determine if transaction x at time t is normal/misuse
* N, M, D are pairwise independent => no temporal activities

Misuse Detector (MD)
* graded (a continue spectrum) / binary (0 or 1)

* define error as a weighted sum of overestimation and underestimation
  (i.e., if x is a misuse, how the MD will say about it)

Theorem 1: A binary MD minimizes the error if it is defined as 
MD = 0 if r(x) <= a lambda /(1 - lambda)b, and MD = 1 otherwise, where
lambda is the a priori probability of normal transaction being
generated, n(x) is the prob. of x being generated by normal process, 
and m(x) is the probaility of x being generated by misuse process. 

When these information is not available, we go for ranking the
transaction, according to the degree of suspicious; i.e., MD(x1) <
MD(x2) => x2 is more suspicious than x1;

define prioritization penalty/error as the error of misranking two
transactions 

Theorem 2: a graded detector minimizes the prioritization error iff MDg
is consistent with r, i.e., r(x1) < r(x2) => MD(x1) < MD(x2)

still we need to know n(x) and m(x)

E.g., frequentist estimator of n(x) is number of time x occurs / total
number of transactions

for m(x), use a surrogate functions, e.g., uniform and independent
models. Uniform model treats each transaction occurs equally; independent
model treats as independent the distributions of the individual
attributes (L of them) of a transaction, where m(x) *= ni(x), for x=1, ..., L 

Overcome the sample size limitation by transforming the samples. Two
approaches: attribute selection, and value aggregtion (partition the
attribute domain). The goals are 

1) can still distinguish N and M process;

2) a good spread of r-values

3) preserve the structure of the space of the all the transactions, S 

4) small mass of unseen transactions

Theorem 3: it is a NP-complete to project a subset of the attributes,
but still maintain a certain number of singleton transactions (=>
resemble the S space)

Nonmodel Based Approach

rule-based (simple pattern matching)

Theorem 4: with the presence of nonmaximal rule (not covering all the
attributes in the transaction), it is possible that the scoring
function (sf) is not consistent with the ranking function (r)

Meeting with VMTH

Paul B. expressed some apprehension about the project. Concerns are

Allocation of personnel
more interested in intrusion detection than misuse detection

We should emphasize:

Security features will increase marketability if application is commercialized
Security is more than protecting confidentiality of information
Security important for planned WWW site

Other information:

Paul B solely involved in policy
Paul needs to inform supervisor about our source of funding
Believe that development system is secure because of obscurity of Mumps and privacy of modem numbers. Virus checking done on development system.
Transaction journal - updates and modifications only, no reads, can be used to regenerate database, database is needed in order to interpret pointers and internal forms
System - central server in basement, 3 client servers (handle interactive logins), 24 users/server, 48 terminals/server. 3 client servers in other bulidings. Support 40 other PCs.
Production environment uses NT server, RAS (remote access for programmers).

Approach:

hypothesize how each user can attack system.
explain difference between audit and enforcement
need supervised clean misuse data
think about how to translate from Mumps to Windows NT
ask for org charts, description of audit logs
Karl will call Paul to discuss his concerns
VTMH programmers can't read postscript, need to find ghostscript for windows/dos.

Milestones

I'm just going to list the milestones here without showing the order. We aren't finished with them so we can reconstruct the order at next week's meeting. I have attempted to roughly order them from last to first.

final milestone - prototype, NT executable, crunches NT audit trail with respect to a policy
Tool/language to specify a policy
Knowledge discovery - learns from training data, does not use an explicity policy
system - interacts with tool/language (provides expert knowledge)
Research report - previous papers cited
resolution engine
policy discovery system
audit logs
RE data filter
compromised software detection mechanism
PPWF (anyone remember what this was?)
literature search (ongoing)
list of attributes - in the problem domain, and on NT

NSA Milestones

literature search
see what's going on

Items for Next Week's Agenda

Med informatics survey
Debrief NT dept of hydrology site visit
NSA project milestones
Karl report on conversations with Paul and Stanford
Problem definition - Raymond

Technical paper - overview of knowledge discovery and databases - Steven