The key point of GrIDS is to detect large-scale automated attacks on networked systems. The mechanism that we propose is to build activity graphs. The nodes of an activity graph correspond to hosts in a system, while edges in the graph correspond to network activity between those hosts.
GrIDS's core component is a graph-based language for analyzing network connection activity in a LAN-MAN sized system. GrIDS contains a language for specifying how network connection activity is represented in graphical form, incorporating information from host-based IDSs, network sniffers and connection analysis components.
The same language allows users to specify graphical structures which require notification of system administrators. This language is suitable for implementing network access control policies as well as identifying attack signatures.
Next the worm spreads to D, E, and so on. The aggregation engine incorporates them too. The graph soon looks like:
A--> B--> D | | | `-> E | `-> C--> F | `-> GAt this point, the aggregation engine notices the tree like structure of the graph, and it reports a suspected worm.
In addition to forming graphs from network connections, we allow the graphs to be "decorated". That is, both nodes and edges in an activity graph may have attributes which supply additional information about the connection/host, or document other significant events which occurred in the same time frame. For example, if an NSM reported that the B->D connection appeared to contain a password file, an annotation to that effect could be added to that link. Similarly, if a tripwire module on C reported a change in a critical system file in the right activity time-frame, that would become an an attribute of the C node in this graph.
In the worm example, we detected the worm based soley on the tree like structure of the graph. In a more complicated algorithm, we could also use the node attributes ("decorations") in detecting a worm. A tree structure may not, in some other algorithm, be suspicious until some node in the tree is decorated by some worm-related attribute.
A difficulty in the simple worm example is that only fast spreading worms will create large graphs (because graphs built at each stage of a slower worm will time-out before the next stage of infection.) Slower worms will create only small trees, each tree (possibly only a single connection) timing-out before another tree is created. Without the large trees which are used to detect worms, a slow worm won't be detected.
One solution is to allow graphs longer latency periods between connections. This, however, creates a serious problem: legitimate user activity clumps together into one huge graph, obscuring possible worm activity.
A better solution is to refine what kinds of network activity should be associated with a particular network abuse. If a worm is known to spread using rlogin and telnet connections, then a graph can be built that only contains rlogin and telnet connections. Since legitimate rlogin and telnet activity is less profuse than the sum of all legitimate activity, the latency time of this new graph may be increased without creating a huge (obscured) graph of legitimate activity.
Of course, for numerous types of network abuse we may search, so we need to be able to build numerous kinds of graphs. The graph engine is capable of maintaining multiple graph spaces, where each graph space contains only graphs of one type. The "type" of graph is specified by a set of rules which specify how graphs in the graph space are built (including rules specifying which kinds of connections to represent).