Database Seminar

Monday, October 10, 2016 - 4:45pm to 6:00pm

Location:

8102 Gates & Hillman Centers

Speaker:

EMAAD MANZOOR, Senior Oracle Technical Consultant https://www.linkedin.com/in/emadmansour

Event Website:

http://db.cs.cmu.edu/events/db-seminar-fall-2016-emaad-manzoor/

For More Information, Contact:

hyunahs@cs.cmu.edu

Given a stream of heterogeneous graphs containing different types of nodes and edges, how can we spot anomalous ones in real-time while consuming bounded memory? This problem is motivated by and generalizes from its application in security to host-level advanced persistent threat (APT) detection. We propose StreamSpot, a clustering based anomaly detection approach that addresses challenges in two key fronts: (1) heterogeneity, and (2) streaming nature. We introduce a new similarity function for heterogeneous graphs that compares two graphs based on their relative frequency of local substructures, represented as short strings. This function lends itself to a vector representation of a graph, which is (a) fast to compute, and (b) amenable to a sketched version with bounded size that preserves similarity.
StreamSpot exhibits desirable properties that a streaming application requires—it is (i) fully-streaming; processing the stream one edge at a time as it arrives, (ii) memory efficient; requiring constant space for the sketches and the clustering, (iii) fast; taking constant time to update the graph sketches and the cluster summaries that can process over 100K edges per second, and (iv) online; scoring and flagging anomalies in real time. Experiments on datasets containing simulated system-call flow graphs from normal browser activity and various attack scenarios (ground truth) show that StreamSpot is high-performance; achieving above 95% detection accuracy with small delay, as well as competitive time and memory usage.