Computer Science Thesis Proposal

Thursday, April 28, 2016 - 10:00am

Location:

Traffic 21 Classroom 6501 Gates & Hillman Centers

Speaker:

MIGUEL ARAUJO, Ph.D. Student http://www.cs.cmu.edu/~maraujo/

The identification of anomalies and communities of nodes in real-world graphs has applications in widespread domains, from the automatic categorization of wikipedia articles or websites to bank fraud detection. While recent and ongoing research is supplying tools for the analysis of simple unlabeled data, it is still a challenge to find patterns and anomalies in large labeled datasets, such as time evolving networks. What do real communities identified in big datasets look like? How is their structure affected by their size? How can we find realistic communities in labeled data? The completed work of this proposal details three related problems in this area. Firstly, we explore the shape and structure of real communities in large networks and we introduce the concept of ā€¯hyperbolic communitiesā€¯, providing two different algorithms for finding such structures in large datasets. Secondly, we find communities in edge-labeled networks, where labels can be timesteps or any other categorical information in general. We describe efficient algorithms for this task. Lastly, we study anomalies in bank transaction networks, where both nodes and edges are labeled. We describe parallel algorithms that automatically find locations where bank accounts were compromised in billion-scale networks. We also detail future work (1) on the distributed detection of edge-labeled communities, (2) on forecasting communities to the future, predicting what members are going to join and finding the most common community profiles, and (3) on the existence of hyperbolic communities in word-networks, merging community detection and the known heavy-tailed distribution of word frequencies. Thesis Committee: Christos Faloutsos (Co-Chair) Pedro Ribeiro (Co-Chair, University of Porto) William Cohen Aarti Singh Tina Eliassi-Rad (Northeastern University) Beatriz Santos (University of Aveiro) Alexandre Francisco (University of Lisbon) Copy of Proposal Summary

For More Information, Contact:

deb@cs.cmu.edu

Keywords:

Thesis Proposal