This thesis introduces CANDID, a passive NetFlow-based network traffic analysis platform targeted at inferring relationships and dependencies among services running on hosts in enterprise networks. These networks present challenges of great scale, complexity, and nonstop dynamism, which hinder the ability for network administrators to maintain insight into the complex relationships that exist in these networks. Consequently, administrators do not always know how best to proceed if a network failure occurs. CANDID strives to empower administrators by illuminating these relationships, such that they will be prepared to remedy complex service failures. The solutions presented here take the first steps towards understanding these complex in-network relationships, with a special focus on inferring one class of dependencies and detecting load balanced services. The focal point of this thesis is two radically different, yet complementary, strategies for inferring the presence of load balancing for pairs of systems. A case study using real NetFlow data from the network located at Lawrence Berkeley National Lab is leveraged to validate the strategies presented here. Promising results indicate this problem space is rich with unanswered research questions and is worthy of further exploration.




Download Full History