Enterprise networks are becoming more complex and more vital to daily operations. To cope with these changes, network administrators need new tools for troubleshooting problems quickly in the face of ever more sophisticated adversaries. Passive network monitoring with declarative queries can provide the combination of responsiveness, focus, and flexibility that administrators need. But networks are subject to high-speed bursts of data, and keeping the cost of passive monitoring hardware under control is a major problem.

In this dissertation, I propose an approach to passive network monitoring in which the monitor is provisioned for the average data rate on the network. This average rate is generally an order of magnitude or more lower than the peak rate. I describe Data Triage, an architecture that wraps a general-purpose streaming query processor with a software fallback mechanism that uses approximate query processing to provide timely answers during bursts. I analyze the policy issues that this architecture exposes and present Delay Constraints, an API and associated scheduling algorithm for managing Data Triage. I then describe my work on novel query approximation techniques to make Data Triage's fallback mechanism work with an important class of monitoring queries. Finally, I describe a deployment study of Data Triage in the context of a prototype end-to-end network monitoring system at Lawrence Berkeley National Laboratory.





Download Full History