Description
In this dissertation, I propose an approach to passive network monitoring in which the monitor is provisioned for the average data rate on the network. This average rate is generally an order of magnitude or more lower than the peak rate. I describe Data Triage, an architecture that wraps a general-purpose streaming query processor with a software fallback mechanism that uses approximate query processing to provide timely answers during bursts. I analyze the policy issues that this architecture exposes and present Delay Constraints, an API and associated scheduling algorithm for managing Data Triage. I then describe my work on novel query approximation techniques to make Data Triage's fallback mechanism work with an important class of monitoring queries. Finally, I describe a deployment study of Data Triage in the context of a prototype end-to-end network monitoring system at Lawrence Berkeley National Laboratory.