We explore techniques to reduce the sensitivity of large-scale data aggregation networks to the loss of data. Our approach leverages multi-level modeling and prediction techniques to account for missing data points and is enabled by the temporal correlation that is present in typical data aggregation applications. The result can tolerate significant involuntary data loss while minimizing overall impact on accuracy. Further, this technique permits nodes to probabilistically remove themselves from the network in order to reduce overall resource usage such as bandwidth or power consumption. In simulation, we explore the tradeoff between algorithmic complexity and prediction performance across a variety of data sets with different dynamic properties. We quantify the temporal correlation in several real-world datasets, and achieve more than 50% resource savings in an environment with significant loss, while maintaining high accuracy.
Title
Probabilistic Data Aggregation In Distributed Networks
Published
2006-02-06
Full Collection Name
Electrical Engineering & Computer Sciences Technical Reports
Other Identifiers
EECS-2006-11
Type
Text
Extent
15 p
Archive
The Engineering Library
Usage Statement
Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).