The availability of cheap wireless sensor motes with imaging capability has made possible wireless camera networks that can be cheaply deployed for applications such as environment monitoring, surveillance and 3DTV. However, the gaping disconnect between high-bandwidth image sensors and a combination of low bandwidth channels, lossy communications and low processing capabilities makes realizing such applications especially challenging and forces tradeoffs between rate, reliability, performance and complexity. In this dissertation, we attempt to bridge that disconnect and address those tradeoffs in a meaningful fashion by drawing from both the Signal Processing and Computer Vision fields.

First, we focus our attention on compression and transmission of video from multiple camera sensors in a robust and distributed fashion over wireless packet erasure channels. We develop a low encoding complexity, low latency and error resilient video streaming solution based on distributed source coding that effectively uses information from all available camera views. We also investigate two correlation models for modeling correlation between camera views; one is based on view synthesis and another is based on epipolar geometry.

Second, we examine the problem of establishing visual correspondences between multiple cameras under rate-constraints. This is a critical step in performing many computer vision tasks such as extrinsic calibration and multi-view object recognition, yet the wireless medium requires that any information exchange should be done in a rate-efficient manner. We pose this as a rate-constrained distributed distance testing problem and propose two novel and complementary solutions: one using distributed source coding to exploit statistical correlation between descriptors of corresponding features and another using random projections to construct low bit-rate and distance-preserving descriptors.

Third, we study the problem of video analysis for multiple video streams generated by the deployment of camera networks, where methods that can run in real-time at a back-end server are needed. We develop computer vision techniques that exploit information which can be efficiently extracted from compressed videos. We consider their application to the task of detecting occurrences of human actions at specific times and locations and study the effects of video compression parameters on recognition performance. We also consider their use in the analysis of meetings to perform tasks such as slide change detection and dominance modeling of participants.




Download Full History