However, whether provided by novel sensor networks or legacy instrumentation, extracting meaningful information from sensor data and taking actions based on that data depends fundamentally on the metadata available to interpret it. There are more than 5 million commercial buildings in the US, with the sensors in each building set up with customized and obscure metadata. One cannot achieve scalable deployments of software analytics and applications across buildings if deploying them requires vendors and domain experts spending spending 100s of hours fixing each building. Today even well-established applications do not get deployed at scale because of this very reason. Thus, the major challenge is scalability , i.e a paradigm where an application can be written once and deployed on 100s or 1000s of buildings.
This thesis evaluates the challenges with existing metadata of sensors in smart-buildings and proposes ways to normalize it to uniform standard that would allow scalable (write-once and deploy everywhere) application development. We develop three empirical criteria for successful metadata schemas --- (a) completeness, or the ability to capture all sensors, (b) ability to capture all relationships between sensors required by state-of-the-art applications, and (c) flexibility in incorporating novel sensors and applications, and usability. We empirically demonstrate that no existing smart-building sensor metadata schema satisfy these properties and develop a schema based on an underlying graphical data model, Brick, that does. We validate Brick across 6 large and diverse commercial buildings (comprising more than 17,000 sensors) in two different continents and set up by different BMS vendors.
We also develop a human-in-the-loop synthesis technique which uses syntactic and data-driven steps to parse legacy metadata into a common schema. This technique allows building-experts, who might not be conversant with sophisticated regular expression programs, to parse more than 70% of the legacy metadata in a building to a common schema by providing example parses of only about 1% of the sensors. We also show how to use active perturbations of subsystems in a building to construct functional relationships between subsystems that may have been missing or incorrectly captured in legacy metadata with an accuracy of ~80%. Finally, we demonstrate the power of our normalized smart-building metadata schema paradigm by using the standardized sensor and relationship representations to implement both simple (e.g. finding errant zones, identifying inefficient air handling subsystems to save energy) and sophisticated applications (e.g finding rough occupancy estimates) that scale across real-world smart-buildings.