Description
We first design a set of binary machine learning classifiers that take as input pairs of Wi-Fi RSSI fingerprints. These classifiers distinguish between pairs of RSSI fingerprints recorded 2 or fewer meters apart and pairs recorded further apart but still in Bluetooth range. We empirically verify that a single classifier cannot generalize well to a range of different environments with vastly different numbers of detectable Wi-Fi Access Points (APs). However, specialized classifiers, tailored to situations where the number of detectable APs falls within a prescribed range, are able to detect physical proximity significantly more accurately. As such, we design three classifiers for situations with low, medium, and high numbers of detectable APs. We characterize their balanced accuracy for proximity detection to be between 66.8% and 77.8%.
Next, we design a second set of binary machine learning classifiers, which take as input pairs of 10-second traces of smartphone magnetometer readings. These classifiers distinguish between pairs of trace segments for which the two recording devices are 2 or fewer meters apart for at least 75% of the segment duration and pairs for which the two devices are further apart but still in Bluetooth range. We first evaluate these classifiers’ performance on traces from the MagPIE dataset, a dataset for evaluating magnetometer-based localization algorithms; we characterize their balanced accuracy for homogeneous-device proximity detection to be between 89.3% and 93.3%. We show that our classifiers can generalize well to different buildings whose traces are not present in their training data. We introduce a simple method of compensating for different magnetometer biases in heterogeneous devices, and evaluate our approach with this added mitigation by training and evaluating classifiers on different disjoint subsets of traces from 4 different smartphone models. We characterize their balanced accuracy for heterogeneous-device proximity detection with non–tilt-compensated traces to be between 93.8% and 96.9%; these results indicate that our classifiers can generalize well to devices whose traces are not present in their training data.