A model must be able to adapt itself to generalize to new. Deep networks achieved great success in the past decade, especially when training and testing data come from the same distribution. Unfortunately, the performance suffers when the training (source) differs from the testing (target) data, a condition known as domain shift. Models should update themselves to deal with these unexpected natural and adversarial perturbations, such as weather change, sensor degradation, adversarial attack, and so on. If we have some labeled target data, several transfer learning methods, such as fine-tuning and few-shot learning, could be utilized to optimize the model in a supervised way. However, the requirement for target labels is not practical for most real-world scenarios. Therefore, we instead focus on the unsupervised learning approach to generalize the model to the target domain.

In this dissertation, we study the setting of fully test-time adaptation, updating the model to the uncontrollable target data distribution, without access to target labels and source data. In other words, the model only has its parameters and unlabeled target data in this setting. The core idea is to leverage the test-time optimization objective, entropy minimization, as a feedback mechanism to the learnable model to close the loop during the test time. We optimize the model for confidence as measured by output entropy in either an online or offine manner. Such a simple yet effective method could reduce the generalization error for image classification on naturally corrupted and adversarial perturbed images. Also, the adaptive nature of the semantic segmentation model could be exploited to cope with the dynamic scale inference for scene understanding. With the help of contrastive learning and diffusion models, we could learn target domain features and generate source-style images to further boost the recognition performance in dynamic environments.




Download Full History