Recognition of general visual categories requires a diverse set of feature types, but not all are equally relevant to individual categories; efficient recognition arises by learning the potentially sparse features for each class and understanding the relationship between features common to related classes. This paper describes hierarchical discriminative probabilistic techniques for learning visual object category models. Our method recovers a nested set of object categories with chosen kernel combinations for discrimination at each level of the tree. We use a Gaussian Process based framework, with a parameterized sparsity penalty to favor compact classification hierarchies. We exploit structural properties of Gaussian Processes in a multi-class setting to gain computational efficiency and employ evidence maximization to optimally infer kernel weights from training data. Experiments on benchmark datasets show that our hierarchical probabilistic kernel combination scheme offers a benefit in both computational efficiency and performance: we report a significant improvement in accuracy compared to the current best whole-image kernel combination schemes on Caltech 101, as well as a two order-of-magnitude improvement in efficiency.