This paper describes a new technique for object recognition based on learning appearance models. The image is decomposed into local regions which are described by a new texture representation derived from the output of multiscale, multiorientation filter banks. We call this representation "Generalized Second Moments" as it can be viewed as a generalization of the windowed second moment matrix representation used by Garding & Lindeberg. Class-characteristic local texture features and their global composition is learned by a hierarchical mixture of experts architecture. The technique is applied to a vehicle database consisting of 5 general car categories (Sedan, Van with back-doors, Van without back-doors, old Sedan, and Volkswagen Bug). This is a difficult problem with considerable in-class variation. Our technique has a 6.5% misclassification rate, compared to eigen-images which give 17.4% misclassification rate, and nearest neighbors which give 15.7% misclassification rate.