The standard mechanisms for query selectivity estimation used in relational database systems rely on properties specific to the attribute types. For example, histograms and quantile values rely on the ability to define a well-ordering over the attribute values. The query optimizer in an object-relational database system will, in general, be unable to exploit these mechanisms for user-defined types, requiring the user to create entirely new estimation mechanisms. Worse, writing the selectivity estimation routines is extremely difficult because the software interfaces provided by vendors are relatively low-level. In this paper, we discuss extensions of the generalized search tree, or GiST, to support user-defined selectivity estimation in a variety of ways. We discuss the computation of selectivity estimates with confidence intervals over arbitrary data types using indices, give methods for combining this technique with random sampling, and present results from an experimental comparison of these methods with several estimators from the literature.




Download Full History