Kernel machines (including support vector machines) offer powerful new methods for improving the accuracy and robustness of fundamental data mining operations on challenging (e.g. high-dimensional) data, including classification, regression, dimensionality reduction, and outlier detection.
However, a key tradeoff to this power is that kernel machines typically compute their outputs in terms of a large fraction of the training data, making it difficult to scale them up to train and run over massive data sets typically tackled in data mining contexts.
We recently demonstrated 2 to 64-fold querytime speedups of SVM and Kernel Fisher classifiers via a new computational geometry method for anytime output bounds . This new paper refines our approach in two key ways.
First, we introduce a simple linear algebra formulation based on standard Cholesky factorization, yielding simpler equations and lower computational overhead. Second, this new formulation suggests new methods for achieving additional speedups, including tuning on query samples. We demonstrate effectiveness on three benchmark datasets.