N. H. Bshouty,
Y. Li
and P. M. Long. Using the doubling dimension to
analyze the generalization of learning algorithms. JCSS, 75(6):323-335, 2009.
Abstract
Given a set F of classifiers and a probability distribution over
their domain, one can define a metric by taking the distance between a
pair of classifiers to be the probability that they classify a random
item differently. We prove bounds on the sample complexity of PAC
learning in terms of the doubling dimension of this metric. These
bounds imply
known bounds on the sample complexity of
learning
halfspaces with respect to the uniform distribution
that are optimal up to a constant factor.
We then prove a bound that holds for any
algorithm that outputs a classifier with zero error whenever this is
possible; this bound is in terms of the maximum of the doubling
dimension and the VC-dimension of F and strengthens the best known
bound in terms of the VC-dimension alone.
Finally, we show that there is no bound on the doubling dimension of
halfspaces in n dimensions in terms of n that holds
independently of the domain distribution. This implies that there is
no such a bound in terms of the VC-dimension of F (in contrast with
the metric dimension).