Curse of dimensionality

From Wikipedia, the free encyclopedia

Jump to: navigation, search

The curse of dimensionality is the problem caused by the exponential increase in volume associated with adding extra dimensions to a (mathematical) space. The term was coined by Richard Bellman.

For example, 100 evenly-spaced sample points suffice to sample a unit interval with no more than 0.01 distance between points; an equivalent sampling of a 10-dimensional unit hypercube with a lattice with a spacing of 0.01 between adjacent points would require 1020 sample points: thus, in some sense, the 10-dimensional hypercube can be said to be a factor of 1018 "larger" than the unit interval. (Adapted from an example by R. E. Bellman, see below.)

Another way to envisage the "vastness" of high-dimensional Euclidean space is to compare the size of the unit ball with the unit cube: as the dimension of the space increases, the unit ball becomes an insignificant volume relative to that of the unit cube; thus, in some sense, nearly all of the high-dimensional space is "far away" from the centre, or, to put it another way, the high-dimensional unit space can be said to consist almost entirely of the "corners" of the hypercube, with almost no "middle". (This is an important intuition for understanding the chi-squared distribution.)

Contents

[edit] The curse of dimensionality in optimization and learning

The curse of dimensionality is a significant obstacle to solving dynamic optimization problems by numerical backwards induction when the dimension of the 'state variable' is large. It also complicates machine learning problems that involve learning a 'state-of-nature' (maybe infinite distribution) from a finite (low) number of data samples in a high-dimensional feature space and nearest neighbor search in high dimensional space.

[edit] The curse of dimensionality in Bayesian statistics

The curse of dimensionality is a significant issue in computing numerical solutions in Bayesian statistics for models with more than a few parameters. This is especially true for hierarchical models because they often have a number of parameters proportional to the amount of data.

[edit] See also

[edit] References

  • Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ.
  • Bellman, R.E. 1961. Adaptive Control Processes. Princeton University Press, Princeton, NJ.
  • Powell, Warren B. 2007. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley, ISBN 0470171553.
Personal tools