Probabilistic models of data sets often exhibit salient geometric structure. Such a phenomenon is summed up in the manifold distribution hypothesis and can be exploited in probabilistic learning. Here we present normal-bundle bootstrap (NBB), a method that generates new data which preserve the geometric structure of a given data set. Inspired by algorithms for manifold learning and concepts in differential geometry, our method decomposes the underlying probability measure into a marginalized measure on a learned data manifold and conditional measures on the normal spaces. The algorithm estimates the data manifold as a density ridge and constructs new data by bootstrapping projection vectors and adding them to the ridge. We apply our method to the inference of density ridge and related statistics, and to data augmentation to reduce overfitting.


  1. probabilistic learning
  2. data manifold
  3. dynamical systems
  4. resampling
  5. data augmentation

MSC codes

  1. 37M22
  2. 53-08
  3. 53A07
  4. 62F40
  5. 62G09

Information & Authors


Published In

cover image SIAM Journal on Mathematics of Data Science
SIAM Journal on Mathematics of Data Science
Pages: 573 - 592
ISSN (online): 2577-0187


Submitted: 28 July 2020
Accepted: 12 February 2021
Published online: 4 May 2021


Funding Information

National Science Foundation https://doi.org/10.13039/100000001 : DMS-1638521

