Free access
Proceedings
Proceedings of the 2019 SIAM International Conference on Data Mining

Intrinsic Dimensionality Estimation within Tight Localities

Abstract

Accurate estimation of Intrinsic Dimensionality (ID) is of crucial importance in many data mining and machine learning tasks, including dimensionality reduction, outlier detection, similarity search and subspace clustering. However, since their convergence generally requires sample sizes (that is, neighborhood sizes) on the order of hundreds of points, existing ID estimation methods may have only limited usefulness for applications in which the data consists of many natural groups of small size. In this paper, we propose a local ID estimation strategy stable even for ‘tight’ localities consisting of as few as 20 sample points. The estimator applies MLE techniques over all available pairwise distances among the members of the sample, based on a recent extreme-value-theoretic model of intrinsic dimensionality, the Local Intrinsic Dimension (LID). Our experimental results show that our proposed estimation technique can achieve notably smaller variance, while maintaining comparable levels of bias, at much smaller sample sizes than state-of-the-art estimators.

Formats available

You can view the full content in the following formats:

Information & Authors

Information

Published In

cover image Proceedings
Proceedings of the 2019 SIAM International Conference on Data Mining
Pages: 181 - 189
Editors: Tanya Berger-Wolf, University of Illinois, USA and Nitesh Chawla, University of Notre Dame
ISBN (Online): 978-1-61197-567-3

History

Published online: 6 May 2019

Authors

Affiliations

Ken-ichi Kawarabayashi

Notes

*
M. E. H. supported by JSPS Kakenhi Kiban (B) Research Grant 18H03296. K. K. supported by JST ERATO Kawarabayashi Large Graph Project JPMJER1201 and by JSPS Kakenhi JP18H05291. M. R. thanks Serbian nat'l project OI174023.

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

There are no citations for this item

View Options

View options

PDF

View PDF

Get Access

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media

The SIAM Publications Library now uses SIAM Single Sign-On for individuals. If you do not have existing SIAM credentials, create your SIAM account https://my.siam.org.