Abstract

In this paper we apply the previously introduced approximation method based on the analysis of variance (ANOVA) decomposition and Grouped Transformations to synthetic and real data. The advantage of this method is the interpretability of the approximation, i.e., the ability to rank the importance of the attribute interactions or the variable couplings. Moreover, we are able to generate an attribute ranking to identify unimportant variables and reduce the dimensionality of the problem. We compare the method to other approaches on publicly available benchmark datasets.

Keywords

  1. ANOVA
  2. high-dimensional
  3. approximation
  4. intrepretability
  5. fast Fourier methods

MSC codes

  1. 65T
  2. 42B05
  3. 62-07
  4. 65D15

Get full access to this article

View all available purchase options and get full access to this article.

Supplementary Material


PLEASE NOTE: These supplementary files have not been peer-reviewed.


Index of Supplementary Materials

Title of paper: Interpretable Approximation of High-Dimensional Data

Authors: D. Potts and M. Schmischke

File: code.zip

Type: Compressed Code Files

Contents: The zip files contains several folders with code to replicate our numerical experiments.

References

1.
C. C. Aggarwal, ed., Data Classification: Algorithms and Applications, CRC Press, Boca Raton, FL, 2015.
2.
F. Bartel, D. Potts, and M. Schmischke, Grouped Transformations in High-Dimensional Explainable ANOVA Approximation, preprint, https://arxiv.org/abs/2010.10199, 2020.
3.
F. Bartel and M. Schmischke, ANOVAapprox Julia package, https://github.com/NFFT/ANOVAapprox/, 2020.
4.
G. Beylkin, J. Garcke, and M. Mohlenkamp, Multivariate regression and machine learning with sums of separable functions, SIAM J. Sci. Comput., 31 (2009), pp. 1840--1857, https://doi.org/10.1137/070710524.
5.
P. Binev, W. Dahmen, and P. Lamby, Fast high-dimensional approximation with sparse occupancy trees, J. Comput. Appl. Math., 235 (2011), pp. 2063--2076, https://doi.org/10.1016/j.cam.2010.10.005.
6.
C. M. Bishop, Pattern Recognition and Machine Learning, Springer, New York, 2006.
7.
R. Caflisch, W. Morokoff, and A. Owen, Valuation of mortgage-backed securities using Brownian bridges to reduce effective dimension, J. Comput. Finance, 1 (1997), pp. 27--46, https://doi.org/10.21314/jcf.1997.005.
8.
R. Chitta, R. Jin, and A. K. Jain, Efficient kernel clustering using random Fourier features, in Proceedings of the 12th International IEEE Conference on Data Mining, IEEE, Washington, DC, 2012, pp. 161--170, https://doi.org/10.1109/icdm.2012.61.
9.
P. G. Constantine, E. Dow, and Q. Wang, Active subspace methods in theory and practice: Applications to kriging surfaces, SIAM J. Sci. Comput., 36 (2014), pp. A1500--A1524, https://doi.org/10.1137/130916138.
10.
P. G. Constantine, A. Eftekhari, J. Hokanson, and R. A. Ward, A near-stationary subspace for ridge approximation, Comput. Methods Appl. Mech. Engrg., 326 (2017), pp. 402--421, https://doi.org/10.1016/j.cma.2017.07.038.
11.
R. DeVore, G. Petrova, and P. Wojtaszczyk, Approximation of functions of few variables in high dimensions, Constr. Approx., 33 (2010), pp. 125--143, https://doi.org/10.1007/s00365-010-9105-8.
12.
D. Dua and C. Graff, UCI Machine Learning Repository, 2017, http://archive.ics.uci.edu/ml.
13.
M. Fornasier, K. Schnass, and J. Vybiral, Learning functions of few arbitrary linear parameters in high dimensions, Found. Comput. Math., 12 (2012), pp. 229--262, https://doi.org/10.1007/s10208-012-9115-y.
14.
M. Goyal, M. Pandey, and R. Thakur, Exploratory analysis of machine learning techniques to predict energy efficiency in buildings, in Proceedings of the 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), IEEE, Washington, DC, 2020, pp. 1033--1037, https://doi.org/10.1109/icrito48877.2020.9197976.
15.
I. G. Graham, F. Y. Kuo, J. A. Nichols, R. Scheichl, C. Schwab, and I. H. Sloan, Quasi-Monte Carlo finite element methods for elliptic PDEs with lognormal random coefficients, Numer. Math., 131 (2014), pp. 329--368, https://doi.org/10.1007/s00211-014-0689-y.
16.
I. G. Graham, F. Y. Kuo, D. Nuyens, R. Scheichl, and I. H. Sloan, Circulant embedding with QMC: Analysis for elliptic PDE with lognormal coefficients, Numer. Math., 140 (2018), pp. 479--511, https://doi.org/10.1007/s00211-018-0968-0.
17.
C. Gu, Smoothing Spline ANOVA Models, 2nd ed., Springer, New York, 2013, https://doi.org/10.1007/978-1-4614-5369-7.
18.
I. Guyon and A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res., 3 (2003), 1157–1182.
19.
A. Hashemi, H. Schaeffer, R. Shi, U. Topcu, G. Tran, and R. Ward, Generalization Bounds for Sparse Random Feature Expansions, preprint, https://arxiv.org/abs/2103.03191, 2021.
20.
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., Springer, New York, 2009.
21.
M. Holtz, Sparse Grid Quadrature in High Dimensions with Applications in Finance and Insurance, Lect. Notes Comput. Sci. Engrg. 77, Springer-Verlag, Berlin, 2011, https://doi.org/10.1007/978-3-642-16004-2.
22.
J. Keiner, S. Kunis, and D. Potts, Using NFFT 3---a software library for various nonequispaced fast Fourier transforms, ACM Trans. Math. Software, 36 (2009), 19, https://doi.org/10.1145/1555386.1555388.
23.
Y. Kokkinos and K. G. Margaritis, Multithreaded local learning regularization neural networks for regression tasks, in Engineering Applications of Neural Networks, Springer, Cham, 2015, pp. 129--138, https://doi.org/10.1007/978-3-319-23983-5_13.
24.
A. Kraskov, H. Stögbauer, and P. Grassberger, Estimating mutual information, Phys. Rev. E, 69 (2004), 066138, https://doi.org/10.1103/physreve.69.066138.
25.
F. Y. Kuo and D. Nuyens, Application of quasi-Monte Carlo methods to elliptic PDEs with random diffusion coefficients: A survey of analysis and implementation, Found. Comput. Math, 16 (2016), pp. 1631--1696, https://doi.org/10.1007/s10208-016-9329-5.
26.
F. Y. Kuo, C. Schwab, and I. H. Sloan, Quasi-Monte Carlo finite element methods for a class of elliptic partial differential equations with random coefficients, SIAM J. Numer. Anal., 50 (2012), pp. 3351--3374, https://doi.org/10.1137/110845537.
27.
F. Y. Kuo, I. H. Sloan, G. W. Wasilkowski, and H. Woźniakowski, On decompositions of multivariate functions, Math. Comp., 79 (2010), pp. 953--966, https://doi.org/10.1090/s0025-5718-09-02319-9.
28.
Z. Li, J.-F. Ton, D. Oglic, and D. Sejdinovic, Towards a unified analysis of random Fourier features, in Proceedings of the 36th International Conference on Machine Learning, K. Chaudhuri and R. Salakhutdinov, eds., Proceedings of Machine Learning Research 97, PMLR, 2019, pp. 3905--3914, http://proceedings.mlr.press/v97/li19k.html.
29.
R. Liu and A. B. Owen, Estimating mean dimensionality of analysis of variance decompositions, J. Amer. Statist. Assoc., 101 (2006), pp. 712--721, https://doi.org/10.1198/016214505000001410.
30.
D. Meyer, F. Leisch, and K. Hornik, The support vector machine under test, Neurocomputing, 55 (2003), pp. 169--186, https://doi.org/10.1016/S0925-2312(03)00431-4.
31.
M. Moeller and T. Ullrich, ${L}_2$-norm sampling discretization and recovery of functions from RKHS with finite trace, Sampl. Theory Signal Process. Data Anal., 19 (2021), 13, https://doi.org/10.1007/s43670-021-00013-3.
32.
G. Montavon, W. Samek, and K.-R. Müller, Methods for interpreting and understanding deep neural networks, Digital Signal Process., 73 (2018), pp. 1--15, https://doi.org/10.1016/j.dsp.2017.10.011.
33.
A. Owen, Effective dimension of some weighted pre-Sobolev spaces with dominating mixed partial derivatives, SIAM J. Numer. Anal., 57 (2019), pp. 547--562, https://doi.org/10.1137/17m1158975.
34.
C. C. Paige and M. A. Saunders, LSQR: An algorithm for sparse linear equations and sparse least squares, ACM Trans. Math. Software, 8 (1982), pp. 43--71, https://doi.org/10.1145/355984.355989.
35.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, Scikit-learn: Machine learning in Python, J. Mach. Learn. Res., 12 (2011), pp. 2825--2830.
36.
G. Plonka, D. Potts, G. Steidl, and M. Tasche, Numerical Fourier Analysis, Appl. Numer. Harmon. Anal., Birkhäuser, Cham, 2018, https://doi.org/10.1007/978-3-030-04306-3.
37.
D. Potts and M. Schmischke, Approximation of high-dimensional periodic functions with Fourier-based methods, SIAM J. Numer. Anal., 59 (2021), pp. 2393--2429, https://arxiv.org/abs/1907.11412.
38.
D. Potts and M. Schmischke, Learning Multivariate Functions with Low-Dimensional Structures Using Polynomial Bases, preprint, https://arxiv.org/abs/1912.03195, 2019.
39.
H. Rabitz and O. F. Alis, General foundations of high dimensional model representations, J. Math. Chem., 25 (1999), pp. 197--233, https://doi.org/10.1023/A:1019188517934.
40.
A. Rahimi and B. Recht, Random features for large-scale kernel machines, in Advances in Neural Information Processing Systems, Vol. 20, J. Platt, D. Koller, Y. Singer, and S. Roweis, eds., Curran Associates, Red Hook, NY, 2008, https://proceedings.neurips.cc/paper/2007/file/013a006f03dbc5392effeb8f18fda755-Paper.pdf.
41.
B. C. Ross, Mutual information between discrete and continuous data sets, PLoS ONE, 9 (2014), e87357, https://doi.org/10.1371/journal.pone.0087357.
42.
A. Saltelli, M. Ratto, T. Andres, F. Campolongo, J. Cariboni, D. Gatelli, M. Saisana, and S. Tarantola, Global Sensitivity Analysis: The Primer, John Wiley & Sons, Chichester, UK, 2008.
43.
W. Samek, T. Wiegand, and K.-R. Müller, Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models, preprint, https://arxiv.org/abs/1708.08296, 2017.
44.
M. Schmischke, ANOVAapprox Numerical Experiments, https://github.com/NFFT/AttributeRankingExamples, 2021.
45.
I. M. Sobol', On sensitivity estimation for nonlinear mathematical models, Matem. Mod., 2 (1990), pp. 112--118.
46.
I. M. Sobol', Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates, Math. Comput. Simul., 55 (2001), pp. 271--280, https://doi.org/10.1016/s0378-4754(00)00270-6.
48.
J. A. Tropp, User-friendly tail bounds for sums of random matrices, Found. Comput. Math., 12 (2011), pp. 389--434, https://doi.org/10.1007/s10208-011-9099-z.
49.
C. F. J. Wu and M. S. Hamada, Experiments: Planning, Analysis, and Optimization, John Wiley & Sons, New York, 2011.
50.
T. Yang, Y.-F. Li, M. Mahdavi, R. Jin, and Z.-H. Zhou, Nyström method vs random Fourier features: A theoretical and empirical comparison, in Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS'12), Vol. 1, Curran Associates, Red Hook, NY, 2012, pp. 476--484.

Information & Authors

Information

Published In

cover image SIAM Journal on Mathematics of Data Science
SIAM Journal on Mathematics of Data Science
Pages: 1301 - 1323
ISSN (online): 2577-0187

History

Submitted: 25 March 2021
Accepted: 7 September 2021
Published online: 30 November 2021

Keywords

  1. ANOVA
  2. high-dimensional
  3. approximation
  4. intrepretability
  5. fast Fourier methods

MSC codes

  1. 65T
  2. 42B05
  3. 62-07
  4. 65D15

Authors

Affiliations

Funding Information

Bundesministerium für Bildung und Forschung https://doi.org/10.13039/501100002347 : 01|S20053A
Deutsche Forschungsgemeinschaft https://doi.org/10.13039/501100001659 : 416228727 - SFB 1410

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

View options

PDF

View PDF

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media