Society for Industrial and Applied Mathematics: SIAM Journal on Mathematics of Data Science: Table of Contents
Table of Contents for SIAM Journal on Mathematics of Data Science. List of articles from both the latest and ahead of print issues.
https://epubs.siam.org/loi/sjmdaq?af=R
Society for Industrial and Applied Mathematics: SIAM Journal on Mathematics of Data Science: Table of Contents
Society for Industrial and Applied Mathematics
enUS
SIAM Journal on Mathematics of Data Science
SIAM Journal on Mathematics of Data Science
https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg
https://epubs.siam.org/loi/sjmdaq?af=R

Nonasymptotic Bounds for Adversarial Excess Risk under Misspecified Models
https://epubs.siam.org/doi/abs/10.1137/23M1598210?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/4">Volume 6, Issue 4</a>, Page 847868, December 2024. <br/> Abstract. We propose a general approach to evaluating the performance of robust estimators based on adversarial losses under misspecified models. We first show that adversarial risk is equivalent to the risk induced by a distributional adversarial attack under certain smoothness conditions. This ensures that the adversarial training procedure is welldefined. To evaluate the generalization performance of the adversarial estimator, we study the adversarial excess risk. Our proposed analysis method includes investigations on both generalization error and approximation error. We then establish nonasymptotic upper bounds for the adversarial excess risk associated with Lipschitz loss functions. In addition, we apply our general results to adversarial training for classification and regression problems. For the quadratic loss in nonparametric regression, we show that the adversarial excess risk bound can be improved over that for a general loss.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 4, Page 847868, December 2024. <br/> Abstract. We propose a general approach to evaluating the performance of robust estimators based on adversarial losses under misspecified models. We first show that adversarial risk is equivalent to the risk induced by a distributional adversarial attack under certain smoothness conditions. This ensures that the adversarial training procedure is welldefined. To evaluate the generalization performance of the adversarial estimator, we study the adversarial excess risk. Our proposed analysis method includes investigations on both generalization error and approximation error. We then establish nonasymptotic upper bounds for the adversarial excess risk associated with Lipschitz loss functions. In addition, we apply our general results to adversarial training for classification and regression problems. For the quadratic loss in nonparametric regression, we show that the adversarial excess risk bound can be improved over that for a general loss.<p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Nonasymptotic Bounds for Adversarial Excess Risk under Misspecified Models
10.1137/23M1598210
SIAM Journal on Mathematics of Data Science
20241001T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Changyu Liu
Yuling Jiao
Junhui Wang
Jian Huang
Nonasymptotic Bounds for Adversarial Excess Risk under Misspecified Models
6
4
847
868
20241231T08:00:00Z
20241231T08:00:00Z
10.1137/23M1598210
https://epubs.siam.org/doi/abs/10.1137/23M1598210?af=R
© 2024 Society for Industrial and Applied Mathematics

FiniteTime Analysis of Natural ActorCritic for POMDPs
https://epubs.siam.org/doi/abs/10.1137/23M1587683?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/4">Volume 6, Issue 4</a>, Page 869896, December 2024. <br/> Abstract.We study the reinforcement learning problem for partially observed Markov decision processes (POMDPs) with large state spaces. We consider a natural actorcritic method that employs an internal memory state for policy parameterization to address partial observability, function approximation in both actor and critic to address the curse of dimensionality, and a multistep temporal difference learning algorithm for policy evaluation. We establish nonasymptotic error bounds for actorcritic methods for partially observed systems under function approximation. In particular, in addition to the function approximation and statistical errors that also arise in MDPs, we explicitly characterize the error due to the use of finitestate controllers. This additional error is stated in terms of the total variation distance between the belief state in POMDPs and the posterior distribution of the hidden state when using a finitestate controller. Further, in the specific case of slidingwindow controllers, we show that this inference error can be made arbitrarily small by using larger window sizes under certain ergodicity conditions.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 4, Page 869896, December 2024. <br/> Abstract.We study the reinforcement learning problem for partially observed Markov decision processes (POMDPs) with large state spaces. We consider a natural actorcritic method that employs an internal memory state for policy parameterization to address partial observability, function approximation in both actor and critic to address the curse of dimensionality, and a multistep temporal difference learning algorithm for policy evaluation. We establish nonasymptotic error bounds for actorcritic methods for partially observed systems under function approximation. In particular, in addition to the function approximation and statistical errors that also arise in MDPs, we explicitly characterize the error due to the use of finitestate controllers. This additional error is stated in terms of the total variation distance between the belief state in POMDPs and the posterior distribution of the hidden state when using a finitestate controller. Further, in the specific case of slidingwindow controllers, we show that this inference error can be made arbitrarily small by using larger window sizes under certain ergodicity conditions.<p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
FiniteTime Analysis of Natural ActorCritic for POMDPs
10.1137/23M1587683
SIAM Journal on Mathematics of Data Science
20241001T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Semih Cayci
Niao He
R. Srikant
FiniteTime Analysis of Natural ActorCritic for POMDPs
6
4
869
896
20241231T08:00:00Z
20241231T08:00:00Z
10.1137/23M1587683
https://epubs.siam.org/doi/abs/10.1137/23M1587683?af=R
© 2024 Society for Industrial and Applied Mathematics

Subgradient Langevin Methods for Sampling from Nonsmooth Potentials
https://epubs.siam.org/doi/abs/10.1137/23M1591451?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/4">Volume 6, Issue 4</a>, Page 897925, December 2024. <br/> Abstract. This paper is concerned with sampling from probability distributions [math] on [math] admitting a density of the form [math], where [math], with [math] being a linear operator and [math] being nondifferentiable. Two different methods are proposed, both employing a subgradient step with respect to [math] but, depending on the regularity of [math], either an explicit or an implicit gradient step with respect to [math]. For both methods, nonasymptotic convergence proofs are provided, with improved convergence results for more regular [math]. Further, numerical experiments are conducted for simple 2D examples, illustrating the convergence rates, and for examples of Bayesian imaging, showing the practical feasibility of the proposed methods for highdimensional data.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 4, Page 897925, December 2024. <br/> Abstract. This paper is concerned with sampling from probability distributions [math] on [math] admitting a density of the form [math], where [math], with [math] being a linear operator and [math] being nondifferentiable. Two different methods are proposed, both employing a subgradient step with respect to [math] but, depending on the regularity of [math], either an explicit or an implicit gradient step with respect to [math]. For both methods, nonasymptotic convergence proofs are provided, with improved convergence results for more regular [math]. Further, numerical experiments are conducted for simple 2D examples, illustrating the convergence rates, and for examples of Bayesian imaging, showing the practical feasibility of the proposed methods for highdimensional data. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Subgradient Langevin Methods for Sampling from Nonsmooth Potentials
10.1137/23M1591451
SIAM Journal on Mathematics of Data Science
20241001T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Andreas Habring
Martin Holler
Thomas Pock
Subgradient Langevin Methods for Sampling from Nonsmooth Potentials
6
4
897
925
20241231T08:00:00Z
20241231T08:00:00Z
10.1137/23M1591451
https://epubs.siam.org/doi/abs/10.1137/23M1591451?af=R
© 2024 Society for Industrial and Applied Mathematics

Equivariant Neural Networks for Indirect Measurements
https://epubs.siam.org/doi/abs/10.1137/23M1582862?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 579601, September 2024. <br/> Abstract.In recent years, deep learning techniques have shown great success in various tasks related to inverse problems, where a target quantity of interest can only be observed through indirect measurements by a forward operator. Common approaches apply deep neural networks in a postprocessing step to the reconstructions obtained by classical reconstruction methods. However, the latter methods can be computationally expensive and introduce artifacts that are not present in the measured data and, in turn, can deteriorate the performance on the given task. To overcome these limitations, we propose a class of equivariant neural networks that can be directly applied to the measurements to solve the desired task. To this end, we build appropriate network structures by developing layers that are equivariant with respect to data transformations induced by wellknown symmetries in the domain of the forward operator. We rigorously analyze the relation between the measurement operator and the resulting group representations and prove a representer theorem that characterizes the class of linear operators that translate between a given pair of group actions. Based on this theory, we extend the existing concepts of Lie group equivariant deep learning to inverse problems and introduce new representations that result from the involved measurement operations. This allows us to efficiently solve classification, regression, or even reconstruction tasks based on indirect measurements also for very sparse data problems, where a classical reconstructionbased approach may be hard or even impossible. We illustrate the effectiveness of our approach in numerical experiments and compare with existing methods.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 579601, September 2024. <br/> Abstract.In recent years, deep learning techniques have shown great success in various tasks related to inverse problems, where a target quantity of interest can only be observed through indirect measurements by a forward operator. Common approaches apply deep neural networks in a postprocessing step to the reconstructions obtained by classical reconstruction methods. However, the latter methods can be computationally expensive and introduce artifacts that are not present in the measured data and, in turn, can deteriorate the performance on the given task. To overcome these limitations, we propose a class of equivariant neural networks that can be directly applied to the measurements to solve the desired task. To this end, we build appropriate network structures by developing layers that are equivariant with respect to data transformations induced by wellknown symmetries in the domain of the forward operator. We rigorously analyze the relation between the measurement operator and the resulting group representations and prove a representer theorem that characterizes the class of linear operators that translate between a given pair of group actions. Based on this theory, we extend the existing concepts of Lie group equivariant deep learning to inverse problems and introduce new representations that result from the involved measurement operations. This allows us to efficiently solve classification, regression, or even reconstruction tasks based on indirect measurements also for very sparse data problems, where a classical reconstructionbased approach may be hard or even impossible. We illustrate the effectiveness of our approach in numerical experiments and compare with existing methods. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Equivariant Neural Networks for Indirect Measurements
10.1137/23M1582862
SIAM Journal on Mathematics of Data Science
20240703T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Matthias Beckmann
Nick Heilenkötter
Equivariant Neural Networks for Indirect Measurements
6
3
579
601
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/23M1582862
https://epubs.siam.org/doi/abs/10.1137/23M1582862?af=R
© 2024 Society for Industrial and Applied Mathematics

Gradient Descent in the Absence of Global Lipschitz Continuity of the Gradients
https://epubs.siam.org/doi/abs/10.1137/22M1527210?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 602626, September 2024. <br/> Abstract.Gradient descent (GD) is a collection of continuous optimization methods that have achieved immeasurable success in practice. Owing to data science applications, GD with diminishing step sizes has become a prominent variant. While this variant of GD has been well studied in the literature for objectives with globally Lipschitz continuous gradients or by requiring bounded iterates, objectives from data science problems do not satisfy such assumptions. Thus, in this work, we provide a novel global convergence analysis of GD with diminishing step sizes for differentiable nonconvex functions whose gradients are only locally Lipschitz continuous. Through our analysis, we generalize what is known about gradient descent with diminishing step sizes, including interesting topological facts, and we elucidate the varied behaviors that can occur in the previously overlooked divergence regime. Thus, we provide a general global convergence analysis of GD with diminishing step sizes under realistic conditions for data science problems.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 602626, September 2024. <br/> Abstract.Gradient descent (GD) is a collection of continuous optimization methods that have achieved immeasurable success in practice. Owing to data science applications, GD with diminishing step sizes has become a prominent variant. While this variant of GD has been well studied in the literature for objectives with globally Lipschitz continuous gradients or by requiring bounded iterates, objectives from data science problems do not satisfy such assumptions. Thus, in this work, we provide a novel global convergence analysis of GD with diminishing step sizes for differentiable nonconvex functions whose gradients are only locally Lipschitz continuous. Through our analysis, we generalize what is known about gradient descent with diminishing step sizes, including interesting topological facts, and we elucidate the varied behaviors that can occur in the previously overlooked divergence regime. Thus, we provide a general global convergence analysis of GD with diminishing step sizes under realistic conditions for data science problems. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Gradient Descent in the Absence of Global Lipschitz Continuity of the Gradients
10.1137/22M1527210
SIAM Journal on Mathematics of Data Science
20240705T07:00:00Z
© 2024 Vivak Patel
Vivak Patel
Albert S. Berahas
Gradient Descent in the Absence of Global Lipschitz Continuity of the Gradients
6
3
602
626
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/22M1527210
https://epubs.siam.org/doi/abs/10.1137/22M1527210?af=R
© 2024 Vivak Patel

ThreeOperator Splitting for Learning to Predict Equilibria in Convex Games
https://epubs.siam.org/doi/abs/10.1137/22M1544531?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 627648, September 2024. <br/> Abstract.Systems of competing agents can often be modeled as games. Assuming rationality, the most likely outcomes are given by an equilibrium, e.g., a Nash equilibrium. In many practical settings, games are influenced by context, i.e., additional data beyond the control of any agent (e.g., weather for traffic and fiscal policy for market economies). Often, the exact game mechanics are unknown, yet vast amounts of historical data consisting of (context, equilibrium) pairs are available, raising the possibility of learning a solver that predicts the equilibria given only the context. We introduce Nash fixedpoint networks (NFPNs), a class of neural networks that naturally output equilibria. Crucially, NFPNs employ a constraint decoupling scheme to handle complicated agent action sets while avoiding expensive projections. Empirically, we find that NFPNs are compatible with the recently developed Jacobianfree backpropagation technique for training implicit networks, making them significantly faster and easier to train than prior models. Our experiments show that NFPNs are capable of scaling to problems orders of magnitude larger than existing learned game solvers. All code is available online.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 627648, September 2024. <br/> Abstract.Systems of competing agents can often be modeled as games. Assuming rationality, the most likely outcomes are given by an equilibrium, e.g., a Nash equilibrium. In many practical settings, games are influenced by context, i.e., additional data beyond the control of any agent (e.g., weather for traffic and fiscal policy for market economies). Often, the exact game mechanics are unknown, yet vast amounts of historical data consisting of (context, equilibrium) pairs are available, raising the possibility of learning a solver that predicts the equilibria given only the context. We introduce Nash fixedpoint networks (NFPNs), a class of neural networks that naturally output equilibria. Crucially, NFPNs employ a constraint decoupling scheme to handle complicated agent action sets while avoiding expensive projections. Empirically, we find that NFPNs are compatible with the recently developed Jacobianfree backpropagation technique for training implicit networks, making them significantly faster and easier to train than prior models. Our experiments show that NFPNs are capable of scaling to problems orders of magnitude larger than existing learned game solvers. All code is available online.<p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
ThreeOperator Splitting for Learning to Predict Equilibria in Convex Games
10.1137/22M1544531
SIAM Journal on Mathematics of Data Science
20240711T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
D. McKenzie
H. Heaton
Q. Li
S. Wu Fung
S. Osher
W. Yin
ThreeOperator Splitting for Learning to Predict Equilibria in Convex Games
6
3
627
648
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/22M1544531
https://epubs.siam.org/doi/abs/10.1137/22M1544531?af=R
© 2024 Society for Industrial and Applied Mathematics

ABBA Neural Networks: Coping with Positivity, Expressivity, and Robustness
https://epubs.siam.org/doi/abs/10.1137/23M1589591?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 649678, September 2024. <br/> Abstract.We introduce ABBA networks, a novel class of (almost) nonnegative neural networks, which are shown to possess a series of appealing properties. In particular, we demonstrate that these networks are universal approximators while enjoying the advantages of nonnegative weighted networks. We derive tight Lipschitz bounds in both the fully connected and convolutional cases. We propose a strategy for designing ABBA nets that are robust against adversarial attacks, by finely controlling the Lipschitz constant of the network during the training phase. We show that our method outperforms other stateoftheart defenses against adversarial whitebox attackers. Experiments are performed on image classification tasks on four benchmark datasets.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 649678, September 2024. <br/> Abstract.We introduce ABBA networks, a novel class of (almost) nonnegative neural networks, which are shown to possess a series of appealing properties. In particular, we demonstrate that these networks are universal approximators while enjoying the advantages of nonnegative weighted networks. We derive tight Lipschitz bounds in both the fully connected and convolutional cases. We propose a strategy for designing ABBA nets that are robust against adversarial attacks, by finely controlling the Lipschitz constant of the network during the training phase. We show that our method outperforms other stateoftheart defenses against adversarial whitebox attackers. Experiments are performed on image classification tasks on four benchmark datasets. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
ABBA Neural Networks: Coping with Positivity, Expressivity, and Robustness
10.1137/23M1589591
SIAM Journal on Mathematics of Data Science
20240715T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Ana Neacşu
JeanChristophe Pesquet
Vlad Vasilescu
Corneliu Burileanu
ABBA Neural Networks: Coping with Positivity, Expressivity, and Robustness
6
3
649
678
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/23M1589591
https://epubs.siam.org/doi/abs/10.1137/23M1589591?af=R
© 2024 Society for Industrial and Applied Mathematics

Memory Capacity of Two Layer Neural Networks with Smooth Activations
https://epubs.siam.org/doi/abs/10.1137/23M1599355?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 679702, September 2024. <br/> Abstract.Determining the memory capacity of two layer neural networks with [math] hidden neurons and input dimension [math] (i.e., [math] total trainable parameters), which refers to the largest size of general data the network can memorize, is a fundamental machine learning question. For activations that are real analytic at a point and, if restricting to a polynomial there, have sufficiently high degree, we establish a lower bound of [math] and optimality up to a factor of approximately 2. All practical activations, such as sigmoids, Heaviside, and the rectified linear unit (ReLU), are real analytic at a point. Furthermore, the degree condition is mild, requiring, for example, that [math] if the activation is [math]. Analogous prior results were limited to Heaviside and ReLU activations—our result covers almost everything else. In order to analyze general activations, we derive the precise generic rank of the network’s Jacobian, which can be written in terms of Hadamard powers and the Khatri–Rao product. Our analysis extends classical linear algebraic facts about the rank of Hadamard powers. Overall, our approach differs from prior works on memory capacity and holds promise for extending to deeper models and other architectures.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 679702, September 2024. <br/> Abstract.Determining the memory capacity of two layer neural networks with [math] hidden neurons and input dimension [math] (i.e., [math] total trainable parameters), which refers to the largest size of general data the network can memorize, is a fundamental machine learning question. For activations that are real analytic at a point and, if restricting to a polynomial there, have sufficiently high degree, we establish a lower bound of [math] and optimality up to a factor of approximately 2. All practical activations, such as sigmoids, Heaviside, and the rectified linear unit (ReLU), are real analytic at a point. Furthermore, the degree condition is mild, requiring, for example, that [math] if the activation is [math]. Analogous prior results were limited to Heaviside and ReLU activations—our result covers almost everything else. In order to analyze general activations, we derive the precise generic rank of the network’s Jacobian, which can be written in terms of Hadamard powers and the Khatri–Rao product. Our analysis extends classical linear algebraic facts about the rank of Hadamard powers. Overall, our approach differs from prior works on memory capacity and holds promise for extending to deeper models and other architectures. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Memory Capacity of Two Layer Neural Networks with Smooth Activations
10.1137/23M1599355
SIAM Journal on Mathematics of Data Science
20240716T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Liam Madden
Christos Thrampoulidis
Memory Capacity of Two Layer Neural Networks with Smooth Activations
6
3
679
702
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/23M1599355
https://epubs.siam.org/doi/abs/10.1137/23M1599355?af=R
© 2024 Society for Industrial and Applied Mathematics

Spectral Triadic Decompositions of RealWorld Networks
https://epubs.siam.org/doi/abs/10.1137/23M1586926?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 703730, September 2024. <br/> Abstract.A fundamental problem in mathematics and network analysis is to find conditions under which a graph can be partitioned into smaller pieces. A ubiquitous tool for this partitioning is the Fiedler vector or discrete Cheeger inequality. These results relate the graph spectrum (eigenvalues of the normalized adjacency matrix) to the ability to break a graph into two pieces, with few edge deletions. An entire subfield of mathematics, called spectral graph theory, has emerged from these results. Yet these results do not say anything about the rich community structure exhibited by realworld networks, which typically have a significant fraction of edges contained in numerous densely clustered blocks. Inspired by the properties of realworld networks, we discover a new spectral condition that relates eigenvalue powers to a network decomposition into densely clustered blocks. We call this the spectral triadic decomposition. Our relationship exactly predicts the existence of community structure, as commonly seen in real networked data. Our proof provides an efficient algorithm to produce the spectral triadic decomposition. We observe on numerous social, coauthorship, and citation network datasets that these decompositions have significant correlation with semantically meaningful communities.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 703730, September 2024. <br/> Abstract.A fundamental problem in mathematics and network analysis is to find conditions under which a graph can be partitioned into smaller pieces. A ubiquitous tool for this partitioning is the Fiedler vector or discrete Cheeger inequality. These results relate the graph spectrum (eigenvalues of the normalized adjacency matrix) to the ability to break a graph into two pieces, with few edge deletions. An entire subfield of mathematics, called spectral graph theory, has emerged from these results. Yet these results do not say anything about the rich community structure exhibited by realworld networks, which typically have a significant fraction of edges contained in numerous densely clustered blocks. Inspired by the properties of realworld networks, we discover a new spectral condition that relates eigenvalue powers to a network decomposition into densely clustered blocks. We call this the spectral triadic decomposition. Our relationship exactly predicts the existence of community structure, as commonly seen in real networked data. Our proof provides an efficient algorithm to produce the spectral triadic decomposition. We observe on numerous social, coauthorship, and citation network datasets that these decompositions have significant correlation with semantically meaningful communities.<p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Spectral Triadic Decompositions of RealWorld Networks
10.1137/23M1586926
SIAM Journal on Mathematics of Data Science
20240806T07:00:00Z
© 2024 SIAM. Published by SIAM under the terms of the Creative Commons 4.0 license
Sabyasachi Basu
Suman Kalyan Bera
C. Seshadhri
Spectral Triadic Decompositions of RealWorld Networks
6
3
703
730
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/23M1586926
https://epubs.siam.org/doi/abs/10.1137/23M1586926?af=R
© 2024 SIAM. Published by SIAM under the terms of the Creative Commons 4.0 license

Optimal Dorfman Group Testing for Symmetric Distributions
https://epubs.siam.org/doi/abs/10.1137/23M1595138?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 731760, September 2024. <br/> Abstract.We study Dorfman’s classical group testing protocol in a novel setting where individual specimen statuses are modeled as exchangeable random variables. We are motivated by infectious disease screening. In that case, specimens which arrive together for testing often originate from the same community and so their statuses may exhibit positive correlation. Dorfman’s protocol screens a population of [math] specimens for a binary trait by partitioning it into nonoverlapping groups, testing these, and only individually retesting the specimens of each positive group. The partition is chosen to minimize the expected number of tests under a probabilistic model of specimen statuses. We relax the typical assumption that these are independent and identically distributed and instead model them as exchangeable random variables. In this case, their joint distribution is symmetric in the sense that it is invariant under permutations. We give a characterization of such distributions in terms of a function [math] where [math] is the marginal probability that any group of size [math] tests negative. We use this interpretable representation to show that the set partitioning problem arising in Dorfman’s protocol can be reduced to an integer partitioning problem and efficiently solved. We apply these tools to an empirical dataset from the COVID19 pandemic. The methodology helps explain the unexpectedly high empirical efficiency reported by the original investigators.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 731760, September 2024. <br/> Abstract.We study Dorfman’s classical group testing protocol in a novel setting where individual specimen statuses are modeled as exchangeable random variables. We are motivated by infectious disease screening. In that case, specimens which arrive together for testing often originate from the same community and so their statuses may exhibit positive correlation. Dorfman’s protocol screens a population of [math] specimens for a binary trait by partitioning it into nonoverlapping groups, testing these, and only individually retesting the specimens of each positive group. The partition is chosen to minimize the expected number of tests under a probabilistic model of specimen statuses. We relax the typical assumption that these are independent and identically distributed and instead model them as exchangeable random variables. In this case, their joint distribution is symmetric in the sense that it is invariant under permutations. We give a characterization of such distributions in terms of a function [math] where [math] is the marginal probability that any group of size [math] tests negative. We use this interpretable representation to show that the set partitioning problem arising in Dorfman’s protocol can be reduced to an integer partitioning problem and efficiently solved. We apply these tools to an empirical dataset from the COVID19 pandemic. The methodology helps explain the unexpectedly high empirical efficiency reported by the original investigators. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Optimal Dorfman Group Testing for Symmetric Distributions
10.1137/23M1595138
SIAM Journal on Mathematics of Data Science
20240809T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Nicholas C. Landolfi
Sanjay Lall
Optimal Dorfman Group Testing for Symmetric Distributions
6
3
731
760
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/23M1595138
https://epubs.siam.org/doi/abs/10.1137/23M1595138?af=R
© 2024 Society for Industrial and Applied Mathematics

New Equivalences between Interpolation and SVMs: Kernels and Structured Features
https://epubs.siam.org/doi/abs/10.1137/23M1568764?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 761787, September 2024. <br/> Abstract.The support vector machine (SVM) is a supervised learning algorithm that finds a maximummargin linear classifier, often after mapping the data to a highdimensional feature space via the kernel trick. Recent work has demonstrated that in certain sufficiently overparameterized settings, the SVM decision function coincides exactly with the minimumnorm label interpolant. This phenomenon of support vector proliferation (SVP) is especially interesting because it allows us to understand SVM performance by leveraging recent analyses of harmless interpolation in linear and kernel models. However, previous work on SVP has made restrictive assumptions on the data/feature distribution and spectrum. In this paper, we present a new and flexible analysis framework for proving SVP in an arbitrary reproducing kernel Hilbert space with a flexible class of generative models for the labels. We present conditions for SVP for features in the families of general bounded orthonormal systems (e.g., Fourier features) and independent subGaussian features. In both cases, we show that SVP occurs in many interesting settings not covered by prior work, and we leverage these results to prove novel generalization results for kernel SVM classification.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 761787, September 2024. <br/> Abstract.The support vector machine (SVM) is a supervised learning algorithm that finds a maximummargin linear classifier, often after mapping the data to a highdimensional feature space via the kernel trick. Recent work has demonstrated that in certain sufficiently overparameterized settings, the SVM decision function coincides exactly with the minimumnorm label interpolant. This phenomenon of support vector proliferation (SVP) is especially interesting because it allows us to understand SVM performance by leveraging recent analyses of harmless interpolation in linear and kernel models. However, previous work on SVP has made restrictive assumptions on the data/feature distribution and spectrum. In this paper, we present a new and flexible analysis framework for proving SVP in an arbitrary reproducing kernel Hilbert space with a flexible class of generative models for the labels. We present conditions for SVP for features in the families of general bounded orthonormal systems (e.g., Fourier features) and independent subGaussian features. In both cases, we show that SVP occurs in many interesting settings not covered by prior work, and we leverage these results to prove novel generalization results for kernel SVM classification. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
New Equivalences between Interpolation and SVMs: Kernels and Structured Features
10.1137/23M1568764
SIAM Journal on Mathematics of Data Science
20240814T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Chiraag Kaushik
Andrew D. McRae
Mark Davenport
Vidya Muthukumar
New Equivalences between Interpolation and SVMs: Kernels and Structured Features
6
3
761
787
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/23M1568764
https://epubs.siam.org/doi/abs/10.1137/23M1568764?af=R
© 2024 Society for Industrial and Applied Mathematics

Improving the AccuracyRobustness TradeOff of Classifiers via Adaptive Smoothing
https://epubs.siam.org/doi/abs/10.1137/23M1564560?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 788814, September 2024. <br/> Abstract.While prior research has proposed a plethora of methods that build neural classifiers robust against adversarial robustness, practitioners are still reluctant to adopt them due to their unacceptably severe clean accuracy penalties. Realworld services based on neural networks are thus still unsafe. This paper significantly alleviates the accuracyrobustness tradeoff by mixing the output probabilities of a standard classifier and a robust classifier, where the standard network is optimized for clean accuracy and is not robust in general. We show that the robust base classifier’s confidence difference for correct and incorrect examples is the key to this improvement. In addition to providing empirical evidence, we theoretically certify the robustness of the mixed classifier under realistic assumptions. We then adapt an adversarial input detector into a mixing network that adaptively adjusts the mixture of the two base models, further reducing the accuracy penalty of achieving robustness. The proposed flexible mixtureofexperts framework, termed “adaptive smoothing,” works in conjunction with existing or even future methods that improve clean accuracy, robustness, or adversary detection. We use strong attack methods, including AutoAttack and adaptive attacks, to evaluate our models’ robustness. On the CIFAR100 dataset, we achieve an [math] clean accuracy while maintaining a [math] [math]AutoAttacked ([math]) accuracy, becoming the second most robust method on the RobustBench benchmark as of submission, while improving the clean accuracy by 10 percentage points over all listed models. Code implementation is available at https://github.com/BaiYT/AdaptiveSmoothing.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 788814, September 2024. <br/> Abstract.While prior research has proposed a plethora of methods that build neural classifiers robust against adversarial robustness, practitioners are still reluctant to adopt them due to their unacceptably severe clean accuracy penalties. Realworld services based on neural networks are thus still unsafe. This paper significantly alleviates the accuracyrobustness tradeoff by mixing the output probabilities of a standard classifier and a robust classifier, where the standard network is optimized for clean accuracy and is not robust in general. We show that the robust base classifier’s confidence difference for correct and incorrect examples is the key to this improvement. In addition to providing empirical evidence, we theoretically certify the robustness of the mixed classifier under realistic assumptions. We then adapt an adversarial input detector into a mixing network that adaptively adjusts the mixture of the two base models, further reducing the accuracy penalty of achieving robustness. The proposed flexible mixtureofexperts framework, termed “adaptive smoothing,” works in conjunction with existing or even future methods that improve clean accuracy, robustness, or adversary detection. We use strong attack methods, including AutoAttack and adaptive attacks, to evaluate our models’ robustness. On the CIFAR100 dataset, we achieve an [math] clean accuracy while maintaining a [math] [math]AutoAttacked ([math]) accuracy, becoming the second most robust method on the RobustBench benchmark as of submission, while improving the clean accuracy by 10 percentage points over all listed models. Code implementation is available at https://github.com/BaiYT/AdaptiveSmoothing. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Improving the AccuracyRobustness TradeOff of Classifiers via Adaptive Smoothing
10.1137/23M1564560
SIAM Journal on Mathematics of Data Science
20240814T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Yatong Bai
Brendon G. Anderson
Aerin Kim
Somayeh Sojoudi
Improving the AccuracyRobustness TradeOff of Classifiers via Adaptive Smoothing
6
3
788
814
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/23M1564560
https://epubs.siam.org/doi/abs/10.1137/23M1564560?af=R
© 2024 Society for Industrial and Applied Mathematics

Topological Fingerprints for Audio Identification
https://epubs.siam.org/doi/abs/10.1137/23M1605090?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 815841, September 2024. <br/> Abstract.We present a topological audio fingerprinting approach for robustly identifying duplicate audio tracks. Our method applies persistent homology on local spectral decompositions of audio signals, using filtered cubical complexes computed from mel spectrograms. By encoding the audio content in terms of local Betti curves, our topological audio fingerprints enable accurate detection of timealigned audio matchings. Experimental results demonstrate the accuracy of our algorithm in the detection of tracks with the same audio content, even when subjected to various obfuscations. Our approach outperforms existing methods in scenarios involving topological distortions, such as time stretching and pitch shifting.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 815841, September 2024. <br/> Abstract.We present a topological audio fingerprinting approach for robustly identifying duplicate audio tracks. Our method applies persistent homology on local spectral decompositions of audio signals, using filtered cubical complexes computed from mel spectrograms. By encoding the audio content in terms of local Betti curves, our topological audio fingerprints enable accurate detection of timealigned audio matchings. Experimental results demonstrate the accuracy of our algorithm in the detection of tracks with the same audio content, even when subjected to various obfuscations. Our approach outperforms existing methods in scenarios involving topological distortions, such as time stretching and pitch shifting. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Topological Fingerprints for Audio Identification
10.1137/23M1605090
SIAM Journal on Mathematics of Data Science
20240830T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Wojciech Reise
Ximena Fernández
Maria Dominguez
Heather A. Harrington
Mariano BeguerisseDíaz
Topological Fingerprints for Audio Identification
6
3
815
841
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/23M1605090
https://epubs.siam.org/doi/abs/10.1137/23M1605090?af=R
© 2024 Society for Industrial and Applied Mathematics

Corrigendum: Posttraining Quantization for Neural Networks with Provable Guarantees
https://epubs.siam.org/doi/abs/10.1137/24M1635582?af=R
SIAM Journal on Mathematics of Data Science, <a href="https://epubs.siam.org/toc/sjmdaq/6/3">Volume 6, Issue 3</a>, Page 842846, September 2024. <br/> Abstract.We correct an error in Lemma A.5 of [J. Zhang, Y. Zhou, and R. Saab, SIAM J. Math. Data Sci., 5 (2023), pp. 373–399]; this lemma propagates to the proofs of Theorem 3.1, Theorem 3.2, and Theorem C.2. We restate these theorems and give their corrected proofs. The main results in the original paper still hold, with minor modifications.
SIAM Journal on Mathematics of Data Science, Volume 6, Issue 3, Page 842846, September 2024. <br/> Abstract.We correct an error in Lemma A.5 of [J. Zhang, Y. Zhou, and R. Saab, SIAM J. Math. Data Sci., 5 (2023), pp. 373–399]; this lemma propagates to the proofs of Theorem 3.1, Theorem 3.2, and Theorem C.2. We restate these theorems and give their corrected proofs. The main results in the original paper still hold, with minor modifications. <p><img src="https://epubs.siam.org/na101/home/literatum/publisher/siam/journals/covergifs/sjmdaq/cover.jpg" alttext="cover image"/></p>
Corrigendum: Posttraining Quantization for Neural Networks with Provable Guarantees
10.1137/24M1635582
SIAM Journal on Mathematics of Data Science
20240926T07:00:00Z
© 2024 Society for Industrial and Applied Mathematics
Jinjie Zhang
Yixuan Zhou
Rayan Saab
Corrigendum: Posttraining Quantization for Neural Networks with Provable Guarantees
6
3
842
846
20240930T07:00:00Z
20240930T07:00:00Z
10.1137/24M1635582
https://epubs.siam.org/doi/abs/10.1137/24M1635582?af=R
© 2024 Society for Industrial and Applied Mathematics