Diffusion condensation is a dynamic process that yields a sequence of multiscale data representations that aim to encode meaningful abstractions. It has proven effective for manifold learning, denoising, clustering, and visualization of high-dimensional data. Diffusion condensation is constructed as a time-inhomogeneous process where each step first computes a diffusion operator and then applies it to the data. We theoretically analyze the convergence and evolution of this process from geometric, spectral, and topological perspectives. From a geometric perspective, we obtain convergence bounds based on the smallest transition probability and the radius of the data, whereas from a spectral perspective, our bounds are based on the eigenspectrum of the diffusion kernel. Our spectral results are of particular interest since most of the literature on data diffusion is focused on homogeneous processes. From a topological perspective, we show that diffusion condensation generalizes centroid-based hierarchical clustering. We use this perspective to obtain a bound based on the number of data points, independent of their location. To understand the evolution of the data geometry beyond convergence, we use topological data analysis. We show that the condensation process itself defines an intrinsic condensation homology. We use this intrinsic topology, as well as the ambient persistent homology, of the condensation process to study how the data changes over diffusion time. We demonstrate both types of topological information in well-understood toy examples. Our work gives theoretical insight into the convergence of diffusion condensation and shows that it provides a link between topological and geometric data analysis.


  1. diffusion
  2. time-inhomogeneous process
  3. topological data analysis
  4. persistent homology
  5. hierarchical clustering

MSC codes

  1. 57M50
  2. 57R40
  3. 62R40
  4. 37B25
  5. 68

Supplementary Materials

Index of Supplementary Materials
PLEASE NOTE: These supplementary files have not been peer-reviewed.
Title of paper: Time-Inhomogeneous Diffusion Geometry and Topology
Authors: Guillaume Huguet, Alexander Tong, Bastian Rieck, Jessie Huang, Manik Kuchroo, Matthew Hirn, Guy Wolf, and Smita Krishnaswamy
File: supplementary_material_dc.pdf
Type: PDF
Contents: additional proofs and brief review of relevant topological data analysis.


Information & Authors


Published In

cover image SIAM Journal on Mathematics of Data Science
SIAM Journal on Mathematics of Data Science
Pages: 346 - 372
ISSN (online): 2577-0187


Submitted: 28 March 2022
Accepted: 6 January 2023
Published online: 22 May 2023


Guillaume Huguet
Department of Mathematics and Statistics, Université de Montréal, Montréal, QC H3T 1J4, Canada, and Mila - Quebec AI Institute, Montréal, QC H2S 3H1, Canada.
Department of Computer Science and Operations Research, Université de Montréal, Montréal, QC H3T 1J4, Canada, and Mila - Quebec AI Institute, Montréal, QC H2S 3H1, Canada.
Helmholtz Munich, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany, and Technical University of Munich, Arcisstr. 21, 80333 Munich, Germany.
Jessie Huang
Departments of Computer Science and Genetics, Yale University, New Haven, CT 06520 USA.
Manik Kuchroo
Department of Neuroscience, Yale University, New Haven, CT 06520 USA.
Departments of CMSE and Mathematics, Michigan State University, East Lansing, MI 48824 USA.
Corresponding coauthor. Department of Mathematics and Statistics, Université de Montréal, Montréal, QC H3T 1J4, Canada, and Mila - Quebec AI Institute, Montréal, QC H2S 3H1, Canada.
Corresponding coauthor. Departments of Computer Science and Genetics, Yale University, New Haven, CT 06520 USA, and Departments of CMSE and Mathematics, Michigan State University, East Lansing, MI 48824 USA.

Funding Information

IVADO Professor funds
Funding: The sixth author was partially supported by NSF grant DMS-1845856. The seventh author was partially funded by IVADO Professor funds, CIFAR AI Chair, and NSERC Discovery grant 03267. The sixth, seventh, and eighth authors were partially supported by NIH grant NIGMS-R01GM135929. The content provided here is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.

