7  Persistent homology

“The essence of this architecture is movement synchronized towards a precise objective. We observe a fraction of the process, like hearing the vibration of a single string in an orchestra of supergiants. We know, but cannot grasp, that above and below, beyond the limits of perception or imagination, thousands and millions of simultaneous transformations are at work, interlinked like a musical score by mathematical counterpoint. It has been described as a symphony in geometry, but we lack the ears to hear it.”
— Stanisław Lem, in “Solaris”

In the chapter on simplicial complexes, we saw that we can build a Vietoris-Rips complex from a point cloud by connecting points within distance \(\epsilon\). But we faced a fundamental problem: how to choose \(\epsilon\)?

Too small, and we see only isolated points. Too large, and everything collapses into a single blob. The brilliant insight of persistent homology is: don’t choose — look at all scales simultaneously and track which topological features persist.

7.1 Filtrations

Definition 7.1 A filtration is a nested sequence of simplicial complexes:

\[ \emptyset = K_0 \subseteq K_1 \subseteq K_2 \subseteq \cdots \subseteq K_m = K. \]

Each \(K_i\) is called a level of the filtration, and the parameter controlling the nesting is called the filtration value.

7.1.1 The Vietoris-Rips filtration

The most common filtration in TDA is the Vietoris-Rips filtration.

Definition 7.2 Let \((X,d)\) be a finite metric space and \(\epsilon \geq 0\). The Vietoris-Rips complex at scale \(\epsilon\) is

\[ \mathrm{VR}(X,\epsilon)=\left\{\sigma \subseteq X \;\middle|\; d(x,x') \leq \epsilon \text{ for all } x,x' \in \sigma \right\}. \]

In words: a finite subset of points is a simplex if all pairwise distances are at most \(\epsilon\).

Why this gives a filtration:

  • If \(\epsilon \leq \epsilon'\), then \(\mathrm{VR}(X,\epsilon) \subseteq \mathrm{VR}(X,\epsilon')\).
  • So increasing \(\epsilon\) can only add simplices, never remove them.

For a finite point cloud, the complex changes only at finitely many scales (critical values from pairwise distances). So we can write:

\[ K_0=\mathrm{VR}(X,\epsilon_0)\subseteq K_1=\mathrm{VR}(X,\epsilon_1)\subseteq \cdots \subseteq K_m=\mathrm{VR}(X,\epsilon_m). \]

Equivalent simplex-level view: each simplex \(\sigma\) appears at

\[ \mathrm{filt}(\sigma)=\max_{x,x' \in \sigma} d(x,x'). \]

Then \(K_t=\{\sigma \mid \mathrm{filt}(\sigma)\leq t\}\). Faces always appear no later than the simplex itself, so this really is a filtration.

This is the Vietoris-Rips filtration. Operationally:

  1. Start with \(\epsilon = 0\): the complex \(\text{VR}(X, 0)\) has only vertices (isolated points).
  2. Increase \(\epsilon\): edges appear between nearby points, then triangles, tetrahedra, etc.
  3. Eventually, for large enough \(\epsilon\), everything is connected into one giant simplex.

As \(\epsilon\) grows, topological features are born and die:

  • Birth: A new connected component appears (at \(\epsilon = 0\), every point is its own component), or a new loop or cavity forms.
  • Death: A component merges with another (two clusters become one), or a loop gets filled in by a triangle.

The persistence of a feature is its lifetime: \(\text{death} - \text{birth}\).

7.1.2 From filtrations to persistent homology

The filtration is geometric/combinatorial data. Persistent homology is what we get after applying homology to it.

Fix a homology dimension \(k\) and a field \(\mathbb{F}\). For each inclusion \(K_i \hookrightarrow K_j\), functoriality of homology gives a linear map:

\[ H_k(K_i;\mathbb{F}) \longrightarrow H_k(K_j;\mathbb{F}). \]

So a filtration becomes a sequence of vector spaces and linear maps:

\[ H_k(K_0) \to H_k(K_1) \to \cdots \to H_k(K_m). \]

This is the \(k\)-th persistence module (or persistence vector space).

Intuitively:

  • A generator of \(H_k(K_i)\) is a \(k\)-dimensional cycle class (component for \(k=0\), loop for \(k=1\), cavity for \(k=2\), …).
  • As \(i\) increases, the image of that class can survive, merge with others, or become trivial.
  • The persistent homology group between levels \(i \leq j\) is

\[ H_k^{i,j} := \mathrm{im}\!\left(H_k(K_i) \to H_k(K_j)\right), \]

which captures classes born by time \(i\) that are still alive at time \(j\).

7.1.3 Sublevel set filtrations and simplicial filtrations

Another important type of filtration comes from a function \(f: X \to \mathbb{R}\). The sublevel set at threshold \(t\) is:

\[ X_t = \{x \in X \mid f(x) \leq t\}. \]

As \(t\) increases, these sets are nested: \(X_s \subseteq X_t\) for \(s \leq t\).

But for computation we usually need simplicial complexes. A standard conversion is:

  1. Start with a simplicial complex \(K\) (for example, from a triangulation or from a Rips complex).
  2. Give each vertex \(v\) a value \(f(v)\).
  3. Extend to each simplex by

\[ F(\sigma) := \max_{v \in \sigma} f(v). \]

  1. Define

\[ K_t := \{\sigma \in K \mid F(\sigma) \leq t\}. \]

Then \((K_t)_{t \in \mathbb{R}}\) is a simplicial filtration (often called a lower-star filtration).

Why is this valid? If \(\tau \subseteq \sigma\), then \(F(\tau) \leq F(\sigma)\), so every face of a simplex in \(K_t\) is also in \(K_t\). Hence each \(K_t\) is a simplicial complex, and \(K_s \subseteq K_t\) for \(s \leq t\).

This is how sublevel-set ideas become computable persistent homology.

7.2 Persistence diagrams

We can summarize the birth and death of every topological feature in a single picture.

Definition 7.3 A persistence diagram is a multiset of points \((b_i, d_i) \in \mathbb{R}^2\), where \(b_i\) is the birth time and \(d_i\) is the death time of the \(i\)-th feature. Since \(b_i \leq d_i\), all points lie on or above the diagonal \(y = x\).

Points far from the diagonal represent features with long persistence — these are the “real” topological features. Points near the diagonal are short-lived and likely correspond to noise.

A persistence diagram for dimension \(k\) captures:

  • \(k = 0\): Connected components. A point \((b, d)\) means a component was born at scale \(b\) and merged with another at scale \(d\). The component that never dies (the last one standing) has \(d = \infty\).
  • \(k = 1\): Loops. A loop appeared at scale \(b\) and was filled in at scale \(d\).
  • \(k = 2\): Cavities. And so on for higher dimensions.

7.3 Barcodes

An equivalent representation is the barcode: each feature becomes an interval (drawn as a horizontal bar) from \(b_i\) to \(d_i\).

Each bar corresponds to one generator/class in the persistence module. Birth means that class first appears; death means it becomes trivial or merges into an older class.

The key algebraic theorem (Zomorodian-Carlsson structure theorem; extended in full generality by Crawley-Boevey for pointwise finite-dimensional modules) says:

\[ \mathcal{V} \cong \bigoplus_{\ell} I[b_\ell,d_\ell), \]

where \(I[b_\ell,d_\ell)\) is an interval module. So every persistence module decomposes into interval pieces, and those intervals are exactly the barcode.

So:

  • Barcode and persistence diagram carry the same information (up to the usual conventions for infinite intervals).
  • Diagram: good for geometry and distances between summaries.
  • Barcode: good for interval counting and reading lifetimes directly.

The barcode makes it easy to see:

  • How many features exist at any given scale (count the bars crossing a vertical line)
  • Which features are long-lived (long bars) vs. noise (short bars)
  • Natural thresholds for choosing a scale parameter

7.4 Bottleneck distance

To compare two persistence diagrams \(D_1\) and \(D_2\), we use the bottleneck distance.

We consider bijections between the two diagrams after adding infinitely many copies of the diagonal \(y=x\) (so points can be matched to the diagonal, interpreted as “feature created/destroyed as noise”). For a matching \(\gamma\), its cost is the largest \(L^\infty\) move:

\[ \mathrm{cost}(\gamma) = \sup_{p \in D_1 \cup \Delta} \|p-\gamma(p)\|_\infty, \]

where \(\Delta=\{(x,x)\mid x\in\mathbb{R}\}\) is the diagonal.

Then

\[ d_B(D_1,D_2)=\inf_{\gamma}\mathrm{cost}(\gamma). \]

Intuition:

  • It is the minimum worst-case displacement needed to transform one diagram into the other.
  • Points near the diagonal are cheap to match to the diagonal, so short-lived features are treated as low-cost noise.
  • Points far from the diagonal are expensive to remove, so long-persistent features dominate the distance.

7.5 Clustering revisited: dendrograms as 0-dimensional persistence

Here’s a beautiful connection between classical clustering and persistent homology.

Consider agglomerative (hierarchical) clustering: we start with \(n\) clusters (one per point) and progressively merge the closest pair. The dendrogram records this merging process.

Now consider the 0-dimensional Vietoris-Rips persistence: as we increase \(\epsilon\), connected components merge. The birth of a component is at \(\epsilon = 0\) (each point starts as its own component), and the death is the scale at which it merges with another.

These are the same thing! The 0-dimensional persistence barcode is precisely the dendrogram rewritten in the language of topology. The “gap” in the dendrogram that tells you how many clusters to choose corresponds to the gap between long and short bars in the barcode.

This is the first example of how persistence provides a principled, multi-scale view of structure that classical methods handle in an ad hoc way.

7.6 The stability theorem

One of the most important theoretical results in TDA is the stability theorem (Chazal, Cohen-Steiner, Glisse, Guibas, Oudot):

If two datasets \(X\) and \(Y\) are “close” (in a precise sense), then their persistence diagrams are also close.

More formally, the bottleneck distance between persistence diagrams is bounded by the Hausdorff distance or the \(L^\infty\) distance between the corresponding functions:

\[ d_B(\text{Dgm}(f), \text{Dgm}(g)) \leq \|f - g\|_\infty. \]

This says that if the input changes a little (in sup norm), the diagram changes a little in bottleneck distance. So persistent homology is not only descriptive, but also quantitatively robust.

7.7 Summary

Persistent homology gives us:

  1. A multi-scale view of topological features (no need to choose a single \(\epsilon\))
  2. A way to distinguish signal from noise (long-lived features vs. short-lived ones)
  3. A stable summary of data shape (robust to perturbations)
  4. A unifying framework that connects clustering, homology, and data analysis

In the next chapter, we will compute persistent homology in Julia using Ripserer.jl and work with persistence diagrams using PersistenceDiagrams.jl.