using MetricSpaces
using MetricSpaces.Datasets
using TDAmapper
using TDAmapper.DomainCovers
using TDAmapper.Refiners
using TDAmapper.Nerves12 Mapper: the general case
“The real voyage of discovery consists not in seeking new landscapes, but in having new eyes.”
— Marcel Proust, in “In Search of Lost Time”
We have already seen two specific mapper-like constructions:
- classical Mapper, based on a filter function and overlapping intervals;
- Ball Mapper, based on balls around landmark points.
The local TDAmapper.jl API makes the general pattern explicit: a mapper algorithm is a combination of
- a cover of the data;
- a refinement rule for each cover element;
- a nerve construction that turns the refined cover into a graph.
The beauty of this decomposition is that each step can be changed independently. That gives us a whole family of Mapper-like algorithms rather than a single fixed recipe.
12.1 The three building blocks
12.1.1 Covering step
Examples of covers include:
- overlapping intervals in the image of a filter function;
- fixed-radius balls around landmarks;
- adaptive covers whose radius depends on local density.
In classical Mapper, the cover lives in the image of a filter function. In Ball Mapper, the cover lives directly in the metric space through balls around landmarks.
12.1.2 Refinement step
Once we have a raw cover, we may refine each element further:
DBscanclusters each cover element into connected dense pieces;Trivial()keeps each cover element as a single node.
This is where “connectedness” enters the discrete picture: in the classical Reeb-graph story we would take connected components, but in data analysis we often replace that step by a clustering method.
12.1.3 Nerve step
Finally, we build a graph from the refined cover:
SimpleNerve()adds an edge whenever two cover elements overlap;- more restrictive nerve constructions can require larger intersections or weighted overlaps.
So the full algorithm is: cover, refine, then take a nerve.
12.2 Creating a mapper directly
The generic mapper function takes one object for each of those stages:
X_demo = annulus(300)
L_demo = farthest_points_sample_ids(X_demo, 30)
M_demo = mapper(
X_demo,
EpsilonBall(X = X_demo, L = L_demo, epsilon = 0.2),
Trivial(),
SimpleNerve(),
)
M_demoMapper graph with 30 vertices and 29 edges
That is exactly the pattern Ball Mapper uses under the hood. Writing it this way makes it easier to see what can be customized and why.
12.3 Interpreting Mapper graphs
Whatever choices we make, the output is still a graph, and we need to know how to read it:
- nodes represent local groups of data points;
- edges connect groups that overlap;
- connected components suggest distinct macroscopic pieces of the data;
- loops suggest circular or toroidal structure;
- branches suggest flares, bifurcations, or tree-like behavior.
Coloring the nodes by a statistic of the underlying data often adds another layer of meaning to the graph.
12.4 Example: a noisy figure-eight
using CairoMakie
using TDAplotsX = add_noise(figure_eight(400; r = 1.0), 0.08)
metricspace_plot(X, markersize = 4)We now build two mapper graphs using the same landmarks but different radii:
L = farthest_points_sample_ids(X, 60)
M_small = mapper(
X,
EpsilonBall(X = X, L = L, epsilon = 0.25),
Trivial(),
SimpleNerve(),
)
mapper_plot(M_small, node_values = node_colors(M_small, first.(X)))M_large = mapper(
X,
EpsilonBall(X = X, L = L, epsilon = 0.4),
Trivial(),
SimpleNerve(),
)
mapper_plot(M_large, node_values = node_colors(M_large, first.(X)))With the smaller radius, the two loops of the figure-eight remain visible. With the larger radius, the graph becomes much coarser and starts to collapse the structure.
12.5 Choosing parameters
There is no universal recipe, but a few heuristics help:
- Coverage: every point should belong to at least one cover element.
- Overlap: if cover elements barely overlap, the graph becomes disconnected.
- Resolution: fine covers reveal more detail but also more noise.
- Stability: features that survive across a range of parameters are more trustworthy.
That last point is the mapper version of the persistence philosophy: do not trust a single parameter choice more than the structure that persists across several of them.