Hexbin Plot
chartAlso known as: hexagonal binning, hex plot, hexbin map, hexagonal heatmap
Description
A hexbin plot addresses the overplotting problem that arises when scatterplots contain thousands or millions of points. Instead of rendering each individual point, the 2D plane is divided into a regular grid of hexagonal bins, and the number of data points falling within each hexagon is counted. The count (or an aggregate statistic like mean or median) is then encoded as the hexagon’s fill color, opacity, or even height in a 3D variant.
Hexagons are preferred over squares for binning because they have a more uniform distance from center to edge (all neighbors are equidistant), they tessellate without alignment artifacts, and they reduce visual banding that rectangular grids produce. The hexagonal grid also has a higher packing efficiency, meaning it captures density patterns with fewer bins than a square grid of equivalent resolution.
The hexbin plot is essentially a 2D histogram: where a standard histogram bins data along one axis, a hexbin plot bins along two axes simultaneously. The resulting display reveals the joint distribution of two variables, making it easy to identify clusters, correlations, and outliers even in massive datasets. The choice of bin size (hexagon radius) is analogous to choosing histogram bin width — too small produces noisy, sparse hexagons; too large smooths out interesting structure.
When to Use
- Visualizing the density of very large scatterplots (thousands to millions of points) where overplotting makes individual points invisible
- Showing the joint distribution of two continuous variables as a 2D density map
- Identifying clusters, ridges, and voids in bivariate data
- Replacing scatterplots when point-level identity is not important but distributional shape is
When NOT to Use
- When individual data points carry identity or meaning that must be visible — use a scatterplot with transparency or jitter
- When you have fewer than ~200 points — the scatterplot is already readable; hexbinning would over-aggregate
- When the relationship between variables is more important than density — use a scatterplot with a trend line
- When one or both variables are categorical — use a heatmap or strip plot instead
- When precise x/y values need to be read — hexagonal aggregation obscures individual positions
Anatomy
- Hexagonal bins: Regular hexagons tiling the 2D plane, each covering a small region
- Fill color: Encodes the count (or other aggregate) within each hexagon, using a sequential color scale (light = few, dark = many)
- Bin size (radius): Controls the resolution of the binning; smaller bins reveal finer structure, larger bins show broader patterns
- Color scale legend: A gradient legend mapping colors to count values
- Axes: Standard x and y axes showing the two continuous variables
- Empty space: Hexagons with zero points are typically not drawn, leaving white space to show the data boundary
Variations
- Hexbin with size encoding: Hexagon radius varies with count (in addition to color), creating a bubble-like effect
- Hexbin map: Hexagonal binning applied to geographic coordinates, aggregating point data on a map
- Hexbin with contours: Density contour lines overlaid on the hexbin grid for smoother boundary perception
- Square binning: Uses square bins instead of hexagons — simpler to implement but with more visual artifacts
- Adaptive hexbin: Bin size varies locally based on data density for multi-resolution views
- 3D hexbin: Hexagons are extruded into prisms, with height encoding count — visually striking but harder to read
Code Reference
// Observable Plot - hexbin via the hexbin transform
Plot.plot({
color: {scheme: "YlGnBu", legend: true, label: "Count"},
marks: [
Plot.hexgrid(),
Plot.hex(data, Plot.hexbin(
{fill: "count", r: "count"},
{x: "var1", y: "var2", binWidth: 20}
))
]
})