Skip to Content

DENDROGRAM

Overview

Performs hierarchical (agglomerative) clustering on numeric data and returns a dendrogram plot as a base64-encoded PNG image. This function is designed for use in Excel, where you can pass a 2D list or a single column of numbers. By default, Ward’s method is used for clustering, but you may specify other linkage methods. The result is visualized as a dendrogram.

Ward’s method minimizes the total within-cluster variance. At each step, the pair of clusters with the minimum increase in total within-cluster variance after merging are combined. The increase in variance ΔE\Delta E when merging clusters AA and BB is:

ΔE=ABA+BxˉAxˉB2\Delta E = \frac{|A| \cdot |B|}{|A| + |B|} \|\bar{x}_A - \bar{x}_B\|^2

See scipy.cluster.hierarchy documentation  for more details on the available methods.

This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:

=DENDROGRAM(data, [method])
  • data (2D list, required): Numeric data for clustering (one or more columns).
  • method (string (enum), optional, default=“ward”): Linkage method. Valid options: "ward" (minimizes within-cluster variance), "single" (minimum distance between clusters), "complete" (maximum distance between clusters), "average" (average distance between clusters), "weighted" (weighted distance between clusters), "centroid" (distance between centroids of clusters), or "median" (median distance between clusters).

The function returns a base64-encoded PNG image of the dendrogram as a string. If the calculation fails, an error message string is returned.

Examples

Example 1: Cluster a List of Values (Default: Ward)

Sample input data (Excel range A1:A10):

Value
9.6
9.8
10
10.4
10.8
11
11.2
12
13
14

In Excel:

=DENDROGRAM(A1:A10)

Expected output: A base64-encoded PNG string (truncated):

"data:image/png;base64,iVBORw0KGgoAAA..."

Example 2: Cluster with Complete Linkage

=DENDROGRAM(A1:A10, "complete")

Expected output: A base64-encoded PNG string (truncated):

"data:image/png;base64,iVBORw0KGgoAAA..."

Python Code

import matplotlib matplotlib.use("Agg") import base64 import io import numpy as np import matplotlib.pyplot as plt from scipy.cluster.hierarchy import dendrogram as hierarchy_dendrogram from scipy.cluster.hierarchy import linkage options = {"insert_only":True} # The function below performs hierarchical clustering and returns a dendrogram image as a base64 string. def dendrogram(data: list[list[float]], method: str = "ward") -> str: """ Performs hierarchical (agglomerative) clustering on numeric data and returns a dendrogram as a base64-encoded PNG image or an error message. Args: data: 2D list of float, required. Numeric data for clustering (Excel range or list). method: str, optional, default="ward". Linkage method for clustering. One of 'single', 'complete', 'average', 'weighted', 'centroid', 'median', 'ward'. Returns: str: Base64-encoded PNG image of the dendrogram, or error message if calculation fails. This example function is provided as-is without any representation of accuracy. """ # Convert input to numpy array, flatten if 1D try: arr = np.array(data, dtype=float) except Exception: # Remove non-numeric rows manually arr_clean = [] for row in data: try: arr_clean.append([float(x) for x in row]) except Exception: continue arr = np.array(arr_clean, dtype=float) if arr.size == 0: return "Error: Not enough data." if arr.ndim == 1: arr = arr.reshape(-1, 1) elif arr.ndim == 2 and arr.shape[1] == 1: arr = arr elif arr.ndim == 2: arr = arr.astype(float) else: return "Error: Invalid input data." # Remove non-numeric rows arr = arr[np.isfinite(arr).all(axis=1)] if arr.shape[0] < 2: return "Error: Not enough data." # Perform hierarchical clustering try: linkage_matrix = linkage(arr, method=method) except Exception: try: linkage_matrix = linkage(arr, method="ward") except Exception: return "Error: Clustering failed." # Plot dendrogram plt.figure(figsize=(8, 4)) hierarchy_dendrogram(linkage_matrix) plt.title(f"Hierarchical Clustering Dendrogram ({method})") plt.xlabel("Sample Index") plt.ylabel("Distance") buf = io.BytesIO() plt.tight_layout() plt.savefig(buf, format="png") plt.close() img_b64 = base64.b64encode(buf.getvalue()).decode("utf-8") return f"data:image/png;base64,{img_b64}"
Last updated on