DENDROGRAM

Overview

Performs hierarchical (agglomerative) clustering on numeric data and returns a dendrogram plot as a base64-encoded PNG image. This function is designed for use in Excel, where you can pass a 2D list or a single column of numbers. By default, Ward’s method is used for clustering, but you may specify other linkage methods. The result is visualized as a dendrogram.

Ward’s method minimizes the total within-cluster variance. At each step, the pair of clusters with the minimum increase in total within-cluster variance after merging are combined. The increase in variance $\Delta E$ when merging clusters $A$ and $B$ is:

\Delta E = \frac{|A| \cdot |B|}{|A| + |B|} \|\bar{x}_A - \bar{x}_B\|^2

See scipy.cluster.hierarchy documentation for more details on the available methods.

This example function is provided as-is without any representation of accuracy.

Usage

To use the function in Excel:


=DENDROGRAM(data, [method])

data (2D list, required): Numeric data for clustering (one or more columns).
method (string (enum), optional, default=“ward”): Linkage method. Valid options: "ward" (minimizes within-cluster variance), "single" (minimum distance between clusters), "complete" (maximum distance between clusters), "average" (average distance between clusters), "weighted" (weighted distance between clusters), "centroid" (distance between centroids of clusters), or "median" (median distance between clusters).

The function returns a base64-encoded PNG image of the dendrogram as a string. If the calculation fails, an error message string is returned.

Examples

Example 1: Cluster a List of Values (Default: Ward)

Sample input data (Excel range A1:A10):

Value
9.6
9.8
10
10.4
10.8
11
11.2
12
13
14

In Excel:


=DENDROGRAM(A1:A10)

Expected output: A base64-encoded PNG string (truncated):


"data:image/png;base64,iVBORw0KGgoAAA..."

Example 2: Cluster with Complete Linkage


=DENDROGRAM(A1:A10, "complete")

Expected output: A base64-encoded PNG string (truncated):


"data:image/png;base64,iVBORw0KGgoAAA..."

Python Code


import matplotlib
matplotlib.use("Agg")
import base64
import io
import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import dendrogram as hierarchy_dendrogram
from scipy.cluster.hierarchy import linkage
 
options = {"insert_only":True}
 
# The function below performs hierarchical clustering and returns a dendrogram image as a base64 string.
def dendrogram(data: list[list[float]], method: str = "ward") -> str:
    """
    Performs hierarchical (agglomerative) clustering on numeric data and returns a dendrogram as a base64-encoded PNG image or an error message.
 
    Args:
        data: 2D list of float, required. Numeric data for clustering (Excel range or list).
        method: str, optional, default="ward". Linkage method for clustering. One of 'single', 'complete', 'average', 'weighted', 'centroid', 'median', 'ward'.
    Returns:
        str: Base64-encoded PNG image of the dendrogram, or error message if calculation fails.
 
    This example function is provided as-is without any representation of accuracy.
    """
    # Convert input to numpy array, flatten if 1D
    try:
        arr = np.array(data, dtype=float)
    except Exception:
        # Remove non-numeric rows manually
        arr_clean = []
        for row in data:
            try:
                arr_clean.append([float(x) for x in row])
            except Exception:
                continue
        arr = np.array(arr_clean, dtype=float)
    if arr.size == 0:
        return "Error: Not enough data."
    if arr.ndim == 1:
        arr = arr.reshape(-1, 1)
    elif arr.ndim == 2 and arr.shape[1] == 1:
        arr = arr
    elif arr.ndim == 2:
        arr = arr.astype(float)
    else:
        return "Error: Invalid input data."
    # Remove non-numeric rows
    arr = arr[np.isfinite(arr).all(axis=1)]
    if arr.shape[0] < 2:
        return "Error: Not enough data."
    # Perform hierarchical clustering
    try:
        linkage_matrix = linkage(arr, method=method)
    except Exception:
        try:
            linkage_matrix = linkage(arr, method="ward")
        except Exception:
            return "Error: Clustering failed."
    # Plot dendrogram
    plt.figure(figsize=(8, 4))
    hierarchy_dendrogram(linkage_matrix)
    plt.title(f"Hierarchical Clustering Dendrogram ({method})")
    plt.xlabel("Sample Index")
    plt.ylabel("Distance")
    buf = io.BytesIO()
    plt.tight_layout()
    plt.savefig(buf, format="png")
    plt.close()
    img_b64 = base64.b64encode(buf.getvalue()).decode("utf-8")
    return f"data:image/png;base64,{img_b64}"