Python in Excel Functions

Excel
Python
Learn about the new BOARDFLARE.RUNPY function and how we’ve aligned our API with Microsoft’s Python in Excel for better code compatibility.
Published

October 22, 2024

Python in Excel Functions

Since we launched Boardflare’s Python for Excel add-in over a month ago, we’ve heard from users that they would like to see more pre-made Python functions that they can use as-is or modify to suit their needs. As we’ve started buiding out a collection of Python functions, we realized that we could align our API with Microsoft’s Python in Excel so that the same code could be used in both with minimal modification.

🆕 BOARDFLARE.RUNPY function

To achieve compatibility with Python in Excel, we’ve introduced a new BOARDFLARE.RUNPY function that handles both scalar and array inputs and outputs, and therefore replaces the original BOARDFLARE.PY and BOARDFLARE.PY.ARRfunctions which are now deprecated. We chose the name RUNPY to reduce potential confusion with Excel’s PY feature and we are not that creative with names.

Like Python in Excel, BOARDFLARE.RUNPY now converts Excel arrays into pandas DataFrames. It also returns the value of the last expression in the Python code, instead of using pyout as the output variable. This approach not only aligns with Python in Excel, but also with Jupyter notebooks.

The net result is that the same function code will work in both, with the only difference being how data is passed as arguments as shown in the following examples:

```python {5} # Using BOARDFLARE.RUNPY def add(a, b): return a + b

add(arg1,arg2)

Assuming the code above is in cell `C2`, the arguments `arg1` and `arg2` are passed to the `BOARDFLARE.RUNPY` function as follows:

```excel
=BOARDFLARE.RUNPY(C2, A2, B2)

With Python in Excel, the same code would be written as follows:

```python {5} # Using Excel’s PY def add(a, b): return a + b

add(xl(“A2”),xl(“B2”))


Here the `xl` function is used to reference the Excel cells `A2` and `B2` inside the Python code.  The upside of Microsoft's approach is everything is all in one cell, so is easier to make changes when developing code interactively.  The downside is that it is not a function, and therefore can't be used in Excel formulas or LAMBDA functions the way `BOARDFLARE.RUNPY` can.

## Functions collection

We're just getting started on building out a collection of ready-to-use Python [functions](/functions) that can also be used as a starting point for building more complex functions.  

For example, we've built a [text_distance](/tools/text/text_distance) function for fuzzy matching that is similar to that used in our [Fuzzy Match](/apps/excel/fuzzy-match) app, but supports many more algorithms and can be readily modified to your specific needs.  Here's the code:  

```python
import textdistance
import pandas as pd

def text_distance(needle, haystack_df, algorithm='jaccard'):
    # Get the algorithm function from textdistance
    algo_func = getattr(textdistance, algorithm)
    # Flatten the DataFrame to a list
    haystack = haystack_df.values.flatten().tolist()
    
    # Check if needle is a DataFrame
    if isinstance(needle, pd.DataFrame):
        needle_list = needle.values.flatten().tolist()
    else:
        needle_list = [needle]
    
    results = [] 
    for needle_item in needle_list:
        # Calculate similarity scores with normalization and round to 2 decimal places
        # Adjust index to be 1-based
        scores = [(index + 1, round(algo_func.normalized_similarity(needle_item, item), 2)) for index, item in enumerate(haystack)]
        # Sort based on scores in descending order
        scores.sort(key=lambda x: x[1], reverse=True)
        # Append the top index and score to results as a list
        results.append(list(scores[0]))

    # results is 2D list, e.g. [[1, 0.75], [2, 0.85]]
    return results

text_distance(arg1, arg2, arg3)

LAMBDA functions using Python

One of the great features of BOARDFLARE.RUNPY is that it can be used to build LAMBDA functions that can be used in Excel formulas. For example, the text_distance function can be used to build a FUZZYMATCH.TD function that is similar to Excel’s XMATCH function, but with the added ability to set a similarity threshold. Here’s the code:

=LAMBDA(lookup_value, lookup_array, [similarity_threshold], [algorithm],
    LET(
        threshold, IF(ISOMITTED(similarity_threshold), 0.7, similarity_threshold),
        algo, IF(ISOMITTED(algorithm), "jaccard", algorithm),
        result, BOARDFLARE.RUNPY("text/fuzzy/text_distance.ipynb", lookup_value, lookup_array, algo),
        index, INDEX(result, 1),
        score, INDEX(result, 2),
        IF(score >= threshold, index, "No match")
    )
)