AI_EXTRACT

Overview

This function uses an AI model to extract specific types of information from text. It is designed for pulling structured data (like dates, contacts, or key points) from unstructured text content such as emails, reports, or meeting notes.

Usage

To use the AI_EXTRACT function in Excel, enter it as a formula in a cell, specifying your text, extract type, and any optional arguments as needed:


=AI_EXTRACT(text, extract_type, [temperature], [model], [max_tokens], [api_key], [api_url])

Replace each parameter with your desired value. The function returns a single-column list of extracted items.

Parameters

Parameter	Type	Required	Description
text	string/range	Yes	The text or cell reference containing the data to analyze.
extract_type	string	Yes	The type of information to extract (e.g., “emails”, “dates”, “key points”).
temperature	float	No	Controls the randomness/creativity of the response (0.0 to 2.0). Lower values are more deterministic.
model	string	No	The specific AI model ID to use (must support JSON mode, e.g., ‘mistral-small-latest’).
max_tokens	int	No	Maximum number of tokens for the generated list content.
api_key	string	No	API key for authentication. Get a free API key from Mistral AI .
api_url	string	No	OpenAI-compatible API endpoint URL (e.g., https://api.mistral.ai/v1/chat/completions ).

Return Value

Return Value	Type	Description
Extracted Data	2D list	A single-column list of extracted items as requested. Returns `[["Error: ..."]]` on failure.

Demo

If either api_key or api_url is not provided, both will default to Boardflare demo values (api_url: https://llm.boardflare.com , api_key: your Microsoft login token if available). This only works for users logged in with a Microsoft account and provides limited free demo usage. You may obtain a free api_key for Mistral AI with your Microsoft account which offers more generous free usage and supports CORS.

Limitations

The quality of the extraction depends on the clarity of the extract_type and the text provided.
Large text inputs may exceed model context limits and result in truncated or incomplete results.
The function requires an internet connection to access the AI model.
Model availability and output may vary depending on the provider or API changes.
Sensitive or confidential data should not be sent to external AI services.
temperature must be a float between 0 and 2 (inclusive). If not, a ValueError is raised.
If you hit the API rate limit for your provider, a message is returned instead of raising an exception.

Benefits

Automates extraction of structured data from unstructured text directly in Excel.
Saves time and improves consistency in reporting and data entry.
Enables dynamic, context-aware extraction using your own data.
More flexible and powerful than manual or native Excel approaches for information extraction.

Examples

1. Extracting Client Names from Meeting Notes


=AI_EXTRACT("During today's annual review, we discussed progress with Acme Corporation, Global Enterprises, and TechSolutions Inc. All three clients reported satisfaction with our services.", "client names")

Sample Output:

Item
Acme Corporation
Global Enterprises
TechSolutions Inc.

2. Extracting Financial Metrics from a Report


=AI_EXTRACT("Q1 results exceeded expectations with revenue of $2.4M, an EBITDA margin of 18.5%, and customer acquisition costs decreasing by 12%. Cash reserves stand at $5.2M and our runway extends to 24 months.", "financial metrics")

Sample Output:

Item
Revenue: $2.4M
EBITDA margin: 18.5%
Customer acquisition costs: -12%
Cash reserves: $5.2M
Runway: 24 months

3. Extracting Action Items from Email


=AI_EXTRACT("Hi team, Following our strategic planning session: 1) Mark needs to finalize the budget by Friday, 2) Sarah will contact vendors for new quotes, 3) Development team must provide timeline estimates by next Wednesday, and 4) Everyone should review the new marketing materials.", "action items")

Sample Output:

Item
Mark needs to finalize the budget by Friday
Sarah will contact vendors for new quotes
Development team must provide timeline estimates by next Wednesday
Everyone should review the new marketing materials

4. Extracting Contact Information from Business Cards


=AI_EXTRACT("John Smith\nSenior Project Manager\nInnovative Solutions Inc.\[email protected]\n+1 (555) 123-4567\n123 Business Avenue, Suite 400\nSan Francisco, CA 94107", "contact information")

Sample Output:

Item
Name: John Smith
Title: Senior Project Manager
Company: Innovative Solutions Inc.
Email: [email protected]
Phone: +1 (555) 123-4567
Address: 123 Business Avenue, Suite 400, San Francisco, CA 94107

5. Extracting Dates and Deadlines


=AI_EXTRACT("The initial design phase will be completed by May 15, 2025. The stakeholder review is scheduled for May 20-22, with development starting June 1. Testing will run through September 15, with final delivery expected by October 3, 2025.", "dates and deadlines")

Sample Output:

Item
Design completion: May 15, 2025
Stakeholder review: May 20-22, 2025
Development start: June 1, 2025
Testing completion: September 15, 2025
Final delivery: October 3, 2025

Source Code


import requests
import json
 
# This function uses AI to extract specific types of information from text.
def ai_extract(text, extract_type, temperature=0.0, model='mistral-small-latest', max_tokens=1000, api_key=None, api_url=None):
    """
    Uses AI to extract specific types of information from text.
 
    Args:
        text (str or list): The text to analyze (string or 2D list with a single cell)
        extract_type (str): Type of information to extract (e.g., 'emails', 'dates', 'action items')
        temperature (float, optional): Controls response creativity (0-2). Default is 0.0
        model (str, optional): ID of the model to use
        max_tokens (int, optional): Maximum tokens for response generation. Default is 1000
        api_key (str, optional): API key for authentication (e.g. for Mistral AI)
        api_url (str, optional): OpenAI compatible URL. (e.g., https://api.mistral.ai/v1/chat/completions)
 
    Returns:
        list: 2D list representing the extracted data as a single column
    """
    # Demo fallback logic (Boardflare)
    if api_key is None or api_url is None:
        if 'idToken' in globals():
            api_key = globals()['idToken']
            api_url = "https://llm.boardflare.com"
        else:
            return [["Login on the Functions tab for limited demo usage, or sign up for a free Mistral AI account at https://console.mistral.ai/ and add your own api_key."]]
 
    # Handle 2D list input (flatten to a single string)
    if isinstance(text, list):
        if len(text) > 0 and len(text[0]) > 0:
            text = str(text[0][0])
        else:
            return [["Error: Empty input text."]]
 
    # Validate temperature
    if not (isinstance(temperature, (int, float)) and 0.0 <= float(temperature) <= 2.0):
        return [["Error: temperature must be a float between 0.0 and 2.0."]]
 
    # Validate max_tokens
    if not (isinstance(max_tokens, int) and max_tokens > 0):
        return [["Error: max_tokens must be a positive integer."]]
 
    # Construct a specific prompt for data extraction
    extract_prompt = f"Extract the following from the text: {extract_type}\n\nText: {text}"
    extract_prompt += "\n\nReturn ONLY a JSON object with a key 'items' whose value is a JSON array of the items you extracted. "
    extract_prompt += "Each item should be a single value representing one extracted piece of information. "
    extract_prompt += "Do not include any explanatory text, just the JSON object. "
    extract_prompt += "For example: {\"items\": [\"item1\", \"item2\", \"item3\"]}"
 
    payload = {
        "messages": [{"role": "user", "content": extract_prompt}],
        "temperature": temperature,
        "model": model,
        "max_tokens": max_tokens,
        "response_format": {"type": "json_object"}
    }
 
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
        "Accept": "application/json"
    }
 
    try:
        response = requests.post(api_url, headers=headers, json=payload)
        response.raise_for_status()
        response_data = response.json()
        content = response_data["choices"][0]["message"]["content"]
        try:
            extracted_data = json.loads(content)
            if isinstance(extracted_data, dict) and "items" in extracted_data:
                extracted_data = extracted_data["items"]
            elif isinstance(extracted_data, dict):
                if "extracted" in extracted_data:
                    extracted_data = extracted_data["extracted"]
                elif "results" in extracted_data:
                    extracted_data = extracted_data["results"]
            if isinstance(extracted_data, list):
                return [[item] for item in extracted_data]
            else:
                return [["Error: Unable to parse response. Expected a list."]]
        except (json.JSONDecodeError, ValueError):
            return [["Error: Unable to extract data. The AI response wasn't in the expected format."]]
    except requests.exceptions.RequestException as e:
        return [["Error: API request failed.", str(e)]]