AI_EXTRACT
Overview
This function uses an AI model to extract specific types of information from text. It is designed for pulling structured data (like dates, contacts, or key points) from unstructured text content such as emails, reports, or meeting notes.
Usage
To use the AI_EXTRACT
function in Excel, enter it as a formula in a cell, specifying your text, extract type, and any optional arguments as needed:
=AI_EXTRACT(text, extract_type, [temperature], [model], [max_tokens], [api_key], [api_url])
Replace each parameter with your desired value. The function returns a single-column list of extracted items.
Parameters
Parameter | Type | Required | Description |
---|---|---|---|
text | string/range | Yes | The text or cell reference containing the data to analyze. |
extract_type | string | Yes | The type of information to extract (e.g., “emails”, “dates”, “key points”). |
temperature | float | No | Controls the randomness/creativity of the response (0.0 to 2.0). Lower values are more deterministic. |
model | string | No | The specific AI model ID to use (must support JSON mode, e.g., ‘mistral-small-latest’). |
max_tokens | int | No | Maximum number of tokens for the generated list content. |
api_key | string | No | API key for authentication. Get a free API key from Mistral AI . |
api_url | string | No | OpenAI-compatible API endpoint URL (e.g., https://api.mistral.ai/v1/chat/completions ). |
Return Value
Return Value | Type | Description |
---|---|---|
Extracted Data | 2D list | A single-column list of extracted items as requested. Returns [["Error: ..."]] on failure. |
Demo
If either api_key
or api_url
is not provided, both will default to Boardflare demo values (api_url
: https://llm.boardflare.com , api_key
: your Microsoft login token if available). This only works for users logged in with a Microsoft account and provides limited free demo usage. You may obtain a free api_key for Mistral AI with your Microsoft account which offers more generous free usage and supports CORS.
Limitations
- The quality of the extraction depends on the clarity of the extract_type and the text provided.
- Large text inputs may exceed model context limits and result in truncated or incomplete results.
- The function requires an internet connection to access the AI model.
- Model availability and output may vary depending on the provider or API changes.
- Sensitive or confidential data should not be sent to external AI services.
temperature
must be a float between 0 and 2 (inclusive). If not, a ValueError is raised.- If you hit the API rate limit for your provider, a message is returned instead of raising an exception.
Benefits
- Automates extraction of structured data from unstructured text directly in Excel.
- Saves time and improves consistency in reporting and data entry.
- Enables dynamic, context-aware extraction using your own data.
- More flexible and powerful than manual or native Excel approaches for information extraction.
Examples
1. Extracting Client Names from Meeting Notes
=AI_EXTRACT("During today's annual review, we discussed progress with Acme Corporation, Global Enterprises, and TechSolutions Inc. All three clients reported satisfaction with our services.", "client names")
Sample Output:
Item |
---|
Acme Corporation |
Global Enterprises |
TechSolutions Inc. |
2. Extracting Financial Metrics from a Report
=AI_EXTRACT("Q1 results exceeded expectations with revenue of $2.4M, an EBITDA margin of 18.5%, and customer acquisition costs decreasing by 12%. Cash reserves stand at $5.2M and our runway extends to 24 months.", "financial metrics")
Sample Output:
Item |
---|
Revenue: $2.4M |
EBITDA margin: 18.5% |
Customer acquisition costs: -12% |
Cash reserves: $5.2M |
Runway: 24 months |
3. Extracting Action Items from Email
=AI_EXTRACT("Hi team, Following our strategic planning session: 1) Mark needs to finalize the budget by Friday, 2) Sarah will contact vendors for new quotes, 3) Development team must provide timeline estimates by next Wednesday, and 4) Everyone should review the new marketing materials.", "action items")
Sample Output:
Item |
---|
Mark needs to finalize the budget by Friday |
Sarah will contact vendors for new quotes |
Development team must provide timeline estimates by next Wednesday |
Everyone should review the new marketing materials |
4. Extracting Contact Information from Business Cards
=AI_EXTRACT("John Smith\nSenior Project Manager\nInnovative Solutions Inc.\[email protected]\n+1 (555) 123-4567\n123 Business Avenue, Suite 400\nSan Francisco, CA 94107", "contact information")
Sample Output:
Item |
---|
Name: John Smith |
Title: Senior Project Manager |
Company: Innovative Solutions Inc. |
Email: [email protected] |
Phone: +1 (555) 123-4567 |
Address: 123 Business Avenue, Suite 400, San Francisco, CA 94107 |
5. Extracting Dates and Deadlines
=AI_EXTRACT("The initial design phase will be completed by May 15, 2025. The stakeholder review is scheduled for May 20-22, with development starting June 1. Testing will run through September 15, with final delivery expected by October 3, 2025.", "dates and deadlines")
Sample Output:
Item |
---|
Design completion: May 15, 2025 |
Stakeholder review: May 20-22, 2025 |
Development start: June 1, 2025 |
Testing completion: September 15, 2025 |
Final delivery: October 3, 2025 |
Source Code
import requests
import json
# This function uses AI to extract specific types of information from text.
def ai_extract(text, extract_type, temperature=0.0, model='mistral-small-latest', max_tokens=1000, api_key=None, api_url=None):
"""
Uses AI to extract specific types of information from text.
Args:
text (str or list): The text to analyze (string or 2D list with a single cell)
extract_type (str): Type of information to extract (e.g., 'emails', 'dates', 'action items')
temperature (float, optional): Controls response creativity (0-2). Default is 0.0
model (str, optional): ID of the model to use
max_tokens (int, optional): Maximum tokens for response generation. Default is 1000
api_key (str, optional): API key for authentication (e.g. for Mistral AI)
api_url (str, optional): OpenAI compatible URL. (e.g., https://api.mistral.ai/v1/chat/completions)
Returns:
list: 2D list representing the extracted data as a single column
"""
# Demo fallback logic (Boardflare)
if api_key is None or api_url is None:
if 'idToken' in globals():
api_key = globals()['idToken']
api_url = "https://llm.boardflare.com"
else:
return [["Login on the Functions tab for limited demo usage, or sign up for a free Mistral AI account at https://console.mistral.ai/ and add your own api_key."]]
# Handle 2D list input (flatten to a single string)
if isinstance(text, list):
if len(text) > 0 and len(text[0]) > 0:
text = str(text[0][0])
else:
return [["Error: Empty input text."]]
# Validate temperature
if not (isinstance(temperature, (int, float)) and 0.0 <= float(temperature) <= 2.0):
return [["Error: temperature must be a float between 0.0 and 2.0."]]
# Validate max_tokens
if not (isinstance(max_tokens, int) and max_tokens > 0):
return [["Error: max_tokens must be a positive integer."]]
# Construct a specific prompt for data extraction
extract_prompt = f"Extract the following from the text: {extract_type}\n\nText: {text}"
extract_prompt += "\n\nReturn ONLY a JSON object with a key 'items' whose value is a JSON array of the items you extracted. "
extract_prompt += "Each item should be a single value representing one extracted piece of information. "
extract_prompt += "Do not include any explanatory text, just the JSON object. "
extract_prompt += "For example: {\"items\": [\"item1\", \"item2\", \"item3\"]}"
payload = {
"messages": [{"role": "user", "content": extract_prompt}],
"temperature": temperature,
"model": model,
"max_tokens": max_tokens,
"response_format": {"type": "json_object"}
}
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
"Accept": "application/json"
}
try:
response = requests.post(api_url, headers=headers, json=payload)
response.raise_for_status()
response_data = response.json()
content = response_data["choices"][0]["message"]["content"]
try:
extracted_data = json.loads(content)
if isinstance(extracted_data, dict) and "items" in extracted_data:
extracted_data = extracted_data["items"]
elif isinstance(extracted_data, dict):
if "extracted" in extracted_data:
extracted_data = extracted_data["extracted"]
elif "results" in extracted_data:
extracted_data = extracted_data["results"]
if isinstance(extracted_data, list):
return [[item] for item in extracted_data]
else:
return [["Error: Unable to parse response. Expected a list."]]
except (json.JSONDecodeError, ValueError):
return [["Error: Unable to extract data. The AI response wasn't in the expected format."]]
except requests.exceptions.RequestException as e:
return [["Error: API request failed.", str(e)]]