AI_EXTRACT
Overview
The AI_EXTRACT
function uses an AI model to extract specific types of information from unstructured text, such as emails, reports, or meeting notes, and returns structured data (like dates, contacts, or key points). It leverages the Mistral AI API, which provides advanced large language models for text analysis and extraction. For more information, see the Mistral AI GitHub repository and the Mistral AI documentation .
This example function is provided as-is without any representation of accuracy.
Usage
To use the function in Excel, enter it as a formula in a cell:
=AI_EXTRACT(text, extract_type, [temperature], [model], [max_tokens], [api_key], [api_url])
text
(str or 2D list, required): The text or cell reference containing the data to analyze.extract_type
(str, required): The type of information to extract (e.g., “emails”, “dates”, “key points”).temperature
(float, optional, default=0.0): Controls the randomness/creativity of the response (0.0 to 2.0).model
(str, optional, default=“mistral-small-latest”): The specific AI model ID to use (must support JSON mode).max_tokens
(int, optional, default=1000): Maximum number of tokens for the generated list content.api_key
(str, optional): API key for authentication. Get a free API key from Mistral AI .api_url
(str, optional): OpenAI-compatible API endpoint URL (e.g.,https://api.mistral.ai/v1/chat/completions
).
The function returns a single-column 2D list of extracted items, or a string error message if extraction fails or input is invalid.
Examples
Example 1: Extracting Client Names from Meeting Notes
=AI_EXTRACT("During today's annual review, we discussed progress with Acme Corporation, Global Enterprises, and TechSolutions Inc. All three clients reported satisfaction with our services.", "client names")
Expected output:
Item |
---|
Acme Corporation |
Global Enterprises |
TechSolutions Inc. |
Example 2: Extracting Financial Metrics from a Report
=AI_EXTRACT("Q1 results exceeded expectations with revenue of $2.4M, an EBITDA margin of 18.5%, and customer acquisition costs decreasing by 12%. Cash reserves stand at $5.2M and our runway extends to 24 months.", "financial metrics")
Expected output:
Item |
---|
Revenue: $2.4M |
EBITDA margin: 18.5% |
Customer acquisition costs: -12% |
Cash reserves: $5.2M |
Runway: 24 months |
Example 3: Extracting Action Items from Email
=AI_EXTRACT("Hi team, Following our strategic planning session: 1) Mark needs to finalize the budget by Friday, 2) Sarah will contact vendors for new quotes, 3) Development team must provide timeline estimates by next Wednesday, and 4) Everyone should review the new marketing materials.", "action items")
Expected output:
Item |
---|
Mark needs to finalize the budget by Friday |
Sarah will contact vendors for new quotes |
Development team must provide timeline estimates by next Wednesday |
Everyone should review the new marketing materials |
Example 4: Extracting Contact Information from Business Cards
=AI_EXTRACT("John Smith\nSenior Project Manager\nInnovative Solutions Inc.\[email protected]\n+1 (555) 123-4567\n123 Business Avenue, Suite 400\nSan Francisco, CA 94107", "contact information")
Expected output:
Item |
---|
Name: John Smith |
Title: Senior Project Manager |
Company: Innovative Solutions Inc. |
Email: [email protected] |
Phone: +1 (555) 123-4567 |
Address: 123 Business Avenue, Suite 400, San Francisco, CA 94107 |
Example 5: Extracting Dates and Deadlines
=AI_EXTRACT("The initial design phase will be completed by May 15, 2025. The stakeholder review is scheduled for May 20-22, with development starting June 1. Testing will run through September 15, with final delivery expected by October 3, 2025.", "dates and deadlines")
Expected output:
Item |
---|
Design completion: May 15, 2025 |
Stakeholder review: May 20-22, 2025 |
Development start: June 1, 2025 |
Testing completion: September 15, 2025 |
Final delivery: October 3, 2025 |
Python Code
import requests
import json
def ai_extract(text, extract_type, temperature=0.0, model="mistral-small-latest", max_tokens=1000, api_key=None, api_url=None):
"""
Uses an AI model to extract specific types of information from unstructured text.
Args:
text: The text to analyze (str or 2D list with a single cell).
extract_type: The type of information to extract (e.g., 'emails', 'dates', 'action items').
temperature: Controls response creativity (float, optional, default=0.0, range 0.0-2.0).
model: The model ID to use (str, optional, default="mistral-small-latest").
max_tokens: Maximum tokens for response generation (int, optional, default=1000).
api_key: API key for authentication (str, optional).
api_url: OpenAI-compatible API endpoint URL (str, optional).
Returns:
2D list: Single-column list of extracted items, or a string error message if extraction fails or input is invalid.
This example function is provided as-is without any representation of accuracy.
"""
# Demo fallback logic (Boardflare)
if api_key is None or api_url is None:
if "idToken" in globals():
api_key = globals()["idToken"]
api_url = "https://llm.boardflare.com"
else:
return "Login on the Functions tab for limited demo usage, or sign up for a free Mistral AI account at https://console.mistral.ai/ and add your own api_key."
# Handle 2D list input (flatten to a single string)
if isinstance(text, list):
if len(text) > 0 and len(text[0]) > 0:
text = str(text[0][0])
else:
return "Error: Empty input text."
# Validate temperature
if not (isinstance(temperature, (int, float)) and 0.0 <= float(temperature) <= 2.0):
return "Error: temperature must be a float between 0.0 and 2.0."
# Validate max_tokens
if not (isinstance(max_tokens, int) and max_tokens > 0):
return "Error: max_tokens must be a positive integer."
# Construct a specific prompt for data extraction
extract_prompt = f"Extract the following from the text: {extract_type}\n\nText: {text}"
extract_prompt += "\n\nReturn ONLY a JSON object with a key 'items' whose value is a JSON array of the items you extracted. "
extract_prompt += "Each item should be a single value representing one extracted piece of information. "
extract_prompt += "Do not include any explanatory text, just the JSON object. "
extract_prompt += 'For example: {"items": ["item1", "item2", "item3"]}'
payload = {
"messages": [{"role": "user", "content": extract_prompt}],
"temperature": temperature,
"model": model,
"max_tokens": max_tokens,
"response_format": {"type": "json_object"}
}
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
"Accept": "application/json"
}
try:
response = requests.post(api_url, headers=headers, json=payload)
response.raise_for_status()
response_data = response.json()
content = response_data["choices"][0]["message"]["content"]
try:
extracted_data = json.loads(content)
if isinstance(extracted_data, dict) and "items" in extracted_data:
extracted_data = extracted_data["items"]
elif isinstance(extracted_data, dict):
if "extracted" in extracted_data:
extracted_data = extracted_data["extracted"]
elif "results" in extracted_data:
extracted_data = extracted_data["results"]
if isinstance(extracted_data, list):
return [[item] for item in extracted_data]
else:
return "Error: Unable to parse response. Expected a list."
except (json.JSONDecodeError, ValueError):
return "Error: Unable to extract data. The AI response wasn't in the expected format."
except requests.exceptions.RequestException as e:
return f"Error: API request failed. {str(e)}"
Live Notebook
Edit this function in a live notebook .