Function Calling and Tool Integration
Models with function calling can execute code, query databases, and interact with external systems. Scope documents test “the LLM” while ignoring that the model now has execution capabilities. The model’s role shifts from text generation to action execution, creating attack surface that doesn’t exist in chat-only implementations.
Function Calling Mechanics
Function calling allows models to invoke predefined functions based on user requests. The application defines available functions with schemas, the model decides when to call them, and the application executes the calls.
Basic implementation:
def get_weather(location: str) -> str:
"""Get current weather for a location"""
return f"Weather in {location}: 72°F, sunny"
def get_user_email(user_id: int) -> str:
"""Get user email address"""
return db.query("SELECT email FROM users WHERE id = ?", user_id)
tools = [
{
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {"type": "object", "properties": {"location": {"type": "string"}}}
},
{
"name": "get_user_email",
"description": "Get user email address",
"parameters": {"type": "object", "properties": {"user_id": {"type": "integer"}}}
}
]
response = model.chat(messages, tools=tools)
if response.tool_calls:
for call in response.tool_calls:
result = execute_function(call.name, call.arguments)
The model chooses which function to call and with what parameters. The application executes these choices. This delegation of execution decisions creates the security problem.
Unauthorized Function Access
Functions inherit the application’s permissions, not the user’s permissions.
Missing Authorization Checks
def get_user_data(user_id: int) -> dict:
"""Retrieve user account data"""
return db.query("SELECT * FROM users WHERE id = ?", user_id)
def handle_request(authenticated_user_id, user_message):
response = model.chat([{"role": "user", "content": user_message}], tools=tools)
if response.tool_calls:
for call in response.tool_calls:
if call.name == "get_user_data":
# No check that user_id matches authenticated_user_id
result = get_user_data(call.arguments["user_id"])
return result
User 123 is authenticated. They prompt: “Show me user 456’s data.” Model calls get_user_data(456). Application executes without verifying user 123 has permission to access user 456’s data.
The function executes with application privileges, not user privileges. Authorization must happen in the application layer before function execution.
Function Visibility Without Access Control
admin_tools = [
{"name": "delete_user", "description": "Delete a user account"},
{"name": "reset_database", "description": "Reset database to initial state"}
]
regular_tools = [
{"name": "get_weather", "description": "Get weather information"}
]
def handle_request(user_role, message):
if user_role == "admin":
tools = admin_tools + regular_tools
else:
tools = regular_tools
response = model.chat([{"role": "user", "content": message}], tools=tools)
This attempts role-based access control by showing different tools to different users. But it relies on the model not being manipulated into calling admin functions.
Correct approach:
def execute_function(user_role, function_name, arguments):
if function_name in ["delete_user", "reset_database"] and user_role != "admin":
raise PermissionError(f"User lacks permission for {function_name}")
return globals()[function_name](**arguments)
Parameter Injection and Validation
Models generate function parameters. These parameters might be malicious.
SQL Injection Through Function Parameters
def search_users(query: str) -> list:
"""Search users by name"""
sql = f"SELECT * FROM users WHERE name LIKE '%{query}%'"
return db.execute(sql)
User prompt: “Search for users named admin’ OR ‘1’=‘1”
Model calls: search_users("admin' OR '1'='1")
Application executes: SELECT * FROM users WHERE name LIKE '%admin' OR '1'='1%'
SQL injection via model-generated parameter.
Fix requires validation:
def search_users(query: str) -> list:
"""Search users by name"""
return db.execute("SELECT * FROM users WHERE name LIKE ?", f"%{query}%")
Path Traversal in File Operations
def read_file(filename: str) -> str:
"""Read a file from the documents directory"""
path = f"/var/app/documents/{filename}"
with open(path, 'r') as f:
return f.read()
User prompt: “Read file ../../../etc/passwd”
Model calls: read_file("../../../etc/passwd")
Application reads: /var/app/documents/../../../etc/passwd which resolves to /etc/passwd
Fix:
import os
def read_file(filename: str) -> str:
"""Read a file from the documents directory"""
base_dir = "/var/app/documents/"
path = os.path.join(base_dir, filename)
if not os.path.abspath(path).startswith(base_dir):
raise ValueError("Invalid file path")
with open(path, 'r') as f:
return f.read()
Command Injection
def ping_host(hostname: str) -> str:
"""Ping a host to check connectivity"""
import subprocess
result = subprocess.run(f"ping -c 1 {hostname}", shell=True, capture_output=True)
return result.stdout.decode()
User prompt: “Ping localhost; cat /etc/passwd”
Model calls: ping_host("localhost; cat /etc/passwd")
Application executes: ping -c 1 localhost; cat /etc/passwd
Fix:
def ping_host(hostname: str) -> str:
"""Ping a host to check connectivity"""
import subprocess
import re
if not re.match(r'^[a-zA-Z0-9.-]+$', hostname):
raise ValueError("Invalid hostname")
result = subprocess.run(["ping", "-c", "1", hostname], capture_output=True)
return result.stdout.decode()
Chained Function Calls
Models can chain multiple function calls in sequence. This creates multi-step attack opportunities.
Information Gathering to Exploitation
def list_users() -> list:
"""List all user IDs"""
return db.query("SELECT id FROM users")
def get_user_details(user_id: int) -> dict:
"""Get detailed user information"""
return db.query("SELECT * FROM users WHERE id = ?", user_id)
def send_email(to: str, subject: str, body: str) -> bool:
"""Send an email"""
mail.send(to, subject, body)
return True
User prompt: “Email all users their account details”
Model chains:
list_users()- gets [1, 2, 3, 4, 5]get_user_details(1)- gets {email: “user1@example.com”, …}send_email("user1@example.com", "Your Details", "...")- Repeat for each user
Each individual function is reasonable. Chained together, they exfiltrate all user data via email.
Rate Limiting and Resource Exhaustion
Functions consume resources. Unlimited function calling can exhaust resources or incur costs.
API Cost Exhaustion
def web_search(query: str) -> list:
"""Search the web using paid search API"""
return paid_api.search(query) # Costs $0.01 per search
User prompt: “Search for 10,000 different variations of ‘weather’”
Model generates 10,000 search function calls. Application executes all of them. Cost: $100.
Without rate limiting on function calls, adversaries exhaust API quotas or incur significant costs.
Database Load
def complex_analytics(start_date: str, end_date: str, filters: dict) -> dict:
"""Run complex analytics query"""
return db.expensive_query(start_date, end_date, filters) # Takes 30 seconds
User prompt: “Run analytics for every day in 2023 separately”
Model generates 365 function calls. Each takes 30 seconds. Total execution time: 3 hours.
Testing Function Calling Security
Scope needs to specify function calling testing explicitly.
Function Enumeration
Test discovery of available functions:
- Can users enumerate all available functions?
- Can function schemas be extracted?
- Are hidden/undocumented functions callable?
Authorization Testing
For each function, test:
- Can unauthorized users call it?
- Can users call it with parameters outside their scope?
- Can authorization be bypassed through prompt manipulation?
Parameter Validation
For each function, test parameters for:
- SQL injection
- Command injection
- Path traversal
- Integer overflow
- Type confusion
Scope Language for Function Calling
Bad scope:
Test the AI assistant including function calling features.
Better scope:
Test function calling security for the following functions:
1. get_weather(location: str)
- Parameter validation (location parameter for injection)
- Rate limiting (max 100 calls per user per hour)
2. get_user_data(user_id: int)
- Authorization (verify user can only access own data)
- Parameter validation (user_id for SQL injection)
3. send_email(to: str, subject: str, body: str)
- Authorization (verify user owns recipient email)
- Rate limiting (max 10 calls per user per day)
- Parameter validation (email header injection)
4. execute_query(sql: str)
- Authorization (admin only)
- Parameter validation (SQL injection, dangerous commands)
Test scenarios:
- Unauthorized function access
- Parameter injection
- Function call chains for privilege escalation
- Rate limit exhaustion
Conclusion
Function calling transforms models from text generators into action executors. This fundamentally changes the attack surface. Testing “the LLM” isn’t sufficient when the model can query databases, execute code, and interact with external systems.
Scope documents that don’t explicitly address function calling will miss this entire class of vulnerabilities.
[Original Source](No response)