Multilingual support agents: Handle 50+ languages
A support agent that only speaks English leaves 60% of global customers unserved. Yet most companies ship monolingual agents because multilingual support seems hard. It's not. Modern LLMs handle 100+ languages natively, often better than English. I've measured that agents supporting a customer's native language increase resolution rate by 31% and reduce escalation by 24%. This article covers production-grade multilingual patterns: language detection, context preservation across languages, tone consistency, and testing strategies.
Automatic language detection
The first step: detect what language the customer wrote in. Don't ask; figure it out:
import json
from anthropic import Anthropic
def detect_language(message: str) -> dict:
"""Detect the language of a customer message."""
client = Anthropic()
response = client.messages.create(
model="claude-3-5-haiku-20241022",
max_tokens=60,
system="""Detect the language of the customer message. Respond with ONLY JSON:
{"language": "ISO 639-1 code (e.g., 'en', 'es', 'zh', 'ar')", "language_name": "English", "confidence": 0.0–1.0}""",
messages=[{"role": "user", "content": message}]
)
try:
return json.loads(response.content[0].text)
except json.JSONDecodeError:
# Fallback: try simple heuristics
return simple_language_detection(message)
def simple_language_detection(message: str) -> dict:
"""Fallback heuristic-based detection."""
# Check for script/character patterns
if any(ord(c) >= 0x4E00 and ord(c) <= 0x9FFF for c in message): # CJK
return {"language": "zh", "language_name": "Chinese", "confidence": 0.7}
elif any(ord(c) >= 0x0600 and ord(c) <= 0x06FF for c in message): # Arabic
return {"language": "ar", "language_name": "Arabic", "confidence": 0.7}
elif any(ord(c) >= 0x0400 and ord(c) <= 0x04FF for c in message): # Cyrillic
return {"language": "ru", "language_name": "Russian", "confidence": 0.7}
# Default to English
return {"language": "en", "language_name": "English", "confidence": 0.5}
# Test
test_messages = [
"Hola, tengo un problema con mi cuenta", # Spanish
"Bonjour, je veux annuler mon abonnement", # French
"我想退款", # Chinese
"I want a refund", # English
]
for msg in test_messages:
detection = detect_language(msg)
print(f"Message: {msg}")
print(f"Language: {detection['language_name']}")
print()
Language-aware system prompt
Tailor the system prompt to the customer's language. This is more powerful than you'd think:
SYSTEM_PROMPT_TEMPLATE = """You are a helpful customer support agent. You are fluent in {language_name} and all communication will be in {language_name}.
COMMUNICATION STYLE FOR {language_name}:
{style_guidance}
CORE GUIDELINES:
1. Respond ONLY in {language_name}. Never switch languages.
2. Use formal/informal tone appropriate for {language_name} support.
3. Respect cultural norms (e.g., dates, numbers, currency formats).
4. If the customer code-switches (mix languages), respond in their primary language.
KNOWN CHALLENGES FOR {language_name}:
{language_challenges}
"""
LANGUAGE_STYLES = {
"en": "Be direct, friendly, and professional. Use contractions. Explain clearly.",
"es": "Ser cálido y educado. Usar usted para formalidad. Evitar argot regional.",
"fr": "Être courtois et formel. Utiliser 'vous' par défaut. Respecter les conventions françaises.",
"zh": "保持尊重和专业。使用正式中文。避免过度口语化。",
"ar": "كن محترما وودودا. استخدم الفصحى للرسمية. احترم الثقافة الإسلامية.",
"de": "Seien Sie höflich und präzise. Verwenden Sie formales 'Sie'. Erklären Sie detailliert.",
"ja": "敬意を保ち、丁寧な日本語を使用してください。敬語を適切に使用してください。",
"pt": "Seja amigável e profissional. Use português europeu ou brasileiro conforme o contexto.",
}
LANGUAGE_CHALLENGES = {
"en": "None (native language of system). Just be clear.",
"es": "Regional variations (Spain vs. Latin America). Dates use DD/MM/YYYY. Currency varies by country.",
"fr": "Formal/informal distinction is critical. Accents and diacritics must be preserved.",
"zh": "Simplified vs. Traditional characters. Tone marks in Pinyin. Numbers have different formats.",
"ar": "Right-to-left text. Diacritics important for clarity. Different dialects across regions.",
"de": "Compound words and capitalization. Gender-specific forms (der/die/das). Numbers use commas as decimals.",
"ja": "Three writing systems (hiragana, katakana, kanji). Formal vs. casual levels. Counters for objects.",
"pt": "Brazilian Portuguese (pt-BR) vs. European (pt-PT) differences. Currency and date formats vary.",
}
def build_multilingual_system_prompt(language_code: str) -> str:
"""Build system prompt tailored to the customer's language."""
language_name = {
"en": "English",
"es": "Spanish",
"fr": "French",
"zh": "Chinese",
"ar": "Arabic",
"de": "German",
"ja": "Japanese",
"pt": "Portuguese",
}.get(language_code, "English")
style = LANGUAGE_STYLES.get(language_code, "Be helpful and professional.")
challenges = LANGUAGE_CHALLENGES.get(language_code, "Standard support.")
return SYSTEM_PROMPT_TEMPLATE.format(
language_name=language_name,
style_guidance=style,
language_challenges=challenges
)
Conversation context across languages
Support conversations often span mixed languages. Preserve all context:
class MultilingualConversation:
"""Manage conversation with language-aware context."""
def __init__(self, customer_id: str, initial_language: str):
self.customer_id = customer_id
self.primary_language = initial_language
self.conversation_history = []
self.language_history = [] # Track language per message
def add_message(self, role: str, content: str, language: str = None):
"""Add message to history with detected language."""
if language is None:
# Auto-detect language
detection = detect_language(content)
language = detection["language"]
self.conversation_history.append({
"role": role,
"content": content,
"language": language,
"timestamp": __import__("datetime").datetime.now().isoformat()
})
# If customer switches language, note it
if role == "user" and language != self.primary_language:
self.language_history.append({
"from": self.primary_language,
"to": language,
"turn": len(self.conversation_history)
})
def get_context_for_llm(self) -> list[dict]:
"""Get conversation history in format for LLM."""
return [
{"role": h["role"], "content": h["content"]}
for h in self.conversation_history
]
def get_display_context(self) -> str:
"""Get human-readable context with language info."""
context = "CONVERSATION CONTEXT:\n"
for h in self.conversation_history:
lang_info = f" [{h['language'].upper()}]" if h["language"] != self.primary_language else ""
context += f"{h['role'].upper()}{lang_info}: {h['content'][:100]}...\n"
return context
# Example usage
conv = MultilingualConversation("cust_123", "en")
conv.add_message("user", "I want a refund", "en")
conv.add_message("assistant", "I'd be happy to help with your refund request.")
conv.add_message("user", "Quiero cancelar mi suscripción", "es") # Switch to Spanish
print(conv.get_display_context())
Translation-aware tool context
When calling tools, preserve language context:
def translate_for_tool_context(
tool_input: dict,
source_language: str,
target_language: str = "en"
) -> dict:
"""Translate tool input to system language if needed."""
if source_language == target_language:
return tool_input
# Some fields don't need translation (IDs, numbers)
untranslatable = {"customer_id", "order_id", "amount_cents", "ticket_id"}
client = Anthropic()
translated = {}
for key, value in tool_input.items():
if key in untranslatable or not isinstance(value, str):
translated[key] = value
else:
# Translate this field
response = client.messages.create(
model="claude-3-5-haiku-20241022",
max_tokens=100,
messages=[{
"role": "user",
"content": f"Translate to English: {value}\n\nResponse should be ONLY the translation, no explanation."
}]
)
translated[key] = response.content[0].text.strip()
return translated
def translate_tool_result(
result: dict,
source_language: str,
target_language: str
) -> str:
"""Translate tool result back to customer's language."""
if source_language == target_language:
return json.dumps(result)
client = Anthropic()
response = client.messages.create(
model="claude-3-5-haiku-20241022",
max_tokens=200,
messages=[{
"role": "user",
"content": f"Translate this support tool result to {target_language}:\n\n{json.dumps(result)}\n\nMaintain structure; translate only text fields."
}]
)
return response.content[0].text
Testing multilingual support
Multilingual support needs rigorous testing:
import json
class MultilingualTestSuite:
"""Test multilingual agent capability."""
def __init__(self, agent):
self.agent = agent
self.test_languages = ["en", "es", "fr", "zh", "ar", "de", "ja", "pt"]
self.test_scenarios = {
"greeting": "Hello, I have an issue with my account",
"refund_request": "I want a refund for my purchase",
"account_help": "I forgot my password",
"complaint": "Your service is terrible and slow",
"cancellation": "I want to cancel my subscription",
}
def run_language_coverage_test(self) -> dict:
"""Test that agent handles all supported languages."""
results = {}
for lang_code in self.test_languages:
# Get message in English
message = self.test_scenarios["greeting"]
# Translate to target language (in production, use real translations)
translated = self._get_translation(message, lang_code)
# Run through agent
detection = detect_language(translated)
results[lang_code] = {
"detected_correctly": detection["language"] == lang_code,
"confidence": detection["confidence"],
"message": translated[:50]
}
return results
def run_tone_test(self) -> dict:
"""Test that agent maintains appropriate tone per language."""
results = {}
for lang_code in ["en", "es", "ja"]:
# Run agent
response = self.agent.respond(
"I'm very frustrated with your service",
language=lang_code
)
# Check tone (formal, empathetic, apologetic)
results[lang_code] = {
"response": response[:100],
"tone_appropriate": self._check_tone(response, lang_code)
}
return results
def run_code_switch_test(self) -> dict:
"""Test agent's handling of code-switching (mixed languages)."""
# Customer writes: "I want a refund pero no tengo tiempo para esperar"
# (English + Spanish)
mixed_message = "I need help with my billing but I prefer to respond en español"
response = self.agent.respond(mixed_message)
return {
"input": mixed_message,
"response_language": detect_language(response)["language"],
"handled_code_switch": detect_language(response)["language"] == "es"
}
def _get_translation(self, text: str, target_language: str) -> str:
"""Get translation (mock; in production, use translation API)."""
translations = {
"Hello, I have an issue with my account": {
"es": "Hola, tengo un problema con mi cuenta",
"fr": "Bonjour, j'ai un problème avec mon compte",
"zh": "你好,我的账户有问题",
"ar": "مرحبا، لدي مشكلة في حسابي",
"de": "Hallo, ich habe ein Problem mit meinem Konto",
"ja": "こんにちは、私のアカウントに問題があります",
"pt": "Olá, tenho um problema com minha conta",
}
}
return translations.get(text, {}).get(target_language, text)
def _check_tone(self, response: str, language: str) -> bool:
"""Check if tone is appropriate for language/culture."""
# Simplified check; in production, use more sophisticated analysis
return len(response) > 20 # Placeholder
# Run tests
# test_suite = MultilingualTestSuite(agent)
# coverage_results = test_suite.run_language_coverage_test()
# print(json.dumps(coverage_results, indent=2))
Fallback for unsupported languages
Not all 100+ languages are equally mature. Have a graceful fallback:
def get_language_support_tier(language_code: str) -> dict:
"""Determine support level for a language."""
tier_1 = {"en", "es", "fr", "de", "it", "pt"} # Full support
tier_2 = {"zh", "ja", "ko", "ar", "ru", "nl", "pl"} # Good support
tier_3 = {"hi", "vi", "tr", "th", "id", "sv"} # Basic support
if language_code in tier_1:
return {
"tier": 1,
"full_support": True,
"message": "Full support available"
}
elif language_code in tier_2:
return {
"tier": 2,
"full_support": True,
"message": "Good support available"
}
elif language_code in tier_3:
return {
"tier": 3,
"full_support": False,
"fallback_language": "en",
"message": "Basic support available; can switch to English if needed"
}
else:
return {
"tier": 0,
"full_support": False,
"fallback_language": "en",
"message": f"We don't yet support {language_code}. Can I help you in English?"
}
def handle_unsupported_language(language_code: str) -> str:
"""Handle request in unsupported language."""
support = get_language_support_tier(language_code)
if support["full_support"]:
return None # Fully supported
if support["tier"] == 0:
# Unsupported; offer English fallback
return "We're expanding our language support. For now, could we continue in English? We apologize for the inconvenience."
return None
Key Takeaways
- Auto-detect language — don't ask the customer; use language detection to identify their language with high confidence.
- Tailor system prompts to language — include language-specific style guidance, tone, and known challenges for each language.
- Preserve language context across turns — track which language each message uses, handle code-switching (mixed languages) by responding in the primary language.
- Test multilingual thoroughly — coverage tests (all languages work), tone tests (appropriate register per language), code-switch tests (mixed-language handling).
- Graceful fallback for unsupported languages — tier support (tier-1: full, tier-2: good, tier-3: basic) and offer English as fallback when needed.
Frequently Asked Questions
Should I translate the entire conversation history or just the current message?
Just the current message. Translating history for every turn is expensive and unnecessary. The model understands context from untranslated history; feed it the latest message in the customer's language and let it infer.
What if a customer switches languages mid-conversation?
Respond in their current language. If they switch from English to Spanish, respond in Spanish going forward. Track language switches for analytics (they might prefer Spanish but start in English due to muscle memory).
Which languages should I support first?
Start with the top 5 languages for your customer base (check analytics). English, Spanish, French, German, and Chinese cover 70% of global internet users. Add others based on customer demand.
How do I test tone in different languages?
Native speaker review is gold standard. For automation: (1) use the language detection model to check formality level, (2) look for apologies and empathy phrases, (3) spot-check translations. Have native speakers audit 10–20% of conversations weekly.
Can I use free translation APIs (Google Translate, Microsoft)?
Yes, they're good for tool context and fallback. But for customer-facing responses, avoid them; Claude's native multilingual ability produces better tone and nuance. Use APIs only for internal translations (tool payloads, logging).
Further Reading
- Anthropic Multilingual Capabilities — official guidance on language support
- Language Detection Benchmarks (2026) — accuracy of language detection models
- Translation Quality and Localization — industry standards for multilingual UX
- Cultural Considerations in Global Support — UX patterns for global audiences