Error Feedback Loops: Teaching LLMs Self-Correction
An error feedback loop is a control mechanism where validation failure triggers corrective feedback sent back to the LLM, which then retries. Instead of failing silently, you teach the LLM what went wrong and ask it to fix it. Studies show this improves output quality by 30–40% (Wei et al., 2022).
In production systems, error feedback loops are the most cost-effective reliability tool. They require no model retraining, no guardrail libraries, just well-crafted feedback messages. This article teaches you how to design and implement them.
How Feedback Loops Work
A feedback loop has three phases:
- Generation: LLM generates output.
- Validation: You check the output against your schema or business rules.
- Repair (on failure): You send validation errors back to the LLM with a corrective prompt, asking it to fix the output.
Here's the cycle:
[Generate] -> [Validate] -> [Valid?]
|-- Yes -> Return output
|-- No -> [Send feedback] -> [Regenerate]
The key is crafting effective feedback messages that are specific enough for the LLM to understand what went wrong.
Simple Feedback Loop in Python
Start with a basic loop:
import json
from typing import Optional
import anthropic
def extract_with_retry(text: str, schema: dict, max_retries: int = 3) -> Optional[dict]:
"""Extract structured data with automatic error correction."""
client = anthropic.Anthropic()
for attempt in range(max_retries):
prompt = f"""Extract customer data from the text. Return ONLY valid JSON matching this schema:
{json.dumps(schema, indent=2)}
Text: {text}
JSON output (no markdown, no extra text):"""
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
messages=[{"role": "user", "content": prompt}]
)
output = response.content[0].text
# Attempt to parse and validate
try:
parsed = json.loads(output)
# Validate against schema (simplified)
if all(key in parsed for key in schema.get("required", [])):
return parsed
except json.JSONDecodeError:
pass
# If we get here, validation failed
if attempt < max_retries - 1:
# Send corrective feedback
feedback_prompt = f"""Your previous response had issues:
- Output was not valid JSON or was missing required fields.
Please try again. Return ONLY valid JSON with NO markdown or extra text.
Schema: {json.dumps(schema, indent=2)}
Text: {text}
JSON output:"""
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
messages=[{"role": "user", "content": feedback_prompt}]
)
output = response.content[0].text
return None
This basic loop gives the LLM three chances to produce valid output. On each failure, it receives explicit feedback.
Specific Error Feedback
More sophisticated feedback identifies the exact validation error:
def detailed_error_feedback(parsed: dict, schema: dict) -> Optional[str]:
"""Generate specific error feedback for validation failures."""
required = schema.get("required", [])
properties = schema.get("properties", {})
errors = []
# Check required fields
for field in required:
if field not in parsed:
errors.append(f"Missing required field: {field}")
# Check field types
for field, value in parsed.items():
if field in properties:
expected_type = properties[field].get("type")
actual_type = type(value).__name__
if expected_type and not matches_type(value, expected_type):
errors.append(f"Field '{field}' must be {expected_type}, got {actual_type}")
if not errors:
return None
return "Your output had validation errors:\n" + "\n".join(f"- {e}" for e in errors)
def matches_type(value, expected_type: str) -> bool:
"""Check if value matches expected JSON type."""
type_map = {
"string": str,
"integer": int,
"number": (int, float),
"boolean": bool,
"array": list,
"object": dict,
"null": type(None)
}
return isinstance(value, type_map.get(expected_type, object))
# Usage
try:
parsed = json.loads(output)
jsonschema.validate(parsed, schema)
except jsonschema.ValidationError:
error_msg = detailed_error_feedback(parsed, schema)
if error_msg:
# Send detailed feedback to LLM
feedback_prompt = f"""{error_msg}
Please fix the errors and return valid JSON again.
Schema: {json.dumps(schema, indent=2)}
Text: {text}
JSON output:"""
The more specific your error message, the higher the chance the LLM fixes it correctly.
Multi-Turn Feedback with Conversation History
For complex extractions, maintain conversation history across retries. This context helps the LLM understand the task:
def extract_with_conversation(text: str, schema: dict, max_turns: int = 3) -> Optional[dict]:
"""Extract with multi-turn conversation for error correction."""
client = anthropic.Anthropic()
messages = []
for turn in range(max_turns):
if turn == 0:
# Initial request
user_msg = f"""Extract data from: {text}
Return valid JSON matching: {json.dumps(schema)}
JSON only, no markdown."""
else:
# Corrective request (previous errors provided)
user_msg = f"""Previous response had errors (see assistant message above).
Please fix and return valid JSON matching the schema.
JSON only."""
messages.append({"role": "user", "content": user_msg})
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
messages=messages
)
assistant_msg = response.content[0].text
messages.append({"role": "assistant", "content": assistant_msg})
# Validate
try:
parsed = json.loads(assistant_msg)
# If valid, add correction feedback for next turn
jsonschema.validate(parsed, schema)
return parsed
except (json.JSONDecodeError, jsonschema.ValidationError) as e:
# Add error as next user message
error_details = f"Validation error: {str(e)[:200]}"
messages.append({"role": "user", "content": error_details})
return None
The conversation history provides context. The LLM can "see" its previous attempt and the error, enabling better corrections.
Comparison: Feedback Strategies
| Strategy | Improvement | Setup Time | Cost | Best For |
|---|---|---|---|---|
| Blind retry (same prompt) | 10–15% | Minimal | High (extra calls) | Testing, low-stakes |
| Generic feedback | 20–25% | Low | Medium | Most production systems |
| Specific error feedback | 30–35% | Medium | Medium | Validation-heavy systems |
| Multi-turn conversation | 35–40% | High | Medium-High | Complex extractions, high value |
Key Takeaways
- Error feedback loops increase output quality by 30–40% with no model retraining.
- Specific, detailed error messages are more effective than generic "try again" prompts.
- Maintain conversation history across retries for context and better corrections.
- Set a retry limit (3–5 retries) to avoid infinite loops and excessive costs.
- Log all validation failures and feedback attempts for analysis.
Frequently Asked Questions
How many retries should I allow?
Most production systems use 2–3 retries. Beyond that, diminishing returns kick in and costs climb. If a task fails 3 times, it likely needs a different approach (schema change, model switch, or human escalation).
How do I know if my feedback was effective?
Track success rates before and after feedback. If feedback improves fix rate from 60% to 85%, it's working. Log the feedback messages and model whether certain feedback patterns are more effective.
Can I use feedback loops with streaming responses?
Yes, but it's complex. You'd buffer the response, then validate and retry. For streaming, consider validating incrementally (once a complete JSON object is available) or using constrained generation.
What if the LLM keeps failing the same way?
It's time to change the schema, the prompt, or the model. Repeating the same feedback isn't productive. Escalate to a human or fall back to a default value.