Skip to main content

Integrating Image Generation into Applications

Embedding image generation into a product is different from standalone generation tools. Products must handle user input variation, enforce quality and safety constraints, manage latency, and scale cost-effectively. This article covers API design, user prompt sanitization, client-server workflows, and deployment patterns for production applications.

Application Architecture Patterns

Backend-as-a-Service: Asynchronous Generation

Most production apps use asynchronous generation: users submit requests, receive a job ID immediately, then poll for results. This decouples the user from generation latency:

from fastapi import FastAPI, BackgroundTasks
from typing import Optional
import uuid
import json
from datetime import datetime

app = FastAPI()

# In-memory job store (use database in production)
jobs = {}

@app.post("/generate")
async def submit_generation(
user_prompt: str,
style: Optional[str] = "photorealistic",
background_tasks: BackgroundTasks = None
):
"""Submit an image generation request."""
# Validate and sanitize user input
job_id = str(uuid.uuid4())

job = {
"id": job_id,
"status": "queued",
"user_prompt": user_prompt,
"style": style,
"created_at": datetime.utcnow().isoformat(),
"result_url": None,
"error": None
}

jobs[job_id] = job

# Queue generation as background task
background_tasks.add_task(generate_and_store, job_id, user_prompt, style)

return {
"job_id": job_id,
"status": "queued",
"check_url": f"/status/{job_id}"
}

@app.get("/status/{job_id}")
async def check_status(job_id: str):
"""Check generation status."""
if job_id not in jobs:
return {"error": "Job not found"}, 404

job = jobs[job_id]
return {
"id": job_id,
"status": job["status"],
"result_url": job["result_url"],
"error": job["error"]
}

async def generate_and_store(job_id: str, user_prompt: str, style: str):
"""Background task to generate image."""
try:
jobs[job_id]["status"] = "generating"

# Build complete prompt with safety constraints
full_prompt = build_safe_prompt(user_prompt, style)

# Generate image
result = generate_image(full_prompt)

# Store result
result_url = f"https://cdn.example.com/{job_id}.jpg"
# Actually upload to storage service here

jobs[job_id]["status"] = "completed"
jobs[job_id]["result_url"] = result_url

except Exception as e:
jobs[job_id]["status"] = "failed"
jobs[job_id]["error"] = str(e)

Real-Time Generation with WebSocket

For low-latency applications (creative tools, design apps), use WebSocket streaming:

from fastapi import WebSocket
import asyncio

@app.websocket("/ws/generate")
async def websocket_generate(websocket: WebSocket):
"""WebSocket endpoint for real-time generation."""
await websocket.accept()

try:
while True:
# Receive user prompt
data = await websocket.receive_json()
user_prompt = data["prompt"]

# Send status update
await websocket.send_json({"status": "generating", "message": "Starting generation..."})

# Generate image
full_prompt = build_safe_prompt(user_prompt, data.get("style", "photorealistic"))
result = await generate_image_async(full_prompt)

# Stream result back
await websocket.send_json({
"status": "completed",
"image_url": result["url"],
"metadata": {
"seed": result.get("seed"),
"model": result.get("model"),
"generation_time": result.get("time_ms")
}
})

except Exception as e:
await websocket.send_json({"status": "error", "message": str(e)})
finally:
await websocket.close()

User Prompt Engineering and Sanitization

Prompt Template System

Rather than letting users enter arbitrary prompts, guide them with templates:

class PromptTemplate:
def __init__(self):
self.templates = {
"product_photography": {
"base": "professional product photograph of {object}",
"style": "studio lighting, white background, sharp focus, 8k",
"forbidden": ["person", "face", "human"],
"required_elements": ["object"]
},
"portrait": {
"base": "portrait of {description}",
"style": "professional photography, soft lighting, sharp focus",
"forbidden": ["gore", "violence", "explicit"],
"required_elements": ["description"]
},
"landscape": {
"base": "{environment} landscape",
"style": "scenic photography, natural lighting, landscape photography",
"forbidden": ["person", "human"],
"required_elements": ["environment"]
}
}

def build_prompt(self, template_name: str, user_inputs: dict):
"""Build a safe prompt from template and user inputs."""
if template_name not in self.templates:
raise ValueError(f"Unknown template: {template_name}")

template = self.templates[template_name]

# Verify required inputs are present
for required in template["required_elements"]:
if required not in user_inputs:
raise ValueError(f"Missing required input: {required}")

# Build base prompt
try:
prompt = template["base"].format(**user_inputs)
except KeyError as e:
raise ValueError(f"Invalid input for template: {e}")

# Add style
prompt += f", {template['style']}"

# Check for forbidden terms
for forbidden in template["forbidden"]:
if forbidden.lower() in prompt.lower():
raise ValueError(f"Forbidden term in prompt: {forbidden}")

return prompt

# Usage
template = PromptTemplate()

# Safe product generation
product_prompt = template.build_prompt(
"product_photography",
{"object": "wireless headphones in black"}
)
# Result: "professional product photograph of wireless headphones in black, studio lighting..."

# Safe portrait
portrait_prompt = template.build_prompt(
"portrait",
{"description": "woman with red hair in casual attire"}
)

Content Safety Filtering

class ContentSafetyFilter:
def __init__(self):
self.blocked_terms = [
"violence", "gore", "explicit", "adult", "nsfw",
"weapon", "gun", "knife", "hate", "racist"
]
self.warnings = ["injury", "accident", "dangerous"]

def filter_prompt(self, prompt: str):
"""Check prompt for safety issues."""
prompt_lower = prompt.lower()

# Check blocked terms
for term in self.blocked_terms:
if term in prompt_lower:
raise ValueError(f"Content policy violation: {term}")

# Check for warnings (alert but allow)
warnings = []
for term in self.warnings:
if term in prompt_lower:
warnings.append(f"Warning: prompt contains '{term}'")

return {"safe": True, "warnings": warnings}

def filter_generation_result(self, image_path: str):
"""Validate generated image for safety (placeholder)."""
# Real implementation would use content moderation API
# (Google Vision API, AWS Rekognition, OpenAI moderation)
return {"safe": True, "flags": []}

# Usage
filter = ContentSafetyFilter()
try:
result = filter.filter_prompt("a beautiful landscape with mountains")
print("Prompt approved:", result)
except ValueError as e:
print("Prompt rejected:", e)

Client-Side Optimization

Progressive Loading and Caching

// JavaScript client for image generation

class ImageGenerationClient {
constructor(apiBaseUrl) {
this.apiBaseUrl = apiBaseUrl;
this.cache = new Map();
}

// Submit generation request
async generate(prompt, style = "photorealistic") {
// Check cache first
const cacheKey = `${prompt}:${style}`;
if (this.cache.has(cacheKey)) {
return this.cache.get(cacheKey);
}

// Submit to backend
const response = await fetch(`${this.apiBaseUrl}/generate`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ prompt, style })
});

const data = await response.json();
return data; // { job_id, status, check_url }
}

// Poll for completion
async waitForCompletion(jobId, maxAttempts = 60, intervalMs = 1000) {
for (let attempt = 0; attempt < maxAttempts; attempt++) {
const response = await fetch(`${this.apiBaseUrl}/status/${jobId}`);
const data = await response.json();

if (data.status === "completed") {
// Cache successful result
const cacheKey = data.prompt + ":" + data.style;
this.cache.set(cacheKey, { url: data.result_url });
return data;
}

if (data.status === "failed") {
throw new Error(`Generation failed: ${data.error}`);
}

// Wait before next poll
await new Promise(resolve => setTimeout(resolve, intervalMs));
}

throw new Error("Generation timeout");
}

// Progressive image loading with blur-up
async loadImageProgressive(imageUrl, placeholderUrl) {
return new Promise((resolve) => {
const img = new Image();

// Load placeholder first
if (placeholderUrl) {
const placeholder = new Image();
placeholder.src = placeholderUrl;
placeholder.style.filter = "blur(10px)";
}

// Load full image
img.onload = () => {
img.style.filter = "none";
resolve(img);
};

img.src = imageUrl;
});
}
}

// Usage
const client = new ImageGenerationClient("https://api.example.com");

async function generateAndDisplay() {
try {
// Submit generation
const job = await client.generate("a beautiful sunset over mountains");
console.log("Generation queued:", job.job_id);

// Wait for completion
const result = await client.waitForCompletion(job.job_id);
console.log("Generation complete:", result.result_url);

// Display with progressive loading
const img = await client.loadImageProgressive(result.result_url);
document.getElementById("image-container").appendChild(img);

} catch (error) {
console.error("Generation failed:", error);
}
}

Deployment and Scaling

Docker Container for Consistent Environment

FROM python:3.11-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY app.py .
COPY templates/ ./templates/

# Environment variables
ENV MODEL=stable-diffusion-3
ENV MAX_RETRIES=3
ENV QUEUE_FILE=/tmp/generation_queue.jsonl

# Health check
HEALTHCHECK --interval=30s --timeout=10s CMD python -c "import requests; requests.get('http://localhost:8000/health')"

EXPOSE 8000

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Kubernetes Deployment with Scaling

apiVersion: apps/v1
kind: Deployment
metadata:
name: image-generation-service
spec:
replicas: 3
selector:
matchLabels:
app: image-generation
template:
metadata:
labels:
app: image-generation
spec:
containers:
- name: generation-api
image: image-generation:latest
ports:
- containerPort: 8000
env:
- name: MODEL
value: "stable-diffusion-3"
- name: QUEUE_FILE
value: "/tmp/generation_queue.jsonl"
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "4"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: image-generation-service
spec:
type: LoadBalancer
selector:
app: image-generation
ports:
- protocol: TCP
port: 80
targetPort: 8000
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: generation-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: image-generation-service
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70

Key Takeaways

  • Use asynchronous generation for most applications: submit request, return job ID, poll for results.
  • Sanitize user prompts using templates and content safety filters.
  • Cache generation results to reduce API calls and latency.
  • Implement progressive loading and status feedback on the client.
  • Deploy in containers with horizontal scaling to handle load variation.
  • Monitor costs, latency, and quality metrics to optimize the pipeline.

Frequently Asked Questions

How long should I wait before timing out a generation request?

Most image generation takes 10–30 seconds. Set timeout at 60 seconds for safety, with status polling every 2–3 seconds. After timeout, inform the user and offer to retry.

Should I store generated images permanently or cache them?

Cache based on hash of (prompt, style, seed). Set a TTL (time-to-live) of 30–90 days for commonly generated images, then delete to save storage. For user-specific generations, store per-user for reuse and history.

How do I prevent abuse (users requesting harmful content)?

Use content safety filters on input (block known harmful terms) and output (use moderation APIs). Log failed attempts. Implement rate limiting per user/IP address. For high-risk apps, require human review of unusual requests.

Can I reduce generation latency?

Yes: pre-warm models, use faster models, cache and reuse results, and run generation in parallel across multiple servers. For real-time apps, use specialized low-latency models if available.

What's the cost of running a generation service?

Varies by model and volume. Rough estimates: SD3 at $0.005–0.01 per image, DALL-E 3 at $0.02 per image. For 1,000 images/day, expect $50–200/month. Use cost tracking and alert when spending exceeds budget.

Further Reading