Inpainting Techniques: Precision Image Editing
Inpainting is diffusion-based image editing that regenerates specific regions of an image while preserving the surrounding context. Unlike traditional content-aware fill or clone stamp tools, inpainting understands semantic content and generates photorealistic replacements that seamlessly blend with the existing image. This tutorial covers mask design, prompt context for inpainting, and production workflows for batch editing.
How Inpainting Works
Inpainting takes three inputs: (1) an original image, (2) a binary mask indicating which pixels to edit (white = regenerate, black = preserve), and (3) a text prompt describing what should be in the masked region. The diffusion model then samples noise only in the masked area while conditioning on both the prompt and the surrounding pixels, resulting in a seamless blend.
Mathematically, inpainting freezes the unmasked regions in the latent space and runs the diffusion process only on the masked region. This preserves fine details outside the mask—lighting, shadows, perspective—while allowing complete regeneration of masked content. The key challenge is designing masks that indicate precisely what should be edited without introducing hard boundaries.
Mask Design Strategies
Hard Masks vs. Soft Masks
A hard mask uses sharp 0/255 boundaries, indicating clear edit regions:
Hard mask example (binary):
- White pixels (255): regenerate this area
- Black pixels (0): preserve this area
- Sharp transition between regions
Soft masks use feathering or gradients to create smooth transitions:
Soft mask benefits:
- Feathered edges blend seamlessly with surrounding pixels
- Reduces visible boundaries and artifacts at mask edges
- Useful when editing regions that should smoothly fade (e.g., removing objects)
For most production work, soft masks outperform hard masks. Feather edges by 10–30 pixels depending on resolution and the size of the edit region. In Python using PIL:
from PIL import Image, ImageFilter
import numpy as np
def create_feathered_mask(width, height, object_bounds, feather_radius=20):
"""Create a soft-edged mask with feathering."""
# Create binary mask
mask = Image.new("L", (width, height), 0)
mask.paste(255, object_bounds) # object_bounds = (left, top, right, bottom)
# Apply Gaussian blur for feathering
mask = mask.filter(ImageFilter.GaussianBlur(radius=feather_radius))
return mask
# Usage
original_image = Image.open("product.jpg")
width, height = original_image.size
mask = create_feathered_mask(width, height, object_bounds=(100, 100, 400, 500), feather_radius=25)
Bounding Box Masks
For object replacement, a bounding box mask is fast and effective:
def create_bounding_box_mask(width, height, bbox, feather=20):
"""Create a mask from a bounding box (xmin, ymin, xmax, ymax)."""
mask = Image.new("L", (width, height), 0)
xmin, ymin, xmax, ymax = bbox
mask.paste(255, (xmin, ymin, xmax, ymax))
mask = mask.filter(ImageFilter.GaussianBlur(radius=feather))
return mask
# Remove a person from a photo by masking their bounding box
bbox = (150, 50, 400, 600) # x_min, y_min, x_max, y_max
mask = create_bounding_box_mask(width, height, bbox, feather=20)
Hand-Drawn and Semantic Masks
For more precise edits, use hand-drawn masks or semantic segmentation:
# Semantic mask example (using a segmentation model like SAM)
from segment_anything import sam_model_registry, SamPredictor
sam = sam_model_registry["vit_h"](checkpoint="sam_vit_h_4b8939.pth")
predictor = SamPredictor(sam)
image = np.array(Image.open("scene.jpg"))
predictor.set_image(image)
# Get mask for a specific object by clicking its center
masks, scores, logits = predictor.predict(point_coords=np.array([[x, y]]), point_labels=np.array([1]))
mask = masks[0] # Take highest confidence mask
Prompt Engineering for Inpainting
Inpainting prompts differ subtly from text-to-image prompts. The prompt should describe what should appear in the masked region while being aware of the surrounding context. Include contextual details that help the model blend seamlessly:
Inpainting prompt example 1 (object replacement):
Original image: photo of a red car parked on a street
Mask region: the car
Inpainting prompt:
"a blue car parked on the same street, afternoon sunlight, realistic shadows,
perspective matching, same road surface and environment"
Good: includes color change (red to blue), context (same street, afternoon light),
and blending cues (matching perspective and road)
Inpainting prompt example 2 (person removal):
Original image: family photo with a person on the left
Mask region: the person
Inpainting prompt:
"a blurred garden background, green foliage, depth of field, soft focus,
matching the bokeh of the original image"
Good: describes what should replace the person, matches the original bokeh
and depth effects
Include lighting and environmental cues that match the original image:
def build_inpainting_prompt(original_context, edit_description, include_context=True):
"""Build a context-aware inpainting prompt."""
if include_context:
prompt = f"{edit_description}, seamlessly blended, same lighting, "
prompt += f"matching perspective and shadows from {original_context}"
else:
prompt = edit_description
return prompt
# Usage
original = "outdoor scene with warm afternoon sunlight"
edit = "a wooden bench"
prompt = build_inpainting_prompt(original, edit, include_context=True)
# Result: "a wooden bench, seamlessly blended, same lighting, matching perspective
# and shadows from outdoor scene with warm afternoon sunlight"
Comparison: Inpainting Approaches
| Approach | Mask Type | Strength | Limitation | Best For |
|---|---|---|---|---|
| Bounding box | Hard rectangular | Fast, simple | Visible edges if not feathered | Quick product replacement |
| Feathered box | Soft rectangular | Good blending | May over-smooth | Most production edits |
| Hand-drawn | Custom polygon | Precise control | Time-consuming | Fine detail work |
| Semantic segment | AI-generated | Pixel-perfect masks | Computational cost | Complex scenes, multiple objects |
Production Inpainting Workflow
import anthropic
from PIL import Image, ImageFilter
import json
class InpaintingPipeline:
def __init__(self, model="stable-diffusion-3"):
self.client = anthropic.Anthropic()
self.model = model
def inpaint(self, image_path, mask_path, prompt, strength=0.9):
"""Execute an inpainting operation."""
# Load image and mask
image = Image.open(image_path).convert("RGB")
mask = Image.open(mask_path).convert("L")
# Verify dimensions match
assert image.size == mask.size, "Image and mask dimensions must match"
# Call inpainting API
result = self.client.messages.create(
model=self.model,
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": self._encode_image(image_path)
}
},
{
"type": "text",
"text": f"Inpaint prompt: {prompt}"
}
]
}
]
)
return result
def batch_inpaint(self, edits_list):
"""Process multiple inpainting operations."""
results = []
for edit in edits_list:
result = self.inpaint(
image_path=edit["image"],
mask_path=edit["mask"],
prompt=edit["prompt"],
strength=edit.get("strength", 0.9)
)
results.append({
"image": edit["image"],
"prompt": edit["prompt"],
"status": "completed",
"output": result
})
return results
def _encode_image(self, image_path):
import base64
with open(image_path, "rb") as f:
return base64.b64encode(f.read()).decode()
# Usage
pipeline = InpaintingPipeline()
edits = [
{"image": "car_red.jpg", "mask": "car_mask.jpg", "prompt": "a blue car, same angle, sunlight"},
{"image": "portrait.jpg", "mask": "background_mask.jpg", "prompt": "a bokeh garden background"}
]
results = pipeline.batch_inpaint(edits)
Advanced Techniques: Strength and Blending
The strength parameter (0–1) controls how much the model modifies the masked region. Low strength (0.3–0.5) makes subtle changes; high strength (0.8–1.0) regenerates completely. For seamless edits, use strength 0.7–0.9:
# Strength tuning for different edit types
strength_recommendations = {
"object_replacement": 0.85, # High: complete change
"color_adjustment": 0.6, # Medium: subtle modification
"background_change": 0.9, # High: isolated background
"texture_refinement": 0.4 # Low: gentle enhancement
}
Key Takeaways
- Inpainting regenerates masked regions while preserving surrounding pixels and context.
- Design soft-edged masks with feathering (10–30 pixel radius) for seamless blending.
- Inpainting prompts should include contextual and lighting cues to match the original image.
- Use feathered bounding boxes for most production work; use semantic segmentation for complex scenes.
- Set strength to 0.7–0.9 for natural blending; lower for subtle adjustments, higher for complete replacement.
- Log mask and prompt parameters for audit trails and reproducibility.
Frequently Asked Questions
Why does my inpainting produce visible seams at the mask boundary?
Hard mask edges cause seams. Always feather your masks with a 15–25 pixel Gaussian blur radius. If seams persist, increase feathering or reduce the strength parameter to 0.7–0.8. Also ensure your inpainting prompt includes contextual details (lighting, shadows) that help the model blend naturally.
Can I inpaint multiple regions in one image?
Technically yes, but results degrade with many simultaneous edits. For best quality, inpaint one region at a time, save the result, then use that as input for the next inpaint operation. This ensures each edit respects previous changes and maintains consistency.
What's the difference between inpainting and outpainting?
Inpainting modifies existing regions within the image bounds. Outpainting extends the image beyond its original boundaries (covered in the next article). They use similar techniques but serve different purposes.
How do I prevent the inpainting model from changing unmasked areas?
Use a high-quality model (Stable Diffusion 3) and ensure your mask is precise. Sometimes the model "drifts" and subtly modifies unmasked areas. To minimize this, use a lower strength parameter and run multiple iterations if needed, comparing results.
Can I use inpainting to remove watermarks?
Yes, but results vary. Create a mask around the watermark, then use a prompt like "clean background, no text, matching original". The quality depends on the background complexity—simple solid backgrounds work well; complex textures may require touch-up.