Files
dotfiles/dotfiles/agents/skills/.system/imagegen/references/cli.md

6.1 KiB

CLI reference (scripts/image_gen.py)

This file is for the fallback CLI mode only. Read it only after the user explicitly asks to use scripts/image_gen.py instead of the built-in image_gen tool.

generate-batch is a CLI subcommand in this fallback path. It is not a top-level mode of the skill.

What this CLI does

  • generate: generate a new image from a prompt
  • edit: edit one or more existing images
  • generate-batch: run many generation jobs from a JSONL file

Real API calls require network access + OPENAI_API_KEY. --dry-run does not.

Quick start (works from any repo)

Set a stable path to the skill CLI (default CODEX_HOME is ~/.codex):

export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
export IMAGE_GEN="$CODEX_HOME/skills/imagegen/scripts/image_gen.py"

Install dependencies into that environment with its package manager. In uv-managed environments, uv pip install ... remains the preferred path.

Quick start

Dry-run (no API call; no network required; does not require the openai package):

python "$IMAGE_GEN" generate \
  --prompt "Test" \
  --out output/imagegen/test.png \
  --dry-run

Notes:

  • One-off dry-runs print the API payload and the computed output path(s).
  • Repo-local finals should live under output/imagegen/.

Generate (requires OPENAI_API_KEY + network):

python "$IMAGE_GEN" generate \
  --prompt "A cozy alpine cabin at dawn" \
  --size 1024x1024 \
  --out output/imagegen/alpine-cabin.png

Edit:

python "$IMAGE_GEN" edit \
  --image input.png \
  --prompt "Replace only the background with a warm sunset" \
  --out output/imagegen/sunset-edit.png

Guardrails

  • Use the bundled CLI directly (python "$IMAGE_GEN" ...) after activating the correct environment.
  • Do not create one-off runners (for example gen_images.py) unless the user explicitly asks for a custom wrapper.
  • Never modify scripts/image_gen.py. If something is missing, ask the user before doing anything else.

Defaults

  • Model: gpt-image-1.5
  • Supported model family for this CLI: GPT Image models (gpt-image-*)
  • Size: 1024x1024
  • Quality: auto
  • Output format: png
  • Default one-off output path: output/imagegen/output.png
  • Background: unspecified unless --background is set

Quality, input fidelity, and masks (CLI fallback only)

These are explicit CLI controls. They are not built-in image_gen tool arguments.

  • --quality works for generate, edit, and generate-batch: low|medium|high|auto
  • --input-fidelity is edit-only and validated as low|high
  • --mask is edit-only

Example:

python "$IMAGE_GEN" edit \
  --image input.png \
  --prompt "Change only the background" \
  --quality high \
  --input-fidelity high \
  --out output/imagegen/background-edit.png

Mask notes:

  • For multi-image edits, pass repeated --image flags. Their order is meaningful, so describe each image by index and role in the prompt.
  • The CLI accepts a single --mask.
  • Use a PNG mask when possible; the script treats mask handling as best-effort and does not perform full preflight validation beyond file checks/warnings.
  • In the edit prompt, repeat invariants (change only the background; keep the subject unchanged) to reduce drift.

Output handling

  • Use tmp/imagegen/ for temporary JSONL inputs or scratch files.
  • Use output/imagegen/ for final outputs.
  • Reruns fail if a target file already exists unless you pass --force.
  • --out-dir changes one-off naming to image_1.<ext>, image_2.<ext>, and so on.
  • Downscaled copies use the default suffix -web unless you override it.

Common recipes

Generate with augmentation fields:

python "$IMAGE_GEN" generate \
  --prompt "A minimal hero image of a ceramic coffee mug" \
  --use-case "product-mockup" \
  --style "clean product photography" \
  --composition "wide product shot with usable negative space for page copy" \
  --constraints "no logos, no text" \
  --out output/imagegen/mug-hero.png

Generate + also write a downscaled copy for fast web loading:

python "$IMAGE_GEN" generate \
  --prompt "A cozy alpine cabin at dawn" \
  --size 1024x1024 \
  --downscale-max-dim 1024 \
  --out output/imagegen/alpine-cabin.png

Generate multiple prompts concurrently (async batch):

mkdir -p tmp/imagegen output/imagegen/batch
cat > tmp/imagegen/prompts.jsonl << 'EOF'
{"prompt":"Cavernous hangar interior with a compact shuttle parked near the center","use_case":"stylized-concept","composition":"wide-angle, low-angle","lighting":"volumetric light rays through drifting fog","constraints":"no logos or trademarks; no watermark","size":"1536x1024"}
{"prompt":"Gray wolf in profile in a snowy forest","use_case":"photorealistic-natural","composition":"eye-level","constraints":"no logos or trademarks; no watermark","size":"1024x1024"}
EOF

python "$IMAGE_GEN" generate-batch \
  --input tmp/imagegen/prompts.jsonl \
  --out-dir output/imagegen/batch \
  --concurrency 5

rm -f tmp/imagegen/prompts.jsonl

Notes:

  • generate-batch requires --out-dir.
  • generate-batch requires --out-dir.
  • Use --concurrency to control parallelism (default 5).
  • Per-job overrides are supported in JSONL (for example size, quality, background, output_format, output_compression, moderation, n, model, out, and prompt-augmentation fields).
  • --n generates multiple variants for a single prompt; generate-batch is for many different prompts.
  • In batch mode, per-job out is treated as a filename under --out-dir.

CLI notes

  • Supported sizes: 1024x1024, 1536x1024, 1024x1536, or auto.
  • Transparent backgrounds require output_format to be png or webp.
  • --prompt-file, --output-compression, --moderation, --max-attempts, --fail-fast, --force, and --no-augment are supported.
  • This CLI is intended for GPT Image models. Do not assume older non-GPT image-model behavior applies here.

See also

  • API parameter quick reference for fallback CLI mode: references/image-api.md
  • Prompt examples shared across both top-level modes: references/sample-prompts.md
  • Network/sandbox notes for fallback CLI mode: references/codex-network.md