Files

Ivan Malison 78157e7782 codex: add imagegen and plugin-creator skills

2026-04-18 19:05:37 -07:00

6.1 KiB

Raw Blame History

CLI reference (`scripts/image_gen.py`)

This file is for the fallback CLI mode only. Read it only after the user explicitly asks to use scripts/image_gen.py instead of the built-in image_gen tool.

generate-batch is a CLI subcommand in this fallback path. It is not a top-level mode of the skill.

What this CLI does

generate: generate a new image from a prompt
edit: edit one or more existing images
generate-batch: run many generation jobs from a JSONL file

Real API calls require network access + OPENAI_API_KEY. --dry-run does not.

Quick start (works from any repo)

Set a stable path to the skill CLI (default CODEX_HOME is ~/.codex):

export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
export IMAGE_GEN="$CODEX_HOME/skills/imagegen/scripts/image_gen.py"

Install dependencies into that environment with its package manager. In uv-managed environments, uv pip install ... remains the preferred path.

Quick start

Dry-run (no API call; no network required; does not require the openai package):

python "$IMAGE_GEN" generate \
  --prompt "Test" \
  --out output/imagegen/test.png \
  --dry-run

Notes:

One-off dry-runs print the API payload and the computed output path(s).
Repo-local finals should live under output/imagegen/.

Generate (requires OPENAI_API_KEY + network):

python "$IMAGE_GEN" generate \
  --prompt "A cozy alpine cabin at dawn" \
  --size 1024x1024 \
  --out output/imagegen/alpine-cabin.png

Edit:

python "$IMAGE_GEN" edit \
  --image input.png \
  --prompt "Replace only the background with a warm sunset" \
  --out output/imagegen/sunset-edit.png

Guardrails

Use the bundled CLI directly (python "$IMAGE_GEN" ...) after activating the correct environment.
Do not create one-off runners (for example gen_images.py) unless the user explicitly asks for a custom wrapper.
Never modify scripts/image_gen.py. If something is missing, ask the user before doing anything else.

Defaults

Model: gpt-image-1.5
Supported model family for this CLI: GPT Image models (gpt-image-*)
Size: 1024x1024
Quality: auto
Output format: png
Default one-off output path: output/imagegen/output.png
Background: unspecified unless --background is set

Quality, input fidelity, and masks (CLI fallback only)

These are explicit CLI controls. They are not built-in image_gen tool arguments.

--quality works for generate, edit, and generate-batch: low|medium|high|auto
--input-fidelity is edit-only and validated as low|high
--mask is edit-only

Example:

python "$IMAGE_GEN" edit \
  --image input.png \
  --prompt "Change only the background" \
  --quality high \
  --input-fidelity high \
  --out output/imagegen/background-edit.png

Mask notes:

For multi-image edits, pass repeated --image flags. Their order is meaningful, so describe each image by index and role in the prompt.
The CLI accepts a single --mask.
Use a PNG mask when possible; the script treats mask handling as best-effort and does not perform full preflight validation beyond file checks/warnings.
In the edit prompt, repeat invariants (change only the background; keep the subject unchanged) to reduce drift.

Output handling

Use tmp/imagegen/ for temporary JSONL inputs or scratch files.
Use output/imagegen/ for final outputs.
Reruns fail if a target file already exists unless you pass --force.
--out-dir changes one-off naming to image_1.<ext>, image_2.<ext>, and so on.
Downscaled copies use the default suffix -web unless you override it.

Common recipes

Generate with augmentation fields:

python "$IMAGE_GEN" generate \
  --prompt "A minimal hero image of a ceramic coffee mug" \
  --use-case "product-mockup" \
  --style "clean product photography" \
  --composition "wide product shot with usable negative space for page copy" \
  --constraints "no logos, no text" \
  --out output/imagegen/mug-hero.png

Generate + also write a downscaled copy for fast web loading:

python "$IMAGE_GEN" generate \
  --prompt "A cozy alpine cabin at dawn" \
  --size 1024x1024 \
  --downscale-max-dim 1024 \
  --out output/imagegen/alpine-cabin.png

Generate multiple prompts concurrently (async batch):

mkdir -p tmp/imagegen output/imagegen/batch
cat > tmp/imagegen/prompts.jsonl << 'EOF'
{"prompt":"Cavernous hangar interior with a compact shuttle parked near the center","use_case":"stylized-concept","composition":"wide-angle, low-angle","lighting":"volumetric light rays through drifting fog","constraints":"no logos or trademarks; no watermark","size":"1536x1024"}
{"prompt":"Gray wolf in profile in a snowy forest","use_case":"photorealistic-natural","composition":"eye-level","constraints":"no logos or trademarks; no watermark","size":"1024x1024"}
EOF

python "$IMAGE_GEN" generate-batch \
  --input tmp/imagegen/prompts.jsonl \
  --out-dir output/imagegen/batch \
  --concurrency 5

rm -f tmp/imagegen/prompts.jsonl