modify template

This commit is contained in:
김경종
2026-06-10 17:12:23 +09:00
parent 2d59191df2
commit df3cc3e890
186 changed files with 24935 additions and 2 deletions
@@ -0,0 +1 @@
f13290a7889cc9e1
+201
View File
@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf of
any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don\'t include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
+356
View File
@@ -0,0 +1,356 @@
---
name: "imagegen"
description: "Generate or edit raster images when the task benefits from AI-created bitmap visuals such as photos, illustrations, textures, sprites, mockups, or transparent-background cutouts. Use when Codex should create a brand-new image, transform an existing image, or derive visual variants from references, and the output should be a bitmap asset rather than repo-native code or vector. Do not use when the task is better handled by editing existing SVG/vector/code-native assets, extending an established icon or logo system, or building the visual directly in HTML/CSS/canvas."
---
# Image Generation Skill
Generates or edits images for the current project (for example website assets, game assets, UI mockups, product mockups, wireframes, logo design, photorealistic images, or infographics).
## Top-level modes and rules
This skill has exactly two top-level modes:
- **Default built-in tool mode (preferred):** built-in `image_gen` tool for normal image generation, editing, and simple transparent-image requests. Does not require `OPENAI_API_KEY`.
- **Fallback CLI mode:** `scripts/image_gen.py` CLI. Use when the user explicitly asks for the CLI/API/model path, or after the user explicitly confirms a true model-native transparency fallback with `gpt-image-1.5`. Requires `OPENAI_API_KEY`.
Within CLI fallback, the CLI exposes three subcommands:
- `generate`
- `edit`
- `generate-batch`
Rules:
- Use the built-in `image_gen` tool by default for normal image generation and editing requests.
- Do not switch to CLI fallback for ordinary quality, size, or file-path control.
- If the user explicitly asks for a transparent image/background, stay on built-in `image_gen` first: prompt for a flat removable chroma-key background, then remove it locally with the installed helper at `$CODEX_HOME/skills/.system/imagegen/scripts/remove_chroma_key.py`.
- Never silently switch from built-in `image_gen` or CLI `gpt-image-2` to CLI `gpt-image-1.5`. Treat this as a model/path downgrade and ask the user before doing it, unless the user has already explicitly requested `gpt-image-1.5`, `scripts/image_gen.py`, or CLI fallback.
- If a transparent request appears too complex for clean chroma-key removal, asks for true/native transparency, or local removal fails validation, explain that true transparency requires CLI `gpt-image-1.5 --background transparent --output-format png` because `gpt-image-2` does not support `background=transparent`, then ask whether to proceed. Run the CLI fallback only after the user confirms.
- The word `batch` by itself does not mean CLI fallback. If the user asks for many assets or says to batch-generate assets without explicitly asking for CLI/API/model controls, stay on the built-in path and issue one built-in call per requested asset or variant.
- If the built-in tool fails or is unavailable, tell the user the CLI fallback exists and that it requires `OPENAI_API_KEY`. Proceed only if the user explicitly asks for that fallback.
- If the user explicitly asks for CLI mode, use the bundled `scripts/image_gen.py` workflow. Do not create one-off SDK runners.
- Never modify `scripts/image_gen.py`. If something is missing, ask the user before doing anything else.
Built-in save-path policy:
- In built-in tool mode, Codex saves generated images under `$CODEX_HOME/*` by default.
- Do not describe or rely on OS temp as the default built-in destination.
- Do not describe or rely on a destination-path argument (if any) on the built-in `image_gen` tool. If a specific location is needed, generate first and then move or copy the selected output from `$CODEX_HOME/generated_images/...`.
- Save-path precedence in built-in mode:
1. If the user names a destination, move or copy the selected output there.
2. If the image is meant for the current project, move or copy the final selected image into the workspace before finishing.
3. If the image is only for preview or brainstorming, render it inline; the underlying file can remain at the default `$CODEX_HOME/*` path.
- Never leave a project-referenced asset only at the default `$CODEX_HOME/*` path.
- Do not overwrite an existing asset unless the user explicitly asked for replacement; otherwise create a sibling versioned filename such as `hero-v2.png` or `item-icon-edited.png`.
Shared prompt guidance for both modes lives in `references/prompting.md` and `references/sample-prompts.md`.
Fallback-only docs/resources for CLI mode:
- `references/cli.md`
- `references/image-api.md`
- `references/codex-network.md`
- `scripts/image_gen.py`
Local post-processing helper:
- `$CODEX_HOME/skills/.system/imagegen/scripts/remove_chroma_key.py`: removes a flat chroma-key background from a generated image and writes a PNG/WebP with alpha. Prefer auto-key sampling, soft matte, and despill for antialiased edges.
## When to use
- Generate a new image (concept art, product shot, cover, website hero)
- Generate a new image using one or more reference images for style, composition, or mood
- Edit an existing image (inpainting, lighting or weather transformations, background replacement, object removal, compositing, transparent background)
- Produce many assets or variants for one task
## When not to use
- Extending or matching an existing SVG/vector icon set, logo system, or illustration library inside the repo
- Creating simple shapes, diagrams, wireframes, or icons that are better produced directly in SVG, HTML/CSS, or canvas
- Making a small project-local asset edit when the source file already exists in an editable native format
- Any task where the user clearly wants deterministic code-native output instead of a generated bitmap
## Decision tree
Think about two separate questions:
1. **Intent:** is this a new image or an edit of an existing image?
2. **Execution strategy:** is this one asset or many assets/variants?
Intent:
- If the user wants to modify an existing image while preserving parts of it, treat the request as **edit**.
- If the user provides images only as references for style, composition, mood, or subject guidance, treat the request as **generate**.
- If the user provides no images, treat the request as **generate**.
Built-in edit semantics:
- Built-in edit mode is for images already visible in the conversation context, such as attached images or images generated earlier in the thread.
- If the user wants to edit a local image file with the built-in tool, first load it with built-in `view_image` tool so the image is visible in the conversation context, then proceed with the built-in edit flow.
- Do not promise arbitrary filesystem-path editing through the built-in tool.
- If a local file still needs direct file-path control, masks, or other explicit CLI-only parameters, use the explicit CLI fallback only when the user asks for it.
- For edits, preserve invariants aggressively and save non-destructively by default.
Execution strategy:
- In the built-in default path, produce many assets or variants by issuing one `image_gen` call per requested asset or variant.
- In the CLI fallback path, use the CLI `generate-batch` subcommand only when the user explicitly chose CLI mode and needs many prompts/assets.
- For many distinct assets, do not use `n` as a substitute for separate prompts. `n` is for variants of one prompt; distinct assets need distinct built-in calls or distinct CLI `generate-batch` jobs.
Assume the user wants a new image unless they clearly ask to change an existing one.
## Workflow
1. Decide the top-level mode: built-in by default, including simple transparent-output requests; fallback CLI only if explicitly requested or after the user explicitly confirms a transparent-output fallback.
2. Decide the intent: `generate` or `edit`.
3. Decide whether the output is preview-only or meant to be consumed by the current project.
4. Decide the execution strategy: single asset vs repeated built-in calls vs CLI `generate-batch`.
5. Collect inputs up front: prompt(s), exact text (verbatim), constraints/avoid list, and any input images.
6. For every input image, label its role explicitly:
- reference image
- edit target
- supporting insert/style/compositing input
7. If the edit target is only on the local filesystem and you are staying on the built-in path, inspect it with `view_image` first so the image is available in conversation context.
8. If the user asked for a photo, illustration, sprite, product image, banner, or other explicitly raster-style asset, use `image_gen` rather than substituting SVG/HTML/CSS placeholders. If the request is for an icon, logo, or UI graphic that should match existing repo-native SVG/vector/code assets, prefer editing those directly instead.
9. Augment the prompt based on specificity:
- If the user's prompt is already specific and detailed, normalize it into a clear spec without adding creative requirements.
- If the user's prompt is generic, add tasteful augmentation only when it materially improves output quality.
10. Use the built-in `image_gen` tool by default.
11. For transparent-output requests, follow the transparent image guidance below: generate with built-in `image_gen` on a flat chroma-key background, copy the selected output into the workspace or `tmp/imagegen/`, run the installed `$CODEX_HOME/skills/.system/imagegen/scripts/remove_chroma_key.py` helper, and validate the alpha result before using it. If this path looks unsuitable or fails, ask before switching to CLI `gpt-image-1.5`.
12. Inspect outputs and validate: subject, style, composition, text accuracy, and invariants/avoid items.
13. Iterate with a single targeted change, then re-check.
14. For preview-only work, render the image inline; the underlying file may remain at the default `$CODEX_HOME/generated_images/...` path.
15. For project-bound work, move or copy the selected artifact into the workspace and update any consuming code or references. Never leave a project-referenced asset only at the default `$CODEX_HOME/generated_images/...` path.
16. For batches or multi-asset requests, persist every requested deliverable final in the workspace unless the user explicitly asked to keep outputs preview-only. Discarded variants do not need to be kept unless requested.
17. If the user explicitly chooses or confirms the CLI fallback, then use the fallback-only docs for model, quality, size, `input_fidelity`, masks, output format, output paths, and network setup.
18. Always report the final saved path(s) for any workspace-bound asset(s), plus the final prompt or prompt set and whether the built-in tool or fallback CLI mode was used.
## Transparent image requests
Transparent-image requests still use built-in `image_gen` first. Because the built-in tool does not expose a true transparent-background control, create a removable chroma-key source image and then convert the key color to alpha locally.
Default sequence:
1. Use built-in `image_gen` to generate the requested subject on a perfectly flat solid chroma-key background.
2. Choose a key color that is unlikely to appear in the subject: default `#00ff00`, use `#ff00ff` for green subjects, and avoid `#0000ff` for blue subjects.
3. After generation, move or copy the selected source image from `$CODEX_HOME/generated_images/...` into the workspace or `tmp/imagegen/`.
4. Run the installed helper path, not a project-relative script path:
```bash
python "${CODEX_HOME:-$HOME/.codex}/skills/.system/imagegen/scripts/remove_chroma_key.py" \
--input <source> \
--out <final.png> \
--auto-key border \
--soft-matte \
--transparent-threshold 12 \
--opaque-threshold 220 \
--despill
```
5. Validate that the output has an alpha channel, transparent corners, plausible subject coverage, and no obvious key-color fringe. If a thin fringe remains, retry once with `--edge-contract 1`; use `--edge-feather 0.25` only when the edge is visibly stair-stepped and the subject is not shiny or reflective.
6. Save the final alpha PNG/WebP in the project if the asset is project-bound. Never leave a project-referenced transparent asset only under `$CODEX_HOME/*`.
Prompt transparent requests like this:
```text
Create the requested subject on a perfectly flat solid #00ff00 chroma-key background for background removal.
The background must be one uniform color with no shadows, gradients, texture, reflections, floor plane, or lighting variation.
Keep the subject fully separated from the background with crisp edges and generous padding.
Do not use #00ff00 anywhere in the subject.
No cast shadow, no contact shadow, no reflection, no watermark, and no text unless explicitly requested.
```
Do not automatically use CLI `gpt-image-1.5 --background transparent --output-format png` instead of chroma keying. Ask the user first when the user asks for true/native transparency, when local removal fails validation, or when the requested image is complex: hair, fur, feathers, smoke, glass, liquids, translucent materials, reflective objects, soft shadows, realistic product grounding, or subject colors that conflict with all practical key colors.
Use a concise confirmation like:
```text
This likely needs true native transparency. The default built-in path uses a chroma-key background plus local removal, but true transparency requires the CLI fallback with gpt-image-1.5 because gpt-image-2 does not support background=transparent. It also requires OPENAI_API_KEY. Should I proceed with that CLI fallback?
```
## Prompt augmentation
Reformat user prompts into a structured, production-oriented spec. Make the user's goal clearer and more actionable, but do not blindly add detail.
Treat this as prompt-shaping guidance, not a closed schema. Use only the lines that help, and add a short extra labeled line when it materially improves clarity.
### Specificity policy
Use the user's prompt specificity to decide how much augmentation is appropriate:
- If the prompt is already specific and detailed, preserve that specificity and only normalize/structure it.
- If the prompt is generic, you may add tasteful augmentation when it will materially improve the result.
Allowed augmentations:
- composition or framing hints
- polish level or intended-use hints
- practical layout guidance
- reasonable scene concreteness that supports the stated request
Not allowed augmentations:
- extra characters or objects that are not implied by the request
- brand names, slogans, palettes, or narrative beats that are not implied
- arbitrary side-specific placement unless the surrounding layout supports it
## Use-case taxonomy (exact slugs)
Classify each request into one of these buckets and keep the slug consistent across prompts and references.
Generate:
- photorealistic-natural — candid/editorial lifestyle scenes with real texture and natural lighting.
- product-mockup — product/packaging shots, catalog imagery, merch concepts.
- ui-mockup — app/web interface mockups and wireframes; specify the desired fidelity.
- infographic-diagram — diagrams/infographics with structured layout and text.
- scientific-educational — classroom explainers, scientific diagrams, and learning visuals with required labels and accuracy constraints.
- ads-marketing — campaign concepts and ad creatives with audience, brand position, scene, and exact tagline/copy.
- productivity-visual — slide, chart, workflow, and data-heavy business visuals.
- logo-brand — logo/mark exploration, vector-friendly.
- illustration-story — comics, childrens book art, narrative scenes.
- stylized-concept — style-driven concept art, 3D/stylized renders.
- historical-scene — period-accurate/world-knowledge scenes.
Edit:
- text-localization — translate/replace in-image text, preserve layout.
- identity-preserve — try-on, person-in-scene; lock face/body/pose.
- precise-object-edit — remove/replace a specific element (including interior swaps).
- lighting-weather — time-of-day/season/atmosphere changes only.
- background-extraction — transparent background / clean cutout. Use built-in `image_gen` with chroma-key removal first for simple opaque subjects; ask before using CLI true transparency for complex subjects.
- style-transfer — apply reference style while changing subject/scene.
- compositing — multi-image insert/merge with matched lighting/perspective.
- sketch-to-render — drawing/line art to photoreal render.
## Shared prompt schema
Use the following labeled spec as shared prompt scaffolding for both top-level modes:
```text
Use case: <taxonomy slug>
Asset type: <where the asset will be used>
Primary request: <user's main prompt>
Input images: <Image 1: role; Image 2: role> (optional)
Scene/backdrop: <environment>
Subject: <main subject>
Style/medium: <photo/illustration/3D/etc>
Composition/framing: <wide/close/top-down; placement>
Lighting/mood: <lighting + mood>
Color palette: <palette notes>
Materials/textures: <surface details>
Text (verbatim): "<exact text>"
Constraints: <must keep/must avoid>
Avoid: <negative constraints>
```
Notes:
- `Asset type` and `Input images` are prompt scaffolding, not dedicated CLI flags.
- `Scene/backdrop` refers to the visual setting. It is not the same as the fallback CLI `background` parameter, which controls output transparency behavior.
- Fallback-only execution notes such as `Quality:`, `Input fidelity:`, masks, output format, and output paths belong in the CLI path only. Do not treat them as built-in `image_gen` tool arguments.
Augmentation rules:
- Keep it short.
- Add only the details needed to improve the prompt materially.
- For edits, explicitly list invariants (`change only X; keep Y unchanged`).
- If any critical detail is missing and blocks success, ask a question; otherwise proceed.
## Examples
### Generation example (hero image)
```text
Use case: product-mockup
Asset type: landing page hero
Primary request: a minimal hero image of a ceramic coffee mug
Style/medium: clean product photography
Composition/framing: wide composition with usable negative space for page copy if needed
Lighting/mood: soft studio lighting
Constraints: no logos, no text, no watermark
```
### Edit example (invariants)
```text
Use case: precise-object-edit
Asset type: product photo background replacement
Primary request: replace only the background with a warm sunset gradient
Constraints: change only the background; keep the product and its edges unchanged; no text; no watermark
```
## Prompting best practices
- Structure prompt as scene/backdrop -> subject -> details -> constraints.
- Include intended use (ad, UI mock, infographic) to set the mode and polish level.
- Use camera/composition language for photorealism.
- Only use SVG/vector stand-ins when the user explicitly asked for vector output or a non-image placeholder.
- Quote exact text and specify typography + placement.
- For tricky words, spell them letter-by-letter and require verbatim rendering.
- For multi-image inputs, reference images by index and describe how they should be used.
- For edits, repeat invariants every iteration to reduce drift.
- Iterate with single-change follow-ups.
- If the prompt is generic, add only the extra detail that will materially help.
- If the prompt is already detailed, normalize it instead of expanding it.
- For CLI fallback only, see `references/cli.md` and `references/image-api.md` for model, `quality`, `input_fidelity`, masks, output format, and output-path guidance.
- For transparent images, use the built-in-first chroma-key workflow unless the request is complex enough to need true CLI transparency; ask before switching to CLI `gpt-image-1.5`.
More principles shared by both modes: `references/prompting.md`.
Copy/paste specs shared by both modes: `references/sample-prompts.md`.
## Guidance by asset type
Asset-type templates (website assets, game assets, wireframes, logo) are consolidated in `references/sample-prompts.md`.
## gpt-image-2 guidance for CLI fallback
The fallback CLI defaults to `gpt-image-2`.
- Use `gpt-image-2` for new CLI/API workflows unless the request needs true model-native transparent output.
- If a transparent request may need CLI fallback, ask before using `gpt-image-1.5` unless the user already explicitly requested `gpt-image-1.5`, `scripts/image_gen.py`, or CLI fallback. Explain that the built-in chroma-key path is the default, but true transparency requires `gpt-image-1.5` because `gpt-image-2` does not support `background=transparent`.
- `gpt-image-2` always uses high fidelity for image inputs; do not set `input_fidelity` with this model.
- `gpt-image-2` supports `quality` values `low`, `medium`, `high`, and `auto`.
- Use `quality low` for fast drafts, thumbnails, and quick iterations. Use `medium`, `high`, or `auto` for final assets, dense text, diagrams, identity-sensitive edits, or high-resolution outputs.
- Square images are typically fastest to generate. Use `1024x1024` for fast square drafts.
- If the user asks for 4K-style output, use `3840x2160` for landscape or `2160x3840` for portrait.
- `gpt-image-2` size may be `auto` or `WIDTHxHEIGHT` if all constraints hold: max edge `<= 3840px`, both edges multiples of `16px`, long-to-short ratio `<= 3:1`, total pixels between `655,360` and `8,294,400`.
Popular `gpt-image-2` sizes:
- `1024x1024` square
- `1536x1024` landscape
- `1024x1536` portrait
- `2048x2048` 2K square
- `2048x1152` 2K landscape
- `3840x2160` 4K landscape
- `2160x3840` 4K portrait
- `auto`
## Fallback CLI mode only
### Temp and output conventions
These conventions apply only to the CLI fallback. They do not describe built-in `image_gen` output behavior.
- Use `tmp/imagegen/` for intermediate files (for example JSONL batches); delete them when done.
- Write final artifacts under `output/imagegen/`.
- Use `--out` or `--out-dir` to control output paths; keep filenames stable and descriptive.
### Dependencies
Prefer `uv` for dependency management in this repo.
Required Python package:
```bash
uv pip install openai
```
Required for local chroma-key removal and optional downscaling:
```bash
uv pip install pillow
```
Portability note:
- If you are using the installed skill outside this repo, install dependencies into that environment with its package manager.
- In uv-managed environments, `uv pip install ...` remains the preferred path.
### Environment
- `OPENAI_API_KEY` must be set for live API calls.
- Do not ask the user for `OPENAI_API_KEY` when using the built-in `image_gen` tool.
- Never ask the user to paste the full key in chat. Ask them to set it locally and confirm when ready.
If the key is missing, give the user these steps:
1. Create an API key in the OpenAI platform UI: https://platform.openai.com/api-keys
2. Set `OPENAI_API_KEY` as an environment variable in their system.
3. Offer to guide them through setting the environment variable for their OS/shell if needed.
If installation is not possible in this environment, tell the user which dependency is missing and how to install it into their active environment.
### Script-mode notes
- CLI commands + examples: `references/cli.md`
- API parameter quick reference: `references/image-api.md`
- Network approvals / sandbox settings for CLI mode: `references/codex-network.md`
## Reference map
- `references/prompting.md`: shared prompting principles for both modes.
- `references/sample-prompts.md`: shared copy/paste prompt recipes for both modes.
- `references/cli.md`: fallback-only CLI usage via `scripts/image_gen.py`.
- `references/image-api.md`: fallback-only API/CLI parameter reference.
- `references/codex-network.md`: fallback-only network/sandbox troubleshooting for CLI mode.
- `scripts/image_gen.py`: fallback-only CLI implementation. Do not load or use it unless the user explicitly chooses CLI mode or explicitly confirms a transparent request's true CLI transparency fallback.
- `$CODEX_HOME/skills/.system/imagegen/scripts/remove_chroma_key.py`: local post-processing helper for built-in transparent-image requests.
@@ -0,0 +1,6 @@
interface:
display_name: "Image Gen"
short_description: "Generate or edit images for websites, games, and more"
icon_small: "./assets/imagegen-small.svg"
icon_large: "./assets/imagegen.png"
default_prompt: "Use $imagegen to make or edit an image for this project."
@@ -0,0 +1,5 @@
<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" fill="currentColor" viewBox="0 0 16 16">
<path fill="currentColor" d="M7.51 6.827a1 1 0 1 1 .278 1.982 1 1 0 0 1-.278-1.982Z"/>
<path fill="currentColor" fill-rule="evenodd" d="M8.31 4.47c.368-.016.699.008 1.016.124l.186.075c.423.194.786.5 1.047.888l.067.107c.148.253.235.533.3.848.073.354.126.797.193 1.343l.277 2.25.088.745c.024.224.041.425.049.605.013.322-.004.615-.085.896l-.04.12a2.53 2.53 0 0 1-.802 1.115l-.16.118c-.281.189-.596.292-.956.366a9.46 9.46 0 0 1-.6.1l-.743.094-2.25.277c-.547.067-.99.121-1.35.136a2.765 2.765 0 0 1-.896-.085l-.12-.039a2.533 2.533 0 0 1-1.115-.802l-.118-.161c-.189-.28-.292-.596-.366-.956a9.42 9.42 0 0 1-.1-.599l-.094-.744-.276-2.25a17.884 17.884 0 0 1-.137-1.35c-.015-.367.009-.698.124-1.015l.076-.185c.193-.423.5-.787.887-1.048l.107-.067c.253-.148.534-.234.849-.3.354-.073.796-.126 1.343-.193l2.25-.277.744-.088c.224-.024.425-.041.606-.049Zm-2.905 5.978a1.47 1.47 0 0 0-.875.074c-.127.052-.267.146-.475.344-.212.204-.462.484-.822.889l-.314.351c.018.115.036.219.055.313.061.295.127.458.206.575l.07.094c.167.211.39.372.645.465l.109.032c.119.027.273.038.499.029.308-.013.7-.06 1.264-.13l2.25-.275.727-.093.198-.03-2.05-1.64a16.848 16.848 0 0 0-.96-.738c-.18-.121-.31-.19-.421-.23l-.106-.03Zm2.95-4.915c-.154.006-.33.021-.536.043l-.729.086-2.25.276c-.564.07-.956.118-1.257.18a1.937 1.937 0 0 0-.478.15l-.097.057a1.47 1.47 0 0 0-.515.608l-.044.107c-.048.133-.073.307-.06.608.012.307.06.7.129 1.264l.22 1.8.178-.197c.145-.159.278-.298.403-.418.255-.243.507-.437.809-.56l.181-.067a2.526 2.526 0 0 1 1.328-.06l.118.029c.27.079.517.215.772.387.287.194.619.46 1.03.789l2.52 2.016c.146-.148.26-.326.332-.524l.031-.109c.027-.119.039-.273.03-.499a8.311 8.311 0 0 0-.044-.536l-.086-.728-.276-2.25c-.07-.564-.118-.956-.18-1.258a1.935 1.935 0 0 0-.15-.477l-.057-.098a1.468 1.468 0 0 0-.608-.515l-.107-.043c-.133-.049-.306-.074-.607-.061Z" clip-rule="evenodd"/>
<path fill="currentColor" d="M7.783 1.272c.36.014.803.07 1.35.136l2.25.277.743.095c.224.03.423.062.6.099.36.074.675.177.955.366l.161.118c.364.29.642.675.802 1.115l.04.12c.081.28.098.574.085.896a9.42 9.42 0 0 1-.05.605l-.087.745-.277 2.25c-.067.547-.12.989-.193 1.343a2.765 2.765 0 0 1-.3.848l-.067.107a2.534 2.534 0 0 1-.415.474l-.086.064a.532.532 0 0 1-.622-.858l.13-.13c.04-.046.077-.094.111-.145l.057-.098c.055-.109.104-.256.15-.477.062-.302.11-.694.18-1.258l.276-2.25.086-.728c.022-.207.037-.382.043-.536.01-.226-.002-.38-.029-.5l-.032-.108a1.469 1.469 0 0 0-.464-.646l-.094-.069c-.118-.08-.28-.145-.575-.206a8.285 8.285 0 0 0-.53-.088l-.728-.092-2.25-.276c-.565-.07-.956-.117-1.264-.13a1.94 1.94 0 0 0-.5.029l-.108.032a1.469 1.469 0 0 0-.647.465l-.068.094c-.054.08-.102.18-.146.33l-.04.1a.533.533 0 0 1-.98-.403l.055-.166c.059-.162.133-.314.23-.457l.117-.16c.29-.365.675-.643 1.115-.803l.12-.04c.28-.08.574-.097.896-.084Z"/>
</svg>

After

Width:  |  Height:  |  Size: 2.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 KiB

@@ -0,0 +1,242 @@
# CLI reference (`scripts/image_gen.py`)
This file is for the fallback CLI mode only. Read it when the user explicitly asks to use `scripts/image_gen.py` / CLI / API / model controls, or after the user explicitly confirms that a transparent-output request should use the `gpt-image-1.5` true-transparency fallback path.
`generate-batch` is a CLI subcommand in this fallback path. It is not a top-level mode of the skill.
The word `batch` in a user request is not CLI opt-in by itself.
## What this CLI does
- `generate`: generate a new image from a prompt
- `edit`: edit one or more existing images
- `generate-batch`: run many generation jobs from a JSONL file after the user explicitly chooses CLI/API/model controls
Real API calls require **network access** + `OPENAI_API_KEY`. `--dry-run` does not.
## Quick start (works from any repo)
Set a stable path to the skill CLI (default `CODEX_HOME` is `~/.codex`):
```
export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
export IMAGE_GEN="$CODEX_HOME/skills/.system/imagegen/scripts/image_gen.py"
```
Install dependencies into that environment with its package manager. In uv-managed environments, `uv pip install ...` remains the preferred path.
## Quick start
Dry-run (no API call; no network required; does not require the `openai` package):
```bash
python "$IMAGE_GEN" generate \
--prompt "Test" \
--out output/imagegen/test.png \
--dry-run
```
Notes:
- One-off dry-runs print the API payload and the computed output path(s).
- Repo-local finals should live under `output/imagegen/`.
Generate (requires `OPENAI_API_KEY` + network):
```bash
python "$IMAGE_GEN" generate \
--prompt "A cozy alpine cabin at dawn" \
--size 1024x1024 \
--out output/imagegen/alpine-cabin.png
```
Edit:
```bash
python "$IMAGE_GEN" edit \
--image input.png \
--prompt "Replace only the background with a warm sunset" \
--out output/imagegen/sunset-edit.png
```
## Guardrails
- Use the bundled CLI directly (`python "$IMAGE_GEN" ...`) after activating the correct environment.
- Do **not** create one-off runners (for example `gen_images.py`) unless the user explicitly asks for a custom wrapper.
- **Never modify** `scripts/image_gen.py`. If something is missing, ask the user before doing anything else.
- Do not silently downgrade from CLI `gpt-image-2` or built-in `image_gen` to CLI `gpt-image-1.5`; ask first unless the user already explicitly requested `gpt-image-1.5`, `scripts/image_gen.py`, or CLI fallback.
## Defaults
- Model: `gpt-image-2`
- Supported model family for this CLI: GPT Image models (`gpt-image-*`)
- Size: `auto`
- Quality: `medium`
- Output format: `png`
- Default one-off output path: `output/imagegen/output.png`
- Background: unspecified unless `--background` is set
## gpt-image-2 size and model guidance
`gpt-image-2` is the default model for new CLI fallback work.
- Use `--quality low` for fast drafts, thumbnails, and quick iterations.
- Use `--quality medium`, `--quality high`, or `--quality auto` for final assets, dense text, diagrams, identity-sensitive edits, and high-resolution outputs.
- Square images are typically fastest. Use `--size 1024x1024` for quick square drafts.
- If the user asks for 4K-style output, use `--size 3840x2160` for landscape or `--size 2160x3840` for portrait.
- Do not pass `--input-fidelity` with `gpt-image-2`; this model always uses high fidelity for image inputs.
- Do not use `--background transparent` with `gpt-image-2`; the default transparent-image workflow uses built-in `image_gen` on a flat chroma-key background plus local removal. Use `gpt-image-1.5` only after the user explicitly confirms the true-transparent CLI fallback, unless they already requested `gpt-image-1.5`, `scripts/image_gen.py`, or CLI fallback.
Popular `gpt-image-2` sizes:
- `1024x1024`
- `1536x1024`
- `1024x1536`
- `2048x2048`
- `2048x1152`
- `3840x2160`
- `2160x3840`
- `auto`
`gpt-image-2` size constraints:
- max edge `<= 3840px`
- both edges multiples of `16px`
- long edge to short edge ratio `<= 3:1`
- total pixels between `655,360` and `8,294,400`
- outputs above `2560x1440` total pixels are experimental
Fast draft:
```bash
python "$IMAGE_GEN" generate \
--prompt "A product thumbnail of a matte ceramic mug on a stone surface" \
--quality low \
--size 1024x1024 \
--out output/imagegen/mug-draft.png
```
Final 2K landscape:
```bash
python "$IMAGE_GEN" generate \
--prompt "A polished landing-page hero image of a matte ceramic mug on a stone surface" \
--quality high \
--size 2048x1152 \
--out output/imagegen/mug-hero.png
```
4K landscape:
```bash
python "$IMAGE_GEN" generate \
--prompt "A detailed architectural visualization at golden hour" \
--size 3840x2160 \
--quality high \
--out output/imagegen/architecture-4k.png
```
True transparent fallback request:
Ask for confirmation before using this command unless the user already explicitly requested `gpt-image-1.5`, `scripts/image_gen.py`, or CLI fallback.
```bash
python "$IMAGE_GEN" generate \
--model gpt-image-1.5 \
--prompt "A clean product cutout on a transparent background" \
--background transparent \
--output-format png \
--out output/imagegen/product-cutout.png
```
When using this path, explain briefly that built-in `image_gen` plus chroma-key removal is the default transparent-image path, but this request needs true model-native transparency. `gpt-image-2` does not support `background=transparent`, so `gpt-image-1.5` is required for this confirmed fallback.
## Quality, input fidelity, and masks (CLI fallback only)
These are explicit CLI controls. They are not built-in `image_gen` tool arguments.
- `--quality` works for `generate`, `edit`, and `generate-batch`: `low|medium|high|auto`
- `--input-fidelity` is **edit-only** and validated as `low|high`; it is not supported for `gpt-image-2`
- `--mask` is **edit-only**
Example:
```bash
python "$IMAGE_GEN" edit \
--model gpt-image-1.5 \
--image input.png \
--prompt "Change only the background" \
--quality high \
--input-fidelity high \
--out output/imagegen/background-edit.png
```
Mask notes:
- For multi-image edits, pass repeated `--image` flags. Their order is meaningful, so describe each image by index and role in the prompt.
- The CLI accepts a single `--mask`.
- Image and mask must be the same size and format and each under 50MB.
- Masks must include an alpha channel.
- If multiple input images are provided, the mask applies to the first image.
- Masking is prompt-guided; do not promise exact pixel-perfect mask boundaries.
- Use a PNG mask when possible; the script treats mask handling as best-effort and does not perform full preflight validation beyond file checks/warnings.
- In the edit prompt, repeat invariants (`change only the background; keep the subject unchanged`) to reduce drift.
## Output handling
- Use `tmp/imagegen/` for temporary JSONL inputs or scratch files.
- Use `output/imagegen/` for final outputs.
- Reruns fail if a target file already exists unless you pass `--force`.
- `--out-dir` changes one-off naming to `image_1.<ext>`, `image_2.<ext>`, and so on.
- Downscaled copies use the default suffix `-web` unless you override it.
## Common recipes
Generate with augmentation fields:
```bash
python "$IMAGE_GEN" generate \
--prompt "A minimal hero image of a ceramic coffee mug" \
--use-case "product-mockup" \
--style "clean product photography" \
--composition "wide product shot with usable negative space for page copy" \
--constraints "no logos, no text" \
--out output/imagegen/mug-hero.png
```
Generate + also write a downscaled copy for fast web loading:
```bash
python "$IMAGE_GEN" generate \
--prompt "A cozy alpine cabin at dawn" \
--size 1024x1024 \
--downscale-max-dim 1024 \
--out output/imagegen/alpine-cabin.png
```
Generate multiple prompts concurrently (async batch):
```bash
mkdir -p tmp/imagegen output/imagegen/batch
cat > tmp/imagegen/prompts.jsonl << 'EOF'
{"prompt":"Cavernous hangar interior with a compact shuttle parked near the center","use_case":"stylized-concept","composition":"wide-angle, low-angle","lighting":"volumetric light rays through drifting fog","constraints":"no logos or trademarks; no watermark","size":"1536x1024"}
{"prompt":"Gray wolf in profile in a snowy forest","use_case":"photorealistic-natural","composition":"eye-level","constraints":"no logos or trademarks; no watermark","size":"1024x1024"}
EOF
python "$IMAGE_GEN" generate-batch \
--input tmp/imagegen/prompts.jsonl \
--out-dir output/imagegen/batch \
--concurrency 5
rm -f tmp/imagegen/prompts.jsonl
```
Notes:
- `generate-batch` requires `--out-dir`.
- generate-batch requires --out-dir.
- Use `--concurrency` to control parallelism (default `5`).
- Per-job overrides are supported in JSONL (for example `size`, `quality`, `background`, `output_format`, `output_compression`, `moderation`, `n`, `model`, `out`, and prompt-augmentation fields).
- `--n` generates multiple variants for a single prompt; `generate-batch` is for many different prompts.
- In batch mode, per-job `out` is treated as a filename under `--out-dir`.
- For many requested deliverable assets, provide one prompt/job per distinct asset and use semantic filenames when possible.
## CLI notes
- Supported sizes depend on the model. `gpt-image-2` supports flexible constrained sizes; older GPT Image models support `1024x1024`, `1536x1024`, `1024x1536`, or `auto`.
- True transparent CLI outputs require `output_format` to be `png` or `webp` and are not supported by `gpt-image-2`.
- `--prompt-file`, `--output-compression`, `--moderation`, `--max-attempts`, `--fail-fast`, `--force`, and `--no-augment` are supported.
- This CLI is intended for GPT Image models. Do not assume older non-GPT image-model behavior applies here.
## See also
- API parameter quick reference for fallback CLI mode: `references/image-api.md`
- Prompt examples shared across both top-level modes: `references/sample-prompts.md`
- Network/sandbox notes for fallback CLI mode: `references/codex-network.md`
- Built-in-first transparent image workflow: `SKILL.md` and `$CODEX_HOME/skills/.system/imagegen/scripts/remove_chroma_key.py`
@@ -0,0 +1,33 @@
# Codex network approvals / sandbox notes
This file is for the fallback CLI mode only. Read it when the user explicitly asks to use `scripts/image_gen.py` / CLI / API / model controls, or after the user explicitly confirms that a transparent-output request should use the `gpt-image-1.5` true-transparency fallback path.
This guidance is intentionally isolated from `SKILL.md` because it can vary by environment and may become stale. Prefer the defaults in your environment when in doubt.
## Why am I asked to approve image generation calls?
The fallback CLI uses the OpenAI Image API, so it needs outbound network access. In many Codex setups, network access is disabled by default and/or the approval policy requires confirmation before networked commands run.
## Important note about approvals vs network
- `--ask-for-approval never` suppresses approval prompts.
- It does **not** by itself enable network access.
- In `workspace-write`, network access still depends on your Codex configuration (for example `[sandbox_workspace_write] network_access = true`).
## How do I reduce repeated approval prompts?
If you trust the repo and want fewer prompts, use a configuration or profile that both:
- enables network for the sandbox mode you plan to use
- sets an approval policy that matches your risk tolerance
Example `~/.codex/config.toml` pattern:
```toml
approval_policy = "on-request"
sandbox_mode = "workspace-write"
[sandbox_workspace_write]
network_access = true
```
If you want quieter automation after network is enabled, you can choose a stricter approval policy, but do that intentionally and with care.
## Safety note
Enabling network and reducing approvals lowers friction, but increases risk if you run untrusted code or work in an untrusted repository.
@@ -0,0 +1,90 @@
# Image API quick reference
This file is for the fallback CLI mode only. Use it when the user explicitly asks to use `scripts/image_gen.py` / CLI / API / model controls, or after the user explicitly confirms that a transparent-output request should use the `gpt-image-1.5` true-transparency fallback path.
These parameters describe the Image API and bundled CLI fallback surface. Do not assume they are normal arguments on the built-in `image_gen` tool.
## Scope
- This fallback CLI is intended for GPT Image models (`gpt-image-2`, `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`).
- The built-in `image_gen` tool and the fallback CLI do not expose the same controls.
## Model summary
| Model | Quality | Input fidelity | Resolutions | Recommended use |
| --- | --- | --- | --- | --- |
| `gpt-image-2` | `low`, `medium`, `high`, `auto` | Always high fidelity for image inputs; do not set `input_fidelity` | `auto` or flexible sizes that satisfy the constraints below | Default for new CLI/API workflows: high-quality generation and editing, text-heavy images, photorealism, compositing, identity-sensitive edits, and workflows where fewer retries matter |
| `gpt-image-1.5` | `low`, `medium`, `high`, `auto` | `low`, `high` | `1024x1024`, `1024x1536`, `1536x1024`, `auto` | True transparent-background fallback and backward-compatible workflows |
| `gpt-image-1` | `low`, `medium`, `high`, `auto` | `low`, `high` | `1024x1024`, `1024x1536`, `1536x1024`, `auto` | Legacy compatibility |
| `gpt-image-1-mini` | `low`, `medium`, `high`, `auto` | `low`, `high` | `1024x1024`, `1024x1536`, `1536x1024`, `auto` | Cost-sensitive draft batches and lower-stakes previews |
## gpt-image-2 sizes
`gpt-image-2` accepts `auto` or any `WIDTHxHEIGHT` size that satisfies all constraints:
- Maximum edge length must be less than or equal to `3840px`.
- Both edges must be multiples of `16px`.
- Long edge to short edge ratio must not exceed `3:1`.
- Total pixels must be at least `655,360` and no more than `8,294,400`.
Popular sizes:
| Label | Size | Notes |
| --- | --- | --- |
| Square | `1024x1024` | Typical fast default |
| Landscape | `1536x1024` | Standard landscape |
| Portrait | `1024x1536` | Standard portrait |
| 2K square | `2048x2048` | Larger square output |
| 2K landscape | `2048x1152` | Widescreen output |
| 4K landscape | `3840x2160` | Widescreen 4K output |
| 4K portrait | `2160x3840` | Vertical 4K output |
| Auto | `auto` | Default size |
Square images are typically fastest to generate. For 4K-style output, use `3840x2160` or `2160x3840`.
## Endpoints
- Generate: `POST /v1/images/generations` (`client.images.generate(...)`)
- Edit: `POST /v1/images/edits` (`client.images.edit(...)`)
## Core parameters for GPT Image models
- `prompt`: text prompt
- `model`: image model
- `n`: number of images (1-10)
- `size`: `auto` by default for `gpt-image-2`; flexible `WIDTHxHEIGHT` sizes are allowed only for `gpt-image-2`; older GPT Image models use `1024x1024`, `1536x1024`, `1024x1536`, or `auto`
- `quality`: `low`, `medium`, `high`, or `auto`
- `background`: output transparency behavior (`transparent`, `opaque`, or `auto`) for generated output; this is not the same thing as the prompt's visual scene/backdrop
- `output_format`: `png` (default), `jpeg`, `webp`
- `output_compression`: 0-100 (jpeg/webp only)
- `moderation`: `auto` (default) or `low`
## Edit-specific parameters
- `image`: one or more input images. For GPT Image models, you can provide up to 16 images.
- `mask`: optional mask image
- `input_fidelity`: `low` or `high` only for models that support it; do not set this for `gpt-image-2`
Model-specific note for `input_fidelity`:
- `gpt-image-2` always uses high fidelity for image inputs and does not support setting `input_fidelity`.
- `gpt-image-1` and `gpt-image-1-mini` preserve all input images, but the first image gets richer textures and finer details.
- `gpt-image-1.5` preserves the first 5 input images with higher fidelity.
## Transparent backgrounds
`gpt-image-2` does not currently support the Image API `background=transparent` parameter. The skill's default transparent-image path is built-in `image_gen` with a flat chroma-key background, followed by local alpha extraction with `python "${CODEX_HOME:-$HOME/.codex}/skills/.system/imagegen/scripts/remove_chroma_key.py"`.
Use CLI `gpt-image-1.5` with `background=transparent` and a transparent-capable output format such as `png` or `webp` only after the user explicitly confirms that fallback, unless they already requested `gpt-image-1.5`, `scripts/image_gen.py`, or CLI fallback. If the user asks for true/native transparency, the subject is too complex for clean chroma-key removal, or local background removal fails validation, explain the tradeoff and ask before switching.
## Output
- `data[]` list with `b64_json` per image
- The bundled `scripts/image_gen.py` CLI decodes `b64_json` and writes output files for you.
## Limits and notes
- Input images and masks must be under 50MB.
- Use the edits endpoint when the user requests changes to an existing image.
- Masking is prompt-guided; exact shapes are not guaranteed.
- Large sizes and high quality increase latency and cost.
- Use `quality=low` for fast drafts, thumbnails, and quick iterations. Use `medium` or `high` for final assets, dense text, diagrams, identity-sensitive edits, or high-resolution outputs.
- High `input_fidelity` can materially increase input token usage on models that support it.
- If a request fails because a specific option is unsupported by the selected GPT Image model, retry manually without that option only when the option is not required by the user. If true transparent CLI output is required, ask before switching to `gpt-image-1.5` instead of dropping `background=transparent`, unless the user already explicitly chose that fallback.
## Important boundary
- `quality`, `input_fidelity`, explicit masks, `background`, `output_format`, and related parameters are fallback-only execution controls.
- Do not assume they are built-in `image_gen` tool arguments.
@@ -0,0 +1,118 @@
# Prompting best practices
These prompting principles are shared by both top-level modes of the skill:
- built-in `image_gen` tool (default)
- explicit `scripts/image_gen.py` CLI fallback
This file is about prompt structure, specificity, and iteration. Fallback-only execution controls such as `quality`, `input_fidelity`, masks, output format, and output paths live in the fallback docs.
## Contents
- [Structure](#structure)
- [Specificity policy](#specificity-policy)
- [Allowed and disallowed augmentation](#allowed-and-disallowed-augmentation)
- [Composition and layout](#composition-and-layout)
- [Constraints and invariants](#constraints-and-invariants)
- [Text in images](#text-in-images)
- [Input images and references](#input-images-and-references)
- [Iterate deliberately](#iterate-deliberately)
- [Transparent images](#transparent-images)
- [Fallback-only execution controls](#fallback-only-execution-controls)
- [Use-case tips](#use-case-tips)
- [Where to find copy/paste recipes](#where-to-find-copypaste-recipes)
## Structure
- Use a consistent order: scene/backdrop -> subject -> key details -> constraints -> output intent.
- Include intended use (ad, UI mock, infographic) to set the level of polish.
- For complex requests, use short labeled lines instead of one long paragraph.
## Specificity policy
- If the user prompt is already specific and detailed, normalize it into a clean spec without adding creative requirements.
- If the prompt is generic, you may add tasteful detail when it materially improves the output.
- Treat examples in `sample-prompts.md` as fully-authored recipes, not as the default amount of augmentation to add to every request.
- For photorealism, include `photorealistic` directly when that is the goal, plus concrete real-world texture such as pores, wrinkles, fabric wear, material grain, or imperfect everyday detail.
## Allowed and disallowed augmentation
Allowed augmentation for generic prompts:
- composition and framing cues
- intended-use or polish-level hints
- practical layout guidance
- reasonable scene concreteness that supports the request
Do not add:
- extra characters, props, or objects that are not implied
- brand palettes, slogans, or story beats that are not implied
- arbitrary side-specific placement unless the surrounding layout supports it
## Composition and layout
- Specify framing and viewpoint (close-up, wide, top-down) and placement only when it materially helps.
- Call out negative space if the asset clearly needs room for UI or copy.
- Avoid making left/right layout decisions unless the user or surrounding layout supports them.
- For people, describe body framing, scale, gaze, and object interactions when they matter (`full body visible`, `looking down at the book`, `hands naturally gripping the handlebars`).
## Constraints and invariants
- State what must not change (`keep background unchanged`).
- For edits, say `change only X; keep Y unchanged` and repeat invariants on every iteration to reduce drift.
## Text in images
- Put literal text in quotes or ALL CAPS and specify typography (font style, size, color, placement).
- Spell uncommon words letter-by-letter if accuracy matters.
- For in-image copy, require verbatim rendering and no extra characters.
- In CLI fallback mode, use `medium` or `high` quality for small text, dense infographics, data-heavy slides, multi-font layouts, legends, axes, and footnotes.
## Input images and references
- Do not assume that every provided image is an edit target.
- Label each image by index and role (`Image 1: edit target`, `Image 2: style reference`).
- If the user provides images for style, composition, or mood guidance and does not ask to modify them, treat the request as generation with references.
- If the user asks to preserve an existing image while changing specific parts, treat the request as an edit.
- For compositing, describe how the images interact (`place the subject from Image 2 into Image 1`).
## Iterate deliberately
- Start with a clean base prompt, then make small single-change edits.
- Re-specify critical constraints when you iterate.
- Prefer one targeted follow-up at a time over rewriting the whole prompt.
## Transparent images
- Use built-in `image_gen` first for transparent-image requests. If the subject is clearly too complex for chroma-key removal, explain the fallback and ask before switching to CLI.
- Prompt for a perfectly flat solid chroma-key background, usually `#00ff00`; use `#ff00ff` when the subject is green, and avoid key colors that appear in the subject.
- Explicitly prohibit shadows, gradients, floor planes, reflections, texture, and lighting variation in the background.
- Ask for crisp edges, generous padding, and no use of the key color inside the subject.
- After generation, remove the background locally with `python "${CODEX_HOME:-$HOME/.codex}/skills/.system/imagegen/scripts/remove_chroma_key.py" --input <source> --out <final.png> --auto-key border --soft-matte --transparent-threshold 12 --opaque-threshold 220 --despill` and validate the alpha result before shipping it.
- Use soft matte and despill for antialiased edges; hard tolerance-only removal is mainly for flat pixel-art or exact-color fixtures.
- Use CLI `gpt-image-1.5 --background transparent --output-format png` only after the user explicitly confirms the fallback, or when the user already explicitly requested `gpt-image-1.5`, `scripts/image_gen.py`, or CLI fallback. Ask first for true/native transparency requests, failed chroma-key validation, or complex transparent subjects such as hair, fur, glass, smoke, liquids, translucent materials, reflective objects, or soft shadows.
## Fallback-only execution controls
- `quality`, `input_fidelity`, explicit masks, output format, and output paths are fallback-only execution controls.
- Do not assume they are built-in `image_gen` tool arguments.
- If the user explicitly chooses CLI fallback, see `references/cli.md` and `references/image-api.md` for those controls.
- In CLI fallback mode, `gpt-image-2` is the default. It supports `quality=low|medium|high|auto`; use `low` for fast drafts and thumbnails, and move to `medium`, `high`, or `auto` for final assets.
- `gpt-image-2` always uses high fidelity for image inputs, so do not set `input_fidelity` with that model.
- If a transparent request needs true CLI transparency, ask before using `gpt-image-1.5` unless the user already explicitly chose it. Explain that built-in chroma-key removal is the default path, but `gpt-image-2` does not support `background=transparent`.
- If the user asks for 4K-style output with `gpt-image-2`, use `3840x2160` for landscape or `2160x3840` for portrait.
## Use-case tips
Generate:
- photorealistic-natural: Prompt as if a real photo is captured in the moment; use photography language (lens, lighting, framing); call for real texture; avoid over-stylized polish unless requested.
- product-mockup: Describe the product/packaging and materials; ensure clean silhouette and label clarity; if in-image text is needed, require verbatim rendering and specify typography.
- ui-mockup: Describe the target fidelity first (shippable mockup or low-fi wireframe), then focus on layout, hierarchy, and practical UI elements; avoid concept-art language.
- infographic-diagram: Define the audience and layout flow; label parts explicitly; require verbatim text; prefer higher quality in CLI mode for dense labels.
- logo-brand: Keep it simple and scalable; ask for a strong silhouette and balanced negative space; avoid decorative flourishes unless requested.
- ads-marketing: Write like a creative brief; include brand positioning, audience, desired vibe, scene, and exact tagline if text must appear.
- productivity-visual: Name the exact artifact (slide, chart, workflow diagram), define the canvas and hierarchy, provide real labels/data, and ask for readable typography and polished spacing.
- scientific-educational: Define audience, lesson objective, required labels, scientific constraints, arrows, and scan-friendly whitespace.
- illustration-story: Define panels or scene beats; keep each action concrete.
- stylized-concept: Specify style cues, material finish, and rendering approach (3D, painterly, clay) without inventing new story elements.
- historical-scene: State the location/date and required period accuracy; constrain clothing, props, and environment to match the era.
Edit:
- text-localization: Change only the text; preserve layout, typography, spacing, and hierarchy; no extra words or reflow unless needed.
- identity-preserve: Lock identity (face, body, pose, hair, expression); change only the specified elements; match lighting and shadows.
- precise-object-edit: Specify exactly what to remove/replace; preserve surrounding texture and lighting; keep everything else unchanged.
- lighting-weather: Change only environmental conditions (light, shadows, atmosphere, precipitation); keep geometry, framing, and subject identity.
- background-extraction: For simple opaque subjects, request a clean cutout on a perfectly flat chroma-key background; crisp silhouette; generous padding; no shadows; no halos; preserve label text exactly; no restyling. Ask before using true CLI transparency for complex subjects.
- style-transfer: Specify style cues to preserve (palette, texture, brushwork) and what must change; add `no extra elements` to prevent drift.
- compositing: Reference inputs by index; specify what moves where; match lighting, perspective, and scale; keep the base framing unchanged.
- sketch-to-render: Preserve layout, proportions, and perspective; choose materials and lighting that support the supplied sketch without adding new elements.
## Where to find copy/paste recipes
For copy/paste prompt specs (examples only), see `references/sample-prompts.md`. This file focuses on principles, specificity, and iteration patterns.
@@ -0,0 +1,433 @@
# Sample prompts (copy/paste)
These prompt recipes are shared across both top-level modes of the skill:
- built-in `image_gen` tool (default)
- `scripts/image_gen.py` CLI fallback for explicit CLI/API/model requests or user-confirmed true-transparent-output fallback requests
Use these as starting points. They are intentionally complete prompt recipes, not the default amount of augmentation to add to every user request.
When adapting a user's prompt:
- keep user-provided requirements
- only add detail according to the specificity policy in `SKILL.md`
- do not treat every example below as permission to invent extra story elements
The labeled lines are prompt scaffolding, not a closed schema. `Asset type` and `Input images` are prompt-only scaffolding; the CLI does not expose them as dedicated flags.
Execution details such as explicit CLI flags, `quality`, `input_fidelity`, masks, output formats, and local output paths depend on mode. Use the built-in tool by default, including simple transparent-image requests. For transparent images, prompt for a flat chroma-key background and remove it locally with `python "${CODEX_HOME:-$HOME/.codex}/skills/.system/imagegen/scripts/remove_chroma_key.py"`; only apply CLI-specific controls when the user explicitly opts into fallback mode or explicitly confirms that the transparent request should use true CLI transparency.
CLI model notes:
- `gpt-image-2` is the fallback CLI default for new workflows.
- `gpt-image-2` supports `quality` values `low`, `medium`, `high`, and `auto`.
- For 4K-style `gpt-image-2` output, use `3840x2160` or `2160x3840`.
- If transparent output needs true CLI fallback, ask before using `gpt-image-1.5` unless the user already explicitly requested `gpt-image-1.5`, `scripts/image_gen.py`, or CLI fallback. Explain that built-in chroma-key removal is the default path, but `gpt-image-2` does not support `background=transparent`.
- Do not set `input_fidelity` with `gpt-image-2`; image inputs already use high fidelity.
For prompting principles (structure, specificity, invariants, iteration), see `references/prompting.md`.
## Generate
### photorealistic-natural
```
Use case: photorealistic-natural
Primary request: candid photo of an elderly sailor on a small fishing boat adjusting a net
Scene/backdrop: coastal water with soft haze
Subject: weathered skin with wrinkles and sun texture
Style/medium: photorealistic candid photo
Composition/framing: medium close-up, eye-level
Lighting/mood: soft coastal daylight, shallow depth of field, subtle film grain
Materials/textures: real skin texture, worn fabric, salt-worn wood
Constraints: natural color balance; no heavy retouching; no glamorization; no watermark
Avoid: studio polish; staged look
```
### product-mockup
```
Use case: product-mockup
Primary request: premium product photo of a matte black shampoo bottle with a minimal label
Scene/backdrop: clean studio gradient from light gray to white
Subject: single bottle centered with subtle reflection
Style/medium: premium product photography
Composition/framing: centered, slight three-quarter angle, generous padding
Lighting/mood: softbox lighting, clean highlights, controlled shadows
Materials/textures: matte plastic, crisp label printing
Constraints: no logos or trademarks; no watermark
```
### ui-mockup
```
Use case: ui-mockup
Primary request: mobile app home screen for a local farmers market with vendors and daily specials
Asset type: mobile app screen
Style/medium: realistic product UI, not concept art
Composition/framing: clean vertical mobile layout with clear hierarchy
Constraints: practical layout, clear typography, no logos or trademarks, no watermark
```
### infographic-diagram
```
Use case: infographic-diagram
Primary request: detailed infographic of an automatic coffee machine flow
Scene/backdrop: clean, light neutral background
Subject: bean hopper -> grinder -> brew group -> boiler -> water tank -> drip tray
Style/medium: clean vector-like infographic with clear callouts and arrows
Composition/framing: vertical poster layout, top-to-bottom flow
Text (verbatim): "Bean Hopper", "Grinder", "Brew Group", "Boiler", "Water Tank", "Drip Tray"
Constraints: clear labels, strong contrast, no logos or trademarks, no watermark
```
### scientific-educational
```
Use case: scientific-educational
Primary request: biology diagram titled "Cellular Respiration at a Glance" for high school students
Scene/backdrop: clean white classroom handout background
Subject: glucose turns into energy inside a cell; include glycolysis, Krebs cycle, and electron transport chain
Style/medium: flat scientific diagram with consistent icons, arrows, and readable labels
Composition/framing: landscape slide-style layout with clear hierarchy and generous whitespace
Text (verbatim): "Cellular Respiration at a Glance", "Glucose", "Pyruvate", "ATP", "NADH", "FADH2", "CO2", "O2", "H2O"
Constraints: scientifically plausible; avoid tiny text; no extra decoration; no watermark
```
### logo-brand
```
Use case: logo-brand
Primary request: original logo for "Field & Flour", a local bakery
Style/medium: vector logo mark; flat colors; minimal
Composition/framing: single centered logo on a plain background with generous padding
Constraints: strong silhouette, balanced negative space; original design only; no gradients unless essential; no trademarks; no watermark
```
### illustration-story
```
Use case: illustration-story
Primary request: 4-panel comic about a pet left alone at home
Scene/backdrop: cozy living room across panels
Subject: pet reacting to the owner leaving, then relaxing, then returning to a composed pose
Style/medium: comic illustration with clear panels
Composition/framing: 4 equal-sized vertical panels, readable actions per panel
Constraints: no text; no logos or trademarks; no watermark
```
### stylized-concept
```
Use case: stylized-concept
Primary request: cavernous hangar interior with tall support beams and drifting fog
Scene/backdrop: industrial hangar interior, deep scale, light haze
Subject: compact shuttle parked near the center
Style/medium: cinematic concept art, industrial realism
Composition/framing: wide-angle, low-angle
Lighting/mood: volumetric light rays cutting through fog
Constraints: no logos or trademarks; no watermark
```
### ads-marketing
```
Use case: ads-marketing
Primary request: campaign image for a streetwear brand called Thread
Subject: group of friends hanging out together in a stylish urban setting
Style/medium: polished youth streetwear campaign photography
Composition/framing: vertical ad layout with natural poses and integrated headline space
Lighting/mood: contemporary, energetic, tasteful
Text (verbatim): "Yours to Create."
Constraints: render the tagline exactly once; clean legible typography; no extra text; no watermarks; no unrelated logos
```
### productivity-visual
```
Use case: productivity-visual
Primary request: one pitch-deck slide titled "Market Opportunity"
Asset type: fundraising slide image
Style/medium: clean modern deck slide, white background, crisp sans-serif typography
Subject: TAM/SAM/SOM concentric-circle diagram plus a small growth bar chart from 2021 to 2026
Composition/framing: 16:9 landscape slide, clear data hierarchy, polished spacing
Text (verbatim): "Market Opportunity", "TAM: $42B", "SAM: $8.7B", "SOM: $340M", "AGI Research, 2024", "Internal analysis"
Constraints: readable labels, no clip art, no stock photography, no decorative clutter, no watermark
```
### historical-scene
```
Use case: historical-scene
Primary request: outdoor crowd scene in Bethel, New York on August 16, 1969
Scene/backdrop: open field with period-appropriate staging
Subject: crowd in period-accurate clothing, authentic environment
Style/medium: photorealistic photo
Composition/framing: wide shot, eye-level
Constraints: period-accurate details; no modern objects; no logos or trademarks; no watermark
```
## Asset type templates (taxonomy-aligned)
### Website assets template
```
Use case: <photorealistic-natural|stylized-concept|product-mockup|infographic-diagram|ui-mockup>
Asset type: <hero image / section illustration / blog header>
Primary request: <short description>
Scene/backdrop: <environment or abstract backdrop>
Subject: <main subject>
Style/medium: <photo/illustration/3D>
Composition/framing: <wide/centered; note usable negative space only if needed>
Lighting/mood: <soft/bright/neutral>
Color palette: <brand colors or neutral>
Constraints: <no text; no logos; no watermark; leave room for UI if needed>
```
### Website assets example: minimal hero background
```
Use case: stylized-concept
Asset type: landing page hero background
Primary request: minimal abstract background with a soft gradient and subtle texture
Style/medium: matte illustration / soft-rendered abstract background
Composition/framing: wide composition with usable negative space for page copy
Lighting/mood: gentle studio glow
Color palette: restrained neutral palette
Constraints: no text; no logos; no watermark
```
### Website assets example: feature section illustration
```
Use case: stylized-concept
Asset type: feature section illustration
Primary request: simple abstract shapes suggesting connection and flow
Scene/backdrop: subtle light-gray backdrop with faint texture
Style/medium: flat illustration; soft shadows; restrained contrast
Composition/framing: centered cluster; open margins for UI
Color palette: muted neutral palette
Constraints: no text; no logos; no watermark
```
### Website assets example: blog header image
```
Use case: photorealistic-natural
Asset type: blog header image
Primary request: overhead desk scene with notebook, pen, and coffee cup
Scene/backdrop: warm wooden tabletop
Style/medium: photorealistic photo
Composition/framing: wide crop with clean room for page copy
Lighting/mood: soft morning light
Constraints: no text; no logos; no watermark
```
### Game assets template
```
Use case: stylized-concept
Asset type: <game environment concept art / game character concept / game UI icon / tileable game texture>
Primary request: <biome/scene/character/icon/material>
Scene/backdrop: <location + set dressing> (if applicable)
Subject: <main focal element(s)>
Style/medium: <realistic/stylized>; <concept art / character render / UI icon / texture>
Composition/framing: <wide/establishing/top-down>; <camera angle>; <focal point placement>
Lighting/mood: <time of day>; <mood>; <volumetric/fog/etc>
Constraints: no logos or trademarks; no watermark
```
### Game assets example: environment concept art
```
Use case: stylized-concept
Asset type: game environment concept art
Primary request: cavernous hangar interior with tall support beams and drifting fog
Scene/backdrop: industrial hangar interior, deep scale, light haze
Subject: compact shuttle parked near the center
Style/medium: cinematic concept art, industrial realism
Composition/framing: wide-angle, low-angle
Lighting/mood: volumetric light rays cutting through fog
Constraints: no logos or trademarks; no watermark
```
### Game assets example: character concept
```
Use case: stylized-concept
Asset type: game character concept
Primary request: desert scout character with layered travel gear
Subject: long coat, satchel, practical travel clothing
Style/medium: character render; stylized realism
Composition/framing: neutral hero pose on a simple backdrop
Constraints: no logos or trademarks; no watermark
```
### Game assets example: UI icon
```
Use case: stylized-concept
Asset type: game UI icon
Primary request: round shield icon with a subtle rune pattern
Style/medium: painted game UI icon
Composition/framing: centered icon; generous padding; clear silhouette
Constraints: no text; no background scene elements; no logos or trademarks; no watermark
```
### Game assets example: tileable texture
```
Use case: stylized-concept
Asset type: tileable game texture
Primary request: worn sandstone blocks
Style/medium: seamless tileable texture; PBR-ish look
Scene/backdrop: neutral lighting reference only
Constraints: seamless edges; no obvious focal elements; no text; no logos or trademarks; no watermark
```
### Wireframe template
```
Use case: ui-mockup
Asset type: website wireframe
Primary request: <page or flow to sketch>
Style/medium: low-fi grayscale wireframe
Composition/framing: <landscape or portrait to match expected device>
Subject: <sections in order; grid/columns; key labels>
Constraints: no color; no logos; no real photos; no watermark
```
### Wireframe example: homepage (desktop)
```
Use case: ui-mockup
Asset type: website wireframe
Primary request: SaaS homepage layout with clear hierarchy
Style/medium: low-fi grayscale wireframe
Subject: top nav; hero with headline and CTA; three feature cards; testimonial strip; pricing preview; footer
Composition/framing: landscape desktop layout
Constraints: label major blocks; no color; no logos; no real photos; no watermark
```
### Wireframe example: pricing page
```
Use case: ui-mockup
Asset type: website wireframe
Primary request: pricing page layout with comparison table
Style/medium: low-fi grayscale wireframe
Subject: header; plan toggle; 3 pricing cards; comparison table; FAQ accordion; footer
Composition/framing: desktop or tablet layout
Constraints: label key areas; no color; no logos; no real photos; no watermark
```
### Wireframe example: mobile onboarding flow
```
Use case: ui-mockup
Asset type: mobile onboarding wireframe
Primary request: three-screen mobile onboarding flow
Style/medium: low-fi grayscale wireframe
Subject: screen 1 headline and CTA; screen 2 feature bullets; screen 3 form fields and CTA
Composition/framing: portrait mobile layout
Constraints: label screens and blocks; no color; no logos; no real photos; no watermark
```
### Logo template
```
Use case: logo-brand
Asset type: logo concept
Primary request: <brand idea or symbol concept>
Style/medium: vector logo mark; flat colors; minimal
Composition/framing: centered mark; clear silhouette; generous margin
Color palette: <1-2 colors; high contrast>
Text (verbatim): "<exact name>" (only if needed)
Constraints: no gradients; no mockups; no 3D; no watermark
```
### Logo example: abstract symbol mark
```
Use case: logo-brand
Asset type: logo concept
Primary request: geometric leaf symbol suggesting sustainability and growth
Style/medium: vector logo mark; flat colors; minimal
Composition/framing: centered mark; clear silhouette
Color palette: deep green and off-white
Constraints: no text unless requested; no gradients; no mockups; no 3D; no watermark
```
### Logo example: monogram mark
```
Use case: logo-brand
Asset type: logo concept
Primary request: interlocking monogram of the letters "AV"
Style/medium: vector logo mark; flat colors; minimal
Composition/framing: centered mark; balanced spacing
Color palette: black on white
Constraints: no gradients; no mockups; no 3D; no watermark
```
### Logo example: wordmark
```
Use case: logo-brand
Asset type: logo concept
Primary request: clean wordmark for a modern studio
Style/medium: vector wordmark; flat colors; minimal
Text (verbatim): "Studio North"
Composition/framing: centered text; even letter spacing
Constraints: no gradients; no mockups; no 3D; no watermark
```
## Edit
### text-localization
```
Use case: text-localization
Input images: Image 1: original infographic
Primary request: replace "Bean Hopper", "Grinder", "Brew Group", "Boiler", "Water Tank", and "Drip Tray" with "Tolva", "Molino", "Grupo de infusión", "Caldera", "Depósito de agua", and "Bandeja de goteo"
Constraints: change only the text; preserve layout, typography, spacing, and hierarchy; no extra words; do not alter logos or imagery
```
### identity-preserve
```
Use case: identity-preserve
Input images: Image 1: person photo; Image 2..N: clothing references
Primary request: replace only the clothing with the provided garments
Constraints: preserve face, body shape, pose, hair, expression, and identity; match lighting and shadows; keep the background unchanged; no accessories or text
```
### precise-object-edit
```
Use case: precise-object-edit
Input images: Image 1: room photo
Primary request: replace only the white chairs with wooden chairs
Constraints: preserve camera angle, room lighting, floor shadows, and surrounding objects; keep all other aspects unchanged
```
### lighting-weather
```
Use case: lighting-weather
Input images: Image 1: original photo
Primary request: make it look like a winter evening with gentle snowfall
Constraints: preserve subject identity, geometry, camera angle, and composition; change only lighting, atmosphere, and weather
```
### background-extraction
```
Use case: background-extraction
Input images: Image 1: product photo
Primary request: isolate the product on a clean transparent background
Scene/backdrop: perfectly flat solid #00ff00 chroma-key background for local background removal
Constraints: background must be one uniform color with no shadows, gradients, texture, reflections, floor plane, or lighting variation; crisp silhouette; generous padding; no halos or fringing; preserve label text exactly; no restyling; do not use #00ff00 anywhere in the subject
```
Post-process note: after built-in generation, run `python "${CODEX_HOME:-$HOME/.codex}/skills/.system/imagegen/scripts/remove_chroma_key.py" --input <source> --out <final.png> --auto-key border --soft-matte --transparent-threshold 12 --opaque-threshold 220 --despill`. Ask before using CLI `gpt-image-1.5 --background transparent --output-format png` for true/native transparency, failed chroma-key validation, or complex subjects such as hair, fur, glass, smoke, liquids, translucent materials, reflections, or soft shadows, unless the user already explicitly requested `gpt-image-1.5`, `scripts/image_gen.py`, or CLI fallback.
### style-transfer
```
Use case: style-transfer
Input images: Image 1: style reference
Primary request: apply Image 1's visual style to a man riding a motorcycle on a plain white backdrop
Constraints: preserve palette, texture, and brushwork; no extra elements
```
### compositing
```
Use case: compositing
Input images: Image 1: base scene; Image 2: subject to insert
Primary request: place the subject from Image 2 next to the person in Image 1
Constraints: match lighting, perspective, and scale; keep the base framing unchanged; no extra elements
```
### character consistency workflow
```
Use case: identity-preserve
Input images: Image 1: previous character anchor illustration
Primary request: continue the story with the same character in a new scene and action
Scene/backdrop: snowy forest after a winter storm
Subject: same young forest hero gently helping a frightened squirrel out of a fallen tree
Style/medium: same children's book watercolor illustration style as Image 1
Constraints: do not redesign the character; preserve facial features, proportions, outfit, color palette, and personality; no text; no watermark
```
### sketch-to-render
```
Use case: sketch-to-render
Input images: Image 1: drawing
Primary request: turn the drawing into a photorealistic image
Constraints: preserve layout, proportions, and perspective; choose realistic materials and lighting; do not add new elements or text
```
@@ -0,0 +1,995 @@
#!/usr/bin/env python3
"""Fallback CLI for explicit image generation or editing with GPT Image models.
Used only when the user explicitly opts into CLI fallback mode, or when explicit
transparent output requires the `gpt-image-1.5` fallback path.
Defaults to gpt-image-2 and a structured prompt augmentation workflow.
"""
from __future__ import annotations
import argparse
import asyncio
import base64
import json
import os
from pathlib import Path
import re
import sys
import time
from typing import Any, Dict, Iterable, List, Optional, Tuple
from io import BytesIO
DEFAULT_MODEL = "gpt-image-2"
DEFAULT_SIZE = "auto"
DEFAULT_QUALITY = "medium"
DEFAULT_OUTPUT_FORMAT = "png"
DEFAULT_CONCURRENCY = 5
DEFAULT_DOWNSCALE_SUFFIX = "-web"
DEFAULT_OUTPUT_PATH = "output/imagegen/output.png"
GPT_IMAGE_MODEL_PREFIX = "gpt-image-"
ALLOWED_LEGACY_SIZES = {"1024x1024", "1536x1024", "1024x1536", "auto"}
ALLOWED_QUALITIES = {"low", "medium", "high", "auto"}
ALLOWED_BACKGROUNDS = {"transparent", "opaque", "auto", None}
ALLOWED_INPUT_FIDELITIES = {"low", "high", None}
GPT_IMAGE_2_MODEL = "gpt-image-2"
GPT_IMAGE_2_MIN_PIXELS = 655_360
GPT_IMAGE_2_MAX_PIXELS = 8_294_400
GPT_IMAGE_2_MAX_EDGE = 3840
GPT_IMAGE_2_MAX_RATIO = 3.0
MAX_IMAGE_BYTES = 50 * 1024 * 1024
MAX_BATCH_JOBS = 500
def _die(message: str, code: int = 1) -> None:
print(f"Error: {message}", file=sys.stderr)
raise SystemExit(code)
def _warn(message: str) -> None:
print(f"Warning: {message}", file=sys.stderr)
def _dependency_hint(package: str, *, upgrade: bool = False) -> str:
command = f"uv pip install {'-U ' if upgrade else ''}{package}"
return (
"Activate the repo-selected environment first, then install it with "
f"`{command}`. If this repo uses a local virtualenv, start with "
"`source .venv/bin/activate`; otherwise use this repo's configured shared fallback "
"environment. If your project declares dependencies, prefer that project's normal "
"`uv sync` flow."
)
def _ensure_api_key(dry_run: bool) -> None:
if os.getenv("OPENAI_API_KEY"):
print("OPENAI_API_KEY is set.", file=sys.stderr)
return
if dry_run:
_warn("OPENAI_API_KEY is not set; dry-run only.")
return
_die("OPENAI_API_KEY is not set. Export it before running.")
def _read_prompt(prompt: Optional[str], prompt_file: Optional[str]) -> str:
if prompt and prompt_file:
_die("Use --prompt or --prompt-file, not both.")
if prompt_file:
path = Path(prompt_file)
if not path.exists():
_die(f"Prompt file not found: {path}")
return path.read_text(encoding="utf-8").strip()
if prompt:
return prompt.strip()
_die("Missing prompt. Use --prompt or --prompt-file.")
return "" # unreachable
def _check_image_paths(paths: Iterable[str]) -> List[Path]:
resolved: List[Path] = []
for raw in paths:
path = Path(raw)
if not path.exists():
_die(f"Image file not found: {path}")
if path.stat().st_size > MAX_IMAGE_BYTES:
_warn(f"Image exceeds 50MB limit: {path}")
resolved.append(path)
return resolved
def _normalize_output_format(fmt: Optional[str]) -> str:
if not fmt:
return DEFAULT_OUTPUT_FORMAT
fmt = fmt.lower()
if fmt not in {"png", "jpeg", "jpg", "webp"}:
_die("output-format must be png, jpeg, jpg, or webp.")
return "jpeg" if fmt == "jpg" else fmt
def _parse_size(size: str) -> Optional[Tuple[int, int]]:
match = re.fullmatch(r"([1-9][0-9]*)x([1-9][0-9]*)", size)
if not match:
return None
return int(match.group(1)), int(match.group(2))
def _validate_gpt_image_2_size(size: str) -> None:
if size == "auto":
return
parsed = _parse_size(size)
if parsed is None:
_die("size must be auto or WIDTHxHEIGHT, for example 1024x1024.")
width, height = parsed
max_edge = max(width, height)
min_edge = min(width, height)
total_pixels = width * height
if max_edge > GPT_IMAGE_2_MAX_EDGE:
_die("gpt-image-2 size maximum edge length must be less than or equal to 3840px.")
if width % 16 != 0 or height % 16 != 0:
_die("gpt-image-2 size width and height must be multiples of 16px.")
if max_edge / min_edge > GPT_IMAGE_2_MAX_RATIO:
_die("gpt-image-2 size long edge to short edge ratio must not exceed 3:1.")
if total_pixels < GPT_IMAGE_2_MIN_PIXELS or total_pixels > GPT_IMAGE_2_MAX_PIXELS:
_die(
"gpt-image-2 size total pixels must be at least 655,360 and no more than 8,294,400."
)
def _validate_size(size: str, model: str) -> None:
if model == GPT_IMAGE_2_MODEL:
_validate_gpt_image_2_size(size)
return
if size not in ALLOWED_LEGACY_SIZES:
_die(
"size must be one of 1024x1024, 1536x1024, 1024x1536, or auto for this GPT Image model."
)
def _validate_quality(quality: str) -> None:
if quality not in ALLOWED_QUALITIES:
_die("quality must be one of low, medium, high, or auto.")
def _validate_background(background: Optional[str]) -> None:
if background not in ALLOWED_BACKGROUNDS:
_die("background must be one of transparent, opaque, or auto.")
def _validate_input_fidelity(input_fidelity: Optional[str]) -> None:
if input_fidelity not in ALLOWED_INPUT_FIDELITIES:
_die("input-fidelity must be one of low or high.")
def _validate_model(model: str) -> None:
if not model.startswith(GPT_IMAGE_MODEL_PREFIX):
_die(
"model must be a GPT Image model (for example gpt-image-1.5, gpt-image-1, or gpt-image-1-mini)."
)
def _validate_transparency(background: Optional[str], output_format: str) -> None:
if background == "transparent" and output_format not in {"png", "webp"}:
_die("transparent background requires output-format png or webp.")
def _validate_model_specific_options(
*,
model: str,
background: Optional[str],
input_fidelity: Optional[str] = None,
) -> None:
if model != GPT_IMAGE_2_MODEL:
return
if background == "transparent":
_die(
"transparent backgrounds are not supported in gpt-image-2, the latest model. "
"Use --model gpt-image-1.5 --background transparent --output-format png instead."
)
if input_fidelity is not None:
_die(
"input_fidelity is not supported in gpt-image-2 because image inputs always use high fidelity for this model."
)
def _validate_generate_payload(payload: Dict[str, Any]) -> None:
model = str(payload.get("model", DEFAULT_MODEL))
_validate_model(model)
n = int(payload.get("n", 1))
if n < 1 or n > 10:
_die("n must be between 1 and 10")
size = str(payload.get("size", DEFAULT_SIZE))
quality = str(payload.get("quality", DEFAULT_QUALITY))
background = payload.get("background")
_validate_size(size, model)
_validate_quality(quality)
_validate_background(background)
_validate_model_specific_options(model=model, background=background)
oc = payload.get("output_compression")
if oc is not None and not (0 <= int(oc) <= 100):
_die("output_compression must be between 0 and 100")
def _build_output_paths(
out: str,
output_format: str,
count: int,
out_dir: Optional[str],
) -> List[Path]:
ext = "." + output_format
if out_dir:
out_base = Path(out_dir)
out_base.mkdir(parents=True, exist_ok=True)
return [out_base / f"image_{i}{ext}" for i in range(1, count + 1)]
out_path = Path(out)
if out_path.exists() and out_path.is_dir():
out_path.mkdir(parents=True, exist_ok=True)
return [out_path / f"image_{i}{ext}" for i in range(1, count + 1)]
if out_path.suffix == "":
out_path = out_path.with_suffix(ext)
elif output_format and out_path.suffix.lstrip(".").lower() != output_format:
_warn(
f"Output extension {out_path.suffix} does not match output-format {output_format}."
)
if count == 1:
return [out_path]
return [
out_path.with_name(f"{out_path.stem}-{i}{out_path.suffix}")
for i in range(1, count + 1)
]
def _augment_prompt(args: argparse.Namespace, prompt: str) -> str:
fields = _fields_from_args(args)
return _augment_prompt_fields(args.augment, prompt, fields)
def _augment_prompt_fields(augment: bool, prompt: str, fields: Dict[str, Optional[str]]) -> str:
if not augment:
return prompt
sections: List[str] = []
if fields.get("use_case"):
sections.append(f"Use case: {fields['use_case']}")
sections.append(f"Primary request: {prompt}")
if fields.get("scene"):
sections.append(f"Scene/background: {fields['scene']}")
if fields.get("subject"):
sections.append(f"Subject: {fields['subject']}")
if fields.get("style"):
sections.append(f"Style/medium: {fields['style']}")
if fields.get("composition"):
sections.append(f"Composition/framing: {fields['composition']}")
if fields.get("lighting"):
sections.append(f"Lighting/mood: {fields['lighting']}")
if fields.get("palette"):
sections.append(f"Color palette: {fields['palette']}")
if fields.get("materials"):
sections.append(f"Materials/textures: {fields['materials']}")
if fields.get("text"):
sections.append(f"Text (verbatim): \"{fields['text']}\"")
if fields.get("constraints"):
sections.append(f"Constraints: {fields['constraints']}")
if fields.get("negative"):
sections.append(f"Avoid: {fields['negative']}")
return "\n".join(sections)
def _fields_from_args(args: argparse.Namespace) -> Dict[str, Optional[str]]:
return {
"use_case": getattr(args, "use_case", None),
"scene": getattr(args, "scene", None),
"subject": getattr(args, "subject", None),
"style": getattr(args, "style", None),
"composition": getattr(args, "composition", None),
"lighting": getattr(args, "lighting", None),
"palette": getattr(args, "palette", None),
"materials": getattr(args, "materials", None),
"text": getattr(args, "text", None),
"constraints": getattr(args, "constraints", None),
"negative": getattr(args, "negative", None),
}
def _print_request(payload: dict) -> None:
print(json.dumps(payload, indent=2, sort_keys=True))
def _decode_and_write(images: List[str], outputs: List[Path], force: bool) -> None:
for idx, image_b64 in enumerate(images):
if idx >= len(outputs):
break
out_path = outputs[idx]
if out_path.exists() and not force:
_die(f"Output already exists: {out_path} (use --force to overwrite)")
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_bytes(base64.b64decode(image_b64))
print(f"Wrote {out_path}")
def _derive_downscale_path(path: Path, suffix: str) -> Path:
if suffix and not suffix.startswith("-") and not suffix.startswith("_"):
suffix = "-" + suffix
return path.with_name(f"{path.stem}{suffix}{path.suffix}")
def _downscale_image_bytes(image_bytes: bytes, *, max_dim: int, output_format: str) -> bytes:
try:
from PIL import Image
except Exception:
_die(f"Downscaling requires Pillow. {_dependency_hint('pillow')}")
if max_dim < 1:
_die("--downscale-max-dim must be >= 1")
with Image.open(BytesIO(image_bytes)) as img:
img.load()
w, h = img.size
scale = min(1.0, float(max_dim) / float(max(w, h)))
target = (max(1, int(round(w * scale))), max(1, int(round(h * scale))))
resized = img if target == (w, h) else img.resize(target, Image.Resampling.LANCZOS)
fmt = output_format.lower()
if fmt == "jpg":
fmt = "jpeg"
if fmt == "jpeg":
if resized.mode in ("RGBA", "LA") or ("transparency" in getattr(resized, "info", {})):
bg = Image.new("RGB", resized.size, (255, 255, 255))
bg.paste(resized.convert("RGBA"), mask=resized.convert("RGBA").split()[-1])
resized = bg
else:
resized = resized.convert("RGB")
out = BytesIO()
resized.save(out, format=fmt.upper())
return out.getvalue()
def _decode_write_and_downscale(
images: List[str],
outputs: List[Path],
*,
force: bool,
downscale_max_dim: Optional[int],
downscale_suffix: str,
output_format: str,
) -> None:
for idx, image_b64 in enumerate(images):
if idx >= len(outputs):
break
out_path = outputs[idx]
if out_path.exists() and not force:
_die(f"Output already exists: {out_path} (use --force to overwrite)")
out_path.parent.mkdir(parents=True, exist_ok=True)
raw = base64.b64decode(image_b64)
out_path.write_bytes(raw)
print(f"Wrote {out_path}")
if downscale_max_dim is None:
continue
derived = _derive_downscale_path(out_path, downscale_suffix)
if derived.exists() and not force:
_die(f"Output already exists: {derived} (use --force to overwrite)")
derived.parent.mkdir(parents=True, exist_ok=True)
resized = _downscale_image_bytes(raw, max_dim=downscale_max_dim, output_format=output_format)
derived.write_bytes(resized)
print(f"Wrote {derived}")
def _create_client():
try:
from openai import OpenAI
except ImportError:
_die(f"openai SDK not installed in the active environment. {_dependency_hint('openai')}")
return OpenAI()
def _create_async_client():
try:
from openai import AsyncOpenAI
except ImportError:
try:
import openai as _openai # noqa: F401
except ImportError:
_die(
f"openai SDK not installed in the active environment. {_dependency_hint('openai')}"
)
_die(
"AsyncOpenAI not available in this openai SDK version. "
f"{_dependency_hint('openai', upgrade=True)}"
)
return AsyncOpenAI()
def _slugify(value: str) -> str:
value = value.strip().lower()
value = re.sub(r"[^a-z0-9]+", "-", value)
value = re.sub(r"-{2,}", "-", value).strip("-")
return value[:60] if value else "job"
def _normalize_job(job: Any, idx: int) -> Dict[str, Any]:
if isinstance(job, str):
prompt = job.strip()
if not prompt:
_die(f"Empty prompt at job {idx}")
return {"prompt": prompt}
if isinstance(job, dict):
if "prompt" not in job or not str(job["prompt"]).strip():
_die(f"Missing prompt for job {idx}")
return job
_die(f"Invalid job at index {idx}: expected string or object.")
return {} # unreachable
def _read_jobs_jsonl(path: str) -> List[Dict[str, Any]]:
p = Path(path)
if not p.exists():
_die(f"Input file not found: {p}")
jobs: List[Dict[str, Any]] = []
for line_no, raw in enumerate(p.read_text(encoding="utf-8").splitlines(), start=1):
line = raw.strip()
if not line or line.startswith("#"):
continue
try:
item: Any
if line.startswith("{"):
item = json.loads(line)
else:
item = line
jobs.append(_normalize_job(item, idx=line_no))
except json.JSONDecodeError as exc:
_die(f"Invalid JSON on line {line_no}: {exc}")
if not jobs:
_die("No jobs found in input file.")
if len(jobs) > MAX_BATCH_JOBS:
_die(f"Too many jobs ({len(jobs)}). Max is {MAX_BATCH_JOBS}.")
return jobs
def _merge_non_null(dst: Dict[str, Any], src: Dict[str, Any]) -> Dict[str, Any]:
merged = dict(dst)
for k, v in src.items():
if v is not None:
merged[k] = v
return merged
def _job_output_paths(
*,
out_dir: Path,
output_format: str,
idx: int,
prompt: str,
n: int,
explicit_out: Optional[str],
) -> List[Path]:
out_dir.mkdir(parents=True, exist_ok=True)
ext = "." + output_format
if explicit_out:
base = Path(explicit_out)
if base.suffix == "":
base = base.with_suffix(ext)
elif base.suffix.lstrip(".").lower() != output_format:
_warn(
f"Job {idx}: output extension {base.suffix} does not match output-format {output_format}."
)
base = out_dir / base.name
else:
slug = _slugify(prompt[:80])
base = out_dir / f"{idx:03d}-{slug}{ext}"
if n == 1:
return [base]
return [
base.with_name(f"{base.stem}-{i}{base.suffix}")
for i in range(1, n + 1)
]
def _extract_retry_after_seconds(exc: Exception) -> Optional[float]:
# Best-effort: openai SDK errors vary by version. Prefer a conservative fallback.
for attr in ("retry_after", "retry_after_seconds"):
val = getattr(exc, attr, None)
if isinstance(val, (int, float)) and val >= 0:
return float(val)
msg = str(exc)
m = re.search(r"retry[- ]after[:= ]+([0-9]+(?:\\.[0-9]+)?)", msg, re.IGNORECASE)
if m:
try:
return float(m.group(1))
except Exception:
return None
return None
def _is_rate_limit_error(exc: Exception) -> bool:
name = exc.__class__.__name__.lower()
if "ratelimit" in name or "rate_limit" in name:
return True
msg = str(exc).lower()
return "429" in msg or "rate limit" in msg or "too many requests" in msg
def _is_transient_error(exc: Exception) -> bool:
if _is_rate_limit_error(exc):
return True
name = exc.__class__.__name__.lower()
if "timeout" in name or "timedout" in name or "tempor" in name:
return True
msg = str(exc).lower()
return "timeout" in msg or "timed out" in msg or "connection reset" in msg
async def _generate_one_with_retries(
client: Any,
payload: Dict[str, Any],
*,
attempts: int,
job_label: str,
) -> Any:
last_exc: Optional[Exception] = None
for attempt in range(1, attempts + 1):
try:
return await client.images.generate(**payload)
except Exception as exc:
last_exc = exc
if not _is_transient_error(exc):
raise
if attempt == attempts:
raise
sleep_s = _extract_retry_after_seconds(exc)
if sleep_s is None:
sleep_s = min(60.0, 2.0**attempt)
print(
f"{job_label} attempt {attempt}/{attempts} failed ({exc.__class__.__name__}); retrying in {sleep_s:.1f}s",
file=sys.stderr,
)
await asyncio.sleep(sleep_s)
raise last_exc or RuntimeError("unknown error")
async def _run_generate_batch(args: argparse.Namespace) -> int:
jobs = _read_jobs_jsonl(args.input)
out_dir = Path(args.out_dir)
base_fields = _fields_from_args(args)
base_payload = {
"model": args.model,
"n": args.n,
"size": args.size,
"quality": args.quality,
"background": args.background,
"output_format": args.output_format,
"output_compression": args.output_compression,
"moderation": args.moderation,
}
if args.dry_run:
for i, job in enumerate(jobs, start=1):
prompt = str(job["prompt"]).strip()
fields = _merge_non_null(base_fields, job.get("fields", {}))
# Allow flat job keys as well (use_case, scene, etc.)
fields = _merge_non_null(fields, {k: job.get(k) for k in base_fields.keys()})
augmented = _augment_prompt_fields(args.augment, prompt, fields)
job_payload = dict(base_payload)
job_payload["prompt"] = augmented
job_payload = _merge_non_null(job_payload, {k: job.get(k) for k in base_payload.keys()})
job_payload = {k: v for k, v in job_payload.items() if v is not None}
_validate_generate_payload(job_payload)
effective_output_format = _normalize_output_format(job_payload.get("output_format"))
_validate_transparency(job_payload.get("background"), effective_output_format)
job_payload["output_format"] = effective_output_format
n = int(job_payload.get("n", 1))
outputs = _job_output_paths(
out_dir=out_dir,
output_format=effective_output_format,
idx=i,
prompt=prompt,
n=n,
explicit_out=job.get("out"),
)
downscaled = None
if args.downscale_max_dim is not None:
downscaled = [
str(_derive_downscale_path(p, args.downscale_suffix)) for p in outputs
]
_print_request(
{
"endpoint": "/v1/images/generations",
"job": i,
"outputs": [str(p) for p in outputs],
"outputs_downscaled": downscaled,
**job_payload,
}
)
return 0
client = _create_async_client()
sem = asyncio.Semaphore(args.concurrency)
any_failed = False
async def run_job(i: int, job: Dict[str, Any]) -> Tuple[int, Optional[str]]:
nonlocal any_failed
prompt = str(job["prompt"]).strip()
job_label = f"[job {i}/{len(jobs)}]"
fields = _merge_non_null(base_fields, job.get("fields", {}))
fields = _merge_non_null(fields, {k: job.get(k) for k in base_fields.keys()})
augmented = _augment_prompt_fields(args.augment, prompt, fields)
payload = dict(base_payload)
payload["prompt"] = augmented
payload = _merge_non_null(payload, {k: job.get(k) for k in base_payload.keys()})
payload = {k: v for k, v in payload.items() if v is not None}
n = int(payload.get("n", 1))
_validate_generate_payload(payload)
effective_output_format = _normalize_output_format(payload.get("output_format"))
_validate_transparency(payload.get("background"), effective_output_format)
payload["output_format"] = effective_output_format
outputs = _job_output_paths(
out_dir=out_dir,
output_format=effective_output_format,
idx=i,
prompt=prompt,
n=n,
explicit_out=job.get("out"),
)
try:
async with sem:
print(f"{job_label} starting", file=sys.stderr)
started = time.time()
result = await _generate_one_with_retries(
client,
payload,
attempts=args.max_attempts,
job_label=job_label,
)
elapsed = time.time() - started
print(f"{job_label} completed in {elapsed:.1f}s", file=sys.stderr)
images = [item.b64_json for item in result.data]
_decode_write_and_downscale(
images,
outputs,
force=args.force,
downscale_max_dim=args.downscale_max_dim,
downscale_suffix=args.downscale_suffix,
output_format=effective_output_format,
)
return i, None
except Exception as exc:
any_failed = True
print(f"{job_label} failed: {exc}", file=sys.stderr)
if args.fail_fast:
raise
return i, str(exc)
tasks = [asyncio.create_task(run_job(i, job)) for i, job in enumerate(jobs, start=1)]
try:
await asyncio.gather(*tasks)
except Exception:
for t in tasks:
if not t.done():
t.cancel()
raise
return 1 if any_failed else 0
def _generate_batch(args: argparse.Namespace) -> None:
exit_code = asyncio.run(_run_generate_batch(args))
if exit_code:
raise SystemExit(exit_code)
def _generate(args: argparse.Namespace) -> None:
prompt = _read_prompt(args.prompt, args.prompt_file)
prompt = _augment_prompt(args, prompt)
payload = {
"model": args.model,
"prompt": prompt,
"n": args.n,
"size": args.size,
"quality": args.quality,
"background": args.background,
"output_format": args.output_format,
"output_compression": args.output_compression,
"moderation": args.moderation,
}
payload = {k: v for k, v in payload.items() if v is not None}
output_format = _normalize_output_format(args.output_format)
_validate_transparency(args.background, output_format)
payload["output_format"] = output_format
output_paths = _build_output_paths(args.out, output_format, args.n, args.out_dir)
downscaled = None
if args.downscale_max_dim is not None:
downscaled = [str(_derive_downscale_path(p, args.downscale_suffix)) for p in output_paths]
if args.dry_run:
_print_request(
{
"endpoint": "/v1/images/generations",
"outputs": [str(p) for p in output_paths],
"outputs_downscaled": downscaled,
**payload,
}
)
return
print(
"Calling Image API (generation). This can take up to a couple of minutes.",
file=sys.stderr,
)
started = time.time()
client = _create_client()
result = client.images.generate(**payload)
elapsed = time.time() - started
print(f"Generation completed in {elapsed:.1f}s.", file=sys.stderr)
images = [item.b64_json for item in result.data]
_decode_write_and_downscale(
images,
output_paths,
force=args.force,
downscale_max_dim=args.downscale_max_dim,
downscale_suffix=args.downscale_suffix,
output_format=output_format,
)
def _edit(args: argparse.Namespace) -> None:
prompt = _read_prompt(args.prompt, args.prompt_file)
prompt = _augment_prompt(args, prompt)
image_paths = _check_image_paths(args.image)
mask_path = Path(args.mask) if args.mask else None
if mask_path:
if not mask_path.exists():
_die(f"Mask file not found: {mask_path}")
if mask_path.suffix.lower() != ".png":
_warn(f"Mask should be a PNG with an alpha channel: {mask_path}")
if mask_path.stat().st_size > MAX_IMAGE_BYTES:
_warn(f"Mask exceeds 50MB limit: {mask_path}")
payload = {
"model": args.model,
"prompt": prompt,
"n": args.n,
"size": args.size,
"quality": args.quality,
"background": args.background,
"output_format": args.output_format,
"output_compression": args.output_compression,
"input_fidelity": args.input_fidelity,
"moderation": args.moderation,
}
payload = {k: v for k, v in payload.items() if v is not None}
output_format = _normalize_output_format(args.output_format)
_validate_transparency(args.background, output_format)
payload["output_format"] = output_format
_validate_input_fidelity(args.input_fidelity)
output_paths = _build_output_paths(args.out, output_format, args.n, args.out_dir)
downscaled = None
if args.downscale_max_dim is not None:
downscaled = [str(_derive_downscale_path(p, args.downscale_suffix)) for p in output_paths]
if args.dry_run:
payload_preview = dict(payload)
payload_preview["image"] = [str(p) for p in image_paths]
if mask_path:
payload_preview["mask"] = str(mask_path)
_print_request(
{
"endpoint": "/v1/images/edits",
"outputs": [str(p) for p in output_paths],
"outputs_downscaled": downscaled,
**payload_preview,
}
)
return
print(
f"Calling Image API (edit) with {len(image_paths)} image(s).",
file=sys.stderr,
)
started = time.time()
client = _create_client()
with _open_files(image_paths) as image_files, _open_mask(mask_path) as mask_file:
request = dict(payload)
request["image"] = image_files if len(image_files) > 1 else image_files[0]
if mask_file is not None:
request["mask"] = mask_file
result = client.images.edit(**request)
elapsed = time.time() - started
print(f"Edit completed in {elapsed:.1f}s.", file=sys.stderr)
images = [item.b64_json for item in result.data]
_decode_write_and_downscale(
images,
output_paths,
force=args.force,
downscale_max_dim=args.downscale_max_dim,
downscale_suffix=args.downscale_suffix,
output_format=output_format,
)
def _open_files(paths: List[Path]):
return _FileBundle(paths)
def _open_mask(mask_path: Optional[Path]):
if mask_path is None:
return _NullContext()
return _SingleFile(mask_path)
class _NullContext:
def __enter__(self):
return None
def __exit__(self, exc_type, exc, tb):
return False
class _SingleFile:
def __init__(self, path: Path):
self._path = path
self._handle = None
def __enter__(self):
self._handle = self._path.open("rb")
return self._handle
def __exit__(self, exc_type, exc, tb):
if self._handle:
try:
self._handle.close()
except Exception:
pass
return False
class _FileBundle:
def __init__(self, paths: List[Path]):
self._paths = paths
self._handles: List[object] = []
def __enter__(self):
self._handles = [p.open("rb") for p in self._paths]
return self._handles
def __exit__(self, exc_type, exc, tb):
for handle in self._handles:
try:
handle.close()
except Exception:
pass
return False
def _add_shared_args(parser: argparse.ArgumentParser) -> None:
parser.add_argument("--model", default=DEFAULT_MODEL)
parser.add_argument("--prompt")
parser.add_argument("--prompt-file")
parser.add_argument("--n", type=int, default=1)
parser.add_argument("--size", default=DEFAULT_SIZE)
parser.add_argument("--quality", default=DEFAULT_QUALITY)
parser.add_argument("--background")
parser.add_argument("--output-format")
parser.add_argument("--output-compression", type=int)
parser.add_argument("--moderation")
parser.add_argument("--out", default=DEFAULT_OUTPUT_PATH)
parser.add_argument("--out-dir")
parser.add_argument("--force", action="store_true")
parser.add_argument("--dry-run", action="store_true")
parser.add_argument("--augment", dest="augment", action="store_true")
parser.add_argument("--no-augment", dest="augment", action="store_false")
parser.set_defaults(augment=True)
# Prompt augmentation hints
parser.add_argument("--use-case")
parser.add_argument("--scene")
parser.add_argument("--subject")
parser.add_argument("--style")
parser.add_argument("--composition")
parser.add_argument("--lighting")
parser.add_argument("--palette")
parser.add_argument("--materials")
parser.add_argument("--text")
parser.add_argument("--constraints")
parser.add_argument("--negative")
# Post-processing (optional): generate an additional downscaled copy for fast web loading.
parser.add_argument("--downscale-max-dim", type=int)
parser.add_argument("--downscale-suffix", default=DEFAULT_DOWNSCALE_SUFFIX)
def main() -> int:
parser = argparse.ArgumentParser(
description="Fallback CLI for explicit image generation or editing via GPT Image models"
)
subparsers = parser.add_subparsers(dest="command", required=True)
gen_parser = subparsers.add_parser("generate", help="Create a new image")
_add_shared_args(gen_parser)
gen_parser.set_defaults(func=_generate)
batch_parser = subparsers.add_parser(
"generate-batch",
help="Generate multiple prompts concurrently (JSONL input)",
)
_add_shared_args(batch_parser)
batch_parser.add_argument("--input", required=True, help="Path to JSONL file (one job per line)")
batch_parser.add_argument("--concurrency", type=int, default=DEFAULT_CONCURRENCY)
batch_parser.add_argument("--max-attempts", type=int, default=3)
batch_parser.add_argument("--fail-fast", action="store_true")
batch_parser.set_defaults(func=_generate_batch)
edit_parser = subparsers.add_parser("edit", help="Edit an existing image")
_add_shared_args(edit_parser)
edit_parser.add_argument("--image", action="append", required=True)
edit_parser.add_argument("--mask")
edit_parser.add_argument("--input-fidelity")
edit_parser.set_defaults(func=_edit)
args = parser.parse_args()
if args.n < 1 or args.n > 10:
_die("--n must be between 1 and 10")
if getattr(args, "concurrency", 1) < 1 or getattr(args, "concurrency", 1) > 25:
_die("--concurrency must be between 1 and 25")
if getattr(args, "max_attempts", 3) < 1 or getattr(args, "max_attempts", 3) > 10:
_die("--max-attempts must be between 1 and 10")
if args.output_compression is not None and not (0 <= args.output_compression <= 100):
_die("--output-compression must be between 0 and 100")
if args.command == "generate-batch" and not args.out_dir:
_die("generate-batch requires --out-dir")
if getattr(args, "downscale_max_dim", None) is not None and args.downscale_max_dim < 1:
_die("--downscale-max-dim must be >= 1")
_validate_model(args.model)
_validate_size(args.size, args.model)
_validate_quality(args.quality)
_validate_background(args.background)
_validate_model_specific_options(
model=args.model,
background=args.background,
input_fidelity=getattr(args, "input_fidelity", None),
)
_ensure_api_key(args.dry_run)
args.func(args)
return 0
if __name__ == "__main__":
raise SystemExit(main())
@@ -0,0 +1,440 @@
#!/usr/bin/env python3
"""Remove a solid chroma-key background from an image.
This helper supports the imagegen skill's built-in-first transparent workflow:
generate an image on a flat key color, then convert that key color to alpha.
"""
from __future__ import annotations
import argparse
from io import BytesIO
from pathlib import Path
import re
from statistics import median
import sys
from typing import Tuple
Color = Tuple[int, int, int]
KEY_DOMINANCE_THRESHOLD = 16.0
ALPHA_NOISE_FLOOR = 8
def _die(message: str, code: int = 1) -> None:
print(f"Error: {message}", file=sys.stderr)
raise SystemExit(code)
def _dependency_hint(package: str) -> str:
return (
"Activate the repo-selected environment first, then install it with "
f"`uv pip install {package}`. If this repo uses a local virtualenv, start with "
"`source .venv/bin/activate`; otherwise use this repo's configured shared fallback "
"environment."
)
def _load_pillow():
try:
from PIL import Image, ImageFilter
except ImportError:
_die(f"Pillow is required for chroma-key removal. {_dependency_hint('pillow')}")
return Image, ImageFilter
def _parse_key_color(raw: str) -> Color:
value = raw.strip()
match = re.fullmatch(r"#?([0-9a-fA-F]{6})", value)
if not match:
_die("key color must be a hex RGB value like #00ff00.")
hex_value = match.group(1)
return (
int(hex_value[0:2], 16),
int(hex_value[2:4], 16),
int(hex_value[4:6], 16),
)
def _validate_args(args: argparse.Namespace) -> None:
if args.tolerance < 0 or args.tolerance > 255:
_die("--tolerance must be between 0 and 255.")
if args.transparent_threshold < 0 or args.transparent_threshold > 255:
_die("--transparent-threshold must be between 0 and 255.")
if args.opaque_threshold < 0 or args.opaque_threshold > 255:
_die("--opaque-threshold must be between 0 and 255.")
if args.soft_matte and args.transparent_threshold >= args.opaque_threshold:
_die("--transparent-threshold must be lower than --opaque-threshold.")
if args.edge_feather < 0 or args.edge_feather > 64:
_die("--edge-feather must be between 0 and 64.")
if args.edge_contract < 0 or args.edge_contract > 16:
_die("--edge-contract must be between 0 and 16.")
src = Path(args.input)
if not src.exists():
_die(f"Input image not found: {src}")
out = Path(args.out)
if out.exists() and not args.force:
_die(f"Output already exists: {out} (use --force to overwrite)")
if out.suffix.lower() not in {".png", ".webp"}:
_die("--out must end in .png or .webp so the alpha channel is preserved.")
def _channel_distance(a: Color, b: Color) -> int:
return max(abs(a[0] - b[0]), abs(a[1] - b[1]), abs(a[2] - b[2]))
def _clamp_channel(value: float) -> int:
return max(0, min(255, int(round(value))))
def _smoothstep(value: float) -> float:
value = max(0.0, min(1.0, value))
return value * value * (3.0 - 2.0 * value)
def _soft_alpha(distance: int, transparent_threshold: float, opaque_threshold: float) -> int:
if distance <= transparent_threshold:
return 0
if distance >= opaque_threshold:
return 255
ratio = (float(distance) - transparent_threshold) / (
opaque_threshold - transparent_threshold
)
return _clamp_channel(255.0 * _smoothstep(ratio))
def _dominance_alpha(rgb: Color, key: Color) -> int:
spill_channels = _spill_channels(key)
if not spill_channels:
return 255
channels = [float(value) for value in rgb]
non_spill = [idx for idx in range(3) if idx not in spill_channels]
key_strength = (
min(channels[idx] for idx in spill_channels)
if len(spill_channels) > 1
else channels[spill_channels[0]]
)
non_key_strength = max((channels[idx] for idx in non_spill), default=0.0)
dominance = key_strength - non_key_strength
if dominance <= 0:
return 255
denominator = max(1.0, float(max(key)) - non_key_strength)
alpha = 1.0 - min(1.0, dominance / denominator)
return _clamp_channel(alpha * 255.0)
def _spill_channels(key: Color) -> list[int]:
key_max = max(key)
if key_max < 128:
return []
return [idx for idx, value in enumerate(key) if value >= key_max - 16 and value >= 128]
def _key_channel_dominance(rgb: Color, key: Color) -> float:
spill_channels = _spill_channels(key)
if not spill_channels:
return 0.0
channels = [float(value) for value in rgb]
non_spill = [idx for idx in range(3) if idx not in spill_channels]
key_strength = (
min(channels[idx] for idx in spill_channels)
if len(spill_channels) > 1
else channels[spill_channels[0]]
)
non_key_strength = max((channels[idx] for idx in non_spill), default=0.0)
return key_strength - non_key_strength
def _looks_key_colored(rgb: Color, key: Color, distance: int) -> bool:
if distance <= 32:
return True
spill_channels = _spill_channels(key)
if not spill_channels:
return True
return _key_channel_dominance(rgb, key) >= KEY_DOMINANCE_THRESHOLD
def _cleanup_spill(rgb: Color, key: Color, alpha: int = 255) -> Color:
if alpha >= 252:
return rgb
spill_channels = _spill_channels(key)
if not spill_channels:
return rgb
channels = [float(value) for value in rgb]
non_spill = [idx for idx in range(3) if idx not in spill_channels]
if non_spill:
anchor = max(channels[idx] for idx in non_spill)
cap = max(0.0, anchor - 1.0)
for idx in spill_channels:
if channels[idx] > cap:
channels[idx] = cap
return (
_clamp_channel(channels[0]),
_clamp_channel(channels[1]),
_clamp_channel(channels[2]),
)
def _apply_alpha_to_image(
image,
*,
key: Color,
tolerance: int,
spill_cleanup: bool,
soft_matte: bool,
transparent_threshold: float,
opaque_threshold: float,
) -> int:
pixels = image.load()
width, height = image.size
transparent = 0
for y in range(height):
for x in range(width):
red, green, blue, alpha = pixels[x, y]
rgb = (red, green, blue)
distance = _channel_distance(rgb, key)
key_like = _looks_key_colored(rgb, key, distance)
output_alpha = (
min(
_soft_alpha(distance, transparent_threshold, opaque_threshold),
_dominance_alpha(rgb, key),
)
if soft_matte and key_like
else (0 if distance <= tolerance else 255)
)
output_alpha = int(round(output_alpha * (alpha / 255.0)))
if 0 < output_alpha <= ALPHA_NOISE_FLOOR:
output_alpha = 0
if output_alpha == 0:
pixels[x, y] = (0, 0, 0, 0)
transparent += 1
continue
if spill_cleanup and key_like:
red, green, blue = _cleanup_spill(rgb, key, output_alpha)
pixels[x, y] = (red, green, blue, output_alpha)
return transparent
def _contract_alpha(image, pixels: int):
if pixels == 0:
return image
_, ImageFilter = _load_pillow()
alpha = image.getchannel("A")
for _ in range(pixels):
alpha = alpha.filter(ImageFilter.MinFilter(3))
image.putalpha(alpha)
return image
def _apply_edge_feather(image, radius: float):
if radius == 0:
return image
_, ImageFilter = _load_pillow()
alpha = image.getchannel("A")
alpha = alpha.filter(ImageFilter.GaussianBlur(radius=radius))
image.putalpha(alpha)
return image
def _encode_image(image, output_format: str) -> bytes:
out = BytesIO()
image.save(out, format=output_format.upper())
return out.getvalue()
def _alpha_counts(image) -> tuple[int, int, int]:
pixels = image.load()
width, height = image.size
total = 0
transparent = 0
partial = 0
for y in range(height):
for x in range(width):
alpha = pixels[x, y][3]
total += 1
if alpha == 0:
transparent += 1
elif alpha < 255:
partial += 1
return total, transparent, partial
def _sample_border_key(image, mode: str) -> Color:
width, height = image.size
pixels = image.load()
samples: list[Color] = []
if mode == "corners":
patch = max(1, min(width, height, 12))
boxes = [
(0, 0, patch, patch),
(width - patch, 0, width, patch),
(0, height - patch, patch, height),
(width - patch, height - patch, width, height),
]
for left, top, right, bottom in boxes:
for y in range(top, bottom):
for x in range(left, right):
red, green, blue = pixels[x, y][:3]
samples.append((red, green, blue))
else:
band = max(1, min(width, height, 6))
step = max(1, min(width, height) // 256)
for x in range(0, width, step):
for y in range(band):
red, green, blue = pixels[x, y][:3]
samples.append((red, green, blue))
red, green, blue = pixels[x, height - 1 - y][:3]
samples.append((red, green, blue))
for y in range(0, height, step):
for x in range(band):
red, green, blue = pixels[x, y][:3]
samples.append((red, green, blue))
red, green, blue = pixels[width - 1 - x, y][:3]
samples.append((red, green, blue))
if not samples:
_die("Could not sample background key color from image border.")
return (
int(round(median(sample[0] for sample in samples))),
int(round(median(sample[1] for sample in samples))),
int(round(median(sample[2] for sample in samples))),
)
def _remove_chroma_key(args: argparse.Namespace) -> None:
Image, _ = _load_pillow()
src = Path(args.input)
out = Path(args.out)
with Image.open(src) as image:
rgba = image.convert("RGBA")
key = (
_sample_border_key(rgba, args.auto_key)
if args.auto_key != "none"
else _parse_key_color(args.key_color)
)
transparent = _apply_alpha_to_image(
rgba,
key=key,
tolerance=args.tolerance,
spill_cleanup=args.spill_cleanup,
soft_matte=args.soft_matte,
transparent_threshold=args.transparent_threshold,
opaque_threshold=args.opaque_threshold,
)
rgba = _contract_alpha(rgba, args.edge_contract)
rgba = _apply_edge_feather(rgba, args.edge_feather)
total, transparent_after, partial_after = _alpha_counts(rgba)
out.parent.mkdir(parents=True, exist_ok=True)
output_format = "PNG" if out.suffix.lower() == ".png" else "WEBP"
out.write_bytes(_encode_image(rgba, output_format))
print(f"Wrote {out}")
print(f"Key color: #{key[0]:02x}{key[1]:02x}{key[2]:02x}")
print(f"Transparent pixels: {transparent_after}/{total}")
print(f"Partially transparent pixels: {partial_after}/{total}")
if transparent == 0:
print("Warning: no pixels matched the key color before feathering.", file=sys.stderr)
def _build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
description="Remove a solid chroma-key background and write an image with alpha."
)
parser.add_argument("--input", required=True, help="Input image path.")
parser.add_argument("--out", required=True, help="Output .png or .webp path.")
parser.add_argument(
"--key-color",
default="#00ff00",
help="Hex RGB key color to remove, for example #00ff00.",
)
parser.add_argument(
"--tolerance",
type=int,
default=12,
help="Hard-key per-channel tolerance for matching the key color, 0-255.",
)
parser.add_argument(
"--auto-key",
choices=["none", "corners", "border"],
default="none",
help="Sample the key color from image corners or border instead of --key-color.",
)
parser.add_argument(
"--soft-matte",
action="store_true",
help="Use a smooth alpha ramp between transparent and opaque thresholds.",
)
parser.add_argument(
"--transparent-threshold",
type=float,
default=12.0,
help="Soft-matte distance at or below which pixels become fully transparent.",
)
parser.add_argument(
"--opaque-threshold",
type=float,
default=96.0,
help="Soft-matte distance at or above which pixels become fully opaque.",
)
parser.add_argument(
"--edge-feather",
type=float,
default=0.0,
help="Optional alpha blur radius for softened edges, 0-64.",
)
parser.add_argument(
"--edge-contract",
type=int,
default=0,
help="Shrink the visible alpha matte by this many pixels before feathering.",
)
parser.add_argument(
"--spill-cleanup",
dest="spill_cleanup",
action="store_true",
help="Reduce obvious key-color spill on opaque pixels.",
)
parser.add_argument(
"--despill",
dest="spill_cleanup",
action="store_true",
help="Alias for --spill-cleanup; decontaminate key-color edge spill.",
)
parser.add_argument("--force", action="store_true", help="Overwrite an existing output file.")
return parser
def main() -> None:
parser = _build_parser()
args = parser.parse_args()
_validate_args(args)
_remove_chroma_key(args)
if __name__ == "__main__":
main()
@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf of
any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don\'t include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
+167
View File
@@ -0,0 +1,167 @@
---
name: "openai-docs"
description: "Use when the user asks how to build with OpenAI products or APIs, asks about Codex itself or choosing Codex surfaces, needs up-to-date official documentation with citations, help choosing the latest model for a use case, or model upgrade and prompt-upgrade guidance; use OpenAI docs MCP tools for non-Codex docs questions, use the Codex manual helper first for broad Codex self-knowledge, and restrict fallback browsing to official OpenAI domains."
---
# OpenAI Docs
Provide authoritative, current guidance from OpenAI developer docs using the developers.openai.com MCP server. "Docs MCP" means `mcp__openaiDeveloperDocs__search_openai_docs` and `mcp__openaiDeveloperDocs__fetch_openai_doc`; for API reference, schema, parameter, or required-field questions, also use `mcp__openaiDeveloperDocs__get_openapi_spec` when available. Official-domain web search is fallback after those tools are unavailable or unhelpful. Broad Codex questions use the manual helper before Docs MCP. This skill also owns model selection, API model migration, and prompt-upgrade guidance.
## API Key Setup
For requests to build, run, configure, debug, or implement an API-backed app, script, CLI, generator, or tool, use `openai-platform-api-key` first when available. After that credential gate is resolved, return here for current docs as needed.
Use this skill directly for docs-only questions, citations, model/API guidance, conceptual explanations, and examples that do not require building or running an API-backed artifact.
## Workflow Configuration
### Source Priority
- For Codex self-knowledge, use the Codex source route below; it owns when to use the manual helper, Docs MCP, or bounded uncertainty.
- For non-Codex OpenAI docs questions, use `mcp__openaiDeveloperDocs__search_openai_docs` to find the most relevant doc pages.
- For non-Codex OpenAI docs questions, fetch the relevant page with `mcp__openaiDeveloperDocs__fetch_openai_doc` before answering. If search is noisy, run a narrower Docs MCP search; when any plausible official OpenAI docs URL is known or found, try fetching that URL through Docs MCP before relying on web-search content.
- For API reference, schema, parameter, or required-field questions, use `mcp__openaiDeveloperDocs__get_openapi_spec` when available to verify the API shape alongside the relevant guide or reference page.
- Use `mcp__openaiDeveloperDocs__list_openai_docs` only when you need to browse or discover non-Codex pages without a clear query.
- For model-selection, "latest model", or default-model questions, fetch `https://developers.openai.com/api/docs/guides/latest-model.md` first. If that is unavailable, load `references/latest-model.md`.
- For model upgrades or prompt upgrades, run `node scripts/resolve-latest-model-info.js` only when the target is latest/current/default or otherwise unspecified; otherwise preserve the explicitly requested target.
- Preserve explicit target requests: if the user names a target model like "migrate to GPT-5.4", keep that requested target even if `latest-model.md` names a newer model. Mention newer guidance only as optional.
- If current remote guidance is needed, fetch both the returned migration and prompting guide URLs directly. If direct fetch fails, use MCP/search fallback; if that also fails, use bundled fallback references and disclose the fallback.
## OpenAI product snapshots
1. Apps SDK: Build ChatGPT apps by providing a web component UI and an MCP server that exposes your app's tools to ChatGPT.
2. Responses API: A unified endpoint designed for stateful, multimodal, tool-using interactions in agentic workflows.
3. Chat Completions API: Generate a model response from a list of messages comprising a conversation.
4. Codex: OpenAI's coding agent for software development that can write, understand, review, and debug code.
5. gpt-oss: Open-weight OpenAI reasoning models (gpt-oss-120b and gpt-oss-20b) released under the Apache 2.0 license.
6. Realtime API: Build low-latency, multimodal experiences including natural speech-to-speech conversations.
7. Agents SDK: A toolkit for building agentic apps where a model can use tools and context, hand off to other agents, stream partial results, and keep a full trace.
## Codex self-knowledge
Use this path for questions about Codex itself: configuring, extending, operating, troubleshooting, local state, product surfaces, or where Codex behavior should live. A codebase merely mentioning a plugin, skill, hook, MCP server, browser, or automation is not enough. For generic software tasks, answer the software task directly; if asked whether Codex self-knowledge applies, answer that meta question briefly and continue the requested artifact.
### Source Route
The Codex manual is the first source for broad Codex synthesis. Treat the manual and Docs MCP as different lanes, not interchangeable official-doc sources. For published-user Codex product answers, the source route is complete: the manual, Docs MCP when this route calls for it, official OpenAI web fallback, and callable capabilities surfaced in the current session when the question is about that capability. Knowledge bases outside developers.openai.com are outside this route for public product answers.
For broad Codex behavior, setup, customization, skills, plugins, MCP, hooks, `AGENTS.md`, automations, surfaces, local state, or system-map questions:
1. Reuse a same-thread manual and outline path when it is still fresh.
2. Otherwise run the skill-local helper first in normal writable sessions. Skip it without trying only when the session is explicitly read-only, shell execution is unavailable, or visible policy shows no allowed temp cache.
3. By default, the helper chooses the first usable temp cache dir in this order: `$TMPDIR/openai-docs-cache`, `%TEMP%\openai-docs-cache`, `%TMP%\openai-docs-cache`, `/private/tmp/openai-docs-cache`, then `/tmp/openai-docs-cache`. Workspace-only write access is not enough for this temp cache.
4. Run the helper directly unless you need to override the cache dir. The helper falls back to `curl` when native `fetch` is unavailable or when proxy env vars are present, so no shell-specific proxy prefix is required. Resolve `<skill-dir>` to this skill's actual directory; in copied local eval workdirs this is usually `.codex/skills/openai-docs`:
```bash
node <skill-dir>/scripts/fetch-codex-manual.mjs
```
If you need to override the cache dir, pass `--cache-dir <cache-dir>`. On Windows, the helper checks `%TEMP%` and `%TMP%` automatically; in PowerShell, `$env:TEMP\\openai-docs-cache` is a typical explicit override.
Treat helper availability as established by explicit read-only/no-shell policy or an actual command result. A guessed sandbox or guessed helper failure is not enough to switch to Docs MCP or web lookup; after an actual helper command failure, continue to the narrowest official next source below.
The helper verifies freshness, writes `codex-manual.md`, and emits `codex-manual.outline.md`. The outline maps source pages and headings to line ranges; use it to choose the relevant manual section, then read or search targeted manual sections for Codex product facts. Use the skill directory to locate and run the helper; after the helper succeeds, use the returned manual and outline paths as the search scope for Codex product facts and term coverage checks.
Reuse the same-thread manual and outline paths for follow-up Codex questions. Refresh first when the manual was fetched more than about a day ago, the path is unusable, the path came from another thread or uncertain provenance, or likely-current information is missing and staleness is plausible.
For questions about whether the manual is current enough to rely on now, run the helper when temp caching is allowed and base the answer on its returned status, manual path, and outline path.
If the manual resolves a Codex claim, answer from it and stop expanding sources for that claim; continue the user's broader task if the docs lookup was only one dependency. Manual source pages and known anchors are enough citation support for manual-covered material.
If the helper is skipped because the session is read-only, has no shell execution, or has no allowed temp cache, the next source is Docs MCP: call `mcp__openaiDeveloperDocs__search_openai_docs`, then `mcp__openaiDeveloperDocs__fetch_openai_doc` for a relevant hit before any web fallback.
If a user names a Codex term or mode that a fresh manual does not use, search the manual for obvious adjacent concepts, then answer that the exact term is not documented and use the closest documented terminology. If the prompt asks how that term maps to Codex behavior, resolve the mapping from adjacent manual sections. If the exact term remains material or likely current after that manual pass, use one narrow Docs MCP search/fetch before bounded uncertainty; otherwise, the source lookup for that terminology or mapping claim is complete.
Use the narrowest official next source only when the manual is unavailable, the helper fails, temp caching is not allowed, another material claim is missing or likely stale, or the user explicitly needs a page-specific citation. Prefer one specific Docs MCP search and, if it returns a clearly relevant page, one fetch; for unresolved Codex capability names, acronyms, scheduling terms, or exact error text, this Docs MCP step is the next source before web search. After the manual plus any permitted Docs MCP gap-fill, resolve remaining gaps as bounded uncertainty. Use official-domain web fallback only after that Docs MCP path is unavailable or unhelpful. If the claim is still not established, stop with bounded uncertainty. If official docs/manual conflict with a callable capability already surfaced in the current session, state the conflict and prefer verified current-session behavior for that environment.
For undocumented or private-looking model slugs, product mode labels, entitlement labels, account access paths, or rollout names, answer from current public docs and bounded uncertainty. Those labels are not a reason to leave the public source route.
For support-style diagnostics, prefer a layer-by-layer answer from the manual over provider-specific web lookups: installed/enabled plugin, bundled app or connector authorization, MCP setup, workspace/admin policy, restart or new-thread expectations, then support or feedback if still unresolved.
If the source route still does not establish a claim, return bounded uncertainty or route to support, an admin, or product feedback instead of widening the investigation.
For unresolved product terminology, answer from the manual plus the allowed official next source. If those sources do not establish the term, answer with bounded uncertainty from those sources.
### Surface Map
When Codex nouns or durable-instruction surfaces overlap, recommend the smallest surface that matches the scope:
- Prompt or thread context -> one-off task constraints.
- `AGENTS.md` -> durable repo conventions, commands, verification steps, and review expectations; closer nested files apply under their subtree.
- Project `.codex/config.toml` -> trusted-repo Codex settings such as sandbox, MCP, hooks, model, or reasoning defaults.
- Global config or global guidance -> personal defaults across repos.
- Skill -> reusable task workflow with references or scripts.
- Plugin -> installable bundle with skills plus commands, tools, MCP config, hooks, assets, apps, or marketplace metadata.
- MCP server or app connector -> live external data/actions or authorized private app/workspace data. Use connectors for private Google Docs, Calendar, Slack, GitHub, Notion, and similar data instead of web search or model memory.
- Automation -> scheduled checks, reminders, monitors, or follow-up work; use a thread heartbeat when continuity in an existing thread matters.
- Hook -> lifecycle enforcement around tool calls, commands, or file edits.
Split mixed-scope requests instead of forcing one answer. Example: "always do X, but only for this PR" defaults to prompt/thread context for the current run; use `AGENTS.md` or project config only if it should persist, hooks only for mechanical enforcement, and automations only for scheduled or follow-up work.
Use this quick product map when needed: CLI is terminal-first local repo work; IDE extension is editor-attached coding; Codex app is desktop planning, review, and interactive work; cloud/web is hosted parallel/offloaded work; Browser Use/in-app browser is Codex-controlled web testing; Chrome extension uses the user's Chrome profile; Computer Use controls desktop apps and OS UI. Keep `config.toml` defaults, `requirements.toml` constraints, and managed/admin policy separate.
### Boundaries And Output
- API key auth does not imply ChatGPT, cloud task, or connector access. For plugin/app/auth failures, check bundle availability, plugin installed/enabled state, connector/app authorization, MCP setup, restart/refresh expectations, workspace policy, and per-surface availability before answering.
- Sandbox or network denials need scoped escalation with a clear justification. Destructive commands, writes outside the workspace, or broad access changes require explicit approval.
- Memory can provide user preference or context, but explicit prompt instructions win and memory is not a source for current external facts.
- For affirmative surface-selection answers, use this shape: recommendation, why, what to avoid, and the manual/source evidence used.
- When page-specific Codex citations are actually needed, these anchors often fit: `concepts/customization#agents-guidance` for `AGENTS.md`, `concepts/customization#skills` for skills, `plugins/build#plugin-structure` for plugins, `concepts/customization#mcp` for MCP, `config-advanced#hooks` for hooks, `app/automations#thread-automations` for thread automations, and `config-reference#configtoml` for config.
## If MCP server is missing
If MCP tools fail or no OpenAI docs resources are available:
1. Run the install command yourself: `codex mcp add openaiDeveloperDocs --url https://developers.openai.com/mcp`
2. If it fails due to permissions/sandboxing, immediately retry the same command with escalated permissions and include a 1-sentence justification for approval.
3. Ask the user to run the install command only if the escalated attempt fails.
4. Ask the user to restart Codex.
5. Re-run the doc search/fetch after restart.
## Workflow
1. Clarify whether the request is general docs lookup, model selection, a model-string upgrade, prompt-upgrade guidance, or broader API/provider migration.
2. For Codex self-knowledge requests, follow the Codex self-knowledge source procedure above.
3. For model-selection or upgrade requests, prefer current remote docs over bundled references when the user asks for latest/current/default guidance.
- Fetch `https://developers.openai.com/api/docs/guides/latest-model.md`.
- Find the latest model ID and explicit migration or prompt-guidance links.
- Prefer explicit links from the latest-model page over derived URLs.
- For explicit named-model requests, preserve the requested model target. Mention newer remote guidance only as optional.
- For dynamic latest/current/default upgrades, run `node scripts/resolve-latest-model-info.js`, then fetch both returned guide URLs directly when possible.
- If direct guide fetch fails, use the developer-docs MCP tools or official OpenAI-domain search to find the same guide content.
- If remote docs are unavailable, use bundled fallback references and say that fallback guidance was used.
4. For model upgrades, keep changes narrow: update active OpenAI API model defaults and directly related prompts only when safe.
5. Leave historical docs, examples, eval baselines, fixtures, provider comparisons, provider registries, pricing tables, alias defaults, low-cost fallback paths, and ambiguous older model usage unchanged unless the user explicitly asks to upgrade them.
6. Keep SDK, tooling, IDE, plugin, shell, auth, and provider-environment migrations out of a model-and-prompt upgrade unless the user explicitly asks for them.
7. If an upgrade needs API-surface changes, schema rewiring, tool-handler changes, or implementation work beyond a literal model-string replacement and prompt edits, report it as blocked or confirmation-needed.
8. For general docs lookup, search docs with a precise query, fetch the best page and exact section needed, and answer with concise citations.
## Reference map
Read only what you need:
- `https://developers.openai.com/api/docs/guides/latest-model.md` -> current model-selection and "best/latest/current model" questions.
- `scripts/fetch-codex-manual.mjs` -> current Codex manual fetch, verification, local temp cache, and outline generation.
- `https://developers.openai.com/codex/codex-manual.md` -> current Codex self-knowledge synthesis, including setup, customization, skills, plugins, MCP, hooks, `AGENTS.md`, automations, and surface behavior; normally access it through the helper path and targeted file reads when temp caching is available.
- `references/latest-model.md` -> bundled fallback for model-selection and "best/latest/current model" questions.
- `references/upgrade-guide.md` -> bundled fallback for model upgrade and upgrade-planning requests.
- `references/prompting-guide.md` -> bundled fallback for prompt rewrites and prompt-behavior upgrades.
## Quality rules
- Treat OpenAI docs as the source of truth; avoid speculation.
- For Codex self-knowledge, follow the source route above instead of relying on remembered behavior.
- Keep migration changes narrow and behavior-preserving.
- Prefer prompt-only upgrades when possible.
- Avoid inventing pricing, availability, parameters, API changes, or breaking changes.
- Keep quotes short and within policy limits; prefer paraphrase with citations.
- If multiple pages differ, call out the difference and cite both.
- If official docs and verified callable current-session behavior disagree, state the conflict before making broad claims or edits.
- If docs do not cover the users need, say so and offer next steps.
## Tooling notes
- Use MCP doc tools before web search for OpenAI-related markdown docs. The Codex manual flow is the exception: follow the Codex self-knowledge source procedure for broad Codex synthesis.
- If the MCP server is installed but returns no meaningful results, then use web search as a fallback.
- When falling back to web search, restrict to official OpenAI domains (developers.openai.com, platform.openai.com) and cite sources.
@@ -0,0 +1,14 @@
interface:
display_name: "OpenAI Docs"
short_description: "Reference OpenAI docs, Codex self-knowledge, and model migration guidance"
icon_small: "./assets/openai-small.svg"
icon_large: "./assets/openai.png"
default_prompt: "Use OpenAI Docs for official docs lookup, questions about Codex itself or Codex surfaces, model selection, model migration, and prompt-upgrade work."
dependencies:
tools:
- type: "mcp"
value: "openaiDeveloperDocs"
description: "OpenAI Developer Docs MCP server"
transport: "streamable_http"
url: "https://developers.openai.com/mcp"
@@ -0,0 +1,3 @@
<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" fill="currentColor" viewBox="0 0 14 14">
<path d="M10.931 3.34a.112.112 0 0 0-.069-.104l-.038-.007c-1.537.05-2.45.318-3.714 1.002v6.683c.48-.248.936-.44 1.414-.58.695-.203 1.417-.292 2.303-.305l.038-.008a.113.113 0 0 0 .066-.104V3.341ZM2.363 9.919c0 .064.051.11.105.111l.33.008c1.162.046 2.042.243 2.975.662-.403-.585-1.008-1.075-1.654-1.292a.991.991 0 0 1-.674-.941v-5.14a6.36 6.36 0 0 0-.59-.076l-.37-.02a.115.115 0 0 0-.122.111v6.577Zm9.455-.001a.998.998 0 0 1-.877.992l-.101.007c-.832.012-1.47.095-2.066.27-.599.174-1.176.448-1.883.863a.444.444 0 0 1-.449 0c-1.299-.763-2.229-1.07-3.689-1.125l-.299-.008a.997.997 0 0 1-.977-.998V3.342c0-.573.478-1.017 1.038-.999l.417.023c.188.015.35.037.513.062v-.754c0-.708.749-1.244 1.429-.903.984.492 1.836 1.449 2.15 2.505 1.216-.617 2.222-.884 3.771-.934l.105.003a.998.998 0 0 1 .918.996v6.576ZM4.332 8.466c0 .049.03.087.07.1l.24.091a4.319 4.319 0 0 1 1.581 1.176V3.721c-.164-.803-.799-1.617-1.584-2.07l-.162-.088c-.025-.012-.054-.013-.088.009a.12.12 0 0 0-.057.102v6.792Z"/>
</svg>

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 KiB

@@ -0,0 +1,37 @@
# Latest model guide
This file is a curated helper. Every recommendation here must be verified against current OpenAI docs before it is repeated to a user.
## Current model map
| Model ID | Use for |
| --- | --- |
| `gpt-5.5` | Latest/default text and reasoning model for most new apps, including coding and tool-heavy workflows |
| `gpt-5.5-pro` | Maximum reasoning or quality when latency and cost matter less |
| `gpt-5.4` | Previous default text and reasoning model; use for existing GPT-5.4 integrations |
| `gpt-5.4-mini` | Lower-cost testing and lighter production workflows |
| `gpt-5.4-nano` | High-throughput simple tasks and classification |
| `gpt-5.5` | Explicit no-reasoning text path via `reasoning.effort: none` |
| `gpt-4.1-mini` | Cheaper no-reasoning text |
| `gpt-4.1-nano` | Fastest and cheapest no-reasoning text |
| `gpt-5.3-codex` | Agentic coding, code editing, and tool-heavy coding workflows |
| `gpt-5.1-codex-mini` | Cheaper coding workflows |
| `gpt-image-2` | Best image generation and edit quality |
| `gpt-image-1.5` | Less expensive image generation and edit quality |
| `gpt-image-1-mini` | Cost-optimized image generation |
| `gpt-4o-mini-tts` | Text-to-speech |
| `gpt-4o-mini-transcribe` | Speech-to-text, fast and cost-efficient |
| `gpt-realtime-1.5` | Realtime voice and multimodal sessions |
| `gpt-realtime-mini` | Cheaper realtime sessions |
| `gpt-audio` | Chat Completions audio input and output |
| `gpt-audio-mini` | Cheaper Chat Completions audio workflows |
| `sora-2` | Faster iteration and draft video generation |
| `sora-2-pro` | Higher-quality production video |
| `omni-moderation-latest` | Text and image moderation |
| `text-embedding-3-large` | Higher-quality retrieval embeddings; default in this skill because no best-specific row exists |
| `text-embedding-3-small` | Lower-cost embeddings |
## Maintenance notes
- This file will drift unless it is periodically re-verified against current OpenAI docs.
- If this file conflicts with current docs, the docs win.
@@ -0,0 +1,244 @@
GPT-5.5 works best when prompts define the outcome and leave room for the model to choose an efficient solution path. Compared with earlier models, you can often use shorter, more outcome-oriented prompts: describe what good looks like, what constraints matter, what evidence is available, and what the final answer should contain.
Avoid carrying over every instruction from an older prompt stack. Legacy prompts often over-specify the process because earlier models needed more help staying on track. With GPT-5.5, that can add noise, narrow the model's search space, or lead to overly mechanical answers.
For more detail on GPT-5.5 behavior changes, start with the [Using GPT-5.5 guide](/api/docs/guides/latest-model). This guide focuses on prompt changes that follow from those behavior changes.
The patterns here are starting points. Adapt them to your product surface, tools, evals, and user experience goals.
## Personality and behavior
GPT-5.5's default style is efficient, direct, and task-oriented. This is useful for production systems: responses stay focused, behavior is easier to steer, and the model avoids unnecessary conversational padding.
For customer-facing assistants, support workflows, coaching experiences, and other conversational products, define both personality and collaboration style.
- **Personality** controls how the assistant sounds: tone, warmth, directness, formality, humor, empathy, and level of polish.
- **Collaboration style** controls how the assistant works: when it asks questions, when it makes assumptions, how proactive it should be, how much context it gives, when it checks work, and how it handles uncertainty or risk.
Keep both short. Personality instructions should shape the user experience. Collaboration instructions should shape task behavior. Neither should replace clear goals, success criteria, tool rules, or stopping conditions.
Example personality block for a steady task-focused assistant:
```text
# Personality
You are a capable collaborator: approachable, steady, and direct. Assume the user is competent and acting in good faith, and respond with patience, respect, and practical helpfulness.
Prefer making progress over stopping for clarification when the request is already clear enough to attempt. Use context and reasonable assumptions to move forward. Ask for clarification only when the missing information would materially change the answer or create meaningful risk, and keep any question narrow.
Stay concise without becoming curt. Give enough context for the user to understand and trust the answer, then stop. Use examples, comparisons, or simple analogies when they make the point easier to grasp. When correcting the user or disagreeing, be candid but constructive. When an error is pointed out, acknowledge it plainly and focus on fixing it.
Match the user's tone within professional bounds. Avoid emojis and profanity by default, unless the user explicitly asks for that style or has clearly established it as appropriate for the conversation.
```
Example personality block for an expressive collaborative assistant:
```text
# Personality
Adopt a vivid conversational presence: intelligent, curious, playful when appropriate, and attentive to the user's thinking. Ask good questions when the problem is blurry, then become decisive once there is enough context.
Be warm, collaborative, and polished. Conversation should feel easy and alive, but not chatty for its own sake. Offer a real point of view rather than merely mirroring the user, while staying responsive to their goals and constraints.
Be thoughtful and grounded when the task calls for synthesis or advice. State a clear recommendation when you have enough context, explain important tradeoffs, and name uncertainty without becoming evasive.
```
For more expressive products, add warmth, curiosity, humor, or point of view explicitly, but keep the block short. Use personality to shape the experience, not to compensate for unclear goals or missing task instructions.
## Improve time to first visible token with a preamble
In streaming applications, users notice how long it takes before the first visible response appears. GPT-5.5 may spend time reasoning, planning, or preparing tool calls before emitting visible text.
For longer or tool-heavy tasks, prompt the model to start with a short preamble: a brief visible update that acknowledges the request and states the first step. This can improve perceived responsiveness without changing the underlying task.
Use this pattern when the task may take more than one step, require tool calls, or involve a long-running agent workflow.
```text
Before any tool calls for a multi-step task, send a short user-visible update that acknowledges the request and states the first step. Keep it to one or two sentences.
```
For coding agents that expose separate message phases, you can be more explicit:
```text
You must always start with an intermediary update before any content in the analysis channel if the task will require calling tools. The user update should acknowledge the request and explain your first step.
```
## Outcome-first prompts and stopping conditions
GPT-5.5 is strongest when the prompt defines the target outcome, success criteria, constraints, and available context, then lets the model choose the path.
For many tasks, describe the destination rather than every step. This gives the model room to choose the right search, tool, or reasoning strategy for the task.
Prefer this:
```text
Resolve the customer's issue end to end.
Success means:
- the eligibility decision is made from the available policy and account data
- any allowed action is completed before responding
- the final answer includes completed_actions, customer_message, and blockers
- if evidence is missing, ask for the smallest missing field
```
**Avoid unnecessary absolute rules.** Older prompts often use strict instructions like `ALWAYS`, `NEVER`, `must`, and `only` to control model behavior. Use those words for true invariants, such as safety rules, required output fields, or actions that should never happen. For judgment calls, such as when to search, ask for clarification, use a tool, or keep iterating, prefer decision rules instead.
Avoid this style of instruction unless every step is truly required:
```text
First inspect A, then inspect B, then compare every field, then think through
all possible exceptions, then decide which tool to call, then call the tool,
then explain the entire process to the user.
```
Add explicit stopping conditions:
```text
Resolve the user query in the fewest useful tool loops, but do not let loop minimization outrank correctness, accessible fallback evidence, calculations, or required citation tags for factual claims.
After each result, ask: "Can I answer the user's core request now with useful evidence and citations for the factual claims?" If yes, answer.
```
Define missing-evidence behavior:
```text
Use the minimum evidence sufficient to answer correctly, cite it precisely, then stop.
```
## Formatting
GPT-5.5 is highly steerable on output format and structure. Use that control when it improves comprehension or product fit.
Set `text.verbosity`, describe the expected output shape, and reserve heavier structure for cases where it improves comprehension or your product UI needs a stable artifact. The API default for `text.verbosity` is `medium`; use `low` when you prefer shorter, more concise responses.
Plain conversational formatting:
```text
Let formatting serve comprehension. Use plain paragraphs as the default format for normal conversation, explanations, reports, documentation, and technical writeups. Keep the presentation clean and readable without making the structure feel heavier than the content.
Use headers, bold text, bullets, and numbered lists sparingly. Reach for them when the user requests them, when the answer needs clear comparison or ranking, or when the information would be harder to scan as prose. Otherwise, favor short paragraphs and natural transitions.
Respect formatting preferences from the user. If they ask for a terse answer, minimal formatting, no bullets, no headers, or a specific structure, follow that preference unless there is a strong reason not to.
```
Add explicit audience and length guidance:
```text
Write for a senior business audience. Keep the answer under 400 words. Use short paragraphs and only include bullets when they improve scannability. Prioritize the conclusion first, then the reasoning, then caveats.
```
For editing, rewriting, summaries, or customer-facing messages, tell the model what to preserve before asking it to improve style. This pattern is useful when you want polish without expansion.
```text
Preserve the requested artifact, length, structure, and genre first. Quietly improve clarity, flow, and correctness. Do not add new claims, extra sections, or a more promotional tone unless explicitly requested.
```
## Grounding, citations, and retrieval budgets
For grounded answers, citation behavior should be part of the prompt. Define what needs support, what counts as enough evidence, and how the model should behave when evidence is missing. Absence of evidence shouldn't automatically become a factual "no." For more details and examples, see the [citation formatting guide](/api/docs/guides/citation-formatting).
### Add an explicit retrieval budget
Retrieval budgets are stopping rules for search. They tell the model when enough evidence is enough.
```text
For ordinary Q&A, start with one broad search using short, discriminative keywords. If the top results contain enough citable support for the core request, answer from those results instead of searching again.
Make another retrieval call only when:
- The top results do not answer the core question.
- A required fact, parameter, owner, date, ID, or source is missing.
- The user asked for exhaustive coverage, a comparison, or a comprehensive list.
- A specific document, URL, email, meeting, record, or code artifact must be read.
- The answer would otherwise contain an important unsupported factual claim.
Do not search again to improve phrasing, add examples, cite nonessential details, or support wording that can safely be made more generic.
```
## Creative drafting guardrails
For drafting tasks, tell the model which claims must come from sources and which parts may be creatively written. This is especially important for slides, launch copy, customer summaries, talk tracks, leadership blurbs, and narrative framing.
```text
For creative or generative requests such as slides, leadership blurbs, outbound copy, summaries for sharing, talk tracks, or narrative framing, distinguish source-backed facts from creative wording.
- Use retrieved or provided facts for concrete product, customer, metric, roadmap, date, capability, and competitive claims, and cite those claims.
- Do not invent specific names, first-party data claims, metrics, roadmap status, customer outcomes, or product capabilities to make the draft sound stronger.
- If there is little or no citable support, write a useful generic draft with placeholders or clearly labeled assumptions rather than unsupported specifics.
```
## Frontend engineering and visual taste
For frontend work, refer to the [example instructions](/api/docs/guides/frontend-prompt) for practical ways to steer UI quality. They cover product and user context, design-system alignment, first-screen usability, familiar controls, expected states, responsive behavior, and common generated-UI defaults to avoid, such as generic heroes, nested cards, decorative gradients, visible instructional text, and broken layouts.
## Prompt the model to check its work
Give GPT-5.5 access to tools that let it check outputs when validation is possible.
For coding agents, ask for concrete validation commands:
```text
After making changes, run the most relevant validation available:
- targeted unit tests for changed behavior
- type checks or lint checks when applicable
- build checks for affected packages
- a minimal smoke test when full validation is too expensive
If validation cannot be run, explain why and describe the next best check.
```
For visual artifacts, ask for inspection after rendering:
```text
Render the artifact before finalizing. Inspect the rendered output for layout, clipping, spacing, missing content, and visual consistency. Revise until the rendered output matches the requirements.
```
For engineering and planning tasks, make implementation plans traceable:
```text
For implementation plans, include:
- requirements and where each is addressed
- named resources, files, APIs, or systems involved
- state transitions or data flow where relevant
- validation commands or checks
- failure behavior
- privacy and security considerations
- open questions that materially affect implementation
```
## Phase parameter
Starting with GPT-5.4, long-running or tool-heavy Responses workflows can use assistant-item `phase` values to distinguish intermediate updates from final answers. GPT-5.5 uses the same pattern.
If you use `previous_response_id`, the API preserves prior assistant state automatically. If your application manually replays assistant output items into the next request, preserve each original `phase` value and pass it back unchanged. This matters most when a response includes preambles, repeated tool calls, or a final answer after intermediate assistant updates.
```text
If manually replaying assistant items:
- Preserve assistant `phase` values exactly.
- Use `phase: "commentary"` for intermediate user-visible updates.
- Use `phase: "final_answer"` for the completed answer.
- Do not add `phase` to user messages.
```
## Suggested prompt structure
Use this structure as a starting point for complex prompts. Keep each section short. Add detail only where it changes behavior.
```text
Role: [1-2 sentences defining the model's function, context, and job]
# Personality
[tone, demeanor, and collaboration style]
# Goal
[user-visible outcome]
# Success criteria
[what must be true before the final answer]
# Constraints
[policy, safety, business, evidence, and side-effect limits]
# Output
[sections, length, and tone]
# Stop rules
[when to retry, fallback, abstain, ask, or stop]
```
@@ -0,0 +1,181 @@
# Upgrading to GPT-5.5
Use this guide when the user explicitly asks to upgrade an existing integration to GPT-5.5. Pair it with current OpenAI docs lookups. The default target string is `gpt-5.5`.
## Freshness check
Before applying this bundled guide for a latest/current/default model upgrade, run `node scripts/resolve-latest-model-info.js` from the OpenAI Docs skill directory.
- If the command returns `modelSlug: "gpt-5p5"`, continue with this bundled guide and use `references/prompting-guide.md` when prompt updates are needed.
- If the command returns a different `modelSlug`, fetch both the returned `migrationGuideUrl` and `promptingGuideUrl` and use them as the current source of truth instead of the bundled references.
- If the command fails, metadata is missing, or either remote guide cannot be fetched, continue with bundled fallback references and say the remote freshness check was unavailable.
- If the user explicitly named a target model, preserve that target and use current docs only to check compatibility or caveats.
## Upgrade posture
Upgrade with the narrowest safe change set:
- replace the model string first
- update only the prompts that are directly tied to that model usage
- do not automatically upgrade older or ambiguous model usages that may be intentionally pinned, such as historical docs, examples, tests, eval baselines, comparison code, or low-cost fallback/routing paths. Unless the user explicitly asks to upgrade all model usage, leave those sites unchanged and list them as confirmation-needed
- prefer prompt-only upgrades when possible
- if the upgrade would require API-surface changes, parameter rewrites, tool rewiring, provider migration, or broader code edits, mark it as blocked instead of stretching the scope
## Upgrade workflow
1. Inventory current model usage.
- Search for model strings, client calls, and prompt-bearing files.
- Include inline prompts, prompt templates, YAML or JSON configs, Markdown docs, and saved prompts when they are clearly tied to a model usage site.
2. Pair each model usage with its prompt surface.
- Prefer the closest prompt surface first: inline system or developer text, then adjacent prompt files, then shared templates.
- If you cannot confidently tie a prompt to the model usage, say so instead of guessing.
3. Classify the source model family.
- Common buckets: GPT-5.4, GPT-5.3-Codex or GPT-5.2-Codex, earlier GPT-5.x, GPT-4o or GPT-4.1, reasoning models such as o1 or o3 or o4-mini, third-party model, or mixed and unclear.
4. Decide the upgrade class.
- `model string only`
- `model string + light prompt rewrite`
- `blocked without code changes`
5. Run the compatibility gate.
- Check whether the current integration can accept `gpt-5.5` without API-surface changes or implementation changes.
- Check whether structured outputs, tool schemas, function names, and downstream parsers can remain unchanged.
- For long-running Responses or tool-heavy agents, check whether `phase` is already preserved or round-tripped when the host replays assistant items or uses preambles.
- If compatibility depends on code changes, return `blocked`.
- If compatibility is unclear, return `unknown` rather than improvising.
6. Apply the upgrade when it is in scope.
- Default replacement string: `gpt-5.5`.
- Keep the intervention small and behavior-preserving.
- Start from the current reasoning effort when it is visible unless there is a measured reason to change it.
- For in-scope changes, update the model string and directly related prompts.
- For blocked or unknown changes, do not edit; report the blocker or uncertainty.
7. Summarize the result.
- `Current model usage`
- `Model-string updates`
- `Reasoning-effort handling`
- `Prompt updates`
- `Structured output and formatting assessment`
- `Tool-use assessment` when the flow uses tools, retrieval, or terminal actions
- `Phase assessment` when the flow is long-running, replayed, or tool-heavy
- `Compatibility check`
- `Validation performed`
Output rule:
- For each usage site, state the starting reasoning-effort recommendation.
- If the repo exposes the current reasoning setting, recommend preserving it first unless current OpenAI docs say otherwise.
- If the repo does not expose the current setting, recommend not adding one unless current OpenAI docs require it.
## Upgrade outcomes
### `model string only`
Choose this when:
- the source model is GPT-5.4
- the existing prompts are already short, explicit, and task-bounded
- the workflow does not rely on strict output formats, tool-call behavior, batch completeness, or long-horizon execution that should be validated after the upgrade
- there are no obvious compatibility blockers
Default action:
- replace the model string with `gpt-5.5`
- preserve the current reasoning effort
- keep prompts unchanged
- validate behavior with existing tests, realistic spot checks, or an existing eval suite when one is already available
### `model string + light prompt rewrite`
Choose this when:
- the task needs stronger completeness, citation discipline, verification, or dependency handling
- the upgraded model becomes too verbose, too dense, or hard to scan unless formatting is constrained
- the workflow has strict output shape requirements and lacks an explicit format contract, schema, or parser validation
- the workflow is research-heavy and needs stronger handling of sparse or empty retrieval results
- the workflow is coding-oriented, terminal-based, tool-heavy, or multi-agent, but the existing API surface and tool definitions can remain unchanged
Default action:
- replace the model string with `gpt-5.5`
- preserve the current reasoning effort for the first pass
- make only the smallest prompt edits needed for the observed workflow risk
- read the [GPT-5.5 prompting guide](/api/docs/guides/prompt-guidance?model=gpt-5.5) to choose the smallest prompt changes that recover or improve behavior
- avoid broad prompt cleanup unrelated to the upgrade
- for research workflows, add citation rules, retrieval budgets, missing-evidence behavior, and validation guidance from the prompting guide
- for dependency-aware or tool-heavy workflows, add prerequisite checks, missing-context handling, explicit tool budgets, stop conditions, and validation guidance
- for coding or terminal workflows, add repo-specific constraints, acceptance criteria, and concrete validation commands
- for multi-agent support or triage workflows, add task ownership, handoff, completeness, and stopping criteria
- for long-running Responses agents with preambles or multiple assistant messages, explicitly review whether `phase` is already handled; if adding or preserving `phase` would require code edits, mark the path as `blocked`
- do not classify a coding or tool-using Responses workflow as `blocked` just because the visible snippet is minimal; prefer `model string + light prompt rewrite` unless the repo clearly shows that a safe GPT-5.5 path would require host-side code changes
### `blocked`
Choose this when:
- the upgrade appears to require API-surface changes
- the upgrade appears to require parameter rewrites or reasoning-setting changes that are not exposed outside implementation code
- the upgrade would require changing tool definitions, tool handler wiring, or schema contracts
- the user is asking for a tooling, IDE, plugin, shell, or environment migration rather than a model and prompt migration
- the integration depends on provider-specific APIs that do not map to the current OpenAI API surface without implementation work
- you cannot confidently identify the prompt surface tied to the model usage
Default action:
- do not improvise a broader upgrade
- report the blocker and explain that the fix is out of scope for this guide
- if useful, describe the smallest follow-up implementation task that would unblock the migration
## Compatibility checklist
Before applying or recommending a model-and-prompt-only upgrade, check:
1. Can the current host accept the `gpt-5.5` model string without changing client code or API surface?
2. Are the related prompts identifiable and editable?
3. Does the host depend on behavior that likely needs API-surface changes, parameter rewrites, provider migration, or tool rewiring?
4. Would the likely fix be prompt-only, or would it need implementation changes?
5. Is the prompt surface close enough to the model usage that you can make a targeted change instead of a broad cleanup?
6. Do strict structured outputs, schemas, or downstream parsers still have an explicit contract?
7. For long-running Responses or tool-heavy agents, is `phase` already preserved if the host relies on preambles, replayed assistant items, or multiple assistant messages?
8. Are latency, token, or price assumptions validated by tests, realistic spot checks, or an existing eval suite rather than inferred from general model positioning?
If item 1 is no, items 3 through 4 point to implementation work, or item 7 is no and the fix needs code changes, return `blocked`.
If item 2 is no, return `unknown` unless the user can point to the prompt location.
Important:
- Existing use of tools, agents, or multiple usage sites is not by itself a blocker.
- If the current host can keep the same API surface and the same tool definitions, prefer `model string + light prompt rewrite` over `blocked`.
- Reserve `blocked` for cases that truly require implementation changes, not cases that only need stronger prompt steering.
- Do not claim token savings without task-level validation.
## Scope boundaries
This guide may:
- update or recommend updated model strings
- update or recommend updated prompts
- inspect code and prompt files to understand where those changes belong
- inspect whether existing Responses flows already preserve `phase`
- flag compatibility blockers
- propose validation with existing tests, realistic spot checks, or existing eval suites
This guide may not:
- move Chat Completions code to Responses
- move Responses code to another API surface
- migrate SDKs, APIs, IDE configuration, shell hooks, plugins, or provider-specific tooling
- rewrite parameter shapes
- change tool definitions or tool-call handling
- change structured-output wiring
- add or retrofit `phase` handling in implementation code
- edit business logic, orchestration logic, SDK usage, IDE configuration, shell hooks, or plugin integration behavior except for model-string replacements and directly related prompt edits
If a safe GPT-5.5 upgrade requires any of those changes, mark the path as blocked and out of scope.
## Validation plan
- Validate each upgraded usage site with existing tests, realistic spot checks, or an existing eval suite when one is already available.
- Compare against the current GPT-5.4 baseline when available.
- Check task success, retry count, tool-call count, total tokens, latency, output shape, and user-visible quality.
- For specialized workflows, validate the contract that matters most instead of judging only general output quality.
- If prompt edits were added, confirm each block is doing real work instead of adding noise.
- If the workflow has downstream impact, add a lightweight verification pass before finalization.
@@ -0,0 +1,598 @@
#!/usr/bin/env node
import {
access,
mkdir,
readFile,
rename,
rm,
stat,
writeFile,
} from "node:fs/promises";
import { constants as fsConstants } from "node:fs";
import { execFile } from "node:child_process";
import { createHash } from "node:crypto";
import path from "node:path";
import process from "node:process";
import { pathToFileURL } from "node:url";
import { inspect, promisify } from "node:util";
const DEFAULT_MANUAL_URL = "https://developers.openai.com/codex/codex-manual.md";
const DEFAULT_CACHE_DIR_NAME = "openai-docs-cache";
const CACHE_FILE_NAME = "codex-manual.md";
const OUTLINE_FILE_NAME = "codex-manual.outline.md";
const HASH_HEADER = "x-content-sha256";
const USER_AGENT = "codex-openai-docs";
const execFileAsync = promisify(execFile);
class ManualFetchError extends Error {
constructor(message, options) {
super(message, options);
this.name = "ManualFetchError";
}
}
const sha256 = (value) => createHash("sha256").update(value).digest("hex");
const withTimeout = async (promiseFactory, timeoutMs) => {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), timeoutMs);
try {
return await promiseFactory(controller.signal);
} finally {
clearTimeout(timeout);
}
};
const proxyConfigured = () =>
process.env.HTTP_PROXY ||
process.env.HTTPS_PROXY ||
process.env.http_proxy ||
process.env.https_proxy;
const responseHeaders = (headers) => ({
get(name) {
return headers.get(name.toLowerCase()) ?? null;
},
});
const makeResponse = ({ body, headers, status }) => ({
headers: responseHeaders(headers),
ok: status >= 200 && status < 300,
status,
async text() {
return body;
},
});
const parseCurlHeaders = (rawHeaders) => {
const normalized = rawHeaders.replace(/\r\n/g, "\n").trim();
const blocks = normalized.split(/\n\n+/).filter(Boolean);
const headerBlock = [...blocks]
.reverse()
.find((block) => block.startsWith("HTTP/"));
if (!headerBlock) {
throw new ManualFetchError("curl did not return HTTP response headers.");
}
const [statusLine, ...lines] = headerBlock.split("\n");
const statusMatch = /^HTTP\/\S+\s+(\d{3})/.exec(statusLine);
if (!statusMatch) {
throw new ManualFetchError(
`Could not parse HTTP status from curl response: ${statusLine}`
);
}
const headers = new Map();
lines.forEach((line) => {
const separator = line.indexOf(":");
if (separator === -1) return;
const name = line.slice(0, separator).trim().toLowerCase();
const value = line.slice(separator + 1).trim();
headers.set(name, value);
});
return {
headers,
status: Number(statusMatch[1]),
};
};
const tempFilePath = (cacheDir, suffix) =>
path.join(
cacheDir,
`.fetch-codex-manual-${process.pid}-${Date.now()}-${Math.random()
.toString(16)
.slice(2)}${suffix}`
);
const requestManualWithCurl = async (url, { cacheDir, method, timeoutMs }) => {
const headerPath = tempFilePath(cacheDir, ".headers");
const bodyPath = tempFilePath(cacheDir, ".body");
const curlNames =
process.platform === "win32" ? ["curl.exe", "curl"] : ["curl"];
const args = [
"--silent",
"--show-error",
"--location",
"--dump-header",
headerPath,
"--output",
bodyPath,
"--user-agent",
USER_AGENT,
"--max-time",
String(Math.max(1, Math.ceil(timeoutMs / 1000))),
];
if (method === "HEAD") {
args.push("--head");
} else {
args.push("--request", method);
}
args.push(url);
let lastError;
for (const curlName of curlNames) {
try {
await execFileAsync(curlName, args, { windowsHide: true });
const [rawHeaders, body] = await Promise.all([
readFile(headerPath, "utf8"),
readFile(bodyPath, "utf8"),
]);
const { headers, status } = parseCurlHeaders(rawHeaders);
return makeResponse({ body, headers, status });
} catch (error) {
lastError = error;
if (error?.code !== "ENOENT") break;
} finally {
await Promise.all([
rm(headerPath, { force: true }),
rm(bodyPath, { force: true }),
]);
}
}
if (lastError?.code === "ENOENT") {
throw new ManualFetchError("curl is unavailable in this environment.", {
cause: lastError,
});
}
throw new ManualFetchError(`${method} ${url} could not be fetched.`, {
cause: lastError,
});
};
const requestManualWithFetch = async (url, { method, timeoutMs }) => {
if (typeof fetch !== "function") {
throw new ManualFetchError(
"Native fetch is unavailable in this Node runtime."
);
}
return withTimeout(
(signal) =>
fetch(url, {
method,
headers: { "User-Agent": USER_AGENT },
signal,
}),
timeoutMs
);
};
const requestManual = async (url, { cacheDir, method, timeoutMs }) => {
const preferCurl = Boolean(proxyConfigured()) || typeof fetch !== "function";
const transports = preferCurl
? [
() => requestManualWithCurl(url, { cacheDir, method, timeoutMs }),
() => requestManualWithFetch(url, { method, timeoutMs }),
]
: [
() => requestManualWithFetch(url, { method, timeoutMs }),
() => requestManualWithCurl(url, { cacheDir, method, timeoutMs }),
];
let lastError;
for (const transport of transports) {
try {
const response = await transport();
if (!response.ok) {
throw new ManualFetchError(
`${method} ${url} failed with HTTP ${response.status}.`
);
}
return response;
} catch (error) {
lastError = error;
}
}
throw new ManualFetchError(`${method} ${url} could not be fetched.`, {
cause: lastError,
});
};
const readHeaderSha = (response) => {
const value = response.headers.get(HASH_HEADER);
if (!value || !/^[a-f0-9]{64}$/i.test(value)) {
throw new ManualFetchError(`Manual response is missing ${HASH_HEADER}.`);
}
return value.toLowerCase();
};
const nearestExistingParent = async (target) => {
let current = target;
while (true) {
try {
const info = await stat(current);
return info.isDirectory() ? current : null;
} catch (error) {
if (error?.code !== "ENOENT") return null;
}
const parent = path.dirname(current);
if (parent === current) return null;
current = parent;
}
};
const usableCacheDir = async (cacheDir) => {
if (!cacheDir) return null;
const resolved = path.resolve(cacheDir);
try {
const info = await stat(resolved);
if (!info.isDirectory()) return null;
} catch (error) {
if (error?.code !== "ENOENT") return null;
}
const parent = await nearestExistingParent(resolved);
if (!parent) return null;
try {
await access(parent, fsConstants.W_OK | fsConstants.X_OK);
} catch {
return null;
}
return resolved;
};
const defaultCacheDirCandidates = () => {
const candidates = [];
const seen = new Set();
const pushCandidate = (candidate) => {
if (!candidate || seen.has(candidate)) return;
seen.add(candidate);
candidates.push(candidate);
};
[process.env.TMPDIR, process.env.TEMP, process.env.TMP].forEach((baseDir) => {
if (baseDir) {
pushCandidate(path.join(baseDir, DEFAULT_CACHE_DIR_NAME));
}
});
if (process.platform !== "win32") {
pushCandidate(`/private/tmp/${DEFAULT_CACHE_DIR_NAME}`);
pushCandidate(`/tmp/${DEFAULT_CACHE_DIR_NAME}`);
}
return candidates;
};
const resolveCacheDir = async (cacheDir) => {
if (cacheDir) {
return usableCacheDir(cacheDir);
}
for (const candidate of defaultCacheDirCandidates()) {
const usable = await usableCacheDir(candidate);
if (usable) return usable;
}
return null;
};
const cacheFilePath = (cacheDir) => path.join(cacheDir, CACHE_FILE_NAME);
const outlineFilePath = (cacheDir) => path.join(cacheDir, OUTLINE_FILE_NAME);
const manualLines = (manual) => {
const lines = manual.replace(/\r\n/g, "\n").split("\n");
if (lines[lines.length - 1] === "") lines.pop();
return lines;
};
const sectionTitle = (rawTitle) =>
rawTitle.replace(/\s+#+\s*$/, "").replace(/\s+/g, " ").trim();
const buildOutline = (manual) => {
const lines = manualLines(manual);
const headings = [];
let inFence = false;
lines.forEach((line, index) => {
if (/^\s*(```|~~~)/.test(line)) {
inFence = !inFence;
return;
}
if (inFence) return;
const match = /^(#{1,6})\s+(.+?)\s*$/.exec(line);
if (!match) return;
const level = match[1].length;
if (level < 2 || level > 3) return;
headings.push({
level,
title: sectionTitle(match[2]),
startLine: index + 1,
endLine: lines.length,
});
});
for (let index = 0; index < headings.length; index += 1) {
const heading = headings[index];
const nextPeer = headings
.slice(index + 1)
.find((candidate) => candidate.level <= heading.level);
if (nextPeer) {
heading.endLine = nextPeer.startLine - 1;
}
}
if (headings.length === 0) {
return {
headingCount: 0,
lineCount: lines.length,
text: "No markdown headings found.",
};
}
const minLevel = Math.min(...headings.map((heading) => heading.level));
return {
headingCount: headings.length,
lineCount: lines.length,
text: headings
.map((heading) => {
const indent = " ".repeat(heading.level - minLevel);
return `${indent}- ${heading.title} (lines ${heading.startLine}-${heading.endLine})`;
})
.join("\n"),
};
};
const outlineMarkdown = (outline) => `# Codex Manual Outline\n\n${outline.text}\n`;
const manualStatusLine = (status) =>
status.cacheStatus === "hit"
? "Manual status: local manual was already current."
: "Manual status: local manual was updated.";
const formatResult = ({ status, outlineText }) =>
[
`Manual path: ${status.manualPath}`,
`Outline path: ${status.outlinePath}`,
manualStatusLine(status),
"",
outlineText,
].join("\n");
const readCachedManual = async (cacheDir, expectedSha256) => {
try {
const manual = await readFile(cacheFilePath(cacheDir), "utf8");
return sha256(manual) === expectedSha256 ? manual : null;
} catch {
return null;
}
};
const writeCachedManual = async (cacheDir, manual) => {
await mkdir(cacheDir, { recursive: true });
const tmpPath = tempFilePath(cacheDir, `.${CACHE_FILE_NAME}.tmp`);
await writeFile(tmpPath, manual, "utf8");
await rename(tmpPath, cacheFilePath(cacheDir));
};
const writeOutline = async (cacheDir, outlineText) => {
await mkdir(cacheDir, { recursive: true });
const tmpPath = tempFilePath(cacheDir, `.${OUTLINE_FILE_NAME}.tmp`);
await writeFile(tmpPath, outlineText, "utf8");
await rename(tmpPath, outlineFilePath(cacheDir));
};
const fetchCodexManual = async ({
manualUrl = DEFAULT_MANUAL_URL,
cacheDir,
timeoutMs = 30000,
} = {}) => {
const resolvedCacheDir = await resolveCacheDir(cacheDir);
if (!resolvedCacheDir) {
throw new ManualFetchError(
"Manual cache directory is unavailable; pass --cache-dir to override or use OpenAI Docs MCP fallback."
);
}
await mkdir(resolvedCacheDir, { recursive: true });
const headResponse = await requestManual(manualUrl, {
cacheDir: resolvedCacheDir,
method: "HEAD",
timeoutMs,
});
const expectedSha256 = readHeaderSha(headResponse);
const manualPath = cacheFilePath(resolvedCacheDir);
const outlinePath = outlineFilePath(resolvedCacheDir);
const checkedAt = new Date().toISOString();
const cachedManual = await readCachedManual(resolvedCacheDir, expectedSha256);
if (cachedManual !== null) {
const outline = buildOutline(cachedManual);
const outlineText = outlineMarkdown(outline);
await writeOutline(resolvedCacheDir, outlineText);
return {
outlineText,
status: {
manualUrl,
headerSha256: expectedSha256,
fetchedManualSha256: expectedSha256,
manualHashMatches: true,
cacheStatus: "hit",
cacheDir: resolvedCacheDir,
manualPath,
outlinePath,
checkedAt,
lineCount: outline.lineCount,
headingCount: outline.headingCount,
},
};
}
const getResponse = await requestManual(manualUrl, {
cacheDir: resolvedCacheDir,
method: "GET",
timeoutMs,
});
const getHeaderSha256 = readHeaderSha(getResponse);
if (getHeaderSha256 !== expectedSha256) {
throw new ManualFetchError(
`${HASH_HEADER} changed between HEAD and GET for ${manualUrl}.`
);
}
const manualText = await getResponse.text();
const actualSha256 = sha256(manualText);
const manualHashMatches = actualSha256 === expectedSha256;
if (!manualHashMatches) {
throw new ManualFetchError(
`${HASH_HEADER} did not match the fetched manual body for ${manualUrl}.`
);
}
await writeCachedManual(resolvedCacheDir, manualText);
const outline = buildOutline(manualText);
const outlineText = outlineMarkdown(outline);
await writeOutline(resolvedCacheDir, outlineText);
return {
outlineText,
status: {
manualUrl,
headerSha256: expectedSha256,
fetchedManualSha256: actualSha256,
manualHashMatches,
cacheStatus: "updated",
cacheDir: resolvedCacheDir,
manualPath,
outlinePath,
checkedAt,
lineCount: outline.lineCount,
headingCount: outline.headingCount,
},
};
};
const parseArgs = (argv) => {
const args = {
manualUrl: DEFAULT_MANUAL_URL,
cacheDir: undefined,
timeoutMs: 30000,
statusJson: false,
};
for (let index = 0; index < argv.length; index += 1) {
const arg = argv[index];
if (arg === "--manual-url") {
args.manualUrl = argv[++index];
} else if (arg === "--cache-dir") {
args.cacheDir = argv[++index];
} else if (arg === "--timeout-ms") {
args.timeoutMs = Number(argv[++index]);
} else if (arg === "--status-json") {
args.statusJson = true;
} else {
throw new ManualFetchError(`Unknown argument: ${arg}`);
}
}
if (!args.manualUrl) {
throw new ManualFetchError("--manual-url cannot be empty.");
}
if (!Number.isFinite(args.timeoutMs) || args.timeoutMs <= 0) {
throw new ManualFetchError("--timeout-ms must be a positive number.");
}
return args;
};
const main = async () => {
const args = parseArgs(process.argv.slice(2));
const { outlineText, status } = await fetchCodexManual(args);
process.stdout.write(formatResult({ status, outlineText }));
if (args.statusJson) {
console.error(JSON.stringify(status));
}
};
const envProxyHint = () => {
if (proxyConfigured()) {
return "Hint: proxy env vars are present. This helper prefers `curl` in proxied sessions; if requests still fail, verify `curl` is installed and the proxy configuration is valid.";
}
if (typeof fetch !== "function") {
return "Hint: native fetch is unavailable in this Node runtime. Install `curl` or use a newer Node version to fetch the manual.";
}
if (process.platform === "win32") {
return "Hint: on Windows, pass a cache dir under `%TEMP%` or `%TMP%`.";
}
return null;
};
const formatErrorDetails = (error) => {
const details = inspect(error, {
breakLength: 120,
colors: false,
compact: false,
depth: 8,
});
if (!error?.cause) {
return details;
}
return `${details}\n\nCause:\n${inspect(error.cause, {
breakLength: 120,
colors: false,
compact: false,
depth: 8,
})}`;
};
const isCliEntrypoint = () => {
const entrypoint = process.argv[1];
if (!entrypoint) {
return false;
}
return pathToFileURL(entrypoint).href === import.meta.url;
};
if (isCliEntrypoint()) {
main().catch((error) => {
console.error(`Error: ${error.message}`);
const hint = envProxyHint();
if (hint) {
console.error(hint);
}
console.error("");
console.error("Details:");
console.error(formatErrorDetails(error));
process.exitCode = 1;
});
}
export { DEFAULT_MANUAL_URL, fetchCodexManual };
@@ -0,0 +1,147 @@
#!/usr/bin/env node
const fs = require("node:fs/promises");
const path = require("node:path");
const DEFAULT_URL =
"https://developers.openai.com/api/docs/guides/latest-model.md";
const DEFAULT_BASE_URL = "https://developers.openai.com";
function parseArgs(argv) {
const args = {
source: process.env.LATEST_MODEL_URL || DEFAULT_URL,
baseUrl: process.env.LATEST_MODEL_BASE_URL || DEFAULT_BASE_URL,
};
for (let i = 2; i < argv.length; i += 1) {
const arg = argv[i];
if (arg === "--source" || arg === "--url") {
args.source = argv[i + 1];
i += 1;
} else if (arg === "--base-url") {
args.baseUrl = argv[i + 1];
i += 1;
}
}
return args;
}
async function readSource(source) {
if (source.startsWith("file://")) {
return fs.readFile(new URL(source), "utf8");
}
if (!/^https?:\/\//.test(source)) {
return fs.readFile(path.resolve(source), "utf8");
}
const response = await fetch(source, {
headers: { accept: "text/markdown,text/plain,*/*" },
});
if (!response.ok) {
throw new Error(`failed to fetch ${source}: ${response.status}`);
}
return response.text();
}
function parseIndentedInfo(lines, startIndex) {
const info = {};
for (let i = startIndex + 1; i < lines.length; i += 1) {
const line = lines[i];
if (!line.trim()) {
continue;
}
const match = line.match(/^ {2}([A-Za-z][A-Za-z0-9_-]*):\s*(.+?)\s*$/);
if (!match) {
break;
}
info[match[1]] = match[2].replace(/^["']|["']$/g, "");
}
return info;
}
function parseFlatInfo(block) {
const info = {};
for (const line of block.split(/\r?\n/)) {
const match = line.match(/^\s*([A-Za-z][A-Za-z0-9_-]*):\s*(.+?)\s*$/);
if (match) {
info[match[1]] = match[2].replace(/^["']|["']$/g, "");
}
}
return info;
}
function extractLatestModelInfo(markdown) {
const lines = markdown.split(/\r?\n/);
const latestModelInfoIndex = lines.findIndex((line) =>
/^latestModelInfo:\s*$/.test(line)
);
if (latestModelInfoIndex >= 0) {
return parseIndentedInfo(lines, latestModelInfoIndex);
}
const commentMatch = markdown.match(
/<!--\s*latestModelInfo\s*\n([\s\S]*?)\n\s*-->/m
);
if (commentMatch) {
return parseFlatInfo(commentMatch[1]);
}
return undefined;
}
function modelToSkillSlug(model) {
return model.trim().replace(/\./g, "p");
}
function absoluteUrl(baseUrl, value) {
return new URL(value, baseUrl).toString();
}
function normalizeInfo(info, baseUrl) {
const model = info?.model?.trim();
const migrationGuide = info?.migrationGuide?.trim();
const promptingGuide = info?.promptingGuide?.trim();
if (!model || !migrationGuide || !promptingGuide) {
throw new Error(
"latestModelInfo must include model, migrationGuide, and promptingGuide"
);
}
return {
model,
modelSlug: modelToSkillSlug(model),
migrationGuideUrl: absoluteUrl(baseUrl, migrationGuide),
promptingGuideUrl: absoluteUrl(baseUrl, promptingGuide),
};
}
async function main() {
const { source, baseUrl } = parseArgs(process.argv);
const markdown = await readSource(source);
const info = extractLatestModelInfo(markdown);
if (!info) {
throw new Error(`latestModelInfo block not found in ${source}`);
}
process.stdout.write(
`${JSON.stringify(normalizeInfo(info, baseUrl), null, 2)}\n`
);
}
main().catch((error) => {
console.error(error.message);
process.exit(1);
});
@@ -0,0 +1,243 @@
---
name: plugin-creator
description: Create and scaffold plugin directories for Codex with a required `.codex-plugin/plugin.json`, optional plugin folders/files, valid manifest defaults, and personal-marketplace entries by default. Use when Codex needs to create a new personal plugin, add optional plugin structure, generate or update marketplace entries for plugin ordering and availability metadata, or update an existing local plugin during development with the CLI-driven cachebuster and reinstall flow.
---
# Plugin Creator
## Quick Start
1. Run the scaffold script:
```bash
# Plugin names are normalized to lower-case hyphen-case and must be <= 64 chars.
# The generated folder and plugin.json name are always the same.
# Run from the skill root (the directory containing this `SKILL.md`).
# By default creates in `~/plugins/<plugin-name>`.
python3 scripts/create_basic_plugin.py <plugin-name>
```
2. Edit `<plugin-path>/.codex-plugin/plugin.json` when the request gives specific metadata.
The scaffold starts with valid defaults and must not contain `[TODO: ...]` placeholders.
3. Generate or update the personal marketplace entry when the plugin should appear in Codex UI ordering:
```bash
# Personal marketplace entries default to `~/.agents/plugins/marketplace.json`.
python3 scripts/create_basic_plugin.py my-plugin --with-marketplace
```
Only specify `--marketplace-name <name>` when the default `personal` marketplace name is already
taken or installed and you need to seed a different new marketplace file:
```bash
python3 scripts/create_basic_plugin.py my-plugin \
--with-marketplace \
--marketplace-name team-local
```
Only use a repo/team marketplace when the user specifically asks for that destination:
```bash
python3 scripts/create_basic_plugin.py my-plugin \
--path <repo-root>/plugins \
--marketplace-path <repo-root>/.agents/plugins/marketplace.json \
--with-marketplace
```
When the user specifies a marketplace path, make sure that marketplace is actually installed before
telling the user to reinstall from it. The default personal marketplace file at
`~/.agents/plugins/marketplace.json` is discovered implicitly, but other marketplace paths are not.
On Windows, use the equivalent path under the user profile.
4. Generate/adjust optional companion folders as needed:
```bash
python3 scripts/create_basic_plugin.py my-plugin \
--path <parent-plugin-directory> \
--marketplace-path <marketplace-json-path> \
--with-skills --with-hooks --with-scripts --with-assets --with-mcp --with-apps --with-marketplace
```
`<parent-plugin-directory>` is the directory where the plugin folder `<plugin-name>` will be
created (for example `~/plugins`).
5. Before handing back a generated plugin, run:
```bash
python3 scripts/validate_plugin.py <plugin-path>
```
For updates to an existing local plugin during development, keep the scaffold flow as-is and use the
reference instead of hand-editing marketplace files:
```bash
python3 scripts/update_plugin_cachebuster.py <plugin-path>
```
Prefer the helper default cachebuster unless the user explicitly asks for a specific override.
See `references/installing-and-updating.md` for the expected cachebuster and reinstall flow while iterating on an existing local plugin.
## What this skill creates
- Default marketplace-backed scaffolds use the personal marketplace file at
`~/.agents/plugins/marketplace.json`, with plugins generally being stored in
`~/plugins/<plugin-name>/`.
- Creates plugin root at `/<parent-plugin-directory>/<plugin-name>/`.
- Always creates `/<parent-plugin-directory>/<plugin-name>/.codex-plugin/plugin.json`.
- Fills the manifest with the validated schema shape that the ingestion path accepts.
- Creates or updates `~/.agents/plugins/marketplace.json` when `--with-marketplace` is set.
- If the marketplace file does not exist yet, seed a personal marketplace root before adding the first plugin entry.
- `<plugin-name>` is normalized using skill-creator naming rules:
- `My Plugin``my-plugin`
- `My--Plugin``my-plugin`
- underscores, spaces, and punctuation are converted to `-`
- result is lower-case hyphen-delimited with consecutive hyphens collapsed
- Supports optional creation of:
- `skills/`
- `hooks/`
- `scripts/`
- `assets/`
- `.mcp.json`
- `.app.json`
## Marketplace workflow
- Personal-marketplace creation defaults to `~/.agents/plugins/marketplace.json`. Here,
"personal marketplace" means the marketplace whose file is at that path.
- Repo/team marketplace creation is opt-in through both `--path` and `--marketplace-path`, only
when the user specifically requests it.
- `--marketplace-name` is an exception path. Use it only when the default `personal` marketplace
name is already taken and you need to seed a different new marketplace file.
- Do not use `--marketplace-name` to rename an existing marketplace file in place. If the file
already exists, its top-level `name` must already match.
- If the user specifies a different marketplace path, treat that marketplace as needing explicit installation via `codex plugin marketplace add`.
- Prefer `scripts/read_marketplace_name.py` when you need the marketplace name from any
`marketplace.json` file. With no argument it reads the default personal marketplace; with an
explicit path it works for repo/team marketplaces too.
- In either location, the generated source path remains `./plugins/<plugin-name>`.
- Marketplace root metadata supports top-level `name` plus optional `interface.displayName`.
- Treat plugin order in `plugins[]` as render order in Codex. Append new entries unless a user explicitly asks to reorder the list.
- `displayName` belongs inside the marketplace `interface` object, not individual `plugins[]` entries.
- Each generated marketplace entry must include all of:
- `policy.installation`
- `policy.authentication`
- `category`
- Default new entries to:
- `policy.installation: "AVAILABLE"`
- `policy.authentication: "ON_INSTALL"`
- Override defaults only when the user explicitly specifies another allowed value.
- Allowed `policy.installation` values:
- `NOT_AVAILABLE`
- `AVAILABLE`
- `INSTALLED_BY_DEFAULT`
- Allowed `policy.authentication` values:
- `ON_INSTALL`
- `ON_USE`
- Treat `policy.products` as an override. Omit it unless the user explicitly requests product gating.
- The generated plugin entry shape is:
```json
{
"name": "plugin-name",
"source": {
"source": "local",
"path": "./plugins/plugin-name"
},
"policy": {
"installation": "AVAILABLE",
"authentication": "ON_INSTALL"
},
"category": "Productivity"
}
```
- Use `--force` only when intentionally replacing an existing marketplace entry for the same plugin name.
- If the target marketplace file does not exist yet, create it with top-level `"name"`, an `"interface"` object containing `"displayName"`, and a `plugins` array, then add the new entry.
- For a brand-new marketplace file, the root object should look like:
```json
{
"name": "personal",
"interface": {
"displayName": "Personal"
},
"plugins": [
{
"name": "plugin-name",
"source": {
"source": "local",
"path": "./plugins/plugin-name"
},
"policy": {
"installation": "AVAILABLE",
"authentication": "ON_INSTALL"
},
"category": "Productivity"
}
]
}
```
## Required behavior
- Outer folder name and `plugin.json` `"name"` are always the same normalized plugin name.
- Do not remove required structure; keep `.codex-plugin/plugin.json` present.
- Do not leave `[TODO: ...]` placeholders in plugin manifests.
- Keep `apps` and `mcpServers` out of `plugin.json` unless their companion files are actually created.
- Omit unsupported plugin manifest fields that validation rejects, including `hooks`.
- If creating files inside an existing plugin path, use `--force` only when overwrite is intentional.
- Preserve any existing marketplace `interface.displayName`.
- When generating marketplace entries, always write `policy.installation`, `policy.authentication`, and `category` even if their values are defaults.
- Add `policy.products` only when the user explicitly asks for that override.
- Keep marketplace `source.path` relative to the selected marketplace root as `./plugins/<plugin-name>`.
- Only use `--marketplace-name` when creating a new marketplace file whose name should not be
`personal` because that name is already taken or installed elsewhere.
- If Codex would need approval to write the marketplace file, ask for that approval before
proceeding. If the user prefers to run the write themselves, provide the exact scaffold command
and then continue from validation or subsequent plugin edits instead of leaving the workflow
vague.
- For updates to an existing local plugin during development, do not hand-edit marketplace config
or `marketplace.json`. Use the update flow documented in
`references/installing-and-updating.md` and `scripts/update_plugin_cachebuster.py`.
- Do not tell the user to run `codex plugin marketplace add` for the default personal-marketplace
flow. That command is for explicit non-default marketplace configuration, not for the standard
`~/.agents/plugins/marketplace.json` path.
- If the user provided a non-default `--marketplace-path`, make sure that marketplace is installed
before giving reinstall instructions. Use `codex plugin marketplace add <path-to-marketplace-root>`
when that explicit marketplace has not been configured yet.
- When the workflow created or updated a marketplace-backed plugin, end the final user-facing
response with a short Codex app handoff. Say `To view this in the Codex app:` and write
`View <normalized plugin name>` and `Share <normalized plugin name>` as Markdown links, not raw
URLs or code spans.
- The View deeplink uses `codex://plugins/<normalized plugin name>?marketplacePath=<absolute marketplace.json path>`.
The Share deeplink uses the same URL with `&mode=share`.
- Replace the placeholders with the real normalized plugin name and absolute `marketplace.json`
path from the scaffolded plugin. URL-encode the path segment and query value when needed.
- Do not add `pluginName` or `hostId` query parameters to these deeplinks. Codex derives both after
the user clicks the link.
- Do not emit the `View <normalized plugin name>` or `Share <normalized plugin name>` links when no marketplace entry was
created or updated.
## Reference to exact spec sample
For the exact canonical sample JSON for both plugin manifests and marketplace entries, use:
- `references/plugin-json-spec.md`
- `references/installing-and-updating.md` for update/reinstall guidance while
iterating on an existing local plugin, plus the new-thread pickup behavior after reinstall
## Validation
After editing `SKILL.md`, run:
```bash
python3 ../skill-creator/scripts/quick_validate.py .
```
Before handing back a generated plugin, run:
```bash
python3 scripts/validate_plugin.py <plugin-path>
```
@@ -0,0 +1,6 @@
interface:
display_name: "Plugin Creator"
short_description: "Scaffold plugins and marketplace entries"
default_prompt: "Use $plugin-creator to scaffold a valid plugin in the personal marketplace, then validate it before handing it back."
icon_small: "./assets/plugin-creator-small.svg"
icon_large: "./assets/plugin-creator.png"
@@ -0,0 +1,3 @@
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="currentColor" viewBox="0 0 20 20">
<path fill="#0D0D0D" d="M12.03 4.113a3.612 3.612 0 0 1 5.108 5.108l-6.292 6.29c-.324.324-.56.561-.791.752l-.235.176c-.205.14-.422.261-.65.36l-.229.093a4.136 4.136 0 0 1-.586.16l-.764.134-2.394.4c-.142.024-.294.05-.423.06-.098.007-.232.01-.378-.026l-.149-.05a1.081 1.081 0 0 1-.521-.474l-.046-.093a1.104 1.104 0 0 1-.075-.527c.01-.129.035-.28.06-.422l.398-2.394c.1-.602.162-.987.295-1.35l.093-.23c.1-.228.22-.445.36-.65l.176-.235c.19-.232.428-.467.751-.79l6.292-6.292Zm-5.35 7.232c-.35.35-.534.535-.66.688l-.11.147a2.67 2.67 0 0 0-.24.433l-.062.154c-.08.22-.124.462-.232 1.112l-.398 2.394-.001.001h.003l2.393-.399.717-.126a2.63 2.63 0 0 0 .394-.105l.154-.063a2.65 2.65 0 0 0 .433-.24l.147-.11c.153-.126.339-.31.688-.66l4.988-4.988-3.227-3.226-4.987 4.988Zm9.517-6.291a2.281 2.281 0 0 0-3.225 0l-.364.362 3.226 3.227.363-.364c.89-.89.89-2.334 0-3.225ZM4.583 1.783a.3.3 0 0 1 .294.241c.117.585.347 1.092.707 1.48.357.385.859.668 1.549.783a.3.3 0 0 1 0 .592c-.69.115-1.192.398-1.549.783-.315.34-.53.77-.657 1.265l-.05.215a.3.3 0 0 1-.588 0c-.117-.585-.347-1.092-.707-1.48-.357-.384-.859-.668-1.549-.783a.3.3 0 0 1 0-.592c.69-.115 1.192-.398 1.549-.783.36-.388.59-.895.707-1.48l.015-.05a.3.3 0 0 1 .279-.19Z"/>
</svg>

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 KiB

@@ -0,0 +1,143 @@
# Updating Existing Local Plugins
Use this reference when a plugin already exists and the request is about updating the plugin during
local development.
All scripts here are specified relative to the skill root. Update the path for running the scripts
depending on your current working directory.
## When To Use This Flow
Use this flow when all of the following are true:
- the plugin already exists locally
- the marketplace entry already points at the plugin source you are editing
- the user wants Codex to see the updated plugin without manually editing marketplace files
If the user still needs the initial plugin entry or marketplace structure created, use the scaffold
flow first and only then switch to this reinstall flow.
## Update Loop
1. Update the plugin manifest to a single Codex cachebuster suffix:
```bash
python3 scripts/update_plugin_cachebuster.py \
<plugin-path>
```
Prefer the default helper behavior here. If you omit `--cachebuster`, the helper uses a UTC
timestamp down to seconds, which is the recommended path for routine local iteration.
Only use a manual cachebuster override when the user explicitly asks for one or when a workflow
outside Codex depends on a specific token:
```bash
python3 scripts/update_plugin_cachebuster.py \
<plugin-path> \
--cachebuster local-20260519-184516
```
2. For the default scaffolded flow, read the marketplace name from the personal marketplace file:
```bash
python3 scripts/read_marketplace_name.py
```
Here, "personal marketplace" means the marketplace whose file is at
`~/.agents/plugins/marketplace.json`. On Windows, use the equivalent path under the user profile.
The helper uses Python's home-directory resolution and prints the marketplace name to use when
constructing the install command.
To read the name from a different marketplace file, pass the path directly:
```bash
python3 scripts/read_marketplace_name.py --marketplace-path <path-to-marketplace.json>
```
3. Reinstall from that marketplace name:
```bash
codex plugin add <plugin-name>@<marketplace-name-from-marketplace-json>
```
The default personal marketplace is discovered implicitly from
`~/.agents/plugins/marketplace.json`. You do not need `codex plugin marketplace add` for that
path, and `codex plugin marketplace list` is not the right check for whether that default
marketplace exists.
4. If the plugin is not using the personal marketplace file, check which configured local
marketplace is actually surfacing that plugin:
```bash
codex plugin list
```
If the plugin is not in the personal marketplace file, confirm which marketplace entry points at
the plugin source you are editing and make sure that marketplace is still local. If it is a
different local marketplace, reinstall from that marketplace name instead of forcing the personal
marketplace flow. If it is not local, stop and help the user resolve the mismatch before
continuing.
5. If the plugin lives in a different confirmed local marketplace, substitute that marketplace
name:
```bash
codex plugin add <plugin-name>@<local-marketplace>
```
6. Prompt the user to use a new thread to try the updated plugin, so that Codex picks up new skills
and tools.
## Cachebuster Policy
- Preserve the existing version prefix and replace only the suffix.
- Treat the preserved prefix as everything before `+`.
- Use the format:
```text
<base-version>+codex.<cachebuster>
```
Examples:
- `0.1.0``0.1.0+codex.local-20260519-184516`
- `0.1.0+codex.old-token``0.1.0+codex.local-20260519-184516`
- `1.2.3-beta.1+codex.prev``1.2.3-beta.1+codex.local-20260519-184516`
- `dev-build+other-tag``dev-build+codex.local-20260519-184516`
Replace the existing Codex cachebuster instead of appending another one. Do not keep incrementing
numeric version components just to trigger reinstall behavior.
## Marketplace Rules
- Marketplace manipulation should happen through commands, not by hand-editing `marketplace.json`
or `config.toml` during this update/reinstall flow.
- Prefer the personal marketplace file for the default scaffolded flow.
- Read the personal marketplace name with
`python3 scripts/read_marketplace_name.py` and use the printed value when constructing
`codex plugin add <plugin-name>@<marketplace-name>`.
- For non-default marketplace files, use
`python3 scripts/read_marketplace_name.py --marketplace-path <path-to-marketplace.json>` to read
the name before constructing reinstall commands.
- Do not tell the user to run `codex plugin marketplace add` for the default personal-marketplace
flow. That marketplace is discovered implicitly by Codex.
- If the user specified a different marketplace path, make sure that marketplace is installed
before giving install or reinstall instructions. Non-default marketplace paths are not
discovered implicitly.
- Use `codex plugin list` when the plugin lives in a different configured marketplace and you need
to confirm which marketplace is surfacing that plugin.
- If a non-default local marketplace has not been configured yet, install it with
`codex plugin marketplace add <path-to-marketplace-root>` before telling the user to run
`codex plugin add <plugin-name>@<marketplace-name>`.
- If the plugin is not in the personal marketplace file, confirm that the selected marketplace is
local before telling the user to reinstall from it.
- If the selected marketplace is not local, stop and help the user resolve that mismatch rather
than pretending the normal local reinstall flow applies.
- If the plugin source is not already the source referenced by the chosen marketplace entry, stop
and fix that first. This update flow does not rewrite marketplace entries.
## After Reinstall
After reinstalling, prompt the user to start a new thread for testing. That is the safe boundary for
picking up the updated plugin and its MCP tools.
@@ -0,0 +1,194 @@
# Plugin JSON sample spec
```json
{
"name": "plugin-name",
"version": "1.2.0",
"description": "Brief plugin description",
"author": {
"name": "Author Name",
"email": "author@example.com",
"url": "https://github.com/author"
},
"homepage": "https://docs.example.com/plugin",
"repository": "https://github.com/author/plugin",
"license": "MIT",
"keywords": ["keyword1", "keyword2"],
"skills": "./skills/",
"hooks": "./hooks.json",
"mcpServers": "./.mcp.json",
"apps": "./.app.json",
"interface": {
"displayName": "Plugin Display Name",
"shortDescription": "Short description for subtitle",
"longDescription": "Long description for details page",
"developerName": "OpenAI",
"category": "Productivity",
"capabilities": ["Interactive", "Write"],
"websiteURL": "https://openai.com/",
"privacyPolicyURL": "https://openai.com/policies/row-privacy-policy/",
"termsOfServiceURL": "https://openai.com/policies/row-terms-of-use/",
"defaultPrompt": [
"Summarize my inbox and draft replies for me.",
"Find open bugs and turn them into Linear tickets.",
"Review today's meetings and flag scheduling gaps."
],
"brandColor": "#3B82F6",
"composerIcon": "./assets/icon.png",
"logo": "./assets/logo.png",
"screenshots": [
"./assets/screenshot1.png",
"./assets/screenshot2.png",
"./assets/screenshot3.png"
]
}
}
```
## Field guide
### Top-level fields
- `name` (`string`): Plugin identifier (kebab-case, no spaces). Required if `plugin.json` is provided and used as manifest name and component namespace.
- `version` (`string`): Plugin semantic version.
- `description` (`string`): Short purpose summary.
- `author` (`object`): Publisher identity.
- `name` (`string`): Author or team name.
- `email` (`string`): Contact email.
- `url` (`string`): Author/team homepage or profile URL.
- `homepage` (`string`): Documentation URL for plugin usage.
- `repository` (`string`): Source code URL.
- `license` (`string`): License identifier (for example `MIT`, `Apache-2.0`).
- `keywords` (`array` of `string`): Search/discovery tags.
- `skills` (`string`): Relative path to skill directories/files.
- `hooks` (`string`): Hook config path.
- `mcpServers` (`string`): MCP config path.
- `apps` (`string`): App manifest path for plugin integrations.
- `interface` (`object`): Interface/UX metadata block for plugin presentation.
### `interface` fields
- `displayName` (`string`): User-facing title shown for the plugin.
- `shortDescription` (`string`): Brief subtitle used in compact views.
- `longDescription` (`string`): Longer description used on details screens.
- `developerName` (`string`): Human-readable publisher name.
- `category` (`string`): Plugin category bucket.
- `capabilities` (`array` of `string`): Capability list from implementation.
- `websiteURL` (`string`): Public website for the plugin.
- `privacyPolicyURL` (`string`): Privacy policy URL.
- `termsOfServiceURL` (`string`): Terms of service URL.
- `defaultPrompt` (`array` of `string`): Starter prompts shown in composer/UX context.
- Include at most 3 strings. Entries after the first 3 are ignored and will not be included.
- Each string is capped at 128 characters. Longer entries are truncated.
- Prefer short starter prompts around 50 characters so they scan well in the UI.
- `brandColor` (`string`): Theme color for the plugin card.
- `composerIcon` (`string`): Path to icon asset.
- `logo` (`string`): Path to logo asset.
- `screenshots` (`array` of `string`): List of screenshot asset paths.
- Screenshot entries must be PNG filenames and stored under `./assets/`.
- Keep file paths relative to plugin root.
### Path conventions and defaults
- Path values should be relative and begin with `./`.
- `skills`, `hooks`, and `mcpServers` are supplemented on top of default component discovery; they do not replace defaults.
- Custom path values must follow the plugin root convention and naming/namespacing rules.
- This repos scaffold writes `.codex-plugin/plugin.json`; treat that as the manifest location this skill generates.
# Marketplace JSON sample spec
`marketplace.json` depends on where the plugin should live. New plugin creation defaults to the
personal marketplace unless the caller explicitly requests a repo-local destination:
- Personal plugin: `~/.agents/plugins/marketplace.json`
- Repo/team plugin: `<repo-root>/.agents/plugins/marketplace.json`
```json
{
"name": "openai-curated",
"interface": {
"displayName": "ChatGPT Official"
},
"plugins": [
{
"name": "linear",
"source": {
"source": "local",
"path": "./plugins/linear"
},
"policy": {
"installation": "AVAILABLE",
"authentication": "ON_INSTALL"
},
"category": "Productivity"
}
]
}
```
## Marketplace field guide
### Top-level fields
- `name` (`string`): Marketplace identifier or catalog name.
- `interface` (`object`, optional): Marketplace presentation metadata.
- `plugins` (`array`): Ordered plugin entries. This order determines how Codex renders plugins.
### `interface` fields
- `displayName` (`string`, optional): User-facing marketplace title.
### Plugin entry fields
- `name` (`string`): Plugin identifier. Match the plugin folder name and `plugin.json` `name`.
- `source` (`object`): Plugin source descriptor.
- `source` (`string`): Use `local` for this repo workflow.
- `path` (`string`): Relative plugin path based on the marketplace root.
- Personal plugin in `~/.agents/plugins/marketplace.json`: `./plugins/<plugin-name>`
- Repo/team plugin: `./plugins/<plugin-name>`
- The same relative path convention is used for both personal and repo/team marketplaces.
- Example: with `~/.agents/plugins/marketplace.json`, `./plugins/<plugin-name>` resolves to
`~/plugins/<plugin-name>`.
- `policy` (`object`): Marketplace policy block. Always include it.
- `installation` (`string`): Availability policy.
- Allowed values: `NOT_AVAILABLE`, `AVAILABLE`, `INSTALLED_BY_DEFAULT`
- Default for new entries: `AVAILABLE`
- `authentication` (`string`): Authentication timing policy.
- Allowed values: `ON_INSTALL`, `ON_USE`
- Default for new entries: `ON_INSTALL`
- `products` (`array` of `string`, optional): Product override for this plugin entry. Omit it unless product gating is explicitly requested.
- `category` (`string`): Display category bucket. Always include it.
### Marketplace generation rules
- `displayName` belongs under the top-level `interface` object, not individual plugin entries.
- When creating a new marketplace file from scratch, seed `interface.displayName` alongside top-level `name`.
- Always include `policy.installation`, `policy.authentication`, and `category` on every generated or updated plugin entry.
- Treat `policy.products` as an override and omit it unless explicitly requested.
- Append new entries unless the user explicitly requests reordering.
- Replace an existing entry for the same plugin only when overwrite is intentional.
- Default new plugin creation to the personal marketplace.
- Use a repo/team marketplace only when the user specifically requests that destination.
- Only override the marketplace `name` when the default `personal` name is already taken or
installed and you need to seed a different new marketplace file.
- Choose marketplace location to match the selected destination:
- Personal plugin: `~/.agents/plugins/marketplace.json`
- Repo/team plugin: `<repo-root>/.agents/plugins/marketplace.json`
### Plugin validation notes
- The validator mirrors the workspace plugin ingestion schema so generated plugins follow the same
manifest contract from the start.
- Plugin manifests must include real values for `name`, `version`, `description`,
`author.name`, and the required `interface` fields.
- `version` must use strict semver.
- `websiteURL`, `privacyPolicyURL`, and `termsOfServiceURL` must be absolute `https://` URLs when
present.
- `composerIcon`, `logo`, and `screenshots` must point to real files inside the plugin archive when
present.
- `apps` and `mcpServers` should appear in `plugin.json` only when `.app.json` and `.mcp.json`
actually exist.
- Validation rejects unsupported manifest fields such as `hooks`, so the scaffold keeps them out of
generated manifests.
- Run `scripts/validate_plugin.py <plugin-path>` before handing back a generated plugin. It adds one
intentional preflight check that rejects leftover `[TODO: ...]` placeholders.
@@ -0,0 +1,324 @@
#!/usr/bin/env python3
"""Scaffold a plugin directory and optionally update marketplace.json."""
from __future__ import annotations
import argparse
import json
import re
from pathlib import Path
from typing import Any
MAX_PLUGIN_NAME_LENGTH = 64
DEFAULT_INSTALL_POLICY = "AVAILABLE"
DEFAULT_AUTH_POLICY = "ON_INSTALL"
DEFAULT_CATEGORY = "Productivity"
DEFAULT_MARKETPLACE_NAME = "personal"
VALID_INSTALL_POLICIES = {"NOT_AVAILABLE", "AVAILABLE", "INSTALLED_BY_DEFAULT"}
VALID_AUTH_POLICIES = {"ON_INSTALL", "ON_USE"}
DEFAULT_PLUGIN_PARENT = Path.home() / "plugins"
DEFAULT_MARKETPLACE_PATH = Path.home() / ".agents" / "plugins" / "marketplace.json"
def normalize_plugin_name(plugin_name: str) -> str:
"""Normalize a plugin name to lowercase hyphen-case."""
normalized = plugin_name.strip().lower()
normalized = re.sub(r"[^a-z0-9]+", "-", normalized)
normalized = normalized.strip("-")
normalized = re.sub(r"-{2,}", "-", normalized)
return normalized
def validate_plugin_name(plugin_name: str) -> None:
if not plugin_name:
raise ValueError("Plugin name must include at least one letter or digit.")
if len(plugin_name) > MAX_PLUGIN_NAME_LENGTH:
raise ValueError(
f"Plugin name '{plugin_name}' is too long ({len(plugin_name)} characters). "
f"Maximum is {MAX_PLUGIN_NAME_LENGTH} characters."
)
def validate_marketplace_name(marketplace_name: str) -> None:
if not marketplace_name:
raise ValueError("Marketplace name must include at least one letter or digit.")
if re.fullmatch(r"[A-Za-z0-9_-]+", marketplace_name) is None:
raise ValueError(
"Marketplace name may only contain ASCII letters, digits, `_`, and `-`."
)
def display_name_from_plugin_name(plugin_name: str) -> str:
return " ".join(part.capitalize() for part in re.split(r"[-_]+", plugin_name))
def build_plugin_json(plugin_name: str, *, with_mcp: bool, with_apps: bool) -> dict[str, Any]:
display_name = display_name_from_plugin_name(plugin_name)
payload: dict[str, Any] = {
"name": plugin_name,
"version": "0.1.0",
"description": f"{display_name} plugin",
"author": {
"name": "Local developer",
},
"skills": "./skills/",
"interface": {
"displayName": display_name,
"shortDescription": f"Use {display_name} in Codex.",
"longDescription": f"{display_name} adds a local Codex plugin scaffold.",
"developerName": "Local developer",
"category": DEFAULT_CATEGORY,
"capabilities": [],
"defaultPrompt": f"Help me use {display_name}.",
},
}
if with_mcp:
payload["mcpServers"] = "./.mcp.json"
if with_apps:
payload["apps"] = "./.app.json"
return payload
def build_marketplace_entry(
plugin_name: str,
install_policy: str,
auth_policy: str,
category: str,
) -> dict[str, Any]:
return {
"name": plugin_name,
"source": {
"source": "local",
"path": f"./plugins/{plugin_name}",
},
"policy": {
"installation": install_policy,
"authentication": auth_policy,
},
"category": category,
}
def load_json(path: Path) -> dict[str, Any]:
with path.open() as handle:
return json.load(handle)
def build_default_marketplace(marketplace_name: str) -> dict[str, Any]:
return {
"name": marketplace_name,
"interface": {
"displayName": display_name_from_plugin_name(marketplace_name),
},
"plugins": [],
}
def validate_marketplace_interface(payload: dict[str, Any]) -> None:
interface = payload.get("interface")
if interface is not None and not isinstance(interface, dict):
raise ValueError("marketplace.json field 'interface' must be an object.")
def update_marketplace_json(
marketplace_path: Path,
marketplace_name: str | None,
plugin_name: str,
install_policy: str,
auth_policy: str,
category: str,
force: bool,
) -> None:
if marketplace_path.exists():
payload = load_json(marketplace_path)
else:
payload = build_default_marketplace(marketplace_name or DEFAULT_MARKETPLACE_NAME)
if not isinstance(payload, dict):
raise ValueError(f"{marketplace_path} must contain a JSON object.")
validate_marketplace_interface(payload)
existing_marketplace_name = payload.get("name")
if marketplace_name is not None:
if not isinstance(existing_marketplace_name, str) or not existing_marketplace_name.strip():
raise ValueError(f"{marketplace_path} must contain a non-empty string 'name'.")
if existing_marketplace_name != marketplace_name:
raise ValueError(
f"{marketplace_path} already uses marketplace name "
f"'{existing_marketplace_name}'. Create a new marketplace file to use "
f"'{marketplace_name}' instead."
)
plugins = payload.setdefault("plugins", [])
if not isinstance(plugins, list):
raise ValueError(f"{marketplace_path} field 'plugins' must be an array.")
new_entry = build_marketplace_entry(plugin_name, install_policy, auth_policy, category)
for index, entry in enumerate(plugins):
if isinstance(entry, dict) and entry.get("name") == plugin_name:
if not force:
raise FileExistsError(
f"Marketplace entry '{plugin_name}' already exists in {marketplace_path}. "
"Use --force to overwrite that entry."
)
plugins[index] = new_entry
break
else:
plugins.append(new_entry)
write_json(marketplace_path, payload, force=True)
def write_json(path: Path, data: dict, force: bool) -> None:
if path.exists() and not force:
raise FileExistsError(f"{path} already exists. Use --force to overwrite.")
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("w") as handle:
json.dump(data, handle, indent=2)
handle.write("\n")
def create_stub_file(path: Path, payload: dict, force: bool) -> None:
if path.exists() and not force:
return
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("w") as handle:
json.dump(payload, handle, indent=2)
handle.write("\n")
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Create a plugin skeleton with a validation-ready plugin.json."
)
parser.add_argument("plugin_name")
parser.add_argument(
"--path",
default=str(DEFAULT_PLUGIN_PARENT),
help=(
"Parent directory for plugin creation (defaults to <home>/plugins). "
"Pass an explicit repo path only when a repo/team plugin is intended."
),
)
parser.add_argument("--with-skills", action="store_true", help="Create skills/ directory")
parser.add_argument("--with-hooks", action="store_true", help="Create hooks/ directory")
parser.add_argument("--with-scripts", action="store_true", help="Create scripts/ directory")
parser.add_argument("--with-assets", action="store_true", help="Create assets/ directory")
parser.add_argument("--with-mcp", action="store_true", help="Create .mcp.json placeholder")
parser.add_argument("--with-apps", action="store_true", help="Create .app.json placeholder")
parser.add_argument(
"--with-marketplace",
action="store_true",
help=(
"Create or update <home>/.agents/plugins/marketplace.json by default. "
"Marketplace entries always point to ./plugins/<plugin-name> relative to the "
"marketplace root."
),
)
parser.add_argument(
"--marketplace-path",
default=str(DEFAULT_MARKETPLACE_PATH),
help=(
"Path to marketplace.json (defaults to <home>/.agents/plugins/marketplace.json). "
"Pass a repo-rooted marketplace path only when a repo/team plugin is intended."
),
)
parser.add_argument(
"--marketplace-name",
help=(
"Marketplace name to seed into a new marketplace.json. Use this only when the default "
"'personal' marketplace name is already taken and you need a different new marketplace."
),
)
parser.add_argument(
"--install-policy",
default=DEFAULT_INSTALL_POLICY,
choices=sorted(VALID_INSTALL_POLICIES),
help="Marketplace policy.installation value",
)
parser.add_argument(
"--auth-policy",
default=DEFAULT_AUTH_POLICY,
choices=sorted(VALID_AUTH_POLICIES),
help="Marketplace policy.authentication value",
)
parser.add_argument(
"--category",
default=DEFAULT_CATEGORY,
help="Marketplace category value",
)
parser.add_argument("--force", action="store_true", help="Overwrite existing files")
return parser.parse_args()
def main() -> None:
args = parse_args()
raw_plugin_name = args.plugin_name
plugin_name = normalize_plugin_name(raw_plugin_name)
if plugin_name != raw_plugin_name:
print(f"Note: Normalized plugin name from '{raw_plugin_name}' to '{plugin_name}'.")
validate_plugin_name(plugin_name)
marketplace_name = None
if args.marketplace_name is not None:
marketplace_name = args.marketplace_name.strip()
validate_marketplace_name(marketplace_name)
plugin_root = (Path(args.path).expanduser().resolve() / plugin_name)
plugin_root.mkdir(parents=True, exist_ok=True)
plugin_json_path = plugin_root / ".codex-plugin" / "plugin.json"
write_json(
plugin_json_path,
build_plugin_json(plugin_name, with_mcp=args.with_mcp, with_apps=args.with_apps),
args.force,
)
optional_directories = {
"skills": args.with_skills,
"hooks": args.with_hooks,
"scripts": args.with_scripts,
"assets": args.with_assets,
}
for folder, enabled in optional_directories.items():
if enabled:
(plugin_root / folder).mkdir(parents=True, exist_ok=True)
if args.with_mcp:
create_stub_file(
plugin_root / ".mcp.json",
{"mcpServers": {}},
args.force,
)
if args.with_apps:
create_stub_file(
plugin_root / ".app.json",
{
"apps": {},
},
args.force,
)
if args.with_marketplace:
marketplace_path = Path(args.marketplace_path).expanduser().resolve()
update_marketplace_json(
marketplace_path,
marketplace_name,
plugin_name,
args.install_policy,
args.auth_policy,
args.category,
args.force,
)
print(f"Created plugin scaffold: {plugin_root}")
print(f"plugin manifest: {plugin_json_path}")
if args.with_marketplace:
print(f"marketplace manifest: {marketplace_path}")
if __name__ == "__main__":
main()
@@ -0,0 +1,48 @@
#!/usr/bin/env python3
"""Print the top-level marketplace name from any marketplace.json file."""
from __future__ import annotations
import argparse
import json
import sys
from pathlib import Path
def default_marketplace_path() -> Path:
return Path.home() / ".agents" / "plugins" / "marketplace.json"
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description=(
"Print the top-level marketplace name from marketplace.json. Defaults to the personal "
"marketplace path under the current home directory."
)
)
parser.add_argument(
"--marketplace-path",
default=str(default_marketplace_path()),
help="Path to marketplace.json",
)
return parser.parse_args()
def main() -> None:
args = parse_args()
marketplace_path = Path(args.marketplace_path).expanduser().resolve()
payload = json.loads(marketplace_path.read_text(encoding="utf-8"))
if not isinstance(payload, dict):
raise ValueError(f"{marketplace_path} must contain a JSON object.")
name = payload.get("name")
if not isinstance(name, str) or not name.strip():
raise ValueError(f"{marketplace_path} must contain a non-empty string 'name'.")
print(name.strip())
if __name__ == "__main__":
try:
main()
except Exception as err: # noqa: BLE001 - CLI should surface a single clear message.
print(str(err), file=sys.stderr)
raise SystemExit(1) from err
@@ -0,0 +1,78 @@
#!/usr/bin/env python3
"""Rewrite a local plugin version to a single Codex cachebuster suffix."""
from __future__ import annotations
import argparse
import json
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
CACHEBUSTER_PREFIX = "codex"
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description=(
"Rewrite a local plugin's version so it preserves everything before '+' and uses "
"a single +codex.<cachebuster> suffix."
)
)
parser.add_argument("plugin_path", help="Path to the plugin root directory")
parser.add_argument(
"--cachebuster",
help="Optional cachebuster token to embed in the plugin version",
)
return parser.parse_args()
def main() -> None:
args = parse_args()
plugin_root = Path(args.plugin_path).expanduser().resolve()
manifest_path = plugin_root / ".codex-plugin" / "plugin.json"
manifest = load_manifest(manifest_path)
version = manifest.get("version")
if not isinstance(version, str) or not version.strip():
raise ValueError(f"{manifest_path} must contain a non-empty string 'version'.")
cachebuster = sanitize_cachebuster(args.cachebuster or default_cachebuster())
next_version = with_cachebuster(version, cachebuster)
manifest["version"] = next_version
manifest_path.write_text(json.dumps(manifest, indent=2) + "\n", encoding="utf-8")
print(f"Updated plugin version: {version} -> {next_version}")
def load_manifest(manifest_path: Path) -> dict[str, object]:
if not manifest_path.is_file():
raise FileNotFoundError(f"missing manifest: {manifest_path}")
payload = json.loads(manifest_path.read_text(encoding="utf-8"))
if not isinstance(payload, dict):
raise ValueError(f"{manifest_path} must contain a JSON object.")
return payload
def sanitize_cachebuster(value: str) -> str:
sanitized = re.sub(r"[^a-z0-9-]+", "-", value.strip().lower())
sanitized = re.sub(r"-{2,}", "-", sanitized).strip("-")
if not sanitized:
raise ValueError("Cachebuster must contain at least one letter or digit.")
return sanitized
def default_cachebuster() -> str:
return datetime.now(timezone.utc).strftime("%Y%m%d%H%M%S")
def with_cachebuster(version: str, cachebuster: str) -> str:
version_prefix = version.split("+", 1)[0]
return f"{version_prefix}+{CACHEBUSTER_PREFIX}.{cachebuster}"
if __name__ == "__main__":
try:
main()
except Exception as err: # noqa: BLE001 - CLI should surface a single clear message.
print(str(err), file=sys.stderr)
raise SystemExit(1) from err
@@ -0,0 +1,586 @@
#!/usr/bin/env python3
"""Validate a generated plugin against the plugin ingestion contract."""
from __future__ import annotations
import argparse
import json
import re
from pathlib import Path, PurePosixPath
from typing import Any
from urllib.parse import urlparse
import yaml
TODO_MARKER = "[TODO:"
SEMVER_RE = re.compile(
r"^(0|[1-9]\d*)\."
r"(0|[1-9]\d*)\."
r"(0|[1-9]\d*)"
r"(?:-(?:0|[1-9]\d*|\d*[A-Za-z-][0-9A-Za-z-]*)(?:\."
r"(?:0|[1-9]\d*|\d*[A-Za-z-][0-9A-Za-z-]*))*)?"
r"(?:\+[0-9A-Za-z-]+(?:\.[0-9A-Za-z-]+)*)?$"
)
HEX_COLOR_RE = re.compile(r"^#[0-9A-F]{6}$", re.IGNORECASE)
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Validate a local Codex plugin.")
parser.add_argument("plugin_path", help="Path to the plugin root directory")
return parser.parse_args()
def main() -> None:
args = parse_args()
plugin_root = Path(args.plugin_path).expanduser().resolve()
errors = validate_plugin(plugin_root)
if errors:
print("Plugin validation failed:")
for error in errors:
print(f"- {error}")
raise SystemExit(1)
print(f"Plugin validation passed: {plugin_root}")
def validate_plugin(plugin_root: Path) -> list[str]:
errors: list[str] = []
manifest_path = plugin_root / ".codex-plugin" / "plugin.json"
manifest = load_json_object(manifest_path, errors)
if manifest is None:
return errors
reject_todo_markers(manifest, "$", errors)
validate_manifest_shape(plugin_root, manifest, errors)
return errors
def load_json_object(path: Path, errors: list[str]) -> dict[str, Any] | None:
if not path.is_file():
errors.append("missing `.codex-plugin/plugin.json`")
return None
try:
payload = json.loads(path.read_text(encoding="utf-8"))
except OSError:
errors.append("unable to read `.codex-plugin/plugin.json`")
return None
except json.JSONDecodeError:
errors.append("`.codex-plugin/plugin.json` must be valid JSON")
return None
if not isinstance(payload, dict):
errors.append("`.codex-plugin/plugin.json` must contain a JSON object")
return None
return payload
def reject_todo_markers(value: Any, path: str, errors: list[str]) -> None:
if isinstance(value, str):
if TODO_MARKER in value:
errors.append(f"{path} still contains a `[TODO: ...]` placeholder")
return
if isinstance(value, list):
for index, item in enumerate(value):
reject_todo_markers(item, f"{path}[{index}]", errors)
return
if isinstance(value, dict):
for key, item in value.items():
reject_todo_markers(item, f"{path}.{key}", errors)
def validate_manifest_shape(
plugin_root: Path,
manifest: dict[str, Any],
errors: list[str],
) -> None:
allowed_keys = {
"id",
"name",
"version",
"description",
"skills",
"apps",
"mcpServers",
"interface",
"author",
"homepage",
"repository",
"license",
"keywords",
}
for key in sorted(set(manifest) - allowed_keys):
errors.append(f"plugin.json field `{key}` is not accepted by plugin validation")
validate_optional_non_empty_string(manifest, "id", errors)
require_non_empty_string(manifest, "name", errors)
version = require_non_empty_string(manifest, "version", errors)
if version is not None and SEMVER_RE.fullmatch(version) is None:
errors.append("plugin.json field `version` must be strict semver")
require_non_empty_string(manifest, "description", errors)
author = require_object(manifest, "author", errors)
if author is not None:
reject_unknown_fields(author, {"name", "email", "url"}, "author", errors)
require_non_empty_string(author, "name", errors, prefix="author")
validate_optional_non_empty_string(author, "email", errors, prefix="author")
validate_optional_https_url(author, "url", errors, prefix="author")
validate_optional_contract_path(manifest, "skills", "skills", errors)
validate_optional_contract_path(manifest, "apps", ".app.json", errors)
validate_optional_contract_path(manifest, "mcpServers", ".mcp.json", errors)
if manifest.get("apps") is not None:
validate_app_manifest(
plugin_root / ".app.json",
errors,
)
if manifest.get("mcpServers") is not None:
validate_mcp_manifest(
plugin_root / ".mcp.json",
errors,
)
validate_skill_manifests(plugin_root, errors)
interface = require_object(manifest, "interface", errors)
if interface is None:
return
reject_unknown_fields(
interface,
{
"displayName",
"shortDescription",
"longDescription",
"developerName",
"category",
"capabilities",
"websiteURL",
"privacyPolicyURL",
"termsOfServiceURL",
"brandColor",
"composerIcon",
"logo",
"screenshots",
"defaultPrompt",
"default_prompt",
},
"interface",
errors,
)
for field in (
"displayName",
"shortDescription",
"longDescription",
"developerName",
"category",
):
require_non_empty_string(interface, field, errors, prefix="interface")
if "defaultPrompt" not in interface and "default_prompt" not in interface:
errors.append(
"plugin.json field `interface.defaultPrompt` or `interface.default_prompt` is required"
)
capabilities = interface.get("capabilities")
if not isinstance(capabilities, list) or not all(
isinstance(value, str) and value.strip() for value in capabilities
):
errors.append("plugin.json field `interface.capabilities` must be an array of strings")
for field in ("websiteURL", "privacyPolicyURL", "termsOfServiceURL"):
validate_optional_https_url(interface, field, errors, prefix="interface")
brand_color = interface.get("brandColor")
if brand_color is not None and (
not isinstance(brand_color, str) or HEX_COLOR_RE.fullmatch(brand_color) is None
):
errors.append("plugin.json field `interface.brandColor` must use `#RRGGBB`")
for field in ("composerIcon", "logo"):
validate_optional_asset_path(plugin_root, plugin_root, interface, field, errors)
screenshots = interface.get("screenshots", [])
if not isinstance(screenshots, list):
errors.append("plugin.json field `interface.screenshots` must be an array")
else:
for index, raw_path in enumerate(screenshots):
validate_asset_path(
plugin_root,
plugin_root,
raw_path,
f"interface.screenshots[{index}]",
errors,
)
def require_object(
payload: dict[str, Any],
key: str,
errors: list[str],
) -> dict[str, Any] | None:
value = payload.get(key)
if not isinstance(value, dict):
errors.append(f"plugin.json field `{key}` must be an object")
return None
return value
def require_non_empty_string(
payload: dict[str, Any],
key: str,
errors: list[str],
*,
prefix: str | None = None,
) -> str | None:
value = payload.get(key)
field = f"{prefix}.{key}" if prefix is not None else key
if not isinstance(value, str) or not value.strip():
errors.append(f"plugin.json field `{field}` must be a non-empty string")
return None
return value
def validate_optional_non_empty_string(
payload: dict[str, Any],
key: str,
errors: list[str],
*,
prefix: str | None = None,
) -> None:
value = payload.get(key)
if value is None:
return
field = f"{prefix}.{key}" if prefix is not None else key
if not isinstance(value, str) or not value.strip():
errors.append(f"plugin.json field `{field}` must be a non-empty string")
def reject_unknown_fields(
payload: dict[str, Any],
allowed_keys: set[str],
prefix: str,
errors: list[str],
) -> None:
for key in sorted(set(payload) - allowed_keys):
errors.append(f"plugin.json field `{prefix}.{key}` is not accepted by plugin validation")
def validate_optional_https_url(
payload: dict[str, Any],
key: str,
errors: list[str],
*,
prefix: str,
) -> None:
value = payload.get(key)
if value is None:
return
parsed = urlparse(value) if isinstance(value, str) else None
if parsed is None or parsed.scheme != "https" or not parsed.netloc:
errors.append(f"plugin.json field `{prefix}.{key}` must be an absolute `https://` URL")
def validate_optional_contract_path(
payload: dict[str, Any],
key: str,
expected: str,
errors: list[str],
) -> None:
value = payload.get(key)
if value is None:
return
normalized = normalize_contract_path(value) if isinstance(value, str) else None
if normalized != expected:
errors.append(f"plugin.json field `{key}` must resolve to `{expected}`")
def normalize_contract_path(raw_path: str) -> str | None:
path = Path(raw_path)
if path.is_absolute():
return None
normalized = path.as_posix().rstrip("/")
return normalized or None
def validate_app_manifest(path: Path, errors: list[str]) -> None:
payload = load_companion_json_object(path, "`.app.json`", errors)
if payload is None:
return
reject_companion_unknown_fields(payload, {"apps"}, "`.app.json`", errors)
apps = payload.get("apps")
if not isinstance(apps, dict):
errors.append("`.app.json` field `apps` must be an object")
return
for key, value in apps.items():
if not isinstance(value, dict):
errors.append(f"`.app.json` app `{key}` must be an object")
continue
reject_companion_unknown_fields(value, {"id"}, f"`.app.json` app `{key}`", errors)
app_id = value.get("id")
if not isinstance(app_id, str) or not app_id.strip():
errors.append(f"`.app.json` app `{key}` field `id` must be a non-empty string")
def validate_mcp_manifest(path: Path, errors: list[str]) -> None:
payload = load_companion_json_object(path, "`.mcp.json`", errors)
if payload is None:
return
reject_companion_unknown_fields(payload, {"mcpServers"}, "`.mcp.json`", errors)
servers = payload.get("mcpServers")
if not isinstance(servers, dict):
errors.append("`.mcp.json` field `mcpServers` must be an object")
return
for key, value in servers.items():
if not isinstance(key, str) or not key.strip():
errors.append("`.mcp.json` server names must be non-empty strings")
if not isinstance(value, dict):
errors.append(f"`.mcp.json` server `{key}` must be an object")
def load_companion_json_object(
path: Path,
label: str,
errors: list[str],
) -> dict[str, Any] | None:
if not path.is_file():
errors.append(f"{label} is required when its plugin.json field is present")
return None
try:
payload = json.loads(path.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
errors.append(f"{label} must contain valid JSON")
return None
if not isinstance(payload, dict):
errors.append(f"{label} must contain a JSON object")
return None
return payload
def reject_companion_unknown_fields(
payload: dict[str, Any],
allowed_keys: set[str],
prefix: str,
errors: list[str],
) -> None:
for key in sorted(set(payload) - allowed_keys):
errors.append(f"{prefix} field `{key}` is not accepted by plugin validation")
def validate_skill_manifests(plugin_root: Path, errors: list[str]) -> None:
skills_root = plugin_root / "skills"
if not skills_root.is_dir():
return
for skill_root in sorted(skills_root.iterdir(), key=lambda path: path.name):
if skill_root.name.startswith(".") or not skill_root.is_dir():
continue
validate_skill_manifest(skill_root, errors)
def validate_skill_manifest(skill_root: Path, errors: list[str]) -> None:
skill_md_path = skill_root / "SKILL.md"
if not skill_md_path.is_file():
errors.append(f"skill `{skill_root.name}` is missing `SKILL.md`")
return
try:
contents = skill_md_path.read_text(encoding="utf-8")
except OSError:
errors.append(f"unable to read skill `{skill_root.name}`")
return
if not contents.startswith("---\n"):
errors.append(f"skill `{skill_root.name}` must start with YAML frontmatter")
return
frontmatter_end = contents.find("\n---", 4)
if frontmatter_end == -1:
errors.append(f"skill `{skill_root.name}` frontmatter is not closed")
return
try:
frontmatter = yaml.safe_load(contents[4:frontmatter_end])
except yaml.YAMLError:
errors.append(f"skill `{skill_root.name}` frontmatter must be valid YAML")
return
if not isinstance(frontmatter, dict):
errors.append(f"skill `{skill_root.name}` frontmatter must be an object")
return
skill_name = frontmatter.get("name")
if not isinstance(skill_name, str) or not skill_name.strip():
errors.append(f"skill `{skill_root.name}` frontmatter field `name` must be non-empty")
description = frontmatter.get("description")
if not isinstance(description, str) or not description.strip():
errors.append(
f"skill `{skill_root.name}` frontmatter field `description` must be non-empty"
)
disable_model_invocation = frontmatter.get("disable-model-invocation")
if disable_model_invocation is None:
disable_model_invocation = frontmatter.get("disable_model_invocation")
if disable_model_invocation not in (None, False):
errors.append(
f"skill `{skill_root.name}` frontmatter field `disable-model-invocation` must be false"
)
agent_yaml_path = skill_root / "agents" / "openai.yaml"
if agent_yaml_path.is_file():
validate_skill_agent_manifest(
plugin_root=skill_root.parent.parent,
skill_root=skill_root,
agent_yaml_path=agent_yaml_path,
errors=errors,
)
def validate_skill_agent_manifest(
*,
plugin_root: Path,
skill_root: Path,
agent_yaml_path: Path,
errors: list[str],
) -> None:
try:
payload = yaml.safe_load(agent_yaml_path.read_text(encoding="utf-8"))
except OSError:
errors.append(f"unable to read skill `{skill_root.name}` agent YAML")
return
except yaml.YAMLError:
errors.append(f"skill `{skill_root.name}` agent YAML must be valid YAML")
return
if not isinstance(payload, dict):
errors.append(f"skill `{skill_root.name}` agent YAML must be an object")
return
reject_skill_agent_unknown_fields(
payload,
{"interface", "policy", "dependencies"},
skill_root,
errors,
)
interface = payload.get("interface")
if not isinstance(interface, dict):
errors.append(f"skill `{skill_root.name}` agent field `interface` must be an object")
return
reject_skill_agent_unknown_fields(
interface,
{
"display_name",
"short_description",
"icon_small",
"icon_large",
"brand_color",
"default_prompt",
},
skill_root,
errors,
prefix="interface",
)
for field in ("display_name", "short_description"):
value = interface.get(field)
if not isinstance(value, str) or not value.strip():
errors.append(
f"skill `{skill_root.name}` agent field `interface.{field}` must be non-empty"
)
for field in ("icon_small", "icon_large"):
validate_optional_asset_path(
skill_root,
plugin_root,
interface,
field,
errors,
prefix=f"skill `{skill_root.name}` agent field `interface",
)
brand_color = interface.get("brand_color")
if brand_color is not None and (
not isinstance(brand_color, str) or HEX_COLOR_RE.fullmatch(brand_color) is None
):
errors.append(
f"skill `{skill_root.name}` agent field `interface.brand_color` must use `#RRGGBB`"
)
default_prompt = interface.get("default_prompt")
if default_prompt is not None and (
not isinstance(default_prompt, str) or not default_prompt.strip()
):
errors.append(
f"skill `{skill_root.name}` agent field `interface.default_prompt` must be non-empty"
)
policy = payload.get("policy")
if policy is not None:
if not isinstance(policy, dict):
errors.append(f"skill `{skill_root.name}` agent field `policy` must be an object")
else:
reject_skill_agent_unknown_fields(
policy,
{"allow_implicit_invocation"},
skill_root,
errors,
prefix="policy",
)
allow_implicit_invocation = policy.get("allow_implicit_invocation")
if allow_implicit_invocation is not None and not isinstance(
allow_implicit_invocation,
bool,
):
errors.append(
f"skill `{skill_root.name}` agent field "
"`policy.allow_implicit_invocation` must be a boolean"
)
dependencies = payload.get("dependencies")
if dependencies is not None:
if not isinstance(dependencies, dict):
errors.append(
f"skill `{skill_root.name}` agent field `dependencies` must be an object"
)
else:
reject_skill_agent_unknown_fields(
dependencies,
{"tools"},
skill_root,
errors,
prefix="dependencies",
)
def reject_skill_agent_unknown_fields(
payload: dict[str, Any],
allowed_keys: set[str],
skill_root: Path,
errors: list[str],
*,
prefix: str | None = None,
) -> None:
for key in sorted(set(payload) - allowed_keys):
field = f"{prefix}.{key}" if prefix is not None else key
errors.append(
f"skill `{skill_root.name}` agent field `{field}` is not accepted by plugin validation"
)
def validate_optional_asset_path(
base_dir: Path,
allowed_root: Path,
payload: dict[str, Any],
key: str,
errors: list[str],
*,
prefix: str = "interface",
) -> None:
raw_path = payload.get(key)
if raw_path is None:
return
validate_asset_path(base_dir, allowed_root, raw_path, f"{prefix}.{key}", errors)
def validate_asset_path(
base_dir: Path,
allowed_root: Path,
raw_path: Any,
field: str,
errors: list[str],
) -> None:
label = field if field.startswith("skill `") else f"plugin.json field `{field}`"
if not isinstance(raw_path, str) or not raw_path.strip():
errors.append(f"{label} must be a non-empty relative path")
return
candidate = PurePosixPath(raw_path.replace("\\", "/"))
if candidate.is_absolute() or any(part in {"", ".", ".."} for part in candidate.parts):
errors.append(f"{label} must stay inside the plugin archive")
return
resolved_path = (base_dir / candidate.as_posix()).resolve()
if not resolved_path.is_relative_to(allowed_root.resolve()):
errors.append(f"{label} must stay inside the plugin archive")
return
if not resolved_path.is_file():
errors.append(f"{label} points to a missing file")
if __name__ == "__main__":
main()
+416
View File
@@ -0,0 +1,416 @@
---
name: skill-creator
description: Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Codex's capabilities with specialized knowledge, workflows, or tool integrations.
metadata:
short-description: Create or update a skill
---
# Skill Creator
This skill provides guidance for creating effective skills.
## About Skills
Skills are modular, self-contained folders that extend Codex's capabilities by providing
specialized knowledge, workflows, and tools. Think of them as "onboarding guides" for specific
domains or tasks—they transform Codex from a general-purpose agent into a specialized agent
equipped with procedural knowledge that no model can fully possess.
### What Skills Provide
1. Specialized workflows - Multi-step procedures for specific domains
2. Tool integrations - Instructions for working with specific file formats or APIs
3. Domain expertise - Company-specific knowledge, schemas, business logic
4. Bundled resources - Scripts, references, and assets for complex and repetitive tasks
## Core Principles
### Concise is Key
The context window is a public good. Skills share the context window with everything else Codex needs: system prompt, conversation history, other Skills' metadata, and the actual user request.
**Default assumption: Codex is already very smart.** Only add context Codex doesn't already have. Challenge each piece of information: "Does Codex really need this explanation?" and "Does this paragraph justify its token cost?"
Prefer concise examples over verbose explanations.
### Set Appropriate Degrees of Freedom
Match the level of specificity to the task's fragility and variability:
**High freedom (text-based instructions)**: Use when multiple approaches are valid, decisions depend on context, or heuristics guide the approach.
**Medium freedom (pseudocode or scripts with parameters)**: Use when a preferred pattern exists, some variation is acceptable, or configuration affects behavior.
**Low freedom (specific scripts, few parameters)**: Use when operations are fragile and error-prone, consistency is critical, or a specific sequence must be followed.
Think of Codex as exploring a path: a narrow bridge with cliffs needs specific guardrails (low freedom), while an open field allows many routes (high freedom).
### Protect Validation Integrity
You may use subagents during iteration to validate whether a skill works on realistic tasks or whether a suspected problem is real. This is most useful when you want an independent pass on the skill's behavior, outputs, or failure modes after a revision. Only do this when it is possible to start new subagents.
When using subagents for validation, treat that as an evaluation surface. The goal is to learn whether the skill generalizes, not whether another agent can reconstruct the answer from leaked context.
Prefer raw artifacts such as example prompts, outputs, diffs, logs, or traces. Give the minimum task-local context needed to perform the validation. Avoid passing the intended answer, suspected bug, intended fix, or your prior conclusions unless the validation explicitly requires them.
### Anatomy of a Skill
Every skill consists of a required SKILL.md file and optional bundled resources:
```
skill-name/
├── SKILL.md (required)
│ ├── YAML frontmatter metadata (required)
│ │ ├── name: (required)
│ │ └── description: (required)
│ └── Markdown instructions (required)
├── agents/ (recommended)
│ └── openai.yaml - UI metadata for skill lists and chips
└── Bundled Resources (optional)
├── scripts/ - Executable code (Python/Bash/etc.)
├── references/ - Documentation intended to be loaded into context as needed
└── assets/ - Files used in output (templates, icons, fonts, etc.)
```
#### SKILL.md (required)
Every SKILL.md consists of:
- **Frontmatter** (YAML): Contains `name` and `description` fields. These are the only fields that Codex reads to determine when the skill gets used, thus it is very important to be clear and comprehensive in describing what the skill is, and when it should be used.
- **Body** (Markdown): Instructions and guidance for using the skill. Only loaded AFTER the skill triggers (if at all).
#### Agents metadata (recommended)
- UI-facing metadata for skill lists and chips
- Read references/openai_yaml.md before generating values and follow its descriptions and constraints
- Create: human-facing `display_name`, `short_description`, and `default_prompt` by reading the skill
- Generate deterministically by passing the values as `--interface key=value` to `scripts/generate_openai_yaml.py` or `scripts/init_skill.py`
- On updates: validate `agents/openai.yaml` still matches SKILL.md; regenerate if stale
- Only include other optional interface fields (icons, brand color) if explicitly provided
- See references/openai_yaml.md for field definitions and examples
#### Bundled Resources (optional)
##### Scripts (`scripts/`)
Executable code (Python/Bash/etc.) for tasks that require deterministic reliability or are repeatedly rewritten.
- **When to include**: When the same code is being rewritten repeatedly or deterministic reliability is needed
- **Example**: `scripts/rotate_pdf.py` for PDF rotation tasks
- **Benefits**: Token efficient, deterministic, may be executed without loading into context
- **Note**: Scripts may still need to be read by Codex for patching or environment-specific adjustments
##### References (`references/`)
Documentation and reference material intended to be loaded as needed into context to inform Codex's process and thinking.
- **When to include**: For documentation that Codex should reference while working
- **Examples**: `references/finance.md` for financial schemas, `references/mnda.md` for company NDA template, `references/policies.md` for company policies, `references/api_docs.md` for API specifications
- **Use cases**: Database schemas, API documentation, domain knowledge, company policies, detailed workflow guides
- **Benefits**: Keeps SKILL.md lean, loaded only when Codex determines it's needed
- **Best practice**: If files are large (>10k words), include grep search patterns in SKILL.md
- **Avoid duplication**: Information should live in either SKILL.md or references files, not both. Prefer references files for detailed information unless it's truly core to the skill—this keeps SKILL.md lean while making information discoverable without hogging the context window. Keep only essential procedural instructions and workflow guidance in SKILL.md; move detailed reference material, schemas, and examples to references files.
##### Assets (`assets/`)
Files not intended to be loaded into context, but rather used within the output Codex produces.
- **When to include**: When the skill needs files that will be used in the final output
- **Examples**: `assets/logo.png` for brand assets, `assets/slides.pptx` for PowerPoint templates, `assets/frontend-template/` for HTML/React boilerplate, `assets/font.ttf` for typography
- **Use cases**: Templates, images, icons, boilerplate code, fonts, sample documents that get copied or modified
- **Benefits**: Separates output resources from documentation, enables Codex to use files without loading them into context
#### What to Not Include in a Skill
A skill should only contain essential files that directly support its functionality. Do NOT create extraneous documentation or auxiliary files, including:
- README.md
- INSTALLATION_GUIDE.md
- QUICK_REFERENCE.md
- CHANGELOG.md
- etc.
The skill should only contain the information needed for an AI agent to do the job at hand. It should not contain auxiliary context about the process that went into creating it, setup and testing procedures, user-facing documentation, etc. Creating additional documentation files just adds clutter and confusion.
### Progressive Disclosure Design Principle
Skills use a three-level loading system to manage context efficiently:
1. **Metadata (name + description)** - Always in context (~100 words)
2. **SKILL.md body** - When skill triggers (<5k words)
3. **Bundled resources** - As needed by Codex (Unlimited because scripts can be executed without reading into context window)
#### Progressive Disclosure Patterns
Keep SKILL.md body to the essentials and under 500 lines to minimize context bloat. Split content into separate files when approaching this limit. When splitting out content into other files, it is very important to reference them from SKILL.md and describe clearly when to read them, to ensure the reader of the skill knows they exist and when to use them.
**Key principle:** When a skill supports multiple variations, frameworks, or options, keep only the core workflow and selection guidance in SKILL.md. Move variant-specific details (patterns, examples, configuration) into separate reference files.
**Pattern 1: High-level guide with references**
```markdown
# PDF Processing
## Quick start
Extract text with pdfplumber:
[code example]
## Advanced features
- **Form filling**: See [FORMS.md](FORMS.md) for complete guide
- **API reference**: See [REFERENCE.md](REFERENCE.md) for all methods
- **Examples**: See [EXAMPLES.md](EXAMPLES.md) for common patterns
```
Codex loads FORMS.md, REFERENCE.md, or EXAMPLES.md only when needed.
**Pattern 2: Domain-specific organization**
For Skills with multiple domains, organize content by domain to avoid loading irrelevant context:
```
bigquery-skill/
├── SKILL.md (overview and navigation)
└── reference/
├── finance.md (revenue, billing metrics)
├── sales.md (opportunities, pipeline)
├── product.md (API usage, features)
└── marketing.md (campaigns, attribution)
```
When a user asks about sales metrics, Codex only reads sales.md.
Similarly, for skills supporting multiple frameworks or variants, organize by variant:
```
cloud-deploy/
├── SKILL.md (workflow + provider selection)
└── references/
├── aws.md (AWS deployment patterns)
├── gcp.md (GCP deployment patterns)
└── azure.md (Azure deployment patterns)
```
When the user chooses AWS, Codex only reads aws.md.
**Pattern 3: Conditional details**
Show basic content, link to advanced content:
```markdown
# DOCX Processing
## Creating documents
Use docx-js for new documents. See [DOCX-JS.md](DOCX-JS.md).
## Editing documents
For simple edits, modify the XML directly.
**For tracked changes**: See [REDLINING.md](REDLINING.md)
**For OOXML details**: See [OOXML.md](OOXML.md)
```
Codex reads REDLINING.md or OOXML.md only when the user needs those features.
**Important guidelines:**
- **Avoid deeply nested references** - Keep references one level deep from SKILL.md. All reference files should link directly from SKILL.md.
- **Structure longer reference files** - For files longer than 100 lines, include a table of contents at the top so Codex can see the full scope when previewing.
## Skill Creation Process
Skill creation involves these steps:
1. Understand the skill with concrete examples
2. Plan reusable skill contents (scripts, references, assets)
3. Initialize the skill (run init_skill.py)
4. Edit the skill (implement resources and write SKILL.md)
5. Validate the skill (run quick_validate.py)
6. Iterate based on real usage and forward-test complex skills.
Follow these steps in order, skipping only if there is a clear reason why they are not applicable.
### Skill Naming
- Use lowercase letters, digits, and hyphens only; normalize user-provided titles to hyphen-case (e.g., "Plan Mode" -> `plan-mode`).
- When generating names, generate a name under 64 characters (letters, digits, hyphens).
- Prefer short, verb-led phrases that describe the action.
- Namespace by tool when it improves clarity or triggering (e.g., `gh-address-comments`, `linear-address-issue`).
- Name the skill folder exactly after the skill name.
### Step 1: Understanding the Skill with Concrete Examples
Skip this step only when the skill's usage patterns are already clearly understood. It remains valuable even when working with an existing skill.
To create an effective skill, clearly understand concrete examples of how the skill will be used. This understanding can come from either direct user examples or generated examples that are validated with user feedback.
For example, when building an image-editor skill, relevant questions include:
- "What functionality should the image-editor skill support? Editing, rotating, anything else?"
- "Can you give some examples of how this skill would be used?"
- "I can imagine users asking for things like 'Remove the red-eye from this image' or 'Rotate this image'. Are there other ways you imagine this skill being used?"
- "What would a user say that should trigger this skill?"
- "Where should I create this skill? If you do not have a preference, I will place it in `$CODEX_HOME/skills` (or `~/.codex/skills` when `CODEX_HOME` is unset) so Codex can discover it automatically."
To avoid overwhelming users, avoid asking too many questions in a single message. Start with the most important questions and follow up as needed for better effectiveness.
Conclude this step when there is a clear sense of the functionality the skill should support.
### Step 2: Planning the Reusable Skill Contents
To turn concrete examples into an effective skill, analyze each example by:
1. Considering how to execute on the example from scratch
2. Identifying what scripts, references, and assets would be helpful when executing these workflows repeatedly
Example: When building a `pdf-editor` skill to handle queries like "Help me rotate this PDF," the analysis shows:
1. Rotating a PDF requires re-writing the same code each time
2. A `scripts/rotate_pdf.py` script would be helpful to store in the skill
Example: When designing a `frontend-webapp-builder` skill for queries like "Build me a todo app" or "Build me a dashboard to track my steps," the analysis shows:
1. Writing a frontend webapp requires the same boilerplate HTML/React each time
2. An `assets/hello-world/` template containing the boilerplate HTML/React project files would be helpful to store in the skill
Example: When building a `big-query` skill to handle queries like "How many users have logged in today?" the analysis shows:
1. Querying BigQuery requires re-discovering the table schemas and relationships each time
2. A `references/schema.md` file documenting the table schemas would be helpful to store in the skill
To establish the skill's contents, analyze each concrete example to create a list of the reusable resources to include: scripts, references, and assets.
### Step 3: Initializing the Skill
At this point, it is time to actually create the skill.
Skip this step only if the skill being developed already exists. In this case, continue to the next step.
Before running `init_skill.py`, ask where the user wants the skill created. If they do not specify a location, default to `$CODEX_HOME/skills`; when `CODEX_HOME` is unset, fall back to `~/.codex/skills` so the skill is auto-discovered.
When creating a new skill from scratch, always run the `init_skill.py` script. The script conveniently generates a new template skill directory that automatically includes everything a skill requires, making the skill creation process much more efficient and reliable.
Usage:
```bash
scripts/init_skill.py <skill-name> --path <output-directory> [--resources scripts,references,assets] [--examples]
```
Examples:
```bash
scripts/init_skill.py my-skill --path "${CODEX_HOME:-$HOME/.codex}/skills"
scripts/init_skill.py my-skill --path "${CODEX_HOME:-$HOME/.codex}/skills" --resources scripts,references
scripts/init_skill.py my-skill --path ~/work/skills --resources scripts --examples
```
The script:
- Creates the skill directory at the specified path
- Generates a SKILL.md template with proper frontmatter and TODO placeholders
- Creates `agents/openai.yaml` using agent-generated `display_name`, `short_description`, and `default_prompt` passed via `--interface key=value`
- Optionally creates resource directories based on `--resources`
- Optionally adds example files when `--examples` is set
After initialization, customize the SKILL.md and add resources as needed. If you used `--examples`, replace or delete placeholder files.
Generate `display_name`, `short_description`, and `default_prompt` by reading the skill, then pass them as `--interface key=value` to `init_skill.py` or regenerate with:
```bash
scripts/generate_openai_yaml.py <path/to/skill-folder> --interface key=value
```
Only include other optional interface fields when the user explicitly provides them. For full field descriptions and examples, see references/openai_yaml.md.
### Step 4: Edit the Skill
When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of Codex to use. Include information that would be beneficial and non-obvious to Codex. Consider what procedural knowledge, domain-specific details, or reusable assets would help another Codex instance execute these tasks more effectively.
After substantial revisions, or if the skill is particularly tricky, you should use subagents to forward-test the skill on realistic tasks or artifacts. When doing so, pass the artifact under validation rather than your diagnosis of what is wrong, and keep the prompt generic enough that success depends on transferable reasoning rather than hidden ground truth.
#### Start with Reusable Skill Contents
To begin implementation, start with the reusable resources identified above: `scripts/`, `references/`, and `assets/` files. Note that this step may require user input. For example, when implementing a `brand-guidelines` skill, the user may need to provide brand assets or templates to store in `assets/`, or documentation to store in `references/`.
Added scripts must be tested by actually running them to ensure there are no bugs and that the output matches what is expected. If there are many similar scripts, only a representative sample needs to be tested to ensure confidence that they all work while balancing time to completion.
If you used `--examples`, delete any placeholder files that are not needed for the skill. Only create resource directories that are actually required.
#### Update SKILL.md
**Writing Guidelines:** Always use imperative/infinitive form.
##### Frontmatter
Write the YAML frontmatter with `name` and `description`:
- `name`: The skill name
- `description`: This is the primary triggering mechanism for your skill, and helps Codex understand when to use the skill.
- Include both what the Skill does and specific triggers/contexts for when to use it.
- Include all "when to use" information here - Not in the body. The body is only loaded after triggering, so "When to Use This Skill" sections in the body are not helpful to Codex.
- Example description for a `docx` skill: "Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. Use when Codex needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks"
Do not include any other fields in YAML frontmatter.
##### Body
Write instructions for using the skill and its bundled resources.
### Step 5: Validate the Skill
Once development of the skill is complete, validate the skill folder to catch basic issues early:
```bash
scripts/quick_validate.py <path/to/skill-folder>
```
The validation script checks YAML frontmatter format, required fields, and naming rules. If validation fails, fix the reported issues and run the command again.
### Step 6: Iterate
After testing the skill, you may detect the skill is complex enough that it requires forward-testing; or users may request improvements.
User testing often this happens right after using the skill, with fresh context of how the skill performed.
**Forward-testing and iteration workflow:**
1. Use the skill on real tasks
2. Notice struggles or inefficiencies
3. Identify how SKILL.md or bundled resources should be updated
4. Implement changes and test again
5. Forward-test if it is reasonable and appropriate
## Forward-testing
To forward-test, launch subagents as a way to stress test the skill with minimal context.
Subagents should *not* know that they are being asked to test the skill. They should be treated as
an agent asked to perform a task by the user. Prompts to subagents should look like:
`Use $skill-x at /path/to/skill-x to solve problem y`
Not:
`Review the skill at /path/to/skill-x; pretend a user asks you to...`
Decision rule for forward-testing:
- Err on the side of forward-testing
- Ask for approval if you think there's a risk that forward-testing would:
* take a long time,
* require additional approvals from the user, or
* modify live production systems
In these cases, show the user your proposed prompt and request (1) a yes/no decision, and
(2) any suggested modifictions.
Considerations when forward-testing:
- use fresh threads for independent passes
- pass the skill, and a request in a similar way the user would.
- pass raw artifacts, not your conclusions
- avoid showing expected answers or intended fixes
- rebuild context from source artifacts after each iteration
- review the subagent's output and reasoning and emitted artifacts
- avoid leaving artifacts the agent can find on disk between iterations;
clean up subagents' artifacts to avoid additional contamination.
If forward-testing only succeeds when subagents see leaked context, tighten the skill or the
forward-testing setup before trusting the result.
@@ -0,0 +1,5 @@
interface:
display_name: "Skill Creator"
short_description: "Create or update a skill"
icon_small: "./assets/skill-creator-small.svg"
icon_large: "./assets/skill-creator.png"
@@ -0,0 +1,3 @@
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="currentColor" viewBox="0 0 20 20">
<path fill="#0D0D0D" d="M12.03 4.113a3.612 3.612 0 0 1 5.108 5.108l-6.292 6.29c-.324.324-.56.561-.791.752l-.235.176c-.205.14-.422.261-.65.36l-.229.093a4.136 4.136 0 0 1-.586.16l-.764.134-2.394.4c-.142.024-.294.05-.423.06-.098.007-.232.01-.378-.026l-.149-.05a1.081 1.081 0 0 1-.521-.474l-.046-.093a1.104 1.104 0 0 1-.075-.527c.01-.129.035-.28.06-.422l.398-2.394c.1-.602.162-.987.295-1.35l.093-.23c.1-.228.22-.445.36-.65l.176-.235c.19-.232.428-.467.751-.79l6.292-6.292Zm-5.35 7.232c-.35.35-.534.535-.66.688l-.11.147a2.67 2.67 0 0 0-.24.433l-.062.154c-.08.22-.124.462-.232 1.112l-.398 2.394-.001.001h.003l2.393-.399.717-.126a2.63 2.63 0 0 0 .394-.105l.154-.063a2.65 2.65 0 0 0 .433-.24l.147-.11c.153-.126.339-.31.688-.66l4.988-4.988-3.227-3.226-4.987 4.988Zm9.517-6.291a2.281 2.281 0 0 0-3.225 0l-.364.362 3.226 3.227.363-.364c.89-.89.89-2.334 0-3.225ZM4.583 1.783a.3.3 0 0 1 .294.241c.117.585.347 1.092.707 1.48.357.385.859.668 1.549.783a.3.3 0 0 1 0 .592c-.69.115-1.192.398-1.549.783-.315.34-.53.77-.657 1.265l-.05.215a.3.3 0 0 1-.588 0c-.117-.585-.347-1.092-.707-1.48-.357-.384-.859-.668-1.549-.783a.3.3 0 0 1 0-.592c.69-.115 1.192-.398 1.549-.783.36-.388.59-.895.707-1.48l.015-.05a.3.3 0 0 1 .279-.19Z"/>
</svg>

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 KiB

@@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
@@ -0,0 +1,49 @@
# openai.yaml fields (full example + descriptions)
`agents/openai.yaml` is an extended, product-specific config intended for the machine/harness to read, not the agent. Other product-specific config can also live in the `agents/` folder.
## Full example
```yaml
interface:
display_name: "Optional user-facing name"
short_description: "Optional user-facing description"
icon_small: "./assets/small-400px.png"
icon_large: "./assets/large-logo.svg"
brand_color: "#3B82F6"
default_prompt: "Optional surrounding prompt to use the skill with"
dependencies:
tools:
- type: "mcp"
value: "github"
description: "GitHub MCP server"
transport: "streamable_http"
url: "https://api.githubcopilot.com/mcp/"
policy:
allow_implicit_invocation: true
```
## Field descriptions and constraints
Top-level constraints:
- Quote all string values.
- Keep keys unquoted.
- For `interface.default_prompt`: generate a helpful, short (typically 1 sentence) example starting prompt based on the skill. It must explicitly mention the skill as `$skill-name` (e.g., "Use $skill-name-here to draft a concise weekly status update.").
- `interface.display_name`: Human-facing title shown in UI skill lists and chips.
- `interface.short_description`: Human-facing short UI blurb (2564 chars) for quick scanning.
- `interface.icon_small`: Path to a small icon asset (relative to skill dir). Default to `./assets/` and place icons in the skill's `assets/` folder.
- `interface.icon_large`: Path to a larger logo asset (relative to skill dir). Default to `./assets/` and place icons in the skill's `assets/` folder.
- `interface.brand_color`: Hex color used for UI accents (e.g., badges).
- `interface.default_prompt`: Default prompt snippet inserted when invoking the skill.
- `dependencies.tools[].type`: Dependency category. Only `mcp` is supported for now.
- `dependencies.tools[].value`: Identifier of the tool or dependency.
- `dependencies.tools[].description`: Human-readable explanation of the dependency.
- `dependencies.tools[].transport`: Connection type when `type` is `mcp`.
- `dependencies.tools[].url`: MCP server URL when `type` is `mcp`.
- `policy.allow_implicit_invocation`: When false, the skill is not injected into
the model context by default, but can still be invoked explicitly via `$skill`.
Defaults to true.
@@ -0,0 +1,226 @@
#!/usr/bin/env python3
"""
OpenAI YAML Generator - Creates agents/openai.yaml for a skill folder.
Usage:
generate_openai_yaml.py <skill_dir> [--name <skill_name>] [--interface key=value]
"""
import argparse
import re
import sys
from pathlib import Path
ACRONYMS = {
"GH",
"MCP",
"API",
"CI",
"CLI",
"LLM",
"PDF",
"PR",
"UI",
"URL",
"SQL",
}
BRANDS = {
"openai": "OpenAI",
"openapi": "OpenAPI",
"github": "GitHub",
"pagerduty": "PagerDuty",
"datadog": "DataDog",
"sqlite": "SQLite",
"fastapi": "FastAPI",
}
SMALL_WORDS = {"and", "or", "to", "up", "with"}
ALLOWED_INTERFACE_KEYS = {
"display_name",
"short_description",
"icon_small",
"icon_large",
"brand_color",
"default_prompt",
}
def yaml_quote(value):
escaped = value.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")
return f'"{escaped}"'
def format_display_name(skill_name):
words = [word for word in skill_name.split("-") if word]
formatted = []
for index, word in enumerate(words):
lower = word.lower()
upper = word.upper()
if upper in ACRONYMS:
formatted.append(upper)
continue
if lower in BRANDS:
formatted.append(BRANDS[lower])
continue
if index > 0 and lower in SMALL_WORDS:
formatted.append(lower)
continue
formatted.append(word.capitalize())
return " ".join(formatted)
def generate_short_description(display_name):
description = f"Help with {display_name} tasks"
if len(description) < 25:
description = f"Help with {display_name} tasks and workflows"
if len(description) < 25:
description = f"Help with {display_name} tasks with guidance"
if len(description) > 64:
description = f"Help with {display_name}"
if len(description) > 64:
description = f"{display_name} helper"
if len(description) > 64:
description = f"{display_name} tools"
if len(description) > 64:
suffix = " helper"
max_name_length = 64 - len(suffix)
trimmed = display_name[:max_name_length].rstrip()
description = f"{trimmed}{suffix}"
if len(description) > 64:
description = description[:64].rstrip()
if len(description) < 25:
description = f"{description} workflows"
if len(description) > 64:
description = description[:64].rstrip()
return description
def read_frontmatter_name(skill_dir):
skill_md = Path(skill_dir) / "SKILL.md"
if not skill_md.exists():
print(f"[ERROR] SKILL.md not found in {skill_dir}")
return None
content = skill_md.read_text()
match = re.match(r"^---\n(.*?)\n---", content, re.DOTALL)
if not match:
print("[ERROR] Invalid SKILL.md frontmatter format.")
return None
frontmatter_text = match.group(1)
import yaml
try:
frontmatter = yaml.safe_load(frontmatter_text)
except yaml.YAMLError as exc:
print(f"[ERROR] Invalid YAML frontmatter: {exc}")
return None
if not isinstance(frontmatter, dict):
print("[ERROR] Frontmatter must be a YAML dictionary.")
return None
name = frontmatter.get("name", "")
if not isinstance(name, str) or not name.strip():
print("[ERROR] Frontmatter 'name' is missing or invalid.")
return None
return name.strip()
def parse_interface_overrides(raw_overrides):
overrides = {}
optional_order = []
for item in raw_overrides:
if "=" not in item:
print(f"[ERROR] Invalid interface override '{item}'. Use key=value.")
return None, None
key, value = item.split("=", 1)
key = key.strip()
value = value.strip()
if not key:
print(f"[ERROR] Invalid interface override '{item}'. Key is empty.")
return None, None
if key not in ALLOWED_INTERFACE_KEYS:
allowed = ", ".join(sorted(ALLOWED_INTERFACE_KEYS))
print(f"[ERROR] Unknown interface field '{key}'. Allowed: {allowed}")
return None, None
overrides[key] = value
if key not in ("display_name", "short_description") and key not in optional_order:
optional_order.append(key)
return overrides, optional_order
def write_openai_yaml(skill_dir, skill_name, raw_overrides):
overrides, optional_order = parse_interface_overrides(raw_overrides)
if overrides is None:
return None
display_name = overrides.get("display_name") or format_display_name(skill_name)
short_description = overrides.get("short_description") or generate_short_description(display_name)
if not (25 <= len(short_description) <= 64):
print(
"[ERROR] short_description must be 25-64 characters "
f"(got {len(short_description)})."
)
return None
interface_lines = [
"interface:",
f" display_name: {yaml_quote(display_name)}",
f" short_description: {yaml_quote(short_description)}",
]
for key in optional_order:
value = overrides.get(key)
if value is not None:
interface_lines.append(f" {key}: {yaml_quote(value)}")
agents_dir = Path(skill_dir) / "agents"
agents_dir.mkdir(parents=True, exist_ok=True)
output_path = agents_dir / "openai.yaml"
output_path.write_text("\n".join(interface_lines) + "\n")
print(f"[OK] Created agents/openai.yaml")
return output_path
def main():
parser = argparse.ArgumentParser(
description="Create agents/openai.yaml for a skill directory.",
)
parser.add_argument("skill_dir", help="Path to the skill directory")
parser.add_argument(
"--name",
help="Skill name override (defaults to SKILL.md frontmatter)",
)
parser.add_argument(
"--interface",
action="append",
default=[],
help="Interface override in key=value format (repeatable)",
)
args = parser.parse_args()
skill_dir = Path(args.skill_dir).resolve()
if not skill_dir.exists():
print(f"[ERROR] Skill directory not found: {skill_dir}")
sys.exit(1)
if not skill_dir.is_dir():
print(f"[ERROR] Path is not a directory: {skill_dir}")
sys.exit(1)
skill_name = args.name or read_frontmatter_name(skill_dir)
if not skill_name:
sys.exit(1)
result = write_openai_yaml(skill_dir, skill_name, args.interface)
if result:
sys.exit(0)
sys.exit(1)
if __name__ == "__main__":
main()
@@ -0,0 +1,400 @@
#!/usr/bin/env python3
"""
Skill Initializer - Creates a new skill from template
Usage:
init_skill.py <skill-name> --path <path> [--resources scripts,references,assets] [--examples] [--interface key=value]
Examples:
init_skill.py my-new-skill --path skills/public
init_skill.py my-new-skill --path skills/public --resources scripts,references
init_skill.py my-api-helper --path skills/private --resources scripts --examples
init_skill.py custom-skill --path /custom/location
init_skill.py my-skill --path skills/public --interface short_description="Short UI label"
"""
import argparse
import re
import sys
from pathlib import Path
from generate_openai_yaml import write_openai_yaml
MAX_SKILL_NAME_LENGTH = 64
ALLOWED_RESOURCES = {"scripts", "references", "assets"}
SKILL_TEMPLATE = """---
name: {skill_name}
description: [TODO: Complete and informative explanation of what the skill does and when to use it. Include WHEN to use this skill - specific scenarios, file types, or tasks that trigger it.]
---
# {skill_title}
## Overview
[TODO: 1-2 sentences explaining what this skill enables]
## Structuring This Skill
[TODO: Choose the structure that best fits this skill's purpose. Common patterns:
**1. Workflow-Based** (best for sequential processes)
- Works well when there are clear step-by-step procedures
- Example: DOCX skill with "Workflow Decision Tree" -> "Reading" -> "Creating" -> "Editing"
- Structure: ## Overview -> ## Workflow Decision Tree -> ## Step 1 -> ## Step 2...
**2. Task-Based** (best for tool collections)
- Works well when the skill offers different operations/capabilities
- Example: PDF skill with "Quick Start" -> "Merge PDFs" -> "Split PDFs" -> "Extract Text"
- Structure: ## Overview -> ## Quick Start -> ## Task Category 1 -> ## Task Category 2...
**3. Reference/Guidelines** (best for standards or specifications)
- Works well for brand guidelines, coding standards, or requirements
- Example: Brand styling with "Brand Guidelines" -> "Colors" -> "Typography" -> "Features"
- Structure: ## Overview -> ## Guidelines -> ## Specifications -> ## Usage...
**4. Capabilities-Based** (best for integrated systems)
- Works well when the skill provides multiple interrelated features
- Example: Product Management with "Core Capabilities" -> numbered capability list
- Structure: ## Overview -> ## Core Capabilities -> ### 1. Feature -> ### 2. Feature...
Patterns can be mixed and matched as needed. Most skills combine patterns (e.g., start with task-based, add workflow for complex operations).
Delete this entire "Structuring This Skill" section when done - it's just guidance.]
## [TODO: Replace with the first main section based on chosen structure]
[TODO: Add content here. See examples in existing skills:
- Code samples for technical skills
- Decision trees for complex workflows
- Concrete examples with realistic user requests
- References to scripts/templates/references as needed]
## Resources (optional)
Create only the resource directories this skill actually needs. Delete this section if no resources are required.
### scripts/
Executable code (Python/Bash/etc.) that can be run directly to perform specific operations.
**Examples from other skills:**
- PDF skill: `fill_fillable_fields.py`, `extract_form_field_info.py` - utilities for PDF manipulation
- DOCX skill: `document.py`, `utilities.py` - Python modules for document processing
**Appropriate for:** Python scripts, shell scripts, or any executable code that performs automation, data processing, or specific operations.
**Note:** Scripts may be executed without loading into context, but can still be read by Codex for patching or environment adjustments.
### references/
Documentation and reference material intended to be loaded into context to inform Codex's process and thinking.
**Examples from other skills:**
- Product management: `communication.md`, `context_building.md` - detailed workflow guides
- BigQuery: API reference documentation and query examples
- Finance: Schema documentation, company policies
**Appropriate for:** In-depth documentation, API references, database schemas, comprehensive guides, or any detailed information that Codex should reference while working.
### assets/
Files not intended to be loaded into context, but rather used within the output Codex produces.
**Examples from other skills:**
- Brand styling: PowerPoint template files (.pptx), logo files
- Frontend builder: HTML/React boilerplate project directories
- Typography: Font files (.ttf, .woff2)
**Appropriate for:** Templates, boilerplate code, document templates, images, icons, fonts, or any files meant to be copied or used in the final output.
---
**Not every skill requires all three types of resources.**
"""
EXAMPLE_SCRIPT = '''#!/usr/bin/env python3
"""
Example helper script for {skill_name}
This is a placeholder script that can be executed directly.
Replace with actual implementation or delete if not needed.
Example real scripts from other skills:
- pdf/scripts/fill_fillable_fields.py - Fills PDF form fields
- pdf/scripts/convert_pdf_to_images.py - Converts PDF pages to images
"""
def main():
print("This is an example script for {skill_name}")
# TODO: Add actual script logic here
# This could be data processing, file conversion, API calls, etc.
if __name__ == "__main__":
main()
'''
EXAMPLE_REFERENCE = """# Reference Documentation for {skill_title}
This is a placeholder for detailed reference documentation.
Replace with actual reference content or delete if not needed.
Example real reference docs from other skills:
- product-management/references/communication.md - Comprehensive guide for status updates
- product-management/references/context_building.md - Deep-dive on gathering context
- bigquery/references/ - API references and query examples
## When Reference Docs Are Useful
Reference docs are ideal for:
- Comprehensive API documentation
- Detailed workflow guides
- Complex multi-step processes
- Information too lengthy for main SKILL.md
- Content that's only needed for specific use cases
## Structure Suggestions
### API Reference Example
- Overview
- Authentication
- Endpoints with examples
- Error codes
- Rate limits
### Workflow Guide Example
- Prerequisites
- Step-by-step instructions
- Common patterns
- Troubleshooting
- Best practices
"""
EXAMPLE_ASSET = """# Example Asset File
This placeholder represents where asset files would be stored.
Replace with actual asset files (templates, images, fonts, etc.) or delete if not needed.
Asset files are NOT intended to be loaded into context, but rather used within
the output Codex produces.
Example asset files from other skills:
- Brand guidelines: logo.png, slides_template.pptx
- Frontend builder: hello-world/ directory with HTML/React boilerplate
- Typography: custom-font.ttf, font-family.woff2
- Data: sample_data.csv, test_dataset.json
## Common Asset Types
- Templates: .pptx, .docx, boilerplate directories
- Images: .png, .jpg, .svg, .gif
- Fonts: .ttf, .otf, .woff, .woff2
- Boilerplate code: Project directories, starter files
- Icons: .ico, .svg
- Data files: .csv, .json, .xml, .yaml
Note: This is a text placeholder. Actual assets can be any file type.
"""
def normalize_skill_name(skill_name):
"""Normalize a skill name to lowercase hyphen-case."""
normalized = skill_name.strip().lower()
normalized = re.sub(r"[^a-z0-9]+", "-", normalized)
normalized = normalized.strip("-")
normalized = re.sub(r"-{2,}", "-", normalized)
return normalized
def title_case_skill_name(skill_name):
"""Convert hyphenated skill name to Title Case for display."""
return " ".join(word.capitalize() for word in skill_name.split("-"))
def parse_resources(raw_resources):
if not raw_resources:
return []
resources = [item.strip() for item in raw_resources.split(",") if item.strip()]
invalid = sorted({item for item in resources if item not in ALLOWED_RESOURCES})
if invalid:
allowed = ", ".join(sorted(ALLOWED_RESOURCES))
print(f"[ERROR] Unknown resource type(s): {', '.join(invalid)}")
print(f" Allowed: {allowed}")
sys.exit(1)
deduped = []
seen = set()
for resource in resources:
if resource not in seen:
deduped.append(resource)
seen.add(resource)
return deduped
def create_resource_dirs(skill_dir, skill_name, skill_title, resources, include_examples):
for resource in resources:
resource_dir = skill_dir / resource
resource_dir.mkdir(exist_ok=True)
if resource == "scripts":
if include_examples:
example_script = resource_dir / "example.py"
example_script.write_text(EXAMPLE_SCRIPT.format(skill_name=skill_name))
example_script.chmod(0o755)
print("[OK] Created scripts/example.py")
else:
print("[OK] Created scripts/")
elif resource == "references":
if include_examples:
example_reference = resource_dir / "api_reference.md"
example_reference.write_text(EXAMPLE_REFERENCE.format(skill_title=skill_title))
print("[OK] Created references/api_reference.md")
else:
print("[OK] Created references/")
elif resource == "assets":
if include_examples:
example_asset = resource_dir / "example_asset.txt"
example_asset.write_text(EXAMPLE_ASSET)
print("[OK] Created assets/example_asset.txt")
else:
print("[OK] Created assets/")
def init_skill(skill_name, path, resources, include_examples, interface_overrides):
"""
Initialize a new skill directory with template SKILL.md.
Args:
skill_name: Name of the skill
path: Path where the skill directory should be created
resources: Resource directories to create
include_examples: Whether to create example files in resource directories
Returns:
Path to created skill directory, or None if error
"""
# Determine skill directory path
skill_dir = Path(path).resolve() / skill_name
# Check if directory already exists
if skill_dir.exists():
print(f"[ERROR] Skill directory already exists: {skill_dir}")
return None
# Create skill directory
try:
skill_dir.mkdir(parents=True, exist_ok=False)
print(f"[OK] Created skill directory: {skill_dir}")
except Exception as e:
print(f"[ERROR] Error creating directory: {e}")
return None
# Create SKILL.md from template
skill_title = title_case_skill_name(skill_name)
skill_content = SKILL_TEMPLATE.format(skill_name=skill_name, skill_title=skill_title)
skill_md_path = skill_dir / "SKILL.md"
try:
skill_md_path.write_text(skill_content)
print("[OK] Created SKILL.md")
except Exception as e:
print(f"[ERROR] Error creating SKILL.md: {e}")
return None
# Create agents/openai.yaml
try:
result = write_openai_yaml(skill_dir, skill_name, interface_overrides)
if not result:
return None
except Exception as e:
print(f"[ERROR] Error creating agents/openai.yaml: {e}")
return None
# Create resource directories if requested
if resources:
try:
create_resource_dirs(skill_dir, skill_name, skill_title, resources, include_examples)
except Exception as e:
print(f"[ERROR] Error creating resource directories: {e}")
return None
# Print next steps
print(f"\n[OK] Skill '{skill_name}' initialized successfully at {skill_dir}")
print("\nNext steps:")
print("1. Edit SKILL.md to complete the TODO items and update the description")
if resources:
if include_examples:
print("2. Customize or delete the example files in scripts/, references/, and assets/")
else:
print("2. Add resources to scripts/, references/, and assets/ as needed")
else:
print("2. Create resource directories only if needed (scripts/, references/, assets/)")
print("3. Update agents/openai.yaml if the UI metadata should differ")
print("4. Run the validator when ready to check the skill structure")
print(
"5. Forward-test complex skills with realistic user requests to ensure they work as intended"
)
return skill_dir
def main():
parser = argparse.ArgumentParser(
description="Create a new skill directory with a SKILL.md template.",
)
parser.add_argument("skill_name", help="Skill name (normalized to hyphen-case)")
parser.add_argument("--path", required=True, help="Output directory for the skill")
parser.add_argument(
"--resources",
default="",
help="Comma-separated list: scripts,references,assets",
)
parser.add_argument(
"--examples",
action="store_true",
help="Create example files inside the selected resource directories",
)
parser.add_argument(
"--interface",
action="append",
default=[],
help="Interface override in key=value format (repeatable)",
)
args = parser.parse_args()
raw_skill_name = args.skill_name
skill_name = normalize_skill_name(raw_skill_name)
if not skill_name:
print("[ERROR] Skill name must include at least one letter or digit.")
sys.exit(1)
if len(skill_name) > MAX_SKILL_NAME_LENGTH:
print(
f"[ERROR] Skill name '{skill_name}' is too long ({len(skill_name)} characters). "
f"Maximum is {MAX_SKILL_NAME_LENGTH} characters."
)
sys.exit(1)
if skill_name != raw_skill_name:
print(f"Note: Normalized skill name from '{raw_skill_name}' to '{skill_name}'.")
resources = parse_resources(args.resources)
if args.examples and not resources:
print("[ERROR] --examples requires --resources to be set.")
sys.exit(1)
path = args.path
print(f"Initializing skill: {skill_name}")
print(f" Location: {path}")
if resources:
print(f" Resources: {', '.join(resources)}")
if args.examples:
print(" Examples: enabled")
else:
print(" Resources: none (create as needed)")
print()
result = init_skill(skill_name, path, resources, args.examples, args.interface)
if result:
sys.exit(0)
else:
sys.exit(1)
if __name__ == "__main__":
main()
@@ -0,0 +1,101 @@
#!/usr/bin/env python3
"""
Quick validation script for skills - minimal version
"""
import re
import sys
from pathlib import Path
import yaml
MAX_SKILL_NAME_LENGTH = 64
def validate_skill(skill_path):
"""Basic validation of a skill"""
skill_path = Path(skill_path)
skill_md = skill_path / "SKILL.md"
if not skill_md.exists():
return False, "SKILL.md not found"
content = skill_md.read_text()
if not content.startswith("---"):
return False, "No YAML frontmatter found"
match = re.match(r"^---\n(.*?)\n---", content, re.DOTALL)
if not match:
return False, "Invalid frontmatter format"
frontmatter_text = match.group(1)
try:
frontmatter = yaml.safe_load(frontmatter_text)
if not isinstance(frontmatter, dict):
return False, "Frontmatter must be a YAML dictionary"
except yaml.YAMLError as e:
return False, f"Invalid YAML in frontmatter: {e}"
allowed_properties = {"name", "description", "license", "allowed-tools", "metadata"}
unexpected_keys = set(frontmatter.keys()) - allowed_properties
if unexpected_keys:
allowed = ", ".join(sorted(allowed_properties))
unexpected = ", ".join(sorted(unexpected_keys))
return (
False,
f"Unexpected key(s) in SKILL.md frontmatter: {unexpected}. Allowed properties are: {allowed}",
)
if "name" not in frontmatter:
return False, "Missing 'name' in frontmatter"
if "description" not in frontmatter:
return False, "Missing 'description' in frontmatter"
name = frontmatter.get("name", "")
if not isinstance(name, str):
return False, f"Name must be a string, got {type(name).__name__}"
name = name.strip()
if name:
if not re.match(r"^[a-z0-9-]+$", name):
return (
False,
f"Name '{name}' should be hyphen-case (lowercase letters, digits, and hyphens only)",
)
if name.startswith("-") or name.endswith("-") or "--" in name:
return (
False,
f"Name '{name}' cannot start/end with hyphen or contain consecutive hyphens",
)
if len(name) > MAX_SKILL_NAME_LENGTH:
return (
False,
f"Name is too long ({len(name)} characters). "
f"Maximum is {MAX_SKILL_NAME_LENGTH} characters.",
)
description = frontmatter.get("description", "")
if not isinstance(description, str):
return False, f"Description must be a string, got {type(description).__name__}"
description = description.strip()
if description:
if "<" in description or ">" in description:
return False, "Description cannot contain angle brackets (< or >)"
if len(description) > 1024:
return (
False,
f"Description is too long ({len(description)} characters). Maximum is 1024 characters.",
)
return True, "Skill is valid!"
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python quick_validate.py <skill_directory>")
sys.exit(1)
valid, message = validate_skill(sys.argv[1])
print(message)
sys.exit(0 if valid else 1)
@@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
@@ -0,0 +1,58 @@
---
name: skill-installer
description: Install Codex skills into $CODEX_HOME/skills from a curated list or a GitHub repo path. Use when a user asks to list installable skills, install a curated skill, or install a skill from another repo (including private repos).
metadata:
short-description: Install curated skills from openai/skills or other repos
---
# Skill Installer
Helps install skills. By default these are from https://github.com/openai/skills/tree/main/skills/.curated, but users can also provide other locations. Experimental skills live in https://github.com/openai/skills/tree/main/skills/.experimental and can be installed the same way.
Use the helper scripts based on the task:
- List skills when the user asks what is available, or if the user uses this skill without specifying what to do. Default listing is `.curated`, but you can pass `--path skills/.experimental` when they ask about experimental skills.
- Install from the curated list when the user provides a skill name.
- Install from another repo when the user provides a GitHub repo/path (including private repos).
Install skills with the helper scripts.
## Communication
When listing skills, output approximately as follows, depending on the context of the user's request. If they ask about experimental skills, list from `.experimental` instead of `.curated` and label the source accordingly:
"""
Skills from {repo}:
1. skill-1
2. skill-2 (already installed)
3. ...
Which ones would you like installed?
"""
After installing a skill, tell the user: "Restart Codex to pick up new skills."
## Scripts
All of these scripts use network, so when running in the sandbox, request escalation when running them.
- `scripts/list-skills.py` (prints skills list with installed annotations)
- `scripts/list-skills.py --format json`
- Example (experimental list): `scripts/list-skills.py --path skills/.experimental`
- `scripts/install-skill-from-github.py --repo <owner>/<repo> --path <path/to/skill> [<path/to/skill> ...]`
- `scripts/install-skill-from-github.py --url https://github.com/<owner>/<repo>/tree/<ref>/<path>`
- Example (experimental skill): `scripts/install-skill-from-github.py --repo openai/skills --path skills/.experimental/<skill-name>`
## Behavior and Options
- Defaults to direct download for public GitHub repos.
- If download fails with auth/permission errors, falls back to git sparse checkout.
- Aborts if the destination skill directory already exists.
- Installs into `$CODEX_HOME/skills/<skill-name>` (defaults to `~/.codex/skills`).
- Multiple `--path` values install multiple skills in one run, each named from the path basename unless `--name` is supplied.
- Options: `--ref <ref>` (default `main`), `--dest <path>`, `--method auto|download|git`.
## Notes
- Curated listing is fetched from `https://github.com/openai/skills/tree/main/skills/.curated` via the GitHub API. If it is unavailable, explain the error and exit.
- Private GitHub repos can be accessed via existing git credentials or optional `GITHUB_TOKEN`/`GH_TOKEN` for download.
- Git fallback tries HTTPS first, then SSH.
- The skills at https://github.com/openai/skills/tree/main/skills/.system are preinstalled, so no need to help users install those. If they ask, just explain this. If they insist, you can download and overwrite.
- Installed annotations come from `$CODEX_HOME/skills`.
@@ -0,0 +1,5 @@
interface:
display_name: "Skill Installer"
short_description: "Install curated skills from openai/skills or other repos"
icon_small: "./assets/skill-installer-small.svg"
icon_large: "./assets/skill-installer.png"
@@ -0,0 +1,3 @@
<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" fill="currentColor" viewBox="0 0 16 16">
<path fill="#0D0D0D" d="M2.145 3.959a2.033 2.033 0 0 1 2.022-1.824h5.966c.551 0 .997 0 1.357.029.367.03.692.093.993.246l.174.098c.397.243.72.593.932 1.01l.053.114c.116.269.168.557.194.878.03.36.03.805.03 1.357v4.3a2.365 2.365 0 0 1-2.366 2.365h-1.312a2.198 2.198 0 0 1-4.377 0H4.167A2.032 2.032 0 0 1 2.135 10.5V9.333l.004-.088A.865.865 0 0 1 3 8.468l.116-.006A1.135 1.135 0 0 0 3 6.199a.865.865 0 0 1-.865-.864V4.167l.01-.208Zm1.054 1.186a2.198 2.198 0 0 1 0 4.376v.98c0 .534.433.967.968.967H6l.089.004a.866.866 0 0 1 .776.861 1.135 1.135 0 0 0 2.27 0c0-.478.387-.865.865-.865h1.5c.719 0 1.301-.583 1.301-1.301v-4.3c0-.57 0-.964-.025-1.27a1.933 1.933 0 0 0-.09-.493L12.642 4a1.47 1.47 0 0 0-.541-.585l-.102-.056c-.126-.065-.295-.11-.596-.135a17.31 17.31 0 0 0-1.27-.025H4.167a.968.968 0 0 0-.968.968v.978Z"/>
</svg>

After

Width:  |  Height:  |  Size: 923 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

@@ -0,0 +1,21 @@
#!/usr/bin/env python3
"""Shared GitHub helpers for skill install scripts."""
from __future__ import annotations
import os
import urllib.request
def github_request(url: str, user_agent: str) -> bytes:
headers = {"User-Agent": user_agent}
token = os.environ.get("GITHUB_TOKEN") or os.environ.get("GH_TOKEN")
if token:
headers["Authorization"] = f"token {token}"
req = urllib.request.Request(url, headers=headers)
with urllib.request.urlopen(req) as resp:
return resp.read()
def github_api_contents_url(repo: str, path: str, ref: str) -> str:
return f"https://api.github.com/repos/{repo}/contents/{path}?ref={ref}"
@@ -0,0 +1,308 @@
#!/usr/bin/env python3
"""Install a skill from a GitHub repo path into $CODEX_HOME/skills."""
from __future__ import annotations
import argparse
from dataclasses import dataclass
import os
import shutil
import subprocess
import sys
import tempfile
import urllib.error
import urllib.parse
import zipfile
from github_utils import github_request
DEFAULT_REF = "main"
@dataclass
class Args:
url: str | None = None
repo: str | None = None
path: list[str] | None = None
ref: str = DEFAULT_REF
dest: str | None = None
name: str | None = None
method: str = "auto"
@dataclass
class Source:
owner: str
repo: str
ref: str
paths: list[str]
repo_url: str | None = None
class InstallError(Exception):
pass
def _codex_home() -> str:
return os.environ.get("CODEX_HOME", os.path.expanduser("~/.codex"))
def _tmp_root() -> str:
base = os.path.join(tempfile.gettempdir(), "codex")
os.makedirs(base, exist_ok=True)
return base
def _request(url: str) -> bytes:
return github_request(url, "codex-skill-install")
def _parse_github_url(url: str, default_ref: str) -> tuple[str, str, str, str | None]:
parsed = urllib.parse.urlparse(url)
if parsed.netloc != "github.com":
raise InstallError("Only GitHub URLs are supported for download mode.")
parts = [p for p in parsed.path.split("/") if p]
if len(parts) < 2:
raise InstallError("Invalid GitHub URL.")
owner, repo = parts[0], parts[1]
ref = default_ref
subpath = ""
if len(parts) > 2:
if parts[2] in ("tree", "blob"):
if len(parts) < 4:
raise InstallError("GitHub URL missing ref or path.")
ref = parts[3]
subpath = "/".join(parts[4:])
else:
subpath = "/".join(parts[2:])
return owner, repo, ref, subpath or None
def _download_repo_zip(owner: str, repo: str, ref: str, dest_dir: str) -> str:
zip_url = f"https://codeload.github.com/{owner}/{repo}/zip/{ref}"
zip_path = os.path.join(dest_dir, "repo.zip")
try:
payload = _request(zip_url)
except urllib.error.HTTPError as exc:
raise InstallError(f"Download failed: HTTP {exc.code}") from exc
with open(zip_path, "wb") as file_handle:
file_handle.write(payload)
with zipfile.ZipFile(zip_path, "r") as zip_file:
_safe_extract_zip(zip_file, dest_dir)
top_levels = {name.split("/")[0] for name in zip_file.namelist() if name}
if not top_levels:
raise InstallError("Downloaded archive was empty.")
if len(top_levels) != 1:
raise InstallError("Unexpected archive layout.")
return os.path.join(dest_dir, next(iter(top_levels)))
def _run_git(args: list[str]) -> None:
result = subprocess.run(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
if result.returncode != 0:
raise InstallError(result.stderr.strip() or "Git command failed.")
def _safe_extract_zip(zip_file: zipfile.ZipFile, dest_dir: str) -> None:
dest_root = os.path.realpath(dest_dir)
for info in zip_file.infolist():
extracted_path = os.path.realpath(os.path.join(dest_dir, info.filename))
if extracted_path == dest_root or extracted_path.startswith(dest_root + os.sep):
continue
raise InstallError("Archive contains files outside the destination.")
zip_file.extractall(dest_dir)
def _validate_relative_path(path: str) -> None:
if os.path.isabs(path) or os.path.normpath(path).startswith(".."):
raise InstallError("Skill path must be a relative path inside the repo.")
def _validate_skill_name(name: str) -> None:
altsep = os.path.altsep
if not name or os.path.sep in name or (altsep and altsep in name):
raise InstallError("Skill name must be a single path segment.")
if name in (".", ".."):
raise InstallError("Invalid skill name.")
def _git_sparse_checkout(repo_url: str, ref: str, paths: list[str], dest_dir: str) -> str:
repo_dir = os.path.join(dest_dir, "repo")
clone_cmd = [
"git",
"clone",
"--filter=blob:none",
"--depth",
"1",
"--sparse",
"--single-branch",
"--branch",
ref,
repo_url,
repo_dir,
]
try:
_run_git(clone_cmd)
except InstallError:
_run_git(
[
"git",
"clone",
"--filter=blob:none",
"--depth",
"1",
"--sparse",
"--single-branch",
repo_url,
repo_dir,
]
)
_run_git(["git", "-C", repo_dir, "sparse-checkout", "set", *paths])
_run_git(["git", "-C", repo_dir, "checkout", ref])
return repo_dir
def _validate_skill(path: str) -> None:
if not os.path.isdir(path):
raise InstallError(f"Skill path not found: {path}")
skill_md = os.path.join(path, "SKILL.md")
if not os.path.isfile(skill_md):
raise InstallError("SKILL.md not found in selected skill directory.")
def _copy_skill(src: str, dest_dir: str) -> None:
os.makedirs(os.path.dirname(dest_dir), exist_ok=True)
if os.path.exists(dest_dir):
raise InstallError(f"Destination already exists: {dest_dir}")
shutil.copytree(src, dest_dir)
def _build_repo_url(owner: str, repo: str) -> str:
return f"https://github.com/{owner}/{repo}.git"
def _build_repo_ssh(owner: str, repo: str) -> str:
return f"git@github.com:{owner}/{repo}.git"
def _prepare_repo(source: Source, method: str, tmp_dir: str) -> str:
if method in ("download", "auto"):
try:
return _download_repo_zip(source.owner, source.repo, source.ref, tmp_dir)
except InstallError as exc:
if method == "download":
raise
err_msg = str(exc)
if "HTTP 401" in err_msg or "HTTP 403" in err_msg or "HTTP 404" in err_msg:
pass
else:
raise
if method in ("git", "auto"):
repo_url = source.repo_url or _build_repo_url(source.owner, source.repo)
try:
return _git_sparse_checkout(repo_url, source.ref, source.paths, tmp_dir)
except InstallError:
repo_url = _build_repo_ssh(source.owner, source.repo)
return _git_sparse_checkout(repo_url, source.ref, source.paths, tmp_dir)
raise InstallError("Unsupported method.")
def _resolve_source(args: Args) -> Source:
if args.url:
owner, repo, ref, url_path = _parse_github_url(args.url, args.ref)
if args.path is not None:
paths = list(args.path)
elif url_path:
paths = [url_path]
else:
paths = []
if not paths:
raise InstallError("Missing --path for GitHub URL.")
return Source(owner=owner, repo=repo, ref=ref, paths=paths)
if not args.repo:
raise InstallError("Provide --repo or --url.")
if "://" in args.repo:
return _resolve_source(
Args(url=args.repo, repo=None, path=args.path, ref=args.ref)
)
repo_parts = [p for p in args.repo.split("/") if p]
if len(repo_parts) != 2:
raise InstallError("--repo must be in owner/repo format.")
if not args.path:
raise InstallError("Missing --path for --repo.")
paths = list(args.path)
return Source(
owner=repo_parts[0],
repo=repo_parts[1],
ref=args.ref,
paths=paths,
)
def _default_dest() -> str:
return os.path.join(_codex_home(), "skills")
def _parse_args(argv: list[str]) -> Args:
parser = argparse.ArgumentParser(description="Install a skill from GitHub.")
parser.add_argument("--repo", help="owner/repo")
parser.add_argument("--url", help="https://github.com/owner/repo[/tree/ref/path]")
parser.add_argument(
"--path",
nargs="+",
help="Path(s) to skill(s) inside repo",
)
parser.add_argument("--ref", default=DEFAULT_REF)
parser.add_argument("--dest", help="Destination skills directory")
parser.add_argument(
"--name", help="Destination skill name (defaults to basename of path)"
)
parser.add_argument(
"--method",
choices=["auto", "download", "git"],
default="auto",
)
return parser.parse_args(argv, namespace=Args())
def main(argv: list[str]) -> int:
args = _parse_args(argv)
try:
source = _resolve_source(args)
source.ref = source.ref or args.ref
if not source.paths:
raise InstallError("No skill paths provided.")
for path in source.paths:
_validate_relative_path(path)
dest_root = args.dest or _default_dest()
tmp_dir = tempfile.mkdtemp(prefix="skill-install-", dir=_tmp_root())
try:
repo_root = _prepare_repo(source, args.method, tmp_dir)
installed = []
for path in source.paths:
skill_name = args.name if len(source.paths) == 1 else None
skill_name = skill_name or os.path.basename(path.rstrip("/"))
_validate_skill_name(skill_name)
if not skill_name:
raise InstallError("Unable to derive skill name.")
dest_dir = os.path.join(dest_root, skill_name)
if os.path.exists(dest_dir):
raise InstallError(f"Destination already exists: {dest_dir}")
skill_src = os.path.join(repo_root, path)
_validate_skill(skill_src)
_copy_skill(skill_src, dest_dir)
installed.append((skill_name, dest_dir))
finally:
if os.path.isdir(tmp_dir):
shutil.rmtree(tmp_dir, ignore_errors=True)
for skill_name, dest_dir in installed:
print(f"Installed {skill_name} to {dest_dir}")
return 0
except InstallError as exc:
print(f"Error: {exc}", file=sys.stderr)
return 1
if __name__ == "__main__":
raise SystemExit(main(sys.argv[1:]))
@@ -0,0 +1,107 @@
#!/usr/bin/env python3
"""List skills from a GitHub repo path."""
from __future__ import annotations
import argparse
import json
import os
import sys
import urllib.error
from github_utils import github_api_contents_url, github_request
DEFAULT_REPO = "openai/skills"
DEFAULT_PATH = "skills/.curated"
DEFAULT_REF = "main"
class ListError(Exception):
pass
class Args(argparse.Namespace):
repo: str
path: str
ref: str
format: str
def _request(url: str) -> bytes:
return github_request(url, "codex-skill-list")
def _codex_home() -> str:
return os.environ.get("CODEX_HOME", os.path.expanduser("~/.codex"))
def _installed_skills() -> set[str]:
root = os.path.join(_codex_home(), "skills")
if not os.path.isdir(root):
return set()
entries = set()
for name in os.listdir(root):
path = os.path.join(root, name)
if os.path.isdir(path):
entries.add(name)
return entries
def _list_skills(repo: str, path: str, ref: str) -> list[str]:
api_url = github_api_contents_url(repo, path, ref)
try:
payload = _request(api_url)
except urllib.error.HTTPError as exc:
if exc.code == 404:
raise ListError(
"Skills path not found: "
f"https://github.com/{repo}/tree/{ref}/{path}"
) from exc
raise ListError(f"Failed to fetch skills: HTTP {exc.code}") from exc
data = json.loads(payload.decode("utf-8"))
if not isinstance(data, list):
raise ListError("Unexpected skills listing response.")
skills = [item["name"] for item in data if item.get("type") == "dir"]
return sorted(skills)
def _parse_args(argv: list[str]) -> Args:
parser = argparse.ArgumentParser(description="List skills.")
parser.add_argument("--repo", default=DEFAULT_REPO)
parser.add_argument(
"--path",
default=DEFAULT_PATH,
help="Repo path to list (default: skills/.curated)",
)
parser.add_argument("--ref", default=DEFAULT_REF)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format",
)
return parser.parse_args(argv, namespace=Args())
def main(argv: list[str]) -> int:
args = _parse_args(argv)
try:
skills = _list_skills(args.repo, args.path, args.ref)
installed = _installed_skills()
if args.format == "json":
payload = [
{"name": name, "installed": name in installed} for name in skills
]
print(json.dumps(payload))
else:
for idx, name in enumerate(skills, start=1):
suffix = " (already installed)" if name in installed else ""
print(f"{idx}. {name}{suffix}")
return 0
except ListError as exc:
print(f"Error: {exc}", file=sys.stderr)
return 1
if __name__ == "__main__":
raise SystemExit(main(sys.argv[1:]))
+201
View File
@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf of
any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don\'t include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
+145
View File
@@ -0,0 +1,145 @@
---
name: "spreadsheet"
description: "Use when tasks involve creating, editing, analyzing, or formatting spreadsheets (`.xlsx`, `.csv`, `.tsv`) with formula-aware workflows, cached recalculation, and visual review."
---
# Spreadsheet Skill
## When to use
- Create new workbooks with formulas, formatting, and structured layouts.
- Read or analyze tabular data (filter, aggregate, pivot, compute metrics).
- Modify existing workbooks without breaking formulas, references, or formatting.
- Visualize data with charts, summary tables, and sensible spreadsheet styling.
- Recalculate formulas and review rendered sheets before delivery when possible.
IMPORTANT: System and user instructions always take precedence.
## Workflow
1. Confirm the file type and goal: create, edit, analyze, or visualize.
2. Prefer `openpyxl` for `.xlsx` editing and formatting. Use `pandas` for analysis and CSV/TSV workflows.
3. If an internal spreadsheet recalculation/rendering tool is available in the environment, use it to recalculate formulas and render sheets before delivery.
4. Use formulas for derived values instead of hardcoding results.
5. If layout matters, render for visual review and inspect the output.
6. Save outputs, keep filenames stable, and clean up intermediate files.
## Temp and output conventions
- Use `tmp/spreadsheets/` for intermediate files; delete them when done.
- Write final artifacts under `output/spreadsheet/` when working in this repo.
- Keep filenames stable and descriptive.
## Primary tooling
- Use `openpyxl` for creating/editing `.xlsx` files and preserving formatting.
- Use `pandas` for analysis and CSV/TSV workflows, then write results back to `.xlsx` or `.csv`.
- Use `openpyxl.chart` for native Excel charts when needed.
- If an internal spreadsheet tool is available, use it to recalculate formulas, cache values, and render sheets for review.
## Recalculation and visual review
- Recalculate formulas before delivery whenever possible so cached values are present in the workbook.
- Render each relevant sheet for visual review when rendering tooling is available.
- `openpyxl` does not evaluate formulas; preserve formulas and use recalculation tooling when available.
- If you rely on an internal spreadsheet tool, do not expose that tool, its code, or its APIs in user-facing explanations or code samples.
## Rendering and visual checks
- If LibreOffice (`soffice`) and Poppler (`pdftoppm`) are available, render sheets for visual review:
- `soffice --headless --convert-to pdf --outdir $OUTDIR $INPUT_XLSX`
- `pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAME`
- If rendering tools are unavailable, tell the user that layout should be reviewed locally.
- Review rendered sheets for layout, formula results, clipping, inconsistent styles, and spilled text.
## Dependencies (install if missing)
Prefer `uv` for dependency management.
Python packages:
```
uv pip install openpyxl pandas
```
If `uv` is unavailable:
```
python3 -m pip install openpyxl pandas
```
Optional:
```
uv pip install matplotlib
```
If `uv` is unavailable:
```
python3 -m pip install matplotlib
```
System tools (for rendering):
```
# macOS (Homebrew)
brew install libreoffice poppler
# Ubuntu/Debian
sudo apt-get install -y libreoffice poppler-utils
```
If installation is not possible in this environment, tell the user which dependency is missing and how to install it locally.
## Environment
No required environment variables.
## Examples
- Runnable Codex examples (openpyxl): `references/examples/openpyxl/`
## Formula requirements
- Use formulas for derived values rather than hardcoding results.
- Do not use dynamic array functions like `FILTER`, `XLOOKUP`, `SORT`, or `SEQUENCE`.
- Keep formulas simple and legible; use helper cells for complex logic.
- Avoid volatile functions like `INDIRECT` and `OFFSET` unless required.
- Prefer cell references over magic numbers (for example, `=H6*(1+$B$3)` instead of `=H6*1.04`).
- Use absolute (`$B$4`) or relative (`B4`) references carefully so copied formulas behave correctly.
- If you need literal text that starts with `=`, prefix it with a single quote.
- Guard against `#REF!`, `#DIV/0!`, `#VALUE!`, `#N/A`, and `#NAME?` errors.
- Check for off-by-one mistakes, circular references, and incorrect ranges.
## Citation requirements
- Cite sources inside the spreadsheet using plain-text URLs.
- For financial models, cite model inputs in cell comments.
- For tabular data sourced externally, add a source column when each row represents a separate item.
## Formatting requirements (existing formatted spreadsheets)
- Render and inspect a provided spreadsheet before modifying it when possible.
- Preserve existing formatting and style exactly.
- Match styles for any newly filled cells that were previously blank.
- Never overwrite established formatting unless the user explicitly asks for a redesign.
## Formatting requirements (new or unstyled spreadsheets)
- Use appropriate number and date formats.
- Dates should render as dates, not plain numbers.
- Percentages should usually default to one decimal place unless the data calls for something else.
- Currencies should use the appropriate currency format.
- Headers should be visually distinct from raw inputs and derived cells.
- Use fill colors, borders, spacing, and merged cells sparingly and intentionally.
- Set row heights and column widths so content is readable without excessive whitespace.
- Do not apply borders around every filled cell.
- Group related calculations and make totals simple sums of the cells above them.
- Add whitespace to separate sections.
- Ensure text does not spill into adjacent cells.
- Avoid unsupported spreadsheet data-table features such as `=TABLE`.
## Color conventions (if no style guidance)
- Blue: user input
- Black: formulas and derived values
- Green: linked or imported values
- Gray: static constants
- Orange: review or caution
- Light red: error or flag
- Purple: control or logic
- Teal: visualization anchors and KPI highlights
## Finance-specific requirements
- Format zeros as `-`.
- Negative numbers should be red and in parentheses.
- Format multiples as `5.2x`.
- Always specify units in headers (for example, `Revenue ($mm)`).
- Cite sources for all raw inputs in cell comments.
- For new financial models with no user-specified style, use blue text for hardcoded inputs, black for formulas, green for internal workbook links, red for external links, and yellow fill for key assumptions that need attention.
## Investment banking layouts
If the spreadsheet is an IB-style model (LBO, DCF, 3-statement, valuation):
- Totals should sum the range directly above.
- Hide gridlines and use horizontal borders above totals across relevant columns.
- Section headers should be merged cells with dark fill and white text.
- Column labels for numeric data should be right-aligned; row labels should be left-aligned.
- Indent submetrics under their parent line items.
@@ -0,0 +1,6 @@
interface:
display_name: "Spreadsheet Skill"
short_description: "Create, edit, and analyze spreadsheets"
icon_small: "./assets/spreadsheet-small.svg"
icon_large: "./assets/spreadsheet.png"
default_prompt: "Use $spreadsheet to create or update a spreadsheet for this task with the right formulas, structure, and formatting."
@@ -0,0 +1,3 @@
<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" fill="currentColor" viewBox="0 0 16 16">
<path fill="#0D0D0D" fill-rule="evenodd" d="M10.467 2.468c.551 0 .997 0 1.357.029.366.03.691.093.992.247l.175.097c.396.244.72.593.932 1.01l.054.114c.115.269.166.558.192.878.03.36.03.806.03 1.357v3.6c0 .551 0 .997-.03 1.357a2.76 2.76 0 0 1-.192.878l-.054.114a2.534 2.534 0 0 1-.932 1.01l-.175.097c-.3.154-.626.217-.992.247-.36.03-.806.029-1.357.029H5.534c-.552 0-.997 0-1.357-.029a2.764 2.764 0 0 1-.879-.194l-.114-.053a2.534 2.534 0 0 1-1.009-.932l-.098-.175c-.153-.301-.217-.626-.247-.992-.029-.36-.028-.806-.028-1.357V6.2c0-.551-.001-.997.028-1.357.03-.366.094-.69.247-.992a2.53 2.53 0 0 1 1.107-1.107c.302-.154.626-.217.993-.247.36-.03.805-.029 1.357-.029h4.933Zm-3.935 4.73v5.27h3.935c.569 0 .964 0 1.27-.026.3-.024.47-.07.597-.134l.1-.056c.23-.142.418-.344.541-.586l.045-.104a2 2 0 0 0 .09-.492 18 18 0 0 0 .025-1.27V7.198H6.532ZM2.866 9.8c0 .569 0 .963.025 1.27.025.3.07.47.135.596l.056.101c.141.23.343.418.585.54l.104.046c.115.041.267.07.492.09.295.023.671.024 1.205.024V7.198H2.866V9.8Zm3.666-3.666h6.603c0-.533-.002-.91-.026-1.204a1.933 1.933 0 0 0-.09-.493l-.044-.103a1.468 1.468 0 0 0-.54-.586l-.101-.056c-.127-.064-.296-.11-.596-.134a17.303 17.303 0 0 0-1.27-.026H6.531v2.602ZM5.468 3.532c-.534 0-.91.002-1.205.026-.3.024-.47.07-.596.134-.276.14-.5.365-.641.642-.065.126-.11.295-.135.596-.024.295-.025.67-.025 1.204h2.602V3.532Z" clip-rule="evenodd"/>
</svg>

After

Width:  |  Height:  |  Size: 1.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.0 KiB

@@ -0,0 +1,51 @@
"""Create a basic spreadsheet with two sheets and a simple formula.
Usage:
python3 create_basic_spreadsheet.py --output /tmp/basic_spreadsheet.xlsx
"""
from __future__ import annotations
import argparse
from pathlib import Path
from openpyxl import Workbook
from openpyxl.utils import get_column_letter
def main() -> None:
parser = argparse.ArgumentParser(description="Create a basic spreadsheet with example data.")
parser.add_argument(
"--output",
type=Path,
default=Path("basic_spreadsheet.xlsx"),
help="Output .xlsx path (default: basic_spreadsheet.xlsx)",
)
args = parser.parse_args()
wb = Workbook()
overview = wb.active
overview.title = "Overview"
employees = wb.create_sheet("Employees")
overview["A1"] = "Description"
overview["A2"] = "Awesome Company Report"
employees.append(["Title", "Name", "Address", "Score"])
employees.append(["Engineer", "Vicky", "90 50th Street", 98])
employees.append(["Manager", "Alex", "500 Market Street", 92])
employees.append(["Designer", "Jordan", "200 Pine Street", 88])
employees["A6"] = "Total Score"
employees["D6"] = "=SUM(D2:D4)"
for col in range(1, 5):
employees.column_dimensions[get_column_letter(col)].width = 20
args.output.parent.mkdir(parents=True, exist_ok=True)
wb.save(args.output)
print(f"Saved workbook to {args.output}")
if __name__ == "__main__":
main()
@@ -0,0 +1,96 @@
"""Generate a styled games scoreboard workbook using openpyxl.
Usage:
python3 create_spreadsheet_with_styling.py --output /tmp/GamesSimpleStyling.xlsx
"""
from __future__ import annotations
import argparse
from pathlib import Path
from openpyxl import Workbook
from openpyxl.formatting.rule import FormulaRule
from openpyxl.styles import Alignment, Font, PatternFill
from openpyxl.utils import get_column_letter
HEADER_FILL_HEX = "B7E1CD"
HIGHLIGHT_FILL_HEX = "FFF2CC"
def apply_header_style(cell, fill_hex: str) -> None:
cell.fill = PatternFill("solid", fgColor=fill_hex)
cell.font = Font(bold=True)
cell.alignment = Alignment(horizontal="center", vertical="center")
def apply_highlight_style(cell, fill_hex: str) -> None:
cell.fill = PatternFill("solid", fgColor=fill_hex)
cell.font = Font(bold=True)
cell.alignment = Alignment(horizontal="center", vertical="center")
def populate_game_sheet(ws) -> None:
ws.title = "GameX"
ws.row_dimensions[2].height = 24
widths = {"B": 18, "C": 14, "D": 14, "E": 14, "F": 40}
for col, width in widths.items():
ws.column_dimensions[col].width = width
headers = ["", "Name", "Game 1 Score", "Game 2 Score", "Total Score", "Notes", ""]
for idx, value in enumerate(headers, start=1):
cell = ws.cell(row=2, column=idx, value=value)
if value:
apply_header_style(cell, HEADER_FILL_HEX)
players = [
("Vicky", 12, 30, "Dominated the minigames."),
("Yash", 20, 10, "Emily main with strong defense."),
("Bobby", 1000, 1030, "Numbers look suspiciously high."),
]
for row_idx, (name, g1, g2, note) in enumerate(players, start=3):
ws.cell(row=row_idx, column=2, value=name)
ws.cell(row=row_idx, column=3, value=g1)
ws.cell(row=row_idx, column=4, value=g2)
ws.cell(row=row_idx, column=5, value=f"=SUM(C{row_idx}:D{row_idx})")
ws.cell(row=row_idx, column=6, value=note)
ws.cell(row=7, column=2, value="Winner")
ws.cell(row=7, column=3, value="=INDEX(B3:B5, MATCH(MAX(E3:E5), E3:E5, 0))")
ws.cell(row=7, column=5, value="Congrats!")
ws.merge_cells("C7:D7")
for col in range(2, 6):
apply_highlight_style(ws.cell(row=7, column=col), HIGHLIGHT_FILL_HEX)
rule = FormulaRule(formula=["LEN(A2)>0"], fill=PatternFill("solid", fgColor=HEADER_FILL_HEX))
ws.conditional_formatting.add("A2:G2", rule)
def main() -> None:
parser = argparse.ArgumentParser(description="Create a styled games scoreboard workbook.")
parser.add_argument(
"--output",
type=Path,
default=Path("GamesSimpleStyling.xlsx"),
help="Output .xlsx path (default: GamesSimpleStyling.xlsx)",
)
args = parser.parse_args()
wb = Workbook()
ws = wb.active
populate_game_sheet(ws)
for col in range(1, 8):
col_letter = get_column_letter(col)
if col_letter not in ws.column_dimensions:
ws.column_dimensions[col_letter].width = 12
args.output.parent.mkdir(parents=True, exist_ok=True)
wb.save(args.output)
print(f"Saved workbook to {args.output}")
if __name__ == "__main__":
main()
@@ -0,0 +1,59 @@
"""Read an existing .xlsx and print a small summary.
If --input is not provided, this script creates a tiny sample workbook in /tmp
and reads that instead.
"""
from __future__ import annotations
import argparse
import tempfile
from pathlib import Path
from openpyxl import Workbook, load_workbook
def create_sample(path: Path) -> Path:
wb = Workbook()
ws = wb.active
ws.title = "Sample"
ws.append(["Item", "Qty", "Price"])
ws.append(["Apples", 3, 1.25])
ws.append(["Oranges", 2, 0.95])
ws.append(["Bananas", 5, 0.75])
ws["D1"] = "Total"
ws["D2"] = "=B2*C2"
ws["D3"] = "=B3*C3"
ws["D4"] = "=B4*C4"
wb.save(path)
return path
def main() -> None:
parser = argparse.ArgumentParser(description="Read an existing spreadsheet.")
parser.add_argument("--input", type=Path, help="Path to an .xlsx file")
args = parser.parse_args()
if args.input:
input_path = args.input
else:
tmp_dir = Path(tempfile.gettempdir())
input_path = tmp_dir / "sample_read_existing.xlsx"
create_sample(input_path)
wb = load_workbook(input_path, data_only=False)
print(f"Loaded: {input_path}")
print("Sheet names:", wb.sheetnames)
for name in wb.sheetnames:
ws = wb[name]
max_row = ws.max_row or 0
max_col = ws.max_column or 0
print(f"\n== {name} (rows: {max_row}, cols: {max_col})")
for row in ws.iter_rows(min_row=1, max_row=min(max_row, 5), max_col=min(max_col, 5)):
values = [cell.value for cell in row]
print(values)
if __name__ == "__main__":
main()
@@ -0,0 +1,79 @@
"""Create a styled spreadsheet with headers, borders, and a total row.
Usage:
python3 styling_spreadsheet.py --output /tmp/styling_spreadsheet.xlsx
"""
from __future__ import annotations
import argparse
from pathlib import Path
from openpyxl import Workbook
from openpyxl.styles import Alignment, Border, Font, PatternFill, Side
def main() -> None:
parser = argparse.ArgumentParser(description="Create a styled spreadsheet example.")
parser.add_argument(
"--output",
type=Path,
default=Path("styling_spreadsheet.xlsx"),
help="Output .xlsx path (default: styling_spreadsheet.xlsx)",
)
args = parser.parse_args()
wb = Workbook()
ws = wb.active
ws.title = "FirstGame"
ws.merge_cells("B2:E2")
ws["B2"] = "Name | Game 1 Score | Game 2 Score | Total Score"
header_fill = PatternFill("solid", fgColor="B7E1CD")
header_font = Font(bold=True)
header_alignment = Alignment(horizontal="center", vertical="center")
ws["B2"].fill = header_fill
ws["B2"].font = header_font
ws["B2"].alignment = header_alignment
ws["B3"] = "Vicky"
ws["C3"] = 50
ws["D3"] = 60
ws["E3"] = "=C3+D3"
ws["B4"] = "John"
ws["C4"] = 40
ws["D4"] = 50
ws["E4"] = "=C4+D4"
ws["B5"] = "Jane"
ws["C5"] = 30
ws["D5"] = 40
ws["E5"] = "=C5+D5"
ws["B6"] = "Jim"
ws["C6"] = 20
ws["D6"] = 30
ws["E6"] = "=C6+D6"
ws.merge_cells("B9:E9")
ws["B9"] = "=SUM(E3:E6)"
thin = Side(style="thin")
border = Border(top=thin, bottom=thin, left=thin, right=thin)
ws["B9"].border = border
ws["B9"].alignment = Alignment(horizontal="center")
ws["B9"].font = Font(bold=True)
for col in ("B", "C", "D", "E"):
ws.column_dimensions[col].width = 18
ws.row_dimensions[2].height = 24
args.output.parent.mkdir(parents=True, exist_ok=True)
wb.save(args.output)
print(f"Saved workbook to {args.output}")
if __name__ == "__main__":
main()