AI字数 5434阅读时长14 分钟

深度用好 Codex:工具、线程与自动化(英中对照)

OpenAI Codex 官方指南解析:如何通过持久线程、语音输入、浏览器工具、自动化任务与共享记忆,将 Codex 从代码助手升级为全能工作系统。英中对照,适合学习阅读。

深度用好 Codex:工具、线程与自动化

*Getting the Most Out of Codex*

来源:OpenAI Codex 官方指南 | 英中对照版


Most developers first use coding agents for code: inspect a repository, make a diff, run tests, and open a pull request. That's still the center of gravity for Codex. But much of the work on a computer is already mediated by code: executing shell commands, browsing web pages, calling APIs, exporting documents, responding to events, and triggering automations. As those surfaces become available to Codex, it starts to feel less like a coding assistant in the narrow sense and more like a system for getting computer work done.

大多数开发者最初使用编程 AI 智能体是为了写代码:检查代码库、生成差异对比(Diff)、运行测试、发起 Pull Request。这仍然是 Codex 的核心用途。但计算机上的许多工作早已以代码为媒介:执行 Shell 命令、浏览网页、调用 API、导出文档、响应事件和触发自动化流程。随着这些能力被逐渐开放给 Codex,它开始越来越不像狭义上的编程助手,而更像一个能把各类计算机工作搞定的通用系统。

The Codex app makes that shift concrete. A thread can keep context, use tools, surface artifacts, and continue across prompts instead of resetting after each exchange.

Codex 应用将这种转变变得具体可感。一个对话线程可以保持上下文、使用工具、生成制品,并跨越多个提示词持续推进,而不是在每次交换后重置。

Getting more out of Codex means using these capabilities together:

深度用好 Codex,意味着将以下能力协同运用:

  • durable threads that preserve context
  • voice, steering, and queuing while the user is still in the loop
  • browser, computer-use, MCP servers, and connectors that let Codex act beyond a repo
  • thread automations and Goals that continue the work while the user is away
  • the side panel, where users can review code, documents, decks, and other artifacts

  • 保留上下文的持久线程
  • 用户在场时的语音输入、方向修正和任务排队
  • 让 Codex 超越代码库发挥作用的浏览器、电脑操控、MCP 服务器和连接器
  • 用户离开时继续推进工作的线程自动化和目标任务(Goals)
  • 供用户审阅代码、文档、幻灯片及其他制品的侧边栏

一、持久线程 / Durable Threads

Durable threads: Long-running Codex threads that preserve working context across repeated sessions.
持久线程: 跨多次会话保留工作上下文的长期 Codex 对话线程。

Pinned threads are one way to keep durable threads close at hand. They're useful for recurring work streams such as:

置顶线程是将持久线程随时保持可用的一种方式。它们非常适合以下类型的持续性工作流:

  • a Chief of Staff thread
  • a release thread
  • a documentation review thread
  • a thread dedicated to external monitoring

  • 参谋/助手线程
  • 版本发布线程
  • 文档审查线程
  • 专用于外部监控的线程

These are persistent workspaces, not short chats. Codex can revisit them over time, preserving prior decisions, preferences, and working context that would otherwise need to be rebuilt from scratch.

这些是持久的工作空间,而非短暂的对话。Codex 可以随时重返这些线程,保留之前的决策、偏好和工作上下文,无需每次从头重建。

Pinned-thread shortcuts make this practical. Command-1 through Command-9 jump directly into saved threads.

置顶线程的快捷键让这一切更加实用。Command-1 到 Command-9 可直接跳入对应的已保存线程。

二、语音输入 / Voice Input

Voice input is valuable because it captures the rough version of a thought before it's compressed into polished prose.

语音输入的价值在于,它能捕捉想法最原始的形态,而不是经过精炼润色后的文字。

Codex has built-in voice input. It works especially well for vague starting points that are natural to say but awkward to type:

Codex 内置了语音输入功能。它特别适合那些自然说出来却难以打出来的模糊起点,例如:

I think someone named Ben mentioned this in Slack.
I do not remember the details.
Please go look.
"我记得有个叫 Ben 的人在 Slack 里提到过这个。"
"我记不清细节了。"
"帮我去查一下。"

For an agent that can search, gather context, and report back, that's often enough.

对于一个能够搜索信息、收集上下文并汇报结果的 AI 智能体来说,这通常就已经足够了。

It also works well for a two- or three-minute thought dump before the task is fully formed.

在任务尚未成形之前,来一段两三分钟的思维倾泻,语音同样得心应手。

Transcripts work the same way. A raw meeting transcript or dictated planning note often provides better source material than a short summary because it preserves uncertainty, emphasis, and unfinished lines of thought.

文字转录的效果如出一辙。一份原始的会议记录或口述的规划笔记,往往比简短的摘要提供更好的原始素材,因为它保留了不确定性、重点强调和未竟的思路。

三、方向修正与任务排队 / Steering and Queuing

Voice becomes even more useful when paired with explicit control over an active task.

当语音与对进行中任务的明确控制相结合时,它的价值进一步放大。

Steering: Interrupting an in-flight Codex task with new direction before the current step finishes.
方向修正(Steering): 在当前步骤完成之前,用新方向打断正在进行的 Codex 任务。

Steering is useful when the agent is heading the wrong way and needs a correction before it finishes. During a website review, for example, the user can interrupt the work while annotating the surface in the side panel:

当智能体走偏了方向、需要在完成前纠正时,方向修正大有用武之地。例如,在审查网站时,用户可以一边在侧边栏对页面进行标注,一边随时打断任务:

make this smaller
the spacing between these two elements feels off
this copy is wrong
"把这个做小一点"
"这两个元素之间的间距感觉不对"
"这段文案写错了"
Queuing: Adding work for Codex to do after the current step completes.
任务排队(Queuing): 在当前步骤完成后,为 Codex 追加待办工作。

Queuing is different. It doesn't interrupt the task in progress. It adds the next task to the line. A user might say:

任务排队与方向修正不同——它不打断进行中的任务,而是把下一个任务加入队列。用户可以这样说:

Once the work is done, send the preview link to the reviewer in Slack.
"工作完成后,把预览链接发到 Slack 给审核人。"

Steering changes what Codex is doing now. Queuing changes what should happen next. Both keep the user close to the work while it's unfolding.

方向修正改变 Codex 当下正在做什么,任务排队改变接下来应该发生什么。两者都让用户在工作展开时始终与其保持紧密联系。

四、工具能力与触达范围 / Tools and Reach

Once a thread has continuity, the next question is what it can act on. Codex can move outward in layers:

一旦线程具备了连续性,下一个问题便是它能对什么发挥作用。Codex 可以向外逐层扩展能力范围:

  • $browser for the in-app browser in the side panel, where Codex can inspect and annotate web surfaces
  • @chrome for signed-in browser state and Chrome-based workflows
  • @computer for work that only exists through a desktop GUI

  • $browser:侧边栏内置浏览器,Codex 可借此检查和标注网页内容
  • @chrome:依赖用户 Chrome 登录状态和 Chrome 工作流的任务
  • @computer:只能通过桌面图形界面完成的任务

$browser fits side-panel browser review. @chrome fits signed-in browser work that depends on the user's Chrome context. @computer fits tasks that only exist through a desktop GUI.

$browser 适合侧边栏浏览器审查;@chrome 适合依赖用户 Chrome 上下文的登录态操作;@computer 适合只能通过桌面 GUI 完成的任务。

MCP servers and connectors extend the same idea into the rest of a workflow. Slack, Gmail, and Calendar matter because many important tasks first appear as messages, inbox items, or scheduling problems before they ever become code.

MCP 服务器和连接器将同样的理念延伸到工作流程的其他环节。Slack、Gmail 和日历之所以重要,是因为许多关键任务在成为代码之前,往往先以消息、收件箱条目或日程安排问题的形式出现。

Skills make repeated workflows reusable. Once a workflow proves useful, package it as a skill so Codex can run it again without relearning the routine from scratch.

技能(Skills)让重复性工作流可以复用。一旦某个工作流被证明有用,就将其封装为技能,让 Codex 下次无需从头重学就能直接运行。

五、随时随地工作 / Work from Anywhere

The Codex mobile app changes when the user has to be at the desk. A task can start on a Mac where the files, permissions, and local setup already live, then continue while the user checks in from a phone.

Codex 移动应用改变了用户必须坐在电脑前的局面。一项任务可以在 Mac 上启动(文件、权限和本地环境都在那里),然后在用户用手机查看时继续推进。

That matters in small moments. Someone can leave the desk while Codex runs a longer task, answer a question from outside, approve the next step, or redirect the thread before they get back. The local environment stays in place; the user doesn't have to.

这在细碎的时间里意义重大。某人可以在 Codex 执行较长任务时离开桌面,在外面回答一个问题、批准下一个步骤或在回来之前重新定向线程。本地环境岿然不动,用户不必如此。

六、自动化任务 / Automations

Automations run Codex work on a schedule. Use a scheduled automation when the recurring job should start fresh from a workspace, such as a daily report or a regular repository check. Use a thread automation when the schedule should return to an active conversation with its running context.

自动化任务让 Codex 按计划工作。当定期任务需要从工作空间全新启动时(如每日报告或定期代码库检查),使用计划自动化;当计划应回到一个保有运行上下文的活跃对话时,使用线程自动化。

Thread automations: Heartbeat-style recurring wake-up calls that return to the same Codex thread on a schedule.
线程自动化: 按计划定期唤醒并返回同一 Codex 线程的心跳式循环机制。

Pinned threads are useful, but they still wait for the user to return. A thread automation can check on something every few minutes or every few hours, continue until it meets a condition, and adjust the cadence over time.

置顶线程很实用,但仍需等待用户主动回来。线程自动化可以每隔几分钟或几小时检查一次,持续执行直至满足某个条件,并随时间调整频率节奏。

A Chief of Staff thread might run every 30 minutes:

一个「参谋助手」线程可能每 30 分钟运行一次:

Every 30 minutes, check Slack and Gmail for unanswered messages that need my attention.
Help me prioritize what matters most.
If someone asks me a question, research the answer as deeply as you can and draft a reply for me, but do not send it.
"每隔 30 分钟,检查 Slack 和 Gmail,找出需要我关注的未回复消息。"
"帮我梳理最重要的优先事项。"
"如果有人向我提问,请尽可能深入研究答案并为我起草回复,但不要发送。"

When the user returns, the expensive part of gathering context is often done. The human still decides what gets sent.

当用户回来时,耗时的上下文收集工作往往已经完成。最终由谁来决定发送什么,还是人。

Thread automations also fit feedback loops. A thread automation can watch pull request comments, Google Docs comments, or Slack replies and keep the surrounding work moving while the user is away.

线程自动化同样适合反馈循环。它可以监视 Pull Request 评论、Google Docs 评论或 Slack 回复,在用户不在时推动相关工作持续进展。

Consider an animation workflow where a reviewer shares a video in Slack. A thread automation can check the thread on a schedule, render an updated version when comments arrive, and reply in the same thread tagging the reviewer. If one integration can't complete the final upload, desktop automation can finish the step through the GUI.

设想一个动画工作流:审阅者在 Slack 中分享了一段视频。线程自动化可以按计划检查该线程,当评论出现时渲染更新版本,并在同一线程中 @ 审阅者回复。如果某个集成无法完成最终上传,桌面自动化可以通过 GUI 完成这一步骤。

The loop spans Slack for feedback, the codebase for rendering, and desktop automation for the final upload.

这个循环跨越了 Slack(反馈)、代码库(渲染)和桌面自动化(最终上传)三个环节。

七、目标任务 / Goals

Goals are most powerful when the task has a real finish line that the agent can keep pushing toward. A weak goal is:

当任务有一个智能体可以持续向其推进的真实终点线时,目标任务(Goals)的威力最为强大。一个弱目标如下所示:

Goals: Longer-running Codex tasks with a finish line the agent can keep working toward over time.
目标任务: 有明确终点线、智能体可以持续向其推进的长周期 Codex 任务。
Implement the plan in this Markdown file.
"实现这个 Markdown 文件中的计划。"

A stronger goal has a measurable success criterion.

一个更强的目标有可量化的成功标准。

For example, an engineer might migrate an internal tool from Python to Rust by setting up the new directory, defining the goal, and making the finish line explicit: the new implementation isn't done until the unit tests pass.

例如,一名工程师可以通过创建新目录、定义目标并明确终点线,将一个内部工具从 Python 迁移到 Rust:新实现在单元测试通过之前不算完成。

A goal combines ongoing execution with a verifier. The user defines the outcome, the stopping condition, and the signal that says whether Codex is getting closer.

一个目标任务将持续执行与验证器相结合。用户定义期望结果、停止条件以及判断 Codex 是否在接近目标的信号。

Useful verifiers include:

有用的验证器包括:

  • a test suite
  • a benchmark
  • a bug reproduction
  • a validation matrix
  • an end-to-end workflow that must keep passing

  • 测试套件
  • 性能基准
  • Bug 复现
  • 验证矩阵
  • 必须持续通过的端到端工作流

Ambition matters, but without verification it's just a wish.

雄心固然重要,但没有验证机制,它不过是一厢情愿。

八、侧边栏 / The Side Panel

The side panel keeps the work beside the conversation that produced it. Instead of exporting an artifact and switching contexts, the user can review it in place. The output might be code, but it might also be a deck, a PDF, a browser page, a table, or another artifact created along the way.

侧边栏让工作与产生它的对话并排存在。用户无需导出制品再切换上下文,可以就地审阅。输出可以是代码,也可以是幻灯片、PDF、浏览器页面、数据表格或其他过程中产生的制品。

It supports four jobs especially well:

它特别擅长以下四项工作:

  • Inspect artifacts
  • Annotate what needs to change
  • Operate web surfaces
  • Review changes

  • 检查制品
  • 标注需要修改的内容
  • 操控网页界面
  • 审查变更

The side panel lets users review Markdown, spreadsheets, data tables, documents, and slides in place. They can inspect, mark up, and revise artifacts without breaking the loop.

侧边栏让用户可以就地审阅 Markdown、电子表格、数据表、文档和幻灯片。他们可以检查、标注并修改制品,而不必打断工作流程。

Annotations

The deck or PDF can stay open beside the thread that produced it, ready for direct review and repair.

幻灯片或 PDF 可以在产生它的线程旁边保持打开状态,随时供直接审阅和修改。

Sheets in Codex

The in-app browser lets Codex inspect a rendered page, control it, and respond to annotations directly on the surface under review. Comments on a page or artifact stay inside the working loop instead of becoming a separate handoff.

内置浏览器让 Codex 可以检查渲染后的页面、对其进行控制,并直接在被审阅的界面上响应标注。对页面或制品的评论保留在工作循环内部,而不会变成独立的交接任务。

The web becomes both output and control surface. Codex can build an artifact, open it in the side panel, inspect it, debug it, and keep refining the same object in place.

网页既是输出,也是控制界面。Codex 可以构建一个制品,在侧边栏中打开它、检查它、调试它,并就地持续打磨同一对象。

These surfaces work especially well:

以下几类界面表现尤为出色:

  • index.html for lightweight static artifacts
  • Storybook for UI review
  • Remotion Studio for programmatic animation
  • browser-based slide decks for presentations
  • data apps for analysis workflows

  • index.html:轻量级静态制品
  • Storybook:UI 组件审查
  • Remotion Studio:程序化动画
  • 基于浏览器的幻灯片:演示文稿
  • 数据应用:分析工作流

A single index.html file can become a durable interactive artifact with no server required. Thread automations can also refresh static artifacts over time so a thread has something new waiting when the user returns.

一个 index.html 文件无需服务器即可成为持久的交互式制品。线程自动化还可以随时间刷新静态制品,让线程在用户回来时始终有新内容等待审阅。

九、共享记忆 / Shared Memory

Long-running threads become more useful when they share memory outside any one conversation.

当长期线程能够在单次对话之外共享记忆时,它们的价值会进一步提升。

Shared memory: Durable context stored outside a single thread so future work can resume from something explicit and reviewable.
共享记忆: 存储于单一线程之外的持久上下文,使未来的工作能够从可见、可审阅的内容处继续。

One durable pattern is to anchor persistent threads in an Obsidian vault. In practice, that means a folder of plain files that stays straightforward to inspect, edit, move, and keep for a long time. Teams can store that folder in cloud storage, Git, Dropbox, Google Drive, or another sync layer that fits their workflow.

一种持久可靠的模式是:将持久线程锚定在 Obsidian 知识库中。实际上,这意味着一个由纯文本文件组成的文件夹,便于长期检查、编辑、迁移和保存。团队可以将这个文件夹存储在云存储、Git、Dropbox、Google Drive 或其他适合其工作流的同步层中。

A vault might look like this:

一个知识库的目录结构可能如下:

vault/ ├── TODO.md ├── people/ ├── projects/ ├── agent/ └── notes/

At the top level, AGENTS.md can define how Codex should update that workspace as it learns more about people, projects, decisions, and open loops.

在顶层,AGENTS.md 可以定义 Codex 在了解更多关于人员、项目、决策和未完成事项时应如何更新该工作空间。

Don't copy one exact vault structure. Teach the agent where durable context should live, what context to preserve, and when not to create churn.

不要照搬某个固定的知识库结构。要教会智能体:持久上下文应该存放在哪里、什么上下文值得保留,以及何时不要无谓地制造文件碎片。

A practical AGENTS.md might say:

一份实用的 AGENTS.md 可以这样写:

`

  • Treat ~/vault as durable work memory.
  • Prefer canonical notes over note sprawl.
  • Route TODOs, people, projects, daily summaries, and scratch notes explicitly.
  • Preserve decisions, blockers, owners, dates, and useful links.
  • If nothing meaningful changed, do not churn the vault.

`

Repositories hold code. The vault holds rolling context: the people involved, what changed, what's blocked, what needs follow-up, and what would otherwise disappear between sessions.

代码库存放代码;知识库存放滚动更新的上下文:相关人员、发生了什么变化、什么被卡住了、什么需要跟进,以及那些否则会在会话间消失的信息。

Important context shouldn't live only inside a conversation transcript. Write it down somewhere the next thread can pick back up.

重要上下文不应只存在于对话记录中。把它写下来,放在下一个线程可以接手的地方。

Codex also has first-party memory features in Settings > Personalization > Memories. They provide a local recall layer for preferences, recurring workflows, and known pitfalls. They complement explicit written context rather than replacing it. Chronicle pushes in the same direction by helping Codex build memory from recent screen context.

Codex 在「设置 → 个性化 → 记忆」中也提供了原生记忆功能。它为偏好、重复性工作流和已知的坑提供了一个本地回忆层,是对显式书面上下文的补充,而非替代。Chronicle 也朝着同一方向发力,通过近期屏幕上下文帮助 Codex 构建记忆。

十、从代码出发,走向更广 / From Code Outward

Codex still starts from code. But more of the work around code is now reachable through the same system: MCP servers, browser surfaces, desktop controls, thread automations, and reviewable artifacts.

Codex 的起点仍然是代码。但围绕代码的更多工作,现在都可以通过同一个系统触达:MCP 服务器、浏览器界面、桌面控制、线程自动化,以及可供审阅的制品。

That changes the control model. Steering interrupts the work in progress. Queuing lines up the next task. Thread automations keep a thread active when the user steps away. Goals add a concrete finish line that Codex can keep working toward.

这改变了控制模式。方向修正打断进行中的工作;任务排队安排下一个待办;线程自动化在用户离开时保持线程活跃;目标任务添加了一个 Codex 可以持续向其推进的具体终点线。

Codex can now carry a workflow from instruction to execution to artifact review, even when the work leaves the repo.

Codex 现在可以将一个工作流从指令推进到执行,再到制品审阅——即使工作已经超出了代码库的范畴。


*本文来源于 OpenAI Codex 官方指南,英中对照版由 Lamjin 整理翻译。*

参考来源

评论

Share

分享这篇文章