第 06 课

第6课：Context Compact —— 上下文压缩，让 Agent 永远聊下去

一句话总结：对话历史会越来越长，迟早撑爆上下文窗口。用三层压缩策略 —— 微压缩（每轮清理旧结果）、自动压缩（超阈值时总结）、手动压缩（Agent 主动触发），让 Agent 可以无限对话。

你将学到什么

为什么对话历史会"爆炸"，以及它会带来什么问题
三层压缩策略各自的原理和触发时机
如何估算 token 用量
如何让 LLM 自己总结对话，实现"记忆浓缩"
如何保存完整对话记录（transcript），压缩但不丢数据
最重要的： 一个能无限对话的 Agent，不管聊多久都不会崩

核心概念：对话历史的"膨胀"问题

问题：聊着聊着就"撑爆"了

想象你在写一个很长的笔记本。每说一句话，你就在笔记本上写一行。AI 回复一段话，你也写上去。AI 调用了一个工具，工具返回了一大堆输出（比如读了一个500行的文件），你也得全写上去。

text
第 1 轮：用户说了 50 个 token，AI 回了 200 个 token         = 250
第 2 轮：AI 读了个文件，5000 token；回复 300 token          = 5,550
第 3 轮：AI 执行了命令，输出 2000 token；回复 200 token      = 7,750
...
第 20 轮：已经累积了 80,000 个 token

大多数模型的上下文窗口是 128K 或 200K token。看起来很多，但工具调用的结果（尤其是文件内容和命令输出）非常占空间，聊个20-30轮就可能接近极限。

而且还有一个隐藏问题： 即使没撑爆，token 越多，每轮对话的费用越高（因为所有历史消息都要作为 input 发送），而且 AI 的注意力也会被大量无关的旧信息分散。

解决思路：三层压缩

就像整理房间一样，我们用三种力度的"打扫"方式：

text
日常整理（微压缩）        大扫除（自动压缩）       搬家级清理（手动压缩）
每轮都做                  token 超阈值时触发         Agent 主动调用

把旧的工具输出            让 AI 总结整段对话         和自动压缩一样，
替换成一句话占位符         只保留关键信息              但由 Agent 决定时机

省 30-50% 空间            省 80-90% 空间             省 80-90% 空间
不丢失任何上下文          会丢失细节                 会丢失细节

第一层：微压缩（Micro Compact）

每轮对话都做。 原理很简单：旧的工具调用结果通常已经不重要了。

比如第3轮 AI 读了一个配置文件，到第10轮的时候，那个文件内容早就不相关了。我们可以把它替换成：

text
之前：{"type": "tool_result", "content": "server:\n  host: 0.0.0.0\n  port: 8080\n  ...（500行配置）"}
之后：{"type": "tool_result", "content": "[Previous: used read_file]"}

关键设计： 保留最近3个工具结果不压缩（因为 AI 可能还在用），只压缩更早的。

第二层：自动压缩（Auto Compact）

当 token 超过阈值时触发。 做法是让另一个 LLM 调用来总结整段对话：

text
原始对话（80000 token）  -->  总结（2000 token）
"用户让我创建了一个
 Flask 项目，安装了依赖，
 写了3个路由，修复了一个
 数据库连接的 bug..."

总结前会把完整对话存到 .transcripts/ 目录，这样即使压缩了也不会真正丢数据。

第三层：手动压缩（Manual Compact）

有时候 Agent 自己知道"我已经完成了一个大任务，可以清理一下了"。我们给它一个 compact 工具，让它可以主动触发压缩。

和上一课的对比：到底改了什么？

上一课的结构：                        本课的结构：

+-- 初始化配置                        +-- 初始化配置
                                     +-- THRESHOLD, KEEP_RECENT 等常量（新增！）
                                     +-- estimate_tokens（新增！token 估算）
                                     +-- micro_compact（新增！微压缩）
                                     +-- auto_compact（新增！自动压缩）
+-- System Prompt                    +-- System Prompt（没变）
+-- 工具函数们                        +-- 工具函数们（没变）
+-- TOOL_HANDLERS                    +-- TOOL_HANDLERS（多了 compact）
+-- TOOLS                            +-- TOOLS（多了 compact 工具定义）
+-- agent_loop                       +-- agent_loop（改了！每轮调压缩）
+-- main                             +-- main（没变）

核心变化：

新增了 estimate_tokens()、micro_compact()、auto_compact() 三个函数
agent_loop 里加了压缩逻辑 —— 每轮开始前先微压缩，再检查是否需要自动压缩
新增了 compact 工具 —— 让 Agent 可以手动触发压缩
注意：这是第一次修改 agent_loop！因为压缩是循环级别的功能，不只是"加个工具"

ASCII 流程图

用户输入
   |
   v
+---------------------------+
| micro_compact(messages)   |  <-- 第一层：把旧工具结果替换成占位符
+---------------------------+
   |
   v
+---------------------------+     是     +--------------------+
| token数 > THRESHOLD ?     |---------->| auto_compact()     |
+---------------------------+           | 1. 保存完整记录    |
   | 否                                 | 2. AI 总结对话     |
   v                                    | 3. 替换所有消息    |
+---------------------------+           +--------------------+
| 发送消息给 AI             |                    |
+---------------------------+                    |
   |                                             v
   v                                     继续下一轮
+---------------------------+
| AI 返回响应               |
+---------------------------+
   |
   v
+---------------------------+     是
| stop_reason == tool_use ? |---------> 执行工具
+---------------------------+           |
   | 否                                 v
   v                            +---------------------------+
  输出最终回答                   | 工具是 compact ?          |
                                +---------------------------+
                                   | 是           | 否
                                   v              v
                              手动压缩         正常继续循环
                              (auto_compact)

完整代码

python
#!/usr/bin/env python3
"""s06_context_compact.py - Compact
上下文压缩 Agent：用三层策略管理对话历史长度，
让 Agent 可以无限对话而不会撑爆上下文窗口。
"""

import json
import os
import subprocess
import time
from pathlib import Path

from anthropic import Anthropic
from dotenv import load_dotenv

# ── 环境初始化 ──────────────────────────────────────────────
load_dotenv(override=True)

# 兼容自定义 API 地址
if os.getenv("ANTHROPIC_BASE_URL"):
    os.environ.pop("ANTHROPIC_AUTH_TOKEN", None)

WORKDIR = Path.cwd()                                    # 工作目录
client = Anthropic(base_url=os.getenv("ANTHROPIC_BASE_URL"))
MODEL = os.environ["MODEL_ID"]                           # 模型 ID

SYSTEM = f"You are a coding agent at {WORKDIR}. Use tools to solve tasks."

# ── 压缩相关常量 ─────────────────────────────────────────────
THRESHOLD = 50000          # token 估算阈值，超过这个数就触发自动压缩
TRANSCRIPT_DIR = WORKDIR / ".transcripts"  # 对话记录存储目录
KEEP_RECENT = 3            # 微压缩时保留最近 N 个工具结果不动
PRESERVE_RESULT_TOOLS = {"read_file"}  # 这些工具的结果永远不压缩（因为内容可能很重要）


# ── 第零层：Token 估算 ──────────────────────────────────────
def estimate_tokens(messages: list) -> int:
    """
    粗略估算对话历史的 token 数。
    简单粗暴的方法：把所有消息转成字符串，字符数除以4。
    不精确，但够用了 —— 我们只需要知道"大概多少"来决定要不要压缩。
    """
    return len(str(messages)) // 4


# ── 第一层：微压缩 ──────────────────────────────────────────
def micro_compact(messages: list) -> list:
    """
    微压缩：把旧的工具调用结果替换成简短的占位符。

    核心思路：
    1. 找到所有工具结果（tool_result 类型的消息）
    2. 保留最近 KEEP_RECENT 个不动
    3. 更早的结果，如果内容超过100字符，替换成 "[Previous: used xxx]"
    4. 特殊工具（如 read_file）的结果不替换，因为内容可能还有用
    """
    # 第一步：收集所有工具结果的位置和引用
    tool_results = []
    for msg_idx, msg in enumerate(messages):
        if msg["role"] == "user" and isinstance(msg.get("content"), list):
            for part_idx, part in enumerate(msg["content"]):
                if isinstance(part, dict) and part.get("type") == "tool_result":
                    tool_results.append((msg_idx, part_idx, part))

    # 如果工具结果总数不超过 KEEP_RECENT，不需要压缩
    if len(tool_results) <= KEEP_RECENT:
        return messages

    # 第二步：建立工具 ID 到工具名称的映射
    # 因为 tool_result 里只有 tool_use_id，我们需要知道对应的工具名
    tool_name_map = {}
    for msg in messages:
        if msg["role"] == "assistant":
            content = msg.get("content", [])
            if isinstance(content, list):
                for block in content:
                    # assistant 的 content 里 tool_use 块包含 id 和 name
                    if hasattr(block, "type") and block.type == "tool_use":
                        tool_name_map[block.id] = block.name

    # 第三步：压缩旧的工具结果（保留最近 KEEP_RECENT 个不动）
    to_clear = tool_results[:-KEEP_RECENT]     # 需要压缩的部分
    for _, _, result in to_clear:
        # 跳过内容本身就很短的结果（不值得压缩）
        if not isinstance(result.get("content"), str) or len(result["content"]) <= 100:
            continue
        # 查找对应的工具名称
        tool_id = result.get("tool_use_id", "")
        tool_name = tool_name_map.get(tool_id, "unknown")
        # 跳过需要保留的工具结果（如 read_file）
        if tool_name in PRESERVE_RESULT_TOOLS:
            continue
        # 替换成简短的占位符
        result["content"] = f"[Previous: used {tool_name}]"

    return messages


# ── 第二层：自动压缩 ─────────────────────────────────────────
def auto_compact(messages: list) -> list:
    """
    自动压缩：当 token 超过阈值时，让 AI 总结整段对话。

    步骤：
    1. 先把完整对话存到 .transcripts/ 目录（压缩但不丢数据！）
    2. 调用 AI 生成对话摘要
    3. 用一条新消息替换所有历史消息
    """
    # 第一步：保存完整对话记录到文件
    # 用时间戳命名，保证不会覆盖
    TRANSCRIPT_DIR.mkdir(exist_ok=True)
    transcript_path = TRANSCRIPT_DIR / f"transcript_{int(time.time())}.jsonl"
    with open(transcript_path, "w") as f:
        for msg in messages:
            # 每条消息一行 JSON，方便后续分析
            f.write(json.dumps(msg, default=str) + "\n")
    print(f"[transcript saved: {transcript_path}]")

    # 第二步：让 AI 总结对话
    # 只取最后 80000 个字符（避免太长），让 AI 做摘要
    conversation_text = json.dumps(messages, default=str)[-80000:]
    response = client.messages.create(
        model=MODEL,
        messages=[{"role": "user", "content":
            "Summarize this conversation for continuity. Include: "
            "1) What was accomplished, 2) Current state, 3) Key decisions made. "
            "Be concise but preserve critical details.\n\n" + conversation_text}],
        max_tokens=2000,       # 限制总结的长度
    )
    summary = response.content[0].text

    # 第三步：用一条总结消息替换所有历史
    # 包含 transcript 路径，方便以后回溯原始对话
    return [
        {"role": "user", "content": f"[Conversation compressed. Transcript: {transcript_path}]\n\n{summary}"},
    ]


# ── 路径安全函数 ─────────────────────────────────────────────
def safe_path(p: str) -> Path:
    """确保路径不会逃逸到工作目录之外"""
    path = (WORKDIR / p).resolve()
    if not path.is_relative_to(WORKDIR):
        raise ValueError(f"Path escapes workspace: {p}")
    return path


# ── 工具执行函数（和之前一样） ─────────────────────────────────
def run_bash(command: str) -> str:
    """执行 shell 命令，有危险命令过滤和超时保护"""
    dangerous = ["rm -rf /", "sudo", "shutdown", "reboot", "> /dev/"]
    if any(d in command for d in dangerous):
        return "Error: Dangerous command blocked"
    try:
        r = subprocess.run(command, shell=True, cwd=WORKDIR,
                           capture_output=True, text=True, timeout=120)
        out = (r.stdout + r.stderr).strip()
        return out[:50000] if out else "(no output)"
    except subprocess.TimeoutExpired:
        return "Error: Timeout (120s)"

def run_read(path: str, limit: int = None) -> str:
    """读取文件内容，可选限制行数"""
    try:
        lines = safe_path(path).read_text().splitlines()
        if limit and limit < len(lines):
            lines = lines[:limit] + [f"... ({len(lines) - limit} more)"]
        return "\n".join(lines)[:50000]
    except Exception as e:
        return f"Error: {e}"

def run_write(path: str, content: str) -> str:
    """写入文件内容，自动创建父目录"""
    try:
        fp = safe_path(path)
        fp.parent.mkdir(parents=True, exist_ok=True)
        fp.write_text(content)
        return f"Wrote {len(content)} bytes"
    except Exception as e:
        return f"Error: {e}"

def run_edit(path: str, old_text: str, new_text: str) -> str:
    """精确替换文件中的指定文本"""
    try:
        fp = safe_path(path)
        content = fp.read_text()
        if old_text not in content:
            return f"Error: Text not found in {path}"
        fp.write_text(content.replace(old_text, new_text, 1))
        return f"Edited {path}"
    except Exception as e:
        return f"Error: {e}"


# ── 工具分发表 ──────────────────────────────────────────────
TOOL_HANDLERS = {
    "bash":       lambda **kw: run_bash(kw["command"]),
    "read_file":  lambda **kw: run_read(kw["path"], kw.get("limit")),
    "write_file": lambda **kw: run_write(kw["path"], kw["content"]),
    "edit_file":  lambda **kw: run_edit(kw["path"], kw["old_text"], kw["new_text"]),
    "compact":    lambda **kw: "Manual compression requested.",   # 新增！手动压缩
}

# ── 工具定义 ────────────────────────────────────────────────
TOOLS = [
    {"name": "bash", "description": "Run a shell command.",
     "input_schema": {"type": "object", "properties": {"command": {"type": "string"}}, "required": ["command"]}},
    {"name": "read_file", "description": "Read file contents.",
     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "limit": {"type": "integer"}}, "required": ["path"]}},
    {"name": "write_file", "description": "Write content to file.",
     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}, "required": ["path", "content"]}},
    {"name": "edit_file", "description": "Replace exact text in file.",
     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "old_text": {"type": "string"}, "new_text": {"type": "string"}}, "required": ["path", "old_text", "new_text"]}},
    # 新增的 compact 工具 —— 让 Agent 可以主动触发压缩
    {"name": "compact", "description": "Trigger manual conversation compression.",
     "input_schema": {"type": "object", "properties": {"focus": {"type": "string", "description": "What to preserve in the summary"}}}},
]


# ── Agent 循环（本课的重点改动！） ────────────────────────────
def agent_loop(messages: list):
    """
    核心 Agent 循环，加入了压缩逻辑。

    和之前的区别：
    1. 每轮开始前先做微压缩
    2. 检查 token 是否超阈值，超了就自动压缩
    3. 如果 Agent 调用了 compact 工具，执行手动压缩
    """
    while True:
        # ====== 压缩入口（本课新增） ======
        # 第一层：每轮都做微压缩，清理旧工具结果
        micro_compact(messages)

        # 第二层：如果 token 超过阈值，触发自动压缩
        if estimate_tokens(messages) > THRESHOLD:
            print("[auto_compact triggered]")
            messages[:] = auto_compact(messages)
            # 注意 messages[:] = ... 是原地替换列表内容
            # 这样外部的 history 变量也会被更新
        # ====== 压缩结束 ======

        # 以下和之前一样：发消息、收回复、执行工具
        response = client.messages.create(
            model=MODEL, system=SYSTEM, messages=messages,
            tools=TOOLS, max_tokens=8000,
        )
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason != "tool_use":
            return

        results = []
        manual_compact = False     # 标记是否需要手动压缩
        for block in response.content:
            if block.type == "tool_use":
                if block.name == "compact":
                    # Agent 主动要求压缩！先标记，等工具结果处理完再执行
                    manual_compact = True
                    output = "Compressing..."
                else:
                    handler = TOOL_HANDLERS.get(block.name)
                    try:
                        output = handler(**block.input) if handler else f"Unknown tool: {block.name}"
                    except Exception as e:
                        output = f"Error: {e}"
                print(f"> {block.name}:")
                print(str(output)[:200])
                results.append({"type": "tool_result", "tool_use_id": block.id, "content": str(output)})

        messages.append({"role": "user", "content": results})

        # 第三层：如果 Agent 调用了 compact 工具，执行手动压缩
        if manual_compact:
            print("[manual compact]")
            messages[:] = auto_compact(messages)
            return    # 压缩后退出循环，等用户下一条消息


# ── 主函数 ──────────────────────────────────────────────────
if __name__ == "__main__":
    history = []
    while True:
        try:
            query = input("\033[36ms06 >> \033[0m")        # 青色提示符
        except (EOFError, KeyboardInterrupt):
            break
        if query.strip().lower() in ("q", "exit", ""):
            break
        history.append({"role": "user", "content": query})
        agent_loop(history)
        # 打印 AI 的文本回复
        response_content = history[-1]["content"]
        if isinstance(response_content, list):
            for block in response_content:
                if hasattr(block, "text"):
                    print(block.text)
        print()

代码逐行拆解

模块一：压缩相关常量

python
THRESHOLD = 50000
TRANSCRIPT_DIR = WORKDIR / ".transcripts"
KEEP_RECENT = 3
PRESERVE_RESULT_TOOLS = {"read_file"}

常量	含义	为什么这么设
`THRESHOLD`	50000 token 触发自动压缩	留足余量，大多数模型上下文在 128K-200K
`TRANSCRIPT_DIR`	对话记录存储目录	压缩后原始对话不丢，放在 `.` 开头的隐藏目录
`KEEP_RECENT`	保留最近3个工具结果	太少会丢失 AI 正在使用的上下文，太多省不了多少
`PRESERVE_RESULT_TOOLS`	`read_file` 的结果不压缩	文件内容可能一直在被引用，压缩掉会让 AI "忘记"文件内容

模块二：estimate_tokens —— token 估算

python
def estimate_tokens(messages: list) -> int:
    return len(str(messages)) // 4

为什么除以4？因为英文大约每4个字符对应1个 token，中文大约每2个字符对应1个 token。除以4是偏保守的估算。

为什么不用精确计算？ 因为精确的 tokenizer 计算本身也要消耗时间和资源。我们只需要一个粗略的数字来决定"要不要压缩"，差个几千 token 没关系。

模块三：micro_compact —— 微压缩

这是最精妙的部分，分三步走：

第一步：收集所有工具结果

python
tool_results = []
for msg_idx, msg in enumerate(messages):
    if msg["role"] == "user" and isinstance(msg.get("content"), list):
        for part_idx, part in enumerate(msg["content"]):
            if isinstance(part, dict) and part.get("type") == "tool_result":
                tool_results.append((msg_idx, part_idx, part))

工具结果在消息历史里的位置：它们是 role: "user" 的消息，content 是一个列表，里面有 type: "tool_result" 的字典。

第二步：建立工具名映射

python
tool_name_map = {}
for msg in messages:
    if msg["role"] == "assistant":
        content = msg.get("content", [])
        if isinstance(content, list):
            for block in content:
                if hasattr(block, "type") and block.type == "tool_use":
                    tool_name_map[block.id] = block.name

为什么需要这个？因为 tool_result 里只有 tool_use_id，没有工具名。我们需要通过 ID 反查出工具名，才能判断要不要保留这个结果。

第三步：替换旧结果

python
to_clear = tool_results[:-KEEP_RECENT]
for _, _, result in to_clear:
    if not isinstance(result.get("content"), str) or len(result["content"]) <= 100:
        continue
    tool_id = result.get("tool_use_id", "")
    tool_name = tool_name_map.get(tool_id, "unknown")
    if tool_name in PRESERVE_RESULT_TOOLS:
        continue
    result["content"] = f"[Previous: used {tool_name}]"

三个跳过条件：

内容不是字符串或长度不超过100 —— 太短了不值得压缩
是 read_file 等需要保留的工具 —— 文件内容可能还在被引用
最近3个工具结果 —— AI 当前可能还在使用

巧妙之处： micro_compact 是直接修改 messages 列表里的字典对象，不创建新列表。这样修改立即生效，不需要额外的赋值操作。

模块四：auto_compact —— 自动压缩

python
def auto_compact(messages: list) -> list:
    # 1. 保存完整记录
    TRANSCRIPT_DIR.mkdir(exist_ok=True)
    transcript_path = TRANSCRIPT_DIR / f"transcript_{int(time.time())}.jsonl"
    with open(transcript_path, "w") as f:
        for msg in messages:
            f.write(json.dumps(msg, default=str) + "\n")

    # 2. 让 AI 总结
    conversation_text = json.dumps(messages, default=str)[-80000:]
    response = client.messages.create(
        model=MODEL,
        messages=[{"role": "user", "content":
            "Summarize this conversation for continuity. Include: "
            "1) What was accomplished, 2) Current state, 3) Key decisions made. "
            "Be concise but preserve critical details.\n\n" + conversation_text}],
        max_tokens=2000,
    )
    summary = response.content[0].text

    # 3. 返回压缩后的消息列表
    return [
        {"role": "user", "content": f"[Conversation compressed. Transcript: {transcript_path}]\n\n{summary}"},
    ]

几个设计要点：

先存再压 —— 用 JSONL 格式保存完整对话，一行一条消息，方便后续分析
取最后80000字符 —— 避免对话太长导致总结请求本身也超限
总结提示词 —— 要求 AI 保留三类信息：完成了什么、当前状态、关键决策
返回新列表 —— 用 messages[:] = auto_compact(messages) 原地替换，外部变量也能感知到变化

模块五：agent_loop 的改动

这是从第1课以来第一次修改 agent_loop：

python
def agent_loop(messages: list):
    while True:
        # 新增：微压缩 + 自动压缩检查
        micro_compact(messages)
        if estimate_tokens(messages) > THRESHOLD:
            print("[auto_compact triggered]")
            messages[:] = auto_compact(messages)

        # 以下和之前一样...
        response = client.messages.create(...)
        ...

        # 新增：检测手动压缩
        manual_compact = False
        for block in response.content:
            if block.type == "tool_use":
                if block.name == "compact":
                    manual_compact = True
                    output = "Compressing..."
                ...

        if manual_compact:
            print("[manual compact]")
            messages[:] = auto_compact(messages)
            return

为什么手动压缩后要 return？ 因为压缩后所有历史都被替换了，AI 还没"看到"压缩后的状态。return 回到主循环，等用户下一条消息时，AI 才会看到总结内容并继续工作。

运行效果

场景1：微压缩（静默发生）

text
s06 >> 读一下 config.yaml 文件
> read_file:
server:
  host: 0.0.0.0
  port: 8080
  ...（假设这里有很长的配置）

好的，这个配置文件包含了服务器设置...

s06 >> 然后执行一下 ls 命令
> bash:
README.md  config.yaml  src/  tests/

当前目录下有这些文件...

s06 >> 再帮我查看一下 git status
> bash:
On branch main
nothing to commit, working tree clean

当前在 main 分支，没有未提交的修改。

在第三轮的时候，第一轮 read_file 返回的那一大段配置内容已经被微压缩替换成了 [Previous: used read_file]。但因为 read_file 在 PRESERVE_RESULT_TOOLS 里，实际上不会被压缩。而如果第一轮是 bash 命令的输出，就会被压缩。

场景2：自动压缩（token 超阈值）

text
（假设已经对话了很多轮，token 超过了 50000）

s06 >> 帮我修一下那个 bug

[auto_compact triggered]
[transcript saved: .transcripts/transcript_1711929600.jsonl]

好的，根据之前的上下文，我们正在处理 Flask 项目中的
数据库连接问题...

自动压缩触发后，之前所有的对话历史被一段总结替换。AI 能根据总结继续工作，而不是"失忆"。

场景3：手动压缩（Agent 主动触发）

text
s06 >> 我们刚才完成了第一个功能，现在开始做第二个功能吧

> compact:
Compressing...
[manual compact]
[transcript saved: .transcripts/transcript_1711929700.jsonl]

好的，我已经压缩了之前的对话记录。接下来让我们开始
第二个功能的开发...

Agent 判断"阶段性任务完成了"，主动调用 compact 工具清理上下文。

关键收获

三层压缩各有分工 —— 微压缩是"日常清扫"（每轮做、开销小），自动压缩是"大扫除"（阈值触发、效果显著），手动压缩是"Agent 自主决策"
压缩但不丢数据 —— 完整对话始终保存在 .transcripts/ 目录，随时可以回溯
微压缩的巧妙设计 —— 保留最近结果、保护重要工具、跳过短内容，三重保险避免误压缩
messages[:] = ... 是关键 —— 原地替换列表内容，外部变量也能同步更新，不需要传返回值
token 估算不需要精确 —— 粗略估算足以做出正确的压缩决策，追求精确反而浪费资源
这是第一次修改 agent_loop —— 因为压缩是循环级别的功能，不只是"加个工具"那么简单

下一课预告

第7课：多文件编辑 —— 让 Agent 像真正的程序员一样工作

到目前为止，我们的 Agent 已经能执行命令、读写文件、按需加载知识、管理上下文。下一课我们将学习如何让 Agent 处理跨文件的复杂编辑任务，像一个真正的程序员一样进行代码重构和多文件协作！

上一课 05. 技能加载下一课 07. 任务系统