Skip to content

Add ppmatAgent with HEA and KnowMat integration#301

Open
JinZongxiao wants to merge 2 commits into
PaddlePaddle:developfrom
JinZongxiao:integrate-ppmat-agent-knowmat
Open

Add ppmatAgent with HEA and KnowMat integration#301
JinZongxiao wants to merge 2 commits into
PaddlePaddle:developfrom
JinZongxiao:integrate-ppmat-agent-knowmat

Conversation

@JinZongxiao

Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings June 29, 2026 02:56
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


jinzongxiao seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

@JinZongxiao

Copy link
Copy Markdown
Collaborator Author

Fix ppmatAgent migration smoke paths

  • 将 KnowMat 的 LLM key 校验从 import 阶段后移到实际运行 LLM extraction 前,避免基础导入、CI、文档构建在没有 LLM_API_KEY 时失败。
  • 修复 PaddleOCR API / MinerU / OpenAI OCR 云端后端对 PyMuPDF 的非必要依赖;只有本地 OCR 或指定页码范围时才需要 PyMuPDF。
  • 修复 CLI 在已有 .md/.txt 输入配合 --force-rerun 时没有任务可执行的问题。
  • 修复 HEA 示例中 tdb_files 使用当前工作目录相对路径的问题,改为使用包内默认 TDB 路径。
  • 新增 ppmatAgent 迁移轻量测试,覆盖基础 import、资源加载、dotenv 校验、包内 TDB registry 和 surrogate registry。

本地已验证:

  • python -m compileall -q ppmatAgent test
  • pytest -q test/test_ppmat_agent_migration.py
  • python -m ppmatAgent.knowmat --help
  • python -m ppmatAgent.hea_crewai_agent.run_crewai --help
  • PaddleOCR API OCR-only smoke test
  • DeepSeek LLM extraction smoke test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants