AGENTS.md
This repo is a persistent LLM-maintained research wiki for machine learning, autonomous driving, robotics, vision-language-action systems, end-to-end architectures, perception, prediction, and planning.
The agent is not acting as a chatbot. It is acting as the maintainer of a markdown knowledge base.
Core responsibilities
- Maintain
wiki/as the durable knowledge layer. - Treat
raw/as immutable source material. - Keep
index.mdcurrent enough to navigate the vault. - Append concise, parseable entries to
log.md. - Prefer updating existing pages over creating duplicates.
- Record uncertainty, disagreement, and superseded claims explicitly.
Directory structure
raw/- Immutable source inputs.
raw/inbox/is the drop zone for new source files.raw/papers/is the long-term local paper store.raw/assets/stores downloaded images and figures.wiki/concepts/canonical topic pages.sources/source summaries, reading queues, and acquisition plans.comparisons/direct comparisons between methods, paradigms, or benchmarks.syntheses/higher-level theses and evolving research narratives.queries/answers worth preserving.taxonomies/maps, ontologies, and evaluation frameworks.templates/- Reusable markdown skeletons.
Page conventions
Frontmatter
Use YAML frontmatter when useful. Preferred keys:
titletypestatustagsupdatedsource_countconfidence
Linking
- Prefer Obsidian-style wikilinks:
[Page Name](/page/page-name). - Link important mentions the first time they appear in a page.
- Add backlinks by updating both sides when the relation is materially important.
Naming
- Use lowercase kebab-case filenames.
- Keep filenames semantic, not conversational.
- Prefer one canonical page per concept.
Source handling rules
- Never modify files under
raw/. - When a source is ingested, create or update a corresponding page under
wiki/sources/. - Distinguish clearly between: - direct source claims, - the wiki's synthesis, - open questions, - contradictions with prior sources.
- If citation counts matter, verify them from a current source at ingest time rather than hard-coding them from memory.
- If an external skill such as AlphaXiv is unavailable, fall back to arXiv abstracts, project pages, OpenReview, Semantic Scholar, and the local raw source itself.
Ingest workflow
When the user asks to ingest a source:
- Read
index.mdand the most relevant existing wiki pages first. - Read the new source from
raw/. - Write or update the source summary.
- Update related concept pages.
- Update comparisons or syntheses if the source changes a broader conclusion.
- Update
index.md. - Append a log entry using this format:
## [YYYY-MM-DD] ingest | Short Source Title
Then include 2-5 bullets summarizing what changed.
Query workflow
When the user asks a research question:
- Start from
index.md. - Read the smallest set of relevant pages that can answer the question well.
- Answer with citations to wiki pages and sources where possible.
- If the answer produces durable value, store it in
wiki/queries/orwiki/comparisons/. - Update
index.mdandlog.mdif a new page is created.
Lint workflow
When asked to lint the wiki, check for:
- duplicate concept pages,
- orphan pages,
- dead-end source summaries that never influenced a concept page,
- stale benchmark claims,
- ambiguous terminology around e2e, modular, hybrid, VLM, VLA, world model, and planner,
- missing canonical pages for recurring ideas.
Domain-specific rules
Autonomous driving
- Always separate open-loop and closed-loop claims.
- Always note required inputs: cameras, lidar, radar, maps, routes, language, privileged state.
- Record whether a method is modular, hybrid, or e2e.
- Record whether results depend on simulation, offline logs, intervention data, or real-world deployment.
Perception / prediction / planning
- Keep task definitions explicit.
- Distinguish benchmark performance from operational usefulness.
- Prefer comparisons that surface assumptions and failure modes, not just metrics.
VLM / VLA / robotics transfer
- Note what transfers cleanly from robotics to driving and what breaks because of speed, safety, partial observability, or scale.
- Track embodiment assumptions separately from general multimodal reasoning claims.
LLMs and foundation models
- Record whether a paper contributes architecture, training recipe, scaling law, alignment method, tool use, multimodal capability, or downstream systems insight.
- Link LLM pages into driving pages only when the transfer mechanism is concrete.
Frontend rules
- The Flask app is a viewer, not the source of truth.
- The wiki lives in markdown; the app should stay thin and stateless.
- New UI features should improve reading, navigation, search, and synthesis browsing before adding product complexity.