This is very good - I have found that well-designed progressive disclosure systems like this tend to be far more effective and performant than cookie-cutter semantic / encoding-based RAG systems. Agents seem to love tables of contents.
It definitely feels more correct from a first principles perspective. It always seemed crazy that we often just throw a bunch of data at an LLM and go figure it out. It's filled with irrelevant junk. That would slow us down so why wouldn't it slow down AI?
Though I will say it has been a challenge to get agents to reliably use tools like `treedocs explore`. They tend to default to using their built in tooling first or commands like grep. I'm still working on reliable prompts and skills that tell the agent when to use which approach.
I built treedocs, a CLI that shows a tree of your file structure, side-by-side with docs.
The idea is simple: your filesystem already tells you what exists, but not what each path is for. treedocs mirrors the file tree into a treedocs.yaml file and lets you define short descriptions, references, and links to files and folders. It can then render that back as a documented tree, detect drift when files move or disappear, and fail checks when descriptions are missing.
I originally wanted this for two related problems:
1. Helping humans acclimate to unfamiliar repositories faster.
2. Giving coding agents concise project context without forcing them to rediscover the repo structure over and over.
Some things it does:
- treedocs init creates a treedocs.yaml
- treedocs sync reconciles it with the filesystem
- treedocs check catches stale entries and missing descriptions
- treedocs explore implements progressive disclosure for efficient token usage.
- Nested treedocs.yaml files act as documentation boundaries for delegated subtrees
- The YAML format is backed by a public JSON Schema
Iโd be interested in feedback on the file format, CLI ergonomics, and whether this kind of repo-level map is useful for your workflows.
The question is who maintains it. I think, LLM is the only practical option. In my agent rules, I tell it to maintain INDEX.md on each level. That takes 1 line of instruction and starting with an example file. Then, it self-maintains.
(In fairness to this poster, DandyLyons' post was in fact dead when we got here โ I used vouch on it.) Probably just weird things that happen when making your first post on HN if it's heuristically not quite right / a Show HN / wordy?
Probably none, just got into HN's spam filter. When you have enough points you can vouche for it and if enough people do it gets reinstated, which is exactly what happened.
This is very good - I have found that well-designed progressive disclosure systems like this tend to be far more effective and performant than cookie-cutter semantic / encoding-based RAG systems. Agents seem to love tables of contents.
It definitely feels more correct from a first principles perspective. It always seemed crazy that we often just throw a bunch of data at an LLM and go figure it out. It's filled with irrelevant junk. That would slow us down so why wouldn't it slow down AI?
Though I will say it has been a challenge to get agents to reliably use tools like `treedocs explore`. They tend to default to using their built in tooling first or commands like grep. I'm still working on reliable prompts and skills that tell the agent when to use which approach.
Maaan, this seems so great except it's MacOS only.
I only have a Mac to test on, but I am open to releasing on Linux or Windows if someone would help.
Hi HN,
I built treedocs, a CLI that shows a tree of your file structure, side-by-side with docs.
The idea is simple: your filesystem already tells you what exists, but not what each path is for. treedocs mirrors the file tree into a treedocs.yaml file and lets you define short descriptions, references, and links to files and folders. It can then render that back as a documented tree, detect drift when files move or disappear, and fail checks when descriptions are missing.
I originally wanted this for two related problems: 1. Helping humans acclimate to unfamiliar repositories faster. 2. Giving coding agents concise project context without forcing them to rediscover the repo structure over and over.
Some things it does: - treedocs init creates a treedocs.yaml - treedocs sync reconciles it with the filesystem - treedocs check catches stale entries and missing descriptions - treedocs explore implements progressive disclosure for efficient token usage. - Nested treedocs.yaml files act as documentation boundaries for delegated subtrees - The YAML format is backed by a public JSON Schema
Iโd be interested in feedback on the file format, CLI ergonomics, and whether this kind of repo-level map is useful for your workflows.
The question is who maintains it. I think, LLM is the only practical option. In my agent rules, I tell it to maintain INDEX.md on each level. That takes 1 line of instruction and starting with an example file. Then, it self-maintains.
This is lovely! I'd done something similar in my own setup.
I'm curious what your experience has been with fork/merge โ why a monolithic file rather than something more diffuse?
Why is @DandyLyons' comment flagged and dead?
Like ... what's wrong with what he posted?
I've seen thousands of Show HNs here, for like a decade. Very similar to this.
What rule did he break?
(In fairness to this poster, DandyLyons' post was in fact dead when we got here โ I used vouch on it.) Probably just weird things that happen when making your first post on HN if it's heuristically not quite right / a Show HN / wordy?
Probably none, just got into HN's spam filter. When you have enough points you can vouche for it and if enough people do it gets reinstated, which is exactly what happened.
Being here with zero activity other than self promotion.