Very cool project. Providing credentials agents and standardizing that whole process seems like valuable work. Question though on the OSS/paid boundary... is the OSS cli the client for the paid service? What is the custody model? Does this service store all my credentials?
Thanks. Yes, I would have to put myself in that category. Typical play here is to offer the self-hosted option. Not sure if that is in the pipeline for the creators of this. Then you are into that trust/operational overhead tradeoff conversation.
Fair pushback and point taken. The reason this isnât trivial is that the CLI is the easy and lightweight part. The heavy part is the infra behind it: auth, vaulting, token refresh/exchange, revocation, provider edge cases etc.
We'll think how to best accomodate full self-hosting in the future!
What do you anticipate to be the hardest part of supporting a self-hosted solution? I've worked a fair bit on converting SAAS -> self-hosted and always interested to hear others' pain points.
I imagine a lot of the organizations that would find this most valuable, and would be willing to pay a lot, would be the same ones that would require something like this.
I think the hardest part is the environment/support matrix, not getting it to run in the first place.
Itâs usually pretty doable to make a system self-hostable on a happy path. The hard part is supporting it across lots of customer environments without being in the loop every time: custom IdPs, private networking, KMS/HSM/BYOK requirements, upgrade/migration paths, backup/restore, observability, and all the weird edge cases that only show up once other people operate it.
And yes, I think your last point is right: the customers who care most about this category are often exactly the ones who will require self-hosted.
What's your take? Curious what you found effective vs. what you deem hardest from your experience.
yes, atm there's nothing that keeps the agent from reading the key from the environment. If a static API key is injected into the agentâs env, the agent can in principle read it. The value of our threat model is better custody, short-lived creds where possible, and auditability, not âthe process canât see its own env.â You can make the hooks a lot stricter and check that the agent can basically never do anything with the credential, the agent is still inside the trust boundary in this case.
Yes, thatâs the ideal model. For services with OAuth/OIDC/token exchange support, we want to mint short-lived delegated creds instead of returning the underlying secret. For services that donât support that, we donât want them to be unsupported entirely. But theyâre a weaker security tier: you can still improve custody/rotation/auditability, just not get the full âagent never sees the real secretâ property without a proxy/broker/signing layer.
Workflow: OneCLI runs as a self-hosted Docker gateway â you route agent traffic through localhost:10255. Kontext doesn't change how you use Claude Code at all, just kontext start --agent claude.
Visibility layer: OneCLI intercepts outbound HTTP requests. Kontext hooks into Claude's PreToolUse/PostToolUse events, so you see bash commands, file ops, and API calls and not just network traffic.
Trust model tradeoff worth naming: OneCLI is fully self-hosted. Kontext holds secrets server-side and mints short-lived tokens per session. We do this via token exchange, RFC 8693, and natively build upon Oauth to support only handing over short-lived tokens - you don't need to capture refresh tokens for external tool calls at all.
Yes, with one important distinction: our visibility is at the agent tool boundary, not the raw network layer.
So if Claude Code invokes Bash and runs curl ..., we see that tool invocation. If it invokes Bash and runs python script.py, and that script makes HTTP requests internally, we still see the Bash invocation.
Thanks for flagging - wasn't aware of Aperture! It's a little different to what the Kontext CLI does though.
Aperture solves âmake multiple coding agents talk to the right LLM backend through an Aperture proxy.â We solve âlaunch a governed agent session with identity, short-lived third-party credentials, and tool-level auditability.â They overlap at the launcher layer, but the security goals are different.
Finally a solution which focuses on contextual authorization - evaluating the agent's reasoning trace when it requests a credential, only issuing it if the intent matches what the user authorized.. developer-focused and self-serve.Happy Launch day!!
Thank you! OpenClaw/NanoClaw is one of the obvious next integration targets. I donât think the right model there is just âwrap the outer processâ the way we do with Claude Code today. OpenClaw has a real plugin surface, so weâd more likely ship a plugin that hooks before_tool_call / after_tool_call and talks to a local Kontext sidecar for policy + telemetry. The nice part is OpenClaw can actually block there, so you can do real allow/deny/approval at the tool boundary.
It should be possible to do this w/ eBPF. Monitor network i/o & rewrite the request on the fly to include the proper tokens & signatures. The agent can just be given placeholder tokens. That way all the usual libraries work as expected & the secrets/signatures are handled w/o worrying about another abstraction layer. Here is some prior art: https://riptides.io/blog/when-ebpf-isnt-enough-why-we-went-w...
Very interesting article! Yes, I think thatâs plausible, but probably not as âjust eBPFâ in practice. Once you need request rewriting, signing, and TLS-aware handling, you usually end up in eBPF + userspace or kernel-module territory. I believe the post is basically making that exact argument. Our current CLI is intentionally at the tool/session layer. A transport-layer mode is interesting, especially for containerized/SDK-driven agents, but itâs a different and more OS-specific implementation path.
Very cool project. Providing credentials agents and standardizing that whole process seems like valuable work. Question though on the OSS/paid boundary... is the OSS cli the client for the paid service? What is the custody model? Does this service store all my credentials?
From another comment:
> Kontext holds secrets server-side and mints short-lived tokens per session.
That probably makes this thing DOA for most people (certainly for me and everyone I know).
Thanks. Yes, I would have to put myself in that category. Typical play here is to offer the self-hosted option. Not sure if that is in the pipeline for the creators of this. Then you are into that trust/operational overhead tradeoff conversation.
Fair pushback and point taken. The reason this isnât trivial is that the CLI is the easy and lightweight part. The heavy part is the infra behind it: auth, vaulting, token refresh/exchange, revocation, provider edge cases etc.
We'll think how to best accomodate full self-hosting in the future!
What do you anticipate to be the hardest part of supporting a self-hosted solution? I've worked a fair bit on converting SAAS -> self-hosted and always interested to hear others' pain points.
I imagine a lot of the organizations that would find this most valuable, and would be willing to pay a lot, would be the same ones that would require something like this.
I think the hardest part is the environment/support matrix, not getting it to run in the first place.
Itâs usually pretty doable to make a system self-hostable on a happy path. The hard part is supporting it across lots of customer environments without being in the loop every time: custom IdPs, private networking, KMS/HSM/BYOK requirements, upgrade/migration paths, backup/restore, observability, and all the weird edge cases that only show up once other people operate it.
And yes, I think your last point is right: the customers who care most about this category are often exactly the ones who will require self-hosted.
What's your take? Curious what you found effective vs. what you deem hardest from your experience.
> for static API keys, the backend injects the credential directly into the agent's runtime environment.
What prevents the agent from presisering or leaking the API key - or reading it from the environment?
yes, atm there's nothing that keeps the agent from reading the key from the environment. If a static API key is injected into the agentâs env, the agent can in principle read it. The value of our threat model is better custody, short-lived creds where possible, and auditability, not âthe process canât see its own env.â You can make the hooks a lot stricter and check that the agent can basically never do anything with the credential, the agent is still inside the trust boundary in this case.
This is how keychains should be designed. Never return the secret, but mint a new token, or sign a request.
We need this also for normal usage like development environments. Or when invoking a command on a remote server.
Are you going to add support for services that don't support OIDC or this going to be a known limitation?
Yes, thatâs the ideal model. For services with OAuth/OIDC/token exchange support, we want to mint short-lived delegated creds instead of returning the underlying secret. For services that donât support that, we donât want them to be unsupported entirely. But theyâre a weaker security tier: you can still improve custody/rotation/auditability, just not get the full âagent never sees the real secretâ property without a proxy/broker/signing layer.
What if kontext runs under the same user as Claude? Could it in principle inspect the kontext process and extract the key from memory?
If they use the system keyring, it depends on the OS and other details - MacOS, Linux, and Windows all have different implementation tradeoffs.
Congrats on the launch! What are the key advantages of this compared to OneCLI[1]?
[1]: https://github.com/onecli/onecli
Great question. Two main differences:
Workflow: OneCLI runs as a self-hosted Docker gateway â you route agent traffic through localhost:10255. Kontext doesn't change how you use Claude Code at all, just kontext start --agent claude.
Visibility layer: OneCLI intercepts outbound HTTP requests. Kontext hooks into Claude's PreToolUse/PostToolUse events, so you see bash commands, file ops, and API calls and not just network traffic.
Trust model tradeoff worth naming: OneCLI is fully self-hosted. Kontext holds secrets server-side and mints short-lived tokens per session. We do this via token exchange, RFC 8693, and natively build upon Oauth to support only handing over short-lived tokens - you don't need to capture refresh tokens for external tool calls at all.
Does this work with any tool calls that make an HTTP request? e.g. calling `curl` directly vs writing a script to make the request, then calling it
Yes, with one important distinction: our visibility is at the agent tool boundary, not the raw network layer.
So if Claude Code invokes Bash and runs curl ..., we see that tool invocation. If it invokes Bash and runs python script.py, and that script makes HTTP requests internally, we still see the Bash invocation.
Sounds awfully similar to Tailscale Aperture[1]
[1] https://tailscale.com/blog/aperture-self-serve
Thanks for flagging - wasn't aware of Aperture! It's a little different to what the Kontext CLI does though.
Aperture solves âmake multiple coding agents talk to the right LLM backend through an Aperture proxy.â We solve âlaunch a governed agent session with identity, short-lived third-party credentials, and tool-level auditability.â They overlap at the launcher layer, but the security goals are different.
Finally a solution which focuses on contextual authorization - evaluating the agent's reasoning trace when it requests a credential, only issuing it if the intent matches what the user authorized.. developer-focused and self-serve.Happy Launch day!!
Really cool and much needed!
I was actually just about to get started writing this but in Rust....
Nice! I'd love to hear what you think about our approach, and what features you'd like to see first.
Yup I needed this bad for my NanoClaw
Nice work
Thank you! OpenClaw/NanoClaw is one of the obvious next integration targets. I donât think the right model there is just âwrap the outer processâ the way we do with Claude Code today. OpenClaw has a real plugin surface, so weâd more likely ship a plugin that hooks before_tool_call / after_tool_call and talks to a local Kontext sidecar for policy + telemetry. The nice part is OpenClaw can actually block there, so you can do real allow/deny/approval at the tool boundary.
It should be possible to do this w/ eBPF. Monitor network i/o & rewrite the request on the fly to include the proper tokens & signatures. The agent can just be given placeholder tokens. That way all the usual libraries work as expected & the secrets/signatures are handled w/o worrying about another abstraction layer. Here is some prior art: https://riptides.io/blog/when-ebpf-isnt-enough-why-we-went-w...
Very interesting article! Yes, I think thatâs plausible, but probably not as âjust eBPFâ in practice. Once you need request rewriting, signing, and TLS-aware handling, you usually end up in eBPF + userspace or kernel-module territory. I believe the post is basically making that exact argument. Our current CLI is intentionally at the tool/session layer. A transport-layer mode is interesting, especially for containerized/SDK-driven agents, but itâs a different and more OS-specific implementation path.
Can I integrate this with my coding agents?