-
Notifications
You must be signed in to change notification settings - Fork 913
feat: add status watcher to MCP server #18320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fd1b20f
to
f8d7628
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idea for getting rid of that "a message from the screen watcher and a message from the LLM could arrive out of order if the timing is just right" race condition:
- maintain a state object that keeps track of
- an array of pending updates with timestamps
- the latest status reported to the server with a timestamp
- instead of submitting status updates to a queue, push them to the array of pending updates
- in a separate loop that ticks every second, process pending updates in batch. when processing a "screen stable" update, act on it only if it's at least 1 second old and there hasn't been a "screen changing" update after it. leave it in the pending array if it's not old enough and there hasn't been a "screen changing" update after it.
- keep the existing deduplication logic that takes into account conversation length and latest status
up to you if you want to implement it. I'm fine with merging the current version once you address the feedback
|
||
// tryCreateAgentClient returns a new client from the command context. It works | ||
// just like tryCreateAgentClient, but does not error. | ||
func (r *RootCmd) tryCreateAgentClient() (*agentsdk.Client, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect the names to be swapped: tryCreateAgentClient
should return an error, createAgentClient
should not. "try" implies the possibility of failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very fair. For this though I just mirrored TryInitClient
. I think the logic is that try
here is more like attempt
, in the sense that failing is OK, or maybe in the way that try
in try/catch
suppresses errors. Happy to rename it, but maybe we should rename both if we do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming both would be perfect.
43350ec
to
5512b1a
Compare
For now, the old function has been renamed to tryCreateAgentClient because it is not clear to me if it is safe to error in these cases or if it was intentional to create a client even if there was no valid URL or token.
Before we were separately reading the environment, but we can reuse the global flags and the function for creating an agent client. Also, since the tests no longer need to set environment variables they can now be ran in parallel. This is not fully backwards compatible in that the old method would fall back to the non-agent client URL if the agent URL was not set, but that seems like undesired behavior since we are not doing that anywhere else.
Since we can now get status updates from two places, they are placed in a queue so we can handle them one at a time.
5512b1a
to
9f28bac
Compare
9f28bac
to
ffc599d
Compare
Instead of the pop phase. This ensures we do not queue up updates that will just end up being discarded once they are popped (which could take some time due to latency to coderd). It also has the side effect of preserving summaries even when the queue gets too big, because now we preserve them as part of pushing, before they might get lost due to getting dropped while we wait on coderd.
ffc599d
to
866b721
Compare
I force pushed to break out the UI fix but the remaining commits are unchanged. Then I moved the update predicates to the push phase instead of the pop phase, so now if we are waiting on coderd, we will not queue up a bunch of duplicate updates that will just end up getting discarded later anyway. Now they get discarded immediately. This could still use channels, I would just need a mutex over I am also doing the part where we keep the last summary from the agent during the push phase instead, which is nice because now even if an agent update is dropped while waiting on coderd because the queue got too big we can still log it with the next agentapi status update. Granted, with a buffer of 512 this is probably not going to actually happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's awesome. I set it up in a local deployment and it works! Both for reporting status if the ai agent doesn't, and for completing statuses the ai agent didn't.
Minor nits on naming: I believe it's more accurate to refer to claude code and others as "AI Agents" rather than LLMs. Also, whenever referring to the url of a coder/agentapi process, I think we should say "AI agentapi url" rather than "llm agent url" for accuracy.
Also TODO: we need to update the claude code and goose terraform modules to use this.
Also, any references to the API are renamed as "AgentAPI" rather than just "agent".
d68e1dc
to
a79ee72
Compare
This is meant to complement the existing task reporter since the LLM does not call it reliably.
I also did some refactoring to use the common agent flags/env vars, those changes are in separate commits.
For #18163, but will need to update the modules to make use of it.