Skip to content

code-server heartbeat logic #1274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rohinwork opened this issue Jan 13, 2020 · 20 comments
Closed

code-server heartbeat logic #1274

rohinwork opened this issue Jan 13, 2020 · 20 comments

Comments

@rohinwork
Copy link

Description

I am running code-server with multi-tenancy and would like to know when to kill the users containers after a period of inactivity. I am using the heartbeat file that code-server touch's every minute for this.

I would like to know on what basis does code-server decides to touch the heartbeat, i.e what are the conditions that code-server decide as activity and inactivity.

@sr229
Copy link
Contributor

sr229 commented Jan 13, 2020

A little dig on #1115 will let you know how it works.

@rohinwork
Copy link
Author

@sr229 I had a look at this. However I am not able to gather under what conditions is the heartbeat function called.
ex scenarious:

  • no websocket activity
  • code-server view not in focus (code-server tab in background_

@sr229
Copy link
Contributor

sr229 commented Jan 13, 2020

We don't have the ability to send that serverside. Even then, @microsoft is more than concerned for this rather than us.

@code-asher
Copy link
Member

code-asher commented Jan 13, 2020 via email

@antofthy
Copy link

Form my experience, while the browser has codeserver window open and the application is running (some browsers will stop the application) then the heartbeat file will update.

In other words while the user has it open. We terminate the docker environment code-server is running in, when the user had not performed a control operation (we have such operations touch heart beat file to simply matters) , or heartbeat has not been updated in more than 2 hours.

@rohinwork
Copy link
Author

@code-asher Awesome! Thank's for your response. I am happy to contribute a PR for if you could let me know where to get started from.

Thanks :)

@nhooyr
Copy link
Contributor

nhooyr commented Jan 27, 2020

We don't need to send anything to the server, just disconnect the WebSocket if the user has been idle for an excessive amount time. Opening a new issue.

@rohinrohin
Copy link

Hi @code-asher - are you still looking for a feature for the "are you still there message"? (since we use it today to clean up resources - pods in kubernetes) This would be really useful to us and if the community also wants it I would be happy to work on this.

@code-asher
Copy link
Member

Yeah, I'd be open to merging a feature like that. I think we initially decided not to because users will usually disconnect naturally (close the tab, close the browser, computer goes to sleep, I think even in some cases if the browser tab isn't focused it'll close the websocket), but I'm not sure how reliable that all is so a dedicated feature might be helpful.

@antofthy
Copy link

On start up I usually just try a curl request, and see if I get a 500 or a login redirection, looping until I get a response before reporting to the user they can proceed.

@rohinrohin
Copy link

rohinrohin commented Jul 23, 2020

@code-asher any suggestions on approach. Here's what I am thinking:

  1. If we already have a function that's triggered on some user activity in the IDE I can hook to that with a timeout of say 'x' minutes and raise a "reconnecting.." like popup ("Are you still there" - Yes/No). If "No"/ response timeout then possibly touch a file that indicates user doesn't seem to be active. The popup will still be active and the user can choose to click yes at any point in that case the file will be deleted.
  2. If we do not have the above, then raise this popup every 'x' mins to the user for the same functionality as above.

Any other suggestions/approach that you can share?

@code-asher
Copy link
Member

I don't think we have anything specifically for user activity so we'd need to add something. My basic idea is:

  1. Hook up event handlers to the body, probably keydown, mousemove, mousedown, scroll, touchstart.
  2. Whenever those events fire update the last active time.
  3. Add an interval that shows the popup if the last active time is greater than x minutes.
  4. If the user does not answer the popup within a set timeframe disable automatic reconnection and disconnect the web sockets.
  5. If the user comes back and answers the popup reconnect the web sockets and re-enable automatic reconnection.

The first three steps can all be done in lib/vscode/src/vs/server/browser/client.ts. The last two will involve modifying the existing reconnection logic in VS Code since right now it just always automatically reconnects and we need to be able to toggle that on/off.

@futurist
Copy link

@code-asher For user idle detection, there's a Idle Detection API in chrome:

https://web.dev/idle-detection/

Maybe you can looking at some polyfills etc.

@code-asher
Copy link
Member

code-asher commented Jun 30, 2021 via email

@shivanik6z
Copy link

If there have been any HTTP requests in the past minute or if there are any active web sockets then the user is considered active. An active web socket is one that is connected even if there is no activity on it. So this includes idle users that just have the tab open (even if it's in the background). If the user closes the tab or their network connection dies then the heartbeat will stop until they connect again. There's currently no way to detect if the user is idle. I was thinking about capturing idle data on the client-side (probably a timer based on mouse and keyboard activity) then sending that through the socket to the server but it's not implemented yet. Maybe we could show an "are you still there?" message as part of that.

Is there any chance that you are exposing the api which updates the heartbeat file? I want to see the timestamp of heartbeat file for my usecase. Do you have any api for this?

@antofthy
Copy link

antofthy commented Nov 30, 2021

Basically I have a shell script which loops over each of the users currently running a docker instance with code-server.
This script collects the time stamp (how old the file is) of the code-server heartbeat, as stored in the users docker mounted home directory. It also collects another time stamp of when the user started their instance, so as to know if the docker instance is only just starting up, and code-server may not be running or the user hasn't connected to it yet.

Basically I get a list of users, and how long they have been 'idle', as well as other information, like on which 'node' there docker instance is running, and what prepared docker image they are running (for different software development environments).

From this I do the following.

  • Unless 'quiet' (running from cron) - output a table of running users and there idleness
  • kill off any user that has been 'idle' too long (> hour)
  • log counts of, total user instances, if they are 'idle' (> 10 minutes), counts of images, and how many on each docker node.

These stats are logged into system logs (remote) so that I can now generate some usage graphs (splunk), or run other scripts to report on numbers of unique users using the system per day.

So while the scripts actions are fairly simple (loop of logged in users reported 'idleness' from heartbeat files), it does quite a lot with that information.

Getting a time from a heartbeat file (perl gets file times in seconds since time perl was run)...

# Constant
INFINITE=99999999

file_timestamp() {     # get the timestamp of a file.
  local log="$1"
  if [ -e "$log" ]; then
    perl -e '
        $last=(stat( "'"$log"'" ))[9];
        print $^T - $last, " ", $last, "\n";
      '
  else
    echo $INFINITE 0
  fi
}

while read user image; do
   #...
   read  delta time <<<$(file_timestamp "path_to_users_heartbeatfile")
   #...
done < <( docker service ls --format "{{.Name}} {{.Image}}" )

SO I call file_timestamp() for each running users heartbeat file, and a file touched when they logged in, and take the smaller. This works even when the heartbeat file is 'yet to exist' for first time users.

I hope this helps...

@antofthy
Copy link

Update.. Added code to my idle monitor/timeout script, to pull web requests from the docker 'ingress' server for the user container. This lets me capture user interaction with docker container for things like, the 'apache webserver' or any 'NodeJS' servers the user is developing. But more importantly it works for jupyter notebook, who's idle reporting API does not seem to capture file editing, only notebook/kernel running.

However web monitoring does not work completely for idle monitoring code-server. Looks as if code server holds a web connection open, at least for things like a CLI terminal, and as no web requests are being finished while user is using the code-server terminal.

As such checking the code-server heartbeat is a MUST, as part of testing for idle timeout for the users docker service.

@yiliang114
Copy link
Contributor

I want to ask if there is such a situation: I start a very long task (such as ai-related training) through the code-server's termial, and the computer may go to sleep. Within a minute I have no http requests made, and the websocket may also disconnect, does that code-server stop?

@yiliang114
Copy link
Contributor

I want to ask if there is such a situation: I start a very long task (such as ai-related training) through the code-server's termial, and the computer may go to sleep. Within a minute I have no http requests made, and the websocket may also disconnect, does that code-server stop?我想问一下是否有这样的情况: 我通过代码-服务器的termial启动一个非常长的任务 (例如与ai相关的培训),并且计算机可能会进入睡眠状态。在一分钟内,我没有http请求,websocket也可能断开连接,代码服务器停止?

Continue to ask a question: Are the heartbeat and automatic stop functions only suitable for serverless?

I understand that the heartbeat is to automatically stop the node service when the code-server has not been used for a long time, and tell the container service in some way that the current pod can be destroyed, so as to save costs.

However, if I deploy the code-server on a server (which will not stop), do I actually not use the heartbeat function? If I don't have to worry about memory consumption for the time being. Because whether the code-server is stopped or not, my server is always running and charging.

@code-asher
Copy link
Member

code-asher commented Jan 13, 2025

code-server itself will never stop on its own. Some folks have implemented external services that monitor the heartbeat and stop code-server, but code-server has no automatic stop of its own.

However, code-server will kill disconnected terminals after a time, so your long-running process will be killed if left disconnected too long. You may want to run it in something like tmux or screen.

If you are not worried about resource consumption, I would say there is no reason to stop code-server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants