-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Prefer matching editor sessions when opening files. #6191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b61e5d6
to
8680610
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution and nice work!
The test failures seem to be unrelated to your changes. Not sure what is going on there.
@code-asher Here's a product question: If two tabs for the same workspace are opened, and if the later-opened tab is closed, should we open files in the first tab? Or are we okay with the first tab being orphaned? To rephrase in pseudocode:
|
Another product question: We should support editor resolution across multiple daemonized code-server instances, right? Or is it sufficient to assume a single a daemonized code-server instance? |
Yes, you're right, the PR right now has the issue of only deleting when opening files via CLI. If we don't want orphan tabs (#6191 (comment)), we cannot overwrite an entry based on some key (i.e. workspace id) as a mechanism to limit the growth of entries, as each session needs to be stored independently. AFAIK this means we need to perform some sort of explicit "garbage collection" procedure to limit the growth of entries. Here's my thought process into how to do this, which builds up to what I'm ultimately proposing (3). 1) Garbage collect on startupSuppose we "garbage collect" when a daemonized code-server instance starts up: iterate through all sockets and delete dead ones. This still leaves the possibility for entries to grow unbounded if only a single code-server instance runs and never stops. 2) Periodic garbage collectionSame as (1), but we garbage collect periodically as to limit entries from growing unbounded given a single code-server instance. Note that if n daemonized code-server instances are running, we have to live with n times more garbage collections. 3) Periodic garbage collection by a single instanceIf we're not okay with n-times more garbage collections, we can allow at most a single instance to own responsibility of garbage collection (a "leader"): Same as (2), but upon garbage collection, we write a All instances will periodically check the
@code-asher WDYT? |
There is also another possibility where we use an HTTP server instead of a file to manage editor sessions. This could be served by a daemonized code-server instance, and run under a dedicated UDS socket.
This way, because a single server handles registration, we never have to deal with races. Also, we can just use JSON instead of new-line delimited JSON. Also, we have an API :) To account for multiple daemonized code-server instances, we can serve under a single instance, using the mechanism detailed in #6191 (comment) to select the serving instance (and garbage-collecting instance). |
Interesting scenario! All else being equal opening in the first tab does seem like the better option but opening the same workspace multiple times at once seems like an unusual workflow to me (maybe there is a use case I am missing though) so I personally would not be opposed to having it be orphaned if that makes things easier for us.
We have generally been operating under the assumption that there is a single code-server daemon per user. I know some people run one instance per project/workspace but that is not how it was meant to be used. That does make me realize we might want to use These are very well thought-out options! If we end up deciding a tab can be orphaned if a second tab with that same workspace is opened then we could do this file-based (named by workspace ID or something) that way we would not need any garbage collection. Actually that would eliminate a type of theoretical race the garbage collection system has: if a tab is opened as garbage collection is running after the collector reads but before it writes. code-server is meant to be long-lived so I would avoid only garbage collecting on startup and since we expect one instance per user I think it will not be worth the effort to do any coordination between multiple instances. I think the API idea is brilliant. Come to think of it, we actually already use this pattern where VS Code checks for updates against an We could probably skip the GET handler if we wanted, since it would be the same instance we could just call something directly. But maybe a GET handler would be nice if there are external tools that want to hook into this API or something. The only other ideas I have are to do some kind of file locking or store the socket paths via VS Code's storage provider which presumably does its own file locking. The advantage to one of those is that the change would be more likely to be accepted upstream one day, but since this is already a feature specific to code-server going the API route seems reasonable to me. |
So I was trying to see how native VS Code does this and I think they do it by creating predictable socket names (see Then when opening a file like Presumably VS Code cleans up their own sockets so we would not have to worry about that. Or even if they do not clean up their own sockets at least it would be an already existing problem rather than a regression. |
Great, not having to worry about multiple instances should dramatically simplify things :) If we combine this assumption with going the server route, we can simplify things even more by keeping the session registry entirely in memory (instead of in a file). Once the daemon dies, all the sessions end up dying too, so we'll never have sessions from previous startups.
If I reflect on my own usage of code-server, for most part I converge to one editor session per workspace. But then sometimes I accumulate enough Chrome windows and tabs to the point that spinning up a new editor session to a workspace I might already have opened is less work than finding my original tab. Or suppose I accidentally open the workspace again in some tab. In either case, I would not expect opening and closing the later opened tab to affect the behavior of some previously opened tab. Here's another thing to consider: once #1964 is resolved, you could imagine use cases popping up that involve hotlinking / bookmarking files, which probably only makes sense to open in new editor sessions (without something like a Chrome extension to foreground existing tabs, but I digress). For example, I currently use code-server to jot down notes in markdown in a single workspace, and once #1964 is resolved I'd want to be able to make quick ephemeral edits to certain files by just clicking into my bookmarks bar. FWIW VSCode will open files in an earlier-opened tab if a later-opened tab is closed. |
I traced through where the IPC path comes from in extHostExtensionService and it seems like it actually uses
That's an interesting idea, to use the filesystem itself as some sort of registry, where the paths are esssentially registry keys. And not have to maintain a separate source of state. We still have an orphan tab problem that any sort of registry with overwrites would cause. But even if we ignore that, we run into issues once we consider multi-root workspaces. For example, a workspace with two roots would have to be registered under two keys, aka two paths. And a socket can only live under one path. (...unless we symlinked the socket under different path... hmm, yeah I'm not sure if this is worth it all just to avoid having a separate source of state...) What I'm thinking: I think if we go the http server route, the registry can just live entirely in memory, which should avoid a lot of cleanup since the editor session registry is always effectively cleared on restart.
|
Yup exactly, by native VS Code I mean the Electron version. My thinking was that if all of these cases work as expected in Electron-based VS Code and we want parity then we could just copy what they are doing into the web version. Buuuuut I completely misread what they do; I thought they were creating a static IPC handle per workspace and using that to open in the right workspace but I think they actually just always have the single socket and connect every instance to that, then I imagine they find the right instance in memory. Since the web always spins up a new socket on every connection (unfortunately) there is no way we can copy that without changing the architecture to match. 😢
No orphan tabs I think because we would append with a suffix if the socket is in use by another tab (so we never overwrite active sockets). We might have inactive sockets if VS Code is not cleaning them up but no orphaned tabs.
Ahh excellent point! The symlink idea is quite clever but yeah I agree with abandoning the file system idea; I was wrong about Electron-based VS Code doing the same thing which was the only reason I even brought it up. 🤦 All things considered HTTP seems like the best path forward to me.
From reading your use cases I feel convinced we should avoid orphaning tabs. One alternative idea to GC is to remove the socket when the extension host terminates, maybe through a DELETE endpoint. We could put the API calls in https://github.com/microsoft/vscode/blob/9ff83d41a3268ec578c032ae54102285b447e947/src/vs/workbench/api/node/extensionHostProcess.ts in the |
Hmm one thing to consider is that the extension host will need to know where to send the HTTP request which I suppose means we need to send the port or socket that code-server is listening on down to the extension host. But if we have to send a message down anyway, maybe we could send a path for it to register its socket on (rather than have it generate one randomly) then we would not need the extension host to make any HTTP calls back since we already know the socket it will use. When the host process exits we can remove the socket. Or alternatively we could post messages back up to code-server through |
Ah wait no we could not send down the socket path because we need to know all the workspaces attached to it as well. Although we could figure this out by checking the And I think sending the port/socket would not work either because it might be authenticated and the extension host will not have credentials. We could probably come up with something for that but maybe it would be easier to send the IPC handle and workspaces up through A separate non-authenticated socket just for this could work although I suppose an extra socket is somewhat unfortunate. We could add a new command to VS Code's socket that replies with its workspaces and get them that way but that seems like a lot to me. |
Hmm, why don't we just do something like
Oooh, that's not a bad idea. I like the simplicity of it, though it does require yet another patch, for arguably a pretty non-critical piece of functionality.
Let me take a first pass with the HTTP server and I'll try your suggestion with the DELETE endpoint. |
Yeah we can add another socket although we may want to avoid |
Come to think of it, I guess we actually have to add another socket with a predictable file path so the external CLI can interact with it otherwise we would have to go back to storing a list of sockets in a file which is the whole thing we are trying to avoid hahaha |
f678836
to
a3bee4d
Compare
2554c19
to
15156a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally got around to finishing this!
Now a daemonized instance will spin up an additional http server at $TMPDIR/code-server.sock
.
- The extension host connects to it to register sessions on creation and deregister on termination. No garbage collection needed :)
- The CLI connects to it to get the best matching active socket
Yeah we can add another socket although we may want to avoid
$tmpdir
and do something user-based to avoid users being able to interfere with other users' instances.
Both the extension host's vscode-ipc-XXX.sock
files and the old vscode-ipc
file use $TMPDIR, so perhaps this could be done a in a separate PR?
@@ -17,7 +17,7 @@ describe("createApp", () => { | |||
beforeAll(async () => { | |||
mockLogger() | |||
|
|||
const testName = "unlink-socket" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you're aware of this but there are maximum unix socket path lengths of ~100 chars, so I had to shorten the test name to get this working on MacOS.
Without shortening it, there's some really wonky behavior where it thinks the socket already exists, which seems to be caused by the fact we're creating two sockets now.
Spent 2/3 hours debugging before I realized the socket path lengths were the issue. I'd suggest that we error out explicitly if the socket path lengths exceed the OS maximum, or at least shorten the test tmpdir paths, lest some poor developer come across this silent failure mode again...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow that is wild. Good to know! Erroring out seems very wise to me. Sounds like the perfect nightmare of a bug.
Absolutely stellar! I will review the changes first thing tomorrow. Thank you sticking through all the back and forth!
That makes perfect sense! 👍 |
One thing we will have to do is update VS Code to 1.78.2 and make sure the patches still apply since mainline was updated to 1.78.2 somewhat recently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just rebased it to the latest version on main (with 1.78.2), works so far for me!
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #6191 +/- ##
==========================================
+ Coverage 72.61% 73.54% +0.93%
==========================================
Files 30 31 +1
Lines 1749 1860 +111
Branches 387 399 +12
==========================================
+ Hits 1270 1368 +98
- Misses 402 415 +13
Partials 77 77
Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the linter wants something formatted then we are good to go!
For some reason when I run yarn watch
I still get no folders but I downloaded the build from CI and it works perfectly fine for me so I figure it is something weird on my machine.
Signed-off-by: Sean Lee <[email protected]>
Cool, fixed the formatting issue! |
Thanks again! |
Signed-off-by: Sean Lee <[email protected]>
Fixes #5709
readSocketPath
is now replaced byVscodeSocketResolver
which will prefer editor sessions that match the files being opened.How it works:
vscode-ipc
.vscode-ipc
vscode-ipc
vscode-ipc
with those sockets deleted. This also serves as a mechansim to prevent the file from growing unbounded.Notes:
Future improvements:
/foo/bar/
beats/foo/
given/foo/bar/baz.txt
). This was left out, but should not be too hard to add.