Skip to content

[content-service] cannot restart stopped workspace #11183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kylos101 opened this issue Jul 6, 2022 · 50 comments · Fixed by #15216
Closed

[content-service] cannot restart stopped workspace #11183

kylos101 opened this issue Jul 6, 2022 · 50 comments · Fixed by #15216
Assignees
Labels

Comments

@kylos101
Copy link
Contributor

kylos101 commented Jul 6, 2022

Bug description

The full error on start is:

cannot initialize workspace: cannot restore backup: tar /dst: tar /dst: exit status 2;tar: .docker-root/overlay2/dedcf720e850fe485571223ecd9959f159eb8eef5250e7ee42ac96f010a3e369/diff/root/.npm/_cacache/content-v2/sha512/2c/49/1e6cfe1a91b8ccb4030ecac14cfe4d1d4e557aaf369a254fd344a9a7e864c7f4cab9edb19e21e7313357cb8eaacc48e19d7fc2b1d2ed294af55846548846: Cannot open: Permission denied tar: .docker-root/overlay2/dedcf720e850fe485571223ecd9959f159eb8eef5250e7ee42ac96f010a3e369/diff/root/.npm/_cacache/content-v2/sha512/74/de/010104b1dac999964663a5780e77912dd860c4bf1089dabe1f9b8175af2aaed3b5e5cba0f24fe31bfc38f19bcc40ffd609a6d2ab6ea02561055f96ba53a6: Cannot open: Permission denied tar: Exiting with failure status due to previous errors

Logs:
https://cloudlogging.app.goo.gl/82CpwrNzW9oJWX1Q8

Log entry for the error:
https://console.cloud.google.com/logs/query;cursorTimestamp=2022-07-06T11:31:39Z;query=resource.labels.cluster_name%20:%20%2528%22eu51%22%2529%0A%22eb595f2b-7de4-4248-800d-7dfe0280f802%22%0Atimestamp%3D%222022-07-06T11:31:39Z%22%0AinsertId%3D%227hkbcwop4k5qd0t4%22;summaryFields=:false:32:beginning:false;timeRange=P1D?project=workspace-clusters

Trend over last 7 days:
https://cloudlogging.app.goo.gl/qdGdJ7CrD1zmLyrQ8

image

Steps to reproduce

  1. Open this repository on Gitpod: https://gitlab.com/6uliver/gitpod-workspace-restart-error-repro
  2. Wait for the workspace and Supertokens to be started, you should see this message in the console: "Started SuperTokens on 0.0.0.0:3567"
  3. Stop the workspace and wait for it to be stopped
  4. Open and start workspace

Expected result

The workspace can be started successfully and Supertokens is running.

Actual result

You can see a Gitpod error page with the text "Oh, no! Something went wrong!" and the following long error message:

initializeWorkspaceContent failed: cannot initialize workspace: cannot restore backup: tar /dst: tar /dst: exit status 2;
tar: .docker-root/overlay2/a061703a1735ce483b1f0d56341fe6943e2adf755b31b20a3f8ed82ee8f38e9e/diff/usr/lib/supertokens/jre/legal/java.base/ADDITIONAL_LICENSE_INFO: Cannot open: Permission denied 
tar: .docker-root/overlay2/a061703a1735ce483b1f0d56341fe6943e2adf755b31b20a3f8ed82ee8f38e9e/diff/usr/lib/supertokens/jre/legal/java.base/ASSEMBLY_EXCEPTION: Cannot open: Permission denied 
...

Workspace affected

https://prlct-shipapp-5830i5ovwcb.ws-eu51.gitpod.io/

Expected behavior

The ability to restart a stopped workspace

Example repository

No response

Anything else?

How to test? From here.

How to test

  1. Run these tests 👇
  • Open a workspace. In the terminal execute:
touch basic
sudo touch root-file
touch executable-file
chmod +x executable-file
touch suid-file
chmod 2555 suid-file
mkdir suid-directory
chmod g+s suid-directory
  • Check the output of ls -lat, should be similar to
drwxr-x--- 8 gitpod gitpod 4096 Apr 28 17:31 .
-r-xr-sr-x 1 gitpod gitpod    0 Apr 28 17:31 suid-file
drwxr-sr-x 2 gitpod gitpod    6 Apr 28 17:31 suid-directory
-rwxr-xr-x 1 gitpod gitpod    0 Apr 28 17:30 executable-file
-rw-r--r-- 1 root   root      0 Apr 28 17:30 root-file
-rw-r--r-- 1 gitpod gitpod    0 Apr 28 17:30 basic
  • Stop and open the workspace
  • Open a terminal and run ls -lat
  • Compare the outputs
  1. Assert the steps to recreate here do not fail

  2. Follow the steps to recreate in this issue, but, before stopping your workspace, stop the supertokens container, and then stop your workspace. Restart your workspace, it should avoid this error.

Front logo Front conversations

@kylos101 kylos101 added the type: bug Something isn't working label Jul 6, 2022
@kylos101 kylos101 moved this to Scheduled in 🌌 Workspace Team Jul 6, 2022
@kylos101
Copy link
Contributor Author

kylos101 commented Jul 6, 2022

@utam0k is this something you could look at next, once you are free? 🙏 It appears to be impacting many users.

@sagor999
Copy link
Contributor

sagor999 commented Jul 7, 2022

This is probably related to other similar issues of accessing .docker-root folder:
#10569
#10108

@utam0k utam0k self-assigned this Jul 8, 2022
@utam0k utam0k moved this from Scheduled to In Progress in 🌌 Workspace Team Jul 8, 2022
@utam0k
Copy link
Contributor

utam0k commented Jul 8, 2022

There really seem to be a few more users with the same error.
https://cloudlogging.app.goo.gl/owK8qukCGMcpkKQk7
image

@utam0k utam0k moved this from In Progress to Scheduled in 🌌 Workspace Team Jul 8, 2022
@utam0k utam0k removed their assignment Jul 8, 2022
@utam0k
Copy link
Contributor

utam0k commented Jul 8, 2022

I removed the assignee by myself because I'll go into the vacation next week.

@utam0k
Copy link
Contributor

utam0k commented Jul 8, 2022

I have tried everything and could not reproduce it with only this information in its current state. Someone else encountered the same error in our repository below, but I was able to reproduce it.
https://github.com/gitpod-io/template-docker-compose

Maybe some manipulation is needed in the container.

@filipjnc
Copy link

I encounter the same issue. I cannot start/restart a workspace, so I need to create a new one everytime. Work not pushed is work lost.

@filipjnc
Copy link

Decided to try it out again today. Worked fine for an hour until the container stopped while working. Afterwards, cannot start the workspace. All my hour worth of work is gone - so annoying. Come one GitPod, get it together and fix it.


cannot initialize workspace: cannot restore backup: tar /dst: tar /dst: exit status 2;tar: .docker-root/overlay2/0ad5fb59a0da61c61bef0492772e450b29c3afb2c36d058fb30501d3fff86ac0/diff/usr/lib/supertokens/jre/legal/java.base/ADDITIONAL_LICENSE_INFO: Cannot open: Permission denied tar: .docker-root/overlay2/0ad5fb59a0da61c61bef0492772e450b29c3afb2c36d058fb30501d3fff86ac0/diff/usr/lib/supertokens/jre/legal/java.base/ASSEMBLY_EXCEPTION: 
…

@utam0k
Copy link
Contributor

utam0k commented Jul 31, 2022

Hi, @filipjnc. Thanks for your report. This issue is already scheduled but has not yet been worked on due to priorities.
Can I ask you to share the repository or how to reproduce it?

@filipjnc
Copy link

Hi @utam0k,
Workspace ID: filipjnc-dtlearnhunt-l2swncfwtr7
Cluster: ws-eu54.gitpod.io

I can reproduce it as follows:

  • Start new workspace from GitHub repo
  • Shut down workspace or wait for timeout
  • Can't restart it again, the error above keeps coming up. As if the container has been permanently corrupted.

@utam0k
Copy link
Contributor

utam0k commented Jul 31, 2022

@filipjnc Thanks for your information. It helps us to resolve it. Is the reproduction rate 100%?

@filipjnc
Copy link

@filipjnc Thanks for your information. It helps us to resolve it. Is the reproduction rate 100%?

Yes. Can reproduce it every time on my end. All old (damaged) workspaces could never be started again.

@utam0k
Copy link
Contributor

utam0k commented Aug 1, 2022

Sorry for the trouble. Thanks for your help.

@kylos101 kylos101 moved this from Scheduled to Backlog in 🌌 Workspace Team Aug 1, 2022
@semiautomatix
Copy link

image

Unfortunately this is about the fourth workspace I've lost. I'm becoming obsessive about git commit before I switch off as trust in Gitpod is minimal at the moment.

@semiautomatix
Copy link

Hi, @filipjnc. Thanks for your report. This issue is already scheduled but has not yet been worked on due to priorities. Can I ask you to share the repository or how to reproduce it?

As way of possibly testing, I'm using a skeleton of this project: https://github.com/sprintcube/docker-compose-lamp

@semiautomatix
Copy link

And again...

image

I forgot to push one of my commits, fortunately it's a small change. But annoying as because now I have to recreate the whole damn docker image.

@semiautomatix
Copy link

Attempted docker-compose down to bring everything down and no dice.

image

@jenting
Copy link
Contributor

jenting commented Aug 9, 2022

Hi, @filipjnc. Thanks for your report. This issue is already scheduled but has not yet been worked on due to priorities. Can I ask you to share the repository or how to reproduce it?

As way of possibly testing, I'm using a skeleton of this project: https://github.com/sprintcube/docker-compose-lamp

@semiautomatix Sorry for the late reply 🙏

I tried to use the repo https://github.com/sprintcube/docker-compose-lamp you provided

  • Open workspace
  • Run docker-compose run
  • Write some files under the path /workspace/docker-compose-lamp
  • Manually stop the workspace

However, I can't reproduce it. Would you please provide more detailed steps? Thank you.

@jenting
Copy link
Contributor

jenting commented Aug 9, 2022

Hi @utam0k, Workspace ID: filipjnc-dtlearnhunt-l2swncfwtr7 Cluster: ws-eu54.gitpod.io

I can reproduce it as follows:

  • Start new workspace from GitHub repo
  • Shut down workspace or wait for timeout
  • Can't restart it again, the error above keeps coming up. As if the container has been permanently corrupted.

@filipjnc Thanks for providing the information. I tried to access your repo to reproduce it, however since it's private so I can't do any further testing and reproduce it.

@semiautomatix
Copy link

Hi, @filipjnc. Thanks for your report. This issue is already scheduled but has not yet been worked on due to priorities. Can I ask you to share the repository or how to reproduce it?

As way of possibly testing, I'm using a skeleton of this project: https://github.com/sprintcube/docker-compose-lamp

@semiautomatix Sorry for the late reply 🙏

I tried to use the repo https://github.com/sprintcube/docker-compose-lamp you provided

  • Open workspace
  • Run docker-compose run
  • Write some files under the path /workspace/docker-compose-lamp
  • Manually stop the workspace

However, I can't reproduce it. Would you please provide more detailed steps? Thank you.

Thanks for the update. I was hoping a clean pull would recreated the error.

Additionally, I've cloned the project, added code to the www folder, started docker-compose, imported an SQL file into the database.

I'll attempt to recreate the error, and provide access to the repo.

@atduarte atduarte moved this from Breakdown to Scheduled in 🌌 Workspace Team Aug 16, 2022
@axonasif
Copy link
Member

@6uliver sorry for the trouble, could you please email us at [email protected] with your workspace ID? Our support team could try getting a backup for you.

@6uliver
Copy link

6uliver commented Nov 18, 2022

@6uliver sorry for the trouble, could you please email us at [email protected] with your workspace ID? Our support team could try getting a backup for you.

Sure, thank you for your help, maybe I will write. But it's a temporary solution to ask support to restore my workspace every time when this problem happens :)

@atduarte
Copy link
Contributor

atduarte commented Nov 24, 2022

I'm sorry we haven't solved this yet, @6uliver and @YoungElPaso. I just tested the repository linked above (thanks, @6uliver!) and the error continues to happen.

@kylos101 given the huge impact of this error, I will add this to breakdown. Btw was it marked as blocked because of PVC?

@kylos101
Copy link
Contributor Author

@atduarte I removed the block, as we've stopped the PVC work. In other words, we blocked this issue to focus effort (time) on PVC. Now that that's stopped, we should resume resolving this issue. As a 🛹 , perhaps on stop, we can issue docker-compose stop ourselves, and stop running containers, before actually starting the backup

@atduarte
Copy link
Contributor

@kylos101 👍 if that doesn't work for somet reason, another (not great, but pragmatic and temporary) option might be allowing users to define shutdown tasks in the .gitpod.yml. I believe @svenefftinge looked into that in the past.

Hoping we can fix the core issue thought 🙏

@kylos101
Copy link
Contributor Author

kylos101 commented Nov 28, 2022

Hoping we can fix the core issue thought pray

💯 🎯 yes, I believe @Furisto may even have a branch that could help with this, but, it's in a draft state. @Furisto is that right?

@sagor999
Copy link
Contributor

sagor999 commented Nov 29, 2022

@kylos101 @svenefftinge was working on a more generic approach to this problem, allowing users to specify shutdown tasks. Here is that PR: #11287
But it seems like work has stopped on it. :(

@kylos101
Copy link
Contributor Author

Added steps to recreate and how to test (by inspecting prior PRs), moving to Scheduled

@kylos101 kylos101 moved this from Breakdown to Scheduled in 🌌 Workspace Team Nov 29, 2022
@Furisto Furisto self-assigned this Dec 2, 2022
@Furisto Furisto moved this from Scheduled to In Progress in 🌌 Workspace Team Dec 2, 2022
@Furisto Furisto moved this from In Progress to Scheduled in 🌌 Workspace Team Dec 5, 2022
Repository owner moved this from Scheduled to Awaiting Deployment in 🌌 Workspace Team Dec 7, 2022
@kylos101
Copy link
Contributor Author

@utam0k can you move this issue to In Validation? It's PR is deployed.

@kylos101
Copy link
Contributor Author

👋 @6uliver @YoungElPaso @filipjnc @semiautomatix @nisan1337 @Nishchit14 , we wanted to reach out and let you know that this issue is in fact resolved as of gen79 (us79 or eu79). I just double checked logs, and there are no traces of this error as of gen79 (which contains the fix). Let us know if you continue to have any trouble restarting stopped workspaces?

@utam0k utam0k moved this from Awaiting Deployment to In Validation in 🌌 Workspace Team Dec 15, 2022
@6uliver
Copy link

6uliver commented Jan 4, 2023

👋 @6uliver @YoungElPaso @filipjnc @semiautomatix @nisan1337 @Nishchit14 , we wanted to reach out and let you know that this issue is in fact resolved as of gen79 (us79 or eu79). I just double checked logs, and there are no traces of this error as of gen79 (which contains the fix). Let us know if you continue to have any trouble restarting stopped workspaces?

Great news! We tried it out for our project and it's working fine! Thank you very much, it has a great impact for us!

@Furisto Furisto moved this from In Validation to Done in 🌌 Workspace Team Jan 4, 2023
@filipjnc
Copy link

filipjnc commented Jan 4, 2023

I'm able to restart the pods without issues now too. Loving gitpod again :)

@srgwsrgwetgethg
Copy link

Having this issue now with a workspace. See also #16660. Opening a new workspace with the same prebuild does not help. Rebuilding the prebuild also does not help. This is serious.

@kylos101
Copy link
Contributor Author

kylos101 commented Jul 11, 2023

👋 @srgwsrgwetgethg this was an incident https://www.gitpodstatus.com/incidents/rs838czq8clg, and resolved by #18236

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.