Skip to content

Project: remote repo sync button stops updating page #10068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
agjohnson opened this issue Feb 24, 2023 · 8 comments · Fixed by #10107
Closed

Project: remote repo sync button stops updating page #10068

agjohnson opened this issue Feb 24, 2023 · 8 comments · Fixed by #10107
Assignees
Labels
Bug A bug Needed: replication Bug replication is required

Comments

@agjohnson
Copy link
Contributor

agjohnson commented Feb 24, 2023

It looks like the Celery task v2 API might be broken. This affects anywhere we're using the readthedocs.core.task layer to inspect long running tasks in the background -- project creation is the main one. The experience for new users might be okayish, if the tasks to resync repositories fires off and completes before the user loads the project creation page.

I noticed the following on the project creation page:

https://readthedocs.org/dashboard/import/

When I hit resync on the project creation page, to resync my remote repositories, the task does start (and does appear to finish), but I get JS failures and there is an associated server failure too:

web_1          |   File "/usr/src/app/checkouts/readthedocs.org/readthedocs/core/utils/tasks/retrieve.py", line 28, in get_task_data
web_1          |     state, info = result.state, result.info                                      
web_1          |   File "/usr/local/lib/python3.10/dist-packages/celery/result.py", line 478, in state
web_1          |     return self._get_task_meta()['status']                                       
web_1          |   File "/usr/local/lib/python3.10/dist-packages/celery/result.py", line 417, in _get_task_meta
web_1          |     return self._maybe_set_cache(self.backend.get_task_meta(self.id))           
web_1          |   File "/usr/local/lib/python3.10/dist-packages/celery/backends/base.py", line 609, in get_task_meta 
web_1          |     meta = self._get_task_meta_for(task_id)                                     
web_1          | AttributeError: 'DisabledBackend' object has no attribute '_get_task_meta_for' 

The JS error is a not-very-helpful Error polling task.

@agjohnson
Copy link
Contributor Author

I'm guessing the call in retrieve.py should be AsyncResult(..., backend=some_backend), but I'd need to poke around to figure out what that value should be.

@agjohnson
Copy link
Contributor Author

I've had no luck with this one, it sounds like #10002 or something related changed the backend of the Celery application. I had tried readthedocs.worker.app.AsyncResult(...), but this is where the DisabledBackend comes from in the first place. It's not exactly clear how the Django cache client would be used here, at least not cleanly.

@humitos are you aware of any workaround here that doesn't involve switching back to the normal results backend?

@humitos
Copy link
Member

humitos commented Feb 27, 2023

@humitos are you aware of any workaround here that doesn't involve switching back to the normal results backend?

As far as I know, there is no workaround for this.

However, the correct solution is to remove this button completely from the UI. We are re-syncing the objects multiple times in different situations automatically now. So, if the repository the user is trying to import does not appear in the list, the problem is a different thing and it won't be solved by clicking the arrows (see, for example, #10009)

@agjohnson
Copy link
Contributor Author

agjohnson commented Feb 27, 2023

Are you sure this is the case? I just reconnected my account and I get an empty repository list to start (I even gave this a few refreshes, assuming perhaps there was a timing issue with task execution):

image

That is, I disconnected/reconnected my account, then went to the project create page -- I did not manually resync the remote repos before

If I resync and wait for the UI failure from the Celery task, then reload the page, I do get results:

image

@agjohnson
Copy link
Contributor Author

Long term, I would agree however. This button is extra steps for the user and very not obvious UX. Plus starting off with an empty repo list is just confusing in general.

But for now, it seems like important functionality that we should fix -- at least until we're more confident that the remote repo list is consistently up to date

@agjohnson
Copy link
Contributor Author

Also also, if we want to "fix" this bug, we could always just continually poll the remote repo API instead of the task watcher API. I'd agree this feels like the wrong direction ultimately though, but it might be easier than fixing the Celery AsyncResult call though.

We are using the AsyncResult pattern somewhere else too though, I'm blanking on where that is though.

@humitos
Copy link
Member

humitos commented Feb 28, 2023

@agjohnson

Are you sure this is the case? I just reconnected my account and I get an empty repository list to start (I even gave this a few refreshes, assuming perhaps there was a timing issue with task execution):

Maybe this event is missing, but we can easily add it.

@agjohnson
Copy link
Contributor Author

agjohnson commented Feb 28, 2023

Talked on chat a bit here.

We should definitely add more events long term -- we should resync on login, signup, and social account connection. There is still the question about how to update repositories for users that do none of the above though -- that is, a user that is logged in and connected already, and then wants to connect a repository that was just created.

This seems like a really good longer term goal.

However, right now this UI/UX is a broken in production, and the bug is already a release or two old at this point. I'm still leaning towards hotfixing this bug in the easiest way possible for now. The easiest fix would be to replace the call to the task inspection API with a simple wait. This is a hack, but at least the UI doesn't break while we're sorting out resyncing bugs.

Alternatively, reverting to the old results backend would resolve this bug as well. It wasn't clear what brought this change in though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A bug Needed: replication Bug replication is required
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants