Skip to content

[$1200] Creating comments and discussions seems to be taking more time than it used to be #381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sdgun opened this issue Feb 9, 2021 · 37 comments

Comments

@sdgun
Copy link
Collaborator

sdgun commented Feb 9, 2021

Steps
Create a comment/discussion in challenge forum or roundtables and monitor the time it takes

Noticed these actions are taking more time as shown in the below screen shots:

Discussion Prod:

Discussion Prod

Discussion Dev:

Discussion Dev

Comment Prod:

Comment Prod

Comment Dev:

Comment Dev

@sdgun sdgun added bug Something isn't working P2 labels Feb 9, 2021
@jmgasper jmgasper changed the title Creating comments and discussions seems to be taking mote time than it used to be [$40] Creating comments and discussions seems to be taking mote time than it used to be Feb 9, 2021
@jmgasper
Copy link
Collaborator

jmgasper commented Feb 9, 2021

Challenge https://www.topcoder.com/challenges/2d61ee7a-b52f-4fb2-94af-c917614755d4 has been created for this ticket.

This is an automated message for ghostar via Topcoder X

@jmgasper
Copy link
Collaborator

jmgasper commented Feb 9, 2021

Challenge https://www.topcoder.com/challenges/2d61ee7a-b52f-4fb2-94af-c917614755d4 has been assigned to obog.

This is an automated message for ghostar via Topcoder X

@jmgasper jmgasper changed the title [$40] Creating comments and discussions seems to be taking mote time than it used to be [$100] Creating comments and discussions seems to be taking mote time than it used to be Feb 11, 2021
@jmgasper jmgasper added P1 and removed P2 labels Feb 11, 2021
@jmgasper
Copy link
Collaborator

@atelomycterus - This is becoming more of a concern. Is there anything I can do in prod to help you debug this?

@atelomycterus
Copy link
Collaborator

@jmgasper I'm going to work on it after #376, #377. The screens show that there are delays on both envs. I'll add php-xdebug and enable a profiler in a dockerfile. It helps to investigate performance issues.

Is there anything I can do in prod to help you debug this?

You may need to copy dump files. I'll analyze them.

Also, the flood settings can increase the posting time. The flood control settings - https://vanilla.topcoder-dev.com/vanilla/settings/floodcontrol.

@jmgasper
Copy link
Collaborator

@atelomycterus - Thanks. I've bumped up the flood control settings in prod so I'll keep an eye on it to see if that makes a difference.

@atelomycterus
Copy link
Collaborator

@jmgasper I am working on it.

Adding comment and updating comment meta-data

Adding a comment takes about 400-600 ms. The issue with the second ajax request which inserts/updates meta-data about the comment (user activity, count of comments, who added/edited the comment,... ) + sends notifications. The second ajax requests works too long. 36 Notifications were not sent from 12 Feb in DEV. This request may take a long time due to PhpmailerException.

TODO: Check server logs in DEV + Need to profile the code.

image

    /** Activity status: The activity sent just fine. */
    const SENT_OK = 2;
    /** Activity status: The activity is waiting to be sent. */
    const SENT_PENDING = 3;
    /** Activity status: The activity could not be sent. */
    const SENT_FAIL = 4;
    /** Activity status: There was an error sending the activity, but it can be retried. */
    const SENT_ERROR = 5;

Reproduced in PROD/DEV. It returns 502 Bad Gateway in PROD. It returns 504 in DEV.
image
image

Adding discussion

I clicked on 'Post Discussion' in PRO/DEV but there were 2 requests (saving draft and posting discussion). Not reproduced it locally.
TODO: Need to debug it on client side.
image

@atelomycterus
Copy link
Collaborator

atelomycterus commented Feb 14, 2021

@jmgasper We've optimized viewing pages, but this issue exists with posting a discussion/comment.

I ran locally with 50+ category, no issues, then I ran with the old Vanilla DB (500+ categories). It works slower. Both issues are reproduced periodically.

Could you provide DB from DEV (there are 1700+ categories, 6400 comments, 2400 discussion)?

Adding discussion and the second request

I clicked on 'Post Discussion' in PRO/DEV but there were 2 requests (saving draft and posting discussion). Not reproduced it locally.

This issue is reproduced periodically (500+ categories) with enabled profiler. If a discussion is added at the nested category (level 5).
If User clicks on 'Post Discussion' and the response has not yet been received because it is slow to execute.
In 60 secs the second request (saving drafts) happens because autosaving runs after 60 secs after typing text.
If posting a discussion works acceptable time, no issue.

Posting a discussion/comment

I ran it with the profiler locally with old Vanilla DB (~500+categories) . It generates 900+MB files per a request for discussion and 400+MB for comment.

  • Insert a discussion/comment. This part works quickly.
  • Update meta data and rebuild cache:
    --Adjust the aggregate post counts for a category. It means get all ancestors for a category
    and iterate through the category and its ancestors, adjusting aggregate counts, update LastXXX fields.
    (e.g. 'Challenges Forum › Data Science Forums › DS F2f 11 23 2020 › Code Questions')
    image

-- Update Category Cache by CategoryID:
--- remove this category from memcached
--- build cache for a category from a scratch.
--- build category tree for the current user based user's permissions.

From Profiling Viewer:
image

image

@jmgasper jmgasper changed the title [$100] Creating comments and discussions seems to be taking mote time than it used to be [$600] Creating comments and discussions seems to be taking mote time than it used to be Feb 15, 2021
@sdgun sdgun changed the title [$600] Creating comments and discussions seems to be taking mote time than it used to be [$600] Creating comments and discussions seems to be taking more time than it used to be Feb 15, 2021
@sdgun sdgun added the Dev env label Feb 15, 2021
@atelomycterus
Copy link
Collaborator

atelomycterus commented Feb 15, 2021

Adding a comment/discussion re-builds cache only for a category + all ancestors. Adding a comment/discussion in 'Challenge discussions' category rebuilds cache for 4 categories. The main performance issue is calculating user permissions for a category in setCache. setCache() is called 24 times for 4 categories. As a result, checkPermissions is called more than 80K+.
TODO: When adding a discussion or comment, the user's permissions to the category will not change. In this case, there is no need to recalculate everything from scratch. The only difference is LastXXX, CountXXX columns.

image

With adding a new comment, it works slower and slower. Adding 9 comments:
image

For the first comment: avg 0.15 for each category: CategoryModel.setCache:return4(), {"execution time (seconds)":0.14945220947265625,"CategoryID":1175}"
...
CategoryModel.setCache:return4(), {"execution time (seconds)":0.5750300884246826,"CategoryID":2335}"
...
CategoryModel.setCache:return4(), {"execution time (seconds)":1.0097100734710693,"CategoryID":1176}
...
For the 8 comment: avg 1.3-1.55, CategoryModel.setCache:return4(), {"execution time (seconds)":1.5211551189422607,"CategoryID":2335}"

@atelomycterus
Copy link
Collaborator

@jmgasper ☝️ This also leads to a huge amount of requests to Memcached.
Could Topcoder give us access to monitoring memcached metrics? If we had access, we could quickly notice it.

@atelomycterus
Copy link
Collaborator

@jmgasper Only for DEV env.

@atelomycterus
Copy link
Collaborator

@jmgasper I ran adding comments using a selenium script. I commented our custom code and ran without checking Group permissions. I wanted to check the execution time of the original code.

When a comment is added:

  • the first request - insert a record in DB (/?discussions=XXX)
  • the second request - adjust the aggregate comment counts for a category + ancestors, update LastXXX fields in DB and rebuild the category cache (remove a category from memcached, build cache for a category from a scratch) and build category tree for the current user based user's permissions. In the screen/video, MAX category depth is 4 (adding a comment to a challenge forums's discussion). So it clears and rebuild cache for 4 categories only.

image

See the video https://youtu.be/JndU9Y2ufnY.

Count Of Categories in DB:
image

Adding a comment, with delay 3 secs, after several seconds usage CPU - 100% .
image

@jmgasper
Copy link
Collaborator

@atelomycterus - Ok, thanks for that. So, if I understand this right, this is a problem with Vanilla? Have they released any new versions recently that we could test to see if the problem is resolved?

@atelomycterus
Copy link
Collaborator

@jmgasper

So, if I understand this right, this is a problem with Vanilla?

yes. Vanilla Team has tried to improve caching functionality vanilla/vanilla#8062. I've seen some issues with memory exhaustion (vanilla/vanilla#7843 and others). Some issues have been fixed, other issues are still open/hold.

The same issue https://open.vanillaforums.com/discussion/38113/vanilla-forum-take-high-cpu-usage, but no replies.

Have they released any new versions recently that we could test to see if the problem is resolved?

New release - https://open.vanillaforums.com/discussion/38573/vanilla-2021-003-rc2-is-now-available but it is not intended for use in production environments.

Same as RC1, there are some warnings and considerations I'll repeat here:
This is an RC (release candidate). It is not intended for use in production environments. It is pre-release software. Install it in some kind of sandbox and go nuts, but avoid overwriting your existing Vanilla installs. Once we're confident in Vanilla 2021.003 on open-source community sites, we'll drop a production-ready release. Future updates probably won't need an RC cycle, but given the length of time since the last release, it seems prudent.

@jmgasper
Copy link
Collaborator

Ok, thanks.

@atelomycterus
Copy link
Collaborator

@jmgasper I am going to work on it today.

  • why the CPU usage is high
  • test our custom Vanilla and profile code using different server configurations (opc, memcached and so on), the main focus on adding comments / posts.
  • do the same with Vanilla 3.3 (no customizations)

Keep you updated.

@atelomycterus
Copy link
Collaborator

@jmgasper Please apply PRs:
#420
topcoder-platform/forums-groups-plugin#58
topcoder-platform/forums-plugins#61

Thanks!

Performance before and after changes

I deployed Vanilla without PRs to compare performance before and after changes.

Discussions

Categories: 1879.

Testing with a selenium script:
image

Call Graph:
image

Comments

Testing with a selenium script, delay 3 secs.

Before: Post-proccesing after adding a comment:
image

After: Post-proccesing after adding a comment:
image

Call Graph
image

@atelomycterus
Copy link
Collaborator

@sdgun @jmgasper Tideways supports Server-Timing API. So I've added it in Vanilla header. After deploying and configuring Tideways, click on/hover a green bar:
image

You should see cache/http/php/sql:
image

Or the same in the header response:
image

@atelomycterus
Copy link
Collaborator

@jmgasper I see more performance bottlenecks in my local env. I think the same issues in DEV/PROD. Let's deploy Vanilla and monitor performance in DEV.

  • verify a token, it takes up to 900ms, avg 680-700 ms
    Get requests:
    image

image

@jmgasper
Copy link
Collaborator

@atelomycterus - This looks good, thanks. Deployed in dev.

I need to deploy to prod today, so if we need to back out anything for Tideways, just let me know.

@jmgasper
Copy link
Collaborator

Payment task has been updated: https://www.topcoder.com/challenges/2d61ee7a-b52f-4fb2-94af-c917614755d4
Payments Complete
Winner: obog
Copilot: ghostar
Challenge 2d61ee7a-b52f-4fb2-94af-c917614755d4 has been paid and closed.

This is an automated message for ghostar via Topcoder X

@jmgasper jmgasper added this to the V1.2 Roundtables milestone Feb 22, 2021
@sdgun
Copy link
Collaborator Author

sdgun commented Feb 22, 2021

Verified in Dev, discussions/comments are created within 3-5 seconds now.

@sdgun sdgun added the QA Pass label Feb 22, 2021
@sdgun
Copy link
Collaborator Author

sdgun commented Feb 22, 2021

Verified in production, the discussions are taking around 8 seconds.

image

@atelomycterus
Copy link
Collaborator

@jmgasper Please apply PRs:
#432
topcoder-platform/forums-groups-plugin#60
Thanks.

Important: Flush cache after deploying to avoid weird behavior (#422 (comment)).

  1. Please add TIDEWAYS_APIKEY env param ([$1200] Creating comments and discussions seems to be taking more time than it used to be #381 (comment)).
  2. build,sh has been updated. Tideways-demon will run only in DEV.
if [ "$ENV" = "dev" ]; then
  ENV=$ENV CI_DEPLOY_TOKEN=$CI_DEPLOY_TOKEN docker-compose -f docker-compose.yml -f docker-compose.dev.yml build $APP_NAME
else
  ENV=$ENV CI_DEPLOY_TOKEN=$CI_DEPLOY_TOKEN docker-compose -f docker-compose.yml build $APP_NAME
fi

@jmgasper
Copy link
Collaborator

@sdgun / @atelomycterus - Deployed in dev. I'm waiting on an update for the API env param

@atelomycterus
Copy link
Collaborator

@sdgun After updating env param I'll check that Tideways is displaying the data. Let's test this issue when Tideways is configured.

@jmgasper
Copy link
Collaborator

@atelomycterus the Param should be set now. I'm redeploying in dev which usually takes less than 10 minutes

@atelomycterus
Copy link
Collaborator

@jmgasper Do you have permissions to create a service in Tideways (#381 (comment))?

@sdgun
Copy link
Collaborator Author

sdgun commented Mar 16, 2021

Verified in production, didn't observe a slowness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants