Skip to content

test: advanced test use cases #3737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Mar 19, 2025
Merged

test: advanced test use cases #3737

merged 18 commits into from
Mar 19, 2025

Conversation

dreamorosi
Copy link
Contributor

@dreamorosi dreamorosi commented Mar 17, 2025

Summary

Changes

Please provide a summary of what's being changed

This PR updates the e2e tests for all the utilities in the following ways:

  • change the order of packages in the matrix and reduce max parallel jobs
  • replace custom JS script that extracts PR info with a bash one
  • centralize timeout configuration
  • add tests for advanced logger use cases

The work on this item started with an assumption that our e2e tests were often timing out due to CloudFormation APIs becoming slow under pressure. Based on this assumption the initial hypothesis was that if we distributed the test suite deployments across multiple regions within the same test run we would parallelize the tests further without incurring into timeouts or slowdowns.

This turned out being a fluke, and making the change did not make the timeouts go away. After talking with @leandrodamascena, he suggested that the issue could've been due to network/load of the GitHub workers. I made several tests and while that helped, it still did not remove the issue completely.

On top of not entirely removing the issue, since it's a runner shared in the GH Org, it introduced additional questions around what will happen when we make releases and tests at the same time. For this reason I decided to not adopt the runner for now and instead continue using the standard ones.

To partially mitigate the issue I instead increased the timeout from 7 to 10 minutes and centralized the config in the Vitest config files rather than having it in every single test case (hence the large diff).

I also reduced the max concurrency from 30 jobs to 25 and tried balancing and mixing packages whose tests are known to more consistently pass and be overall faster to run (i.e. metrics and layers) with others that have more frequent failures or longer runs (i.e. idempotency and logger). By making these changes some of the longer running tests are started immediately together with longer ones, so that we do something like this (simplified):

Screenshot 2025-03-18 at 15 30 11

The PR also adds a new test suite for advanced use cases in logger, that includes log buffering, correlation id, and that prepares things for provisioned concurrency which will be added in #3724.

This is not likely a final solution to the e2e tests being finicky, but after several hours of work I decided to put a pin on it knowing that things are now slightly better (or at least they feel like it). In the future we should probably rework some of the tests entirely to: 1/ deploy less resources, 2/ incorporate retries, 3/ maintain the same assertions while refactoring out problematic/finicky edge cases.

Please add the issue number below, if no issue is present the PR might get blocked and not be reviewed

Issue number: closes #3730


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.

@dreamorosi dreamorosi self-assigned this Mar 17, 2025
@boring-cyborg boring-cyborg bot added the automation This item relates to automation label Mar 17, 2025
@pull-request-size pull-request-size bot added the size/S PR between 10-29 LOC label Mar 17, 2025
@pull-request-size pull-request-size bot added size/M PR between 30-99 LOC and removed size/S PR between 10-29 LOC labels Mar 17, 2025
@boring-cyborg boring-cyborg bot added the tests PRs that add or change tests label Mar 18, 2025
@pull-request-size pull-request-size bot added size/L PRs between 100-499 LOC and removed size/M PR between 30-99 LOC labels Mar 18, 2025
@boring-cyborg boring-cyborg bot added the dependencies Changes that touch dependencies, e.g. Dependabot, etc. label Mar 18, 2025
@pull-request-size pull-request-size bot added size/XXL PRs with 1K+ LOC, largely documentation related and removed size/L PRs between 100-499 LOC labels Mar 18, 2025
@dreamorosi dreamorosi changed the title chore: make tests run in multiple regions tests: advanced test use cases Mar 18, 2025
@dreamorosi dreamorosi changed the title tests: advanced test use cases test: advanced test use cases Mar 18, 2025
@dreamorosi dreamorosi requested a review from am29d March 18, 2025 14:52
@dreamorosi dreamorosi marked this pull request as ready for review March 18, 2025 14:52
Copy link
Contributor

@leandrodamascena leandrodamascena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dreamorosi! I didn't focus on the tests themselves while reviewing because you have more context on that! But the timeout configuration and pipeline look good to me.

@dreamorosi dreamorosi merged commit 61725f4 into main Mar 19, 2025
118 checks passed
@dreamorosi dreamorosi deleted the tests/logger_advanced_e2e branch March 19, 2025 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automation This item relates to automation dependencies Changes that touch dependencies, e.g. Dependabot, etc. size/XXL PRs with 1K+ LOC, largely documentation related tests PRs that add or change tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Maintenance: run e2e tests in on bigger runner
2 participants