Skip to content

Feature/test health check #128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 21, 2019
Merged

Feature/test health check #128

merged 9 commits into from
May 21, 2019

Conversation

vikasrohit
Copy link

Multiple improvements in the process of consuming kafka events to avoid broken consumers without any notification. In nutshell, it was issue because of unhandled rejection of promise and after node 7.0.0, it started to cause app to crash instead of just logging a warning. This app is using node 8.2.1 so node was crashing the event loop execution immediately after the detection of unhandled rejection of a promise while some other apps (e.g. tc-email-service) are on version 6.x which just logs warning but continues to execute the event loop or they just don't have the cases of unhandled rejection yet.

Here are the things that we have improved to fix the situation and prevent it from occurring in future.

  1. Added .catch handler for the API call promise to have better control over what to be done when it fails to load the mentioned user data. It logs a message (which can be captured and alerted by log aggregator) whenever there is error in fetching details of mentioned user and skips notification for such user so that the process can continue sending notifications to other intended users.
  2. Although, we don't need any thing after 1 to handle the recent incident, we have improved the health check method to detect such state where Kafka consumer is not consuming any message further because of any unhandled rejection errors. It now uses the pause field on the subscription objects of the consumer to detect such incidents. So, now it would cause health check to fail which would in turn result in restarting the process by the container as per configuration.
  3. Also added handler for unhandledRejection event so that we can abort the process as soon as we get any unhandled rejection of promise for a single event loop of node js process which in turn should cause the container to restart the process.
  4. Avoided using explicit constructor anti pattern for the promises here to avoid breaking the promise chain which the reason why the root error or the promise rejection did't bubble up to the parent promises' catch handlers. If this promise chain was not broken at the time of incident, it would not have stuck in unresponsive state and would have continued consuming other events without any restart.
  5. Also, removed one more incident of explicit constructor anti pattern, although that should not have caused any problem as it is not composing any nested promise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant