Skip to content

Bug: BaseBatchProcessingError.child_exceptions doesn't handle None case #1203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
peterschutt opened this issue May 14, 2022 · 23 comments · Fixed by #2155
Closed

Bug: BaseBatchProcessingError.child_exceptions doesn't handle None case #1203

peterschutt opened this issue May 14, 2022 · 23 comments · Fixed by #2155
Labels
batch Batch processing utility help wanted Could use a second pair of eyes/hands tech-debt Technical Debt tasks typing Static typing definition related issues (mypy, pyright, etc.)

Comments

@peterschutt
Copy link
Contributor

Expected Behaviour

This is a bit of a nitpick but something I noticed on my travels...

https://github.com/awslabs/aws-lambda-powertools-python/blob/1ca74d36af7165f41dc2d6b39a8f36caacd69677/aws_lambda_powertools/utilities/batch/exceptions.py#L11-L24

child_exceptions parameter is typed Optional[List[ExceptionInfo]] but the None case isn't handled anywhere before iteration over the attribute occurs in format_exceptions() method.

Current Behaviour

Instantiating the exception and calling format_exceptions() without providing a list for the child_exceptions param will raise an exception.

Code snippet

from aws_lambda_powertools.utilities.batch.exceptions import BaseBatchProcessingError

err = BaseBatchProcessingError(msg="something")  # types ok
err.format_exceptions("parent exception string")

Possible Solution

Most likely just make sure attrib is always a list without having to change the signature of the class.

class BaseBatchProcessingError(Exception):
    def __init__(self, msg="", child_exceptions: Optional[List[ExceptionInfo]] = None):
        ...
        self.child_exceptions = child_exceptions or []

Steps to Reproduce

See code snippet above.

AWS Lambda Powertools for Python version

latest

AWS Lambda function runtime

3.9

Packaging format used

PyPi

Debugging logs

No response

@peterschutt peterschutt added bug Something isn't working triage Pending triage from maintainers labels May 14, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented May 14, 2022

Thanks for opening your first issue here! We'll come back to you as soon as we can.

@michaelbrewer
Copy link
Contributor

More likely to fail before this, so maybe child_exceptions should not be optional.

            raise SQSBatchProcessingError(
                msg=f"Not all records processed successfully. {len(self.exceptions)} individual errors logged "
                f"separately below.",
                child_exceptions=self.exceptions,
            )

michaelbrewer added a commit to gyft/aws-lambda-powertools-python that referenced this issue May 16, 2022
Change:
- Allow for child_exceptions being none in format_exceptions

fixes aws-powertools#1203
@michaelbrewer
Copy link
Contributor

@peterschutt if this is accepted as an 'issue', i made a PR (#1205) based on your fix.

@sthulb
Copy link
Contributor

sthulb commented May 16, 2022

I can’t stress this enough @michaelbrewer, In an effort to prevent you from wasting time, you should wait for issues to prioritised, this won't be merged for a long time until we're ready to add new features.

@michaelbrewer
Copy link
Contributor

I fully understand the risk of putting my unpaid time into this.

In this case, this is a bug and not a feature, so out of respect for my fellow Powertools users i submitted a fix PR for the one offending line

Feel free to drop these PRs or ignore until August 2022

@heitorlessa
Copy link
Contributor

hey @peterschutt thank you for flagging another piece we can improve 🙏. We'll look into it after some major changes in exposing the roadmap, list of maintainers, and our new end-to-end testing framework (internal, but we might explore ways to expose to customers).

If you however ran into an issue with Batch without explicitly calling format_exceptions, please do let us know and either of the maintainers[1] can prioritize it ASAP.

From memory, the Optional was a leftover type hint after several refactoring attempts (legacy SQS -> new SQS), and while we're passing an actual exception as part of our failure_handler mechanics.... we can surely do better - besides this, I just spotted a few other areas for internal improvements so we can handle them altogether.

[1] Currrent maintainers: @am29d @mploski @sthulb @heitorlessa.

Thank you!

@heitorlessa heitorlessa added typing Static typing definition related issues (mypy, pyright, etc.) area/batch and removed bug Something isn't working triage Pending triage from maintainers labels May 17, 2022
@michaelbrewer
Copy link
Contributor

michaelbrewer commented May 17, 2022

At Fiserv, our developers run into this kind of misleading typing of public apis and error handling within the library all of the time. Hopefully some stability comes through.

from aws_lambda_powertools.utilities.batch.exceptions import SQSBatchProcessingError
print(SQSBatchProcessingError("invalid input"))

Outputs:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/michaelbrewer/projects/python/aws-lambda-powertools-python/aws_lambda_powertools/utilities/batch/exceptions.py", line 37, in __str__
    return self.format_exceptions(parent_exception_str)
  File "/Users/michaelbrewer/projects/python/aws-lambda-powertools-python/aws_lambda_powertools/utilities/batch/exceptions.py", line 19, in format_exceptions
    for exception in self.child_exceptions:
TypeError: 'NoneType' object is not iterable

@michaelbrewer
Copy link
Contributor

michaelbrewer commented May 17, 2022

A typing based change would actually have to be a breaking change:

As msg becomes a required parameter to allow for child_exceptions to be an none-optional list

patch below:

diff --git a/aws_lambda_powertools/utilities/batch/exceptions.py b/aws_lambda_powertools/utilities/batch/exceptions.py
index d90c25f1..f33f6aef 100644
--- a/aws_lambda_powertools/utilities/batch/exceptions.py
+++ b/aws_lambda_powertools/utilities/batch/exceptions.py
@@ -9,7 +9,7 @@ ExceptionInfo = Tuple[Optional[Type[BaseException]], Optional[BaseException], Op


 class BaseBatchProcessingError(Exception):
-    def __init__(self, msg="", child_exceptions: Optional[List[ExceptionInfo]] = None):
+    def __init__(self, msg: str, child_exceptions: List[ExceptionInfo]):
         super().__init__(msg)
         self.msg = msg
         self.child_exceptions = child_exceptions
@@ -27,7 +27,7 @@ class BaseBatchProcessingError(Exception):
 class SQSBatchProcessingError(BaseBatchProcessingError):
     """When at least one message within a batch could not be processed"""

-    def __init__(self, msg="", child_exceptions: Optional[List[ExceptionInfo]] = None):
+    def __init__(self, msg: str, child_exceptions: List[ExceptionInfo]):
         super().__init__(msg, child_exceptions)

     # Overriding this method so we can output all child exception tracebacks when we raise this exception to prevent
@@ -40,7 +40,7 @@ class SQSBatchProcessingError(BaseBatchProcessingError):
 class BatchProcessingError(BaseBatchProcessingError):
     """When all batch records failed to be processed"""

-    def __init__(self, msg="", child_exceptions: Optional[List[ExceptionInfo]] = None):
+    def __init__(self, msg: str, child_exceptions: List[ExceptionInfo]):
         super().__init__(msg, child_exceptions)

     def __str__(self):
diff --git a/tests/functional/test_utilities_batch.py b/tests/functional/test_utilities_batch.py
index a5e1e706..3db2ec20 100644
--- a/tests/functional/test_utilities_batch.py
+++ b/tests/functional/test_utilities_batch.py
@@ -908,3 +908,7 @@ def test_batch_processor_error_when_entire_batch_fails(sqs_event_factory, record

     # THEN raise BatchProcessingError
     assert "All records failed processing. " in str(e.value)
+
+
+def test_child_exceptions_is_required():
+    assert SQSBatchProcessingError(msg="required", child_exceptions=[]).format_exceptions("example") == "example\n"

@peterschutt
Copy link
Contributor Author

A typing based change would actually have to be a breaking change:

Quite possible that somewhere downstream people are relying on self.child_exceptions being None if not passed in. If you accept that, then my original suggestion of child_exceptions or [] in __init__() isn't really a safe change either.

The safest change would be to address it right at the point of iteration in format_exceptions(), e.g., :

     def format_exceptions(self, parent_exception_str): 
         ...
         for exception in self.child_exceptions or []: 
         ...

... then could use warnings and deprecate the behavior of not passing in the child_exceptions param when instantiating the error with the aim to eventually have it all nice and tidy.

Another point, the exception class doesn't really care that the exceptions are a list or not, just that they can be iterated over in format_exceptions(), so probably the constructor could be typed a little more liberally as well, Iterable or Collection.

@michaelbrewer
Copy link
Contributor

michaelbrewer commented May 17, 2022

If you rely on self.child_exceptions being None, then you are likely running into the same bug and have a runtime error. Either runtime fix makes sense. My PR is for your one.

Either way, being part of the public api, this bug will trip up people.

@peterschutt
Copy link
Contributor Author

If you rely on self.child_exceptions being None, then you are likely running into the same bug and have a runtime error.

... if the way you use the exception results in it getting serialized via __str__(), sure.

My PR is for your one.

If I was lodging the same bug again I'd definitely make the change in format_exceptions() and not in the constructor. It maintains the existing interface, ensures the class at least honors the existing interface, and eliminates any issues arising from the surface area of the change.

In any case, the impression that I get from @sthulb and @heitorlessa is that there are some higher level things worthy of more attention at the moment. I only lodged the report because it didn't look right, not because I was directly affected by the issue in any meaningful way.

@michaelbrewer Thanks for doing the PR though, and if @sthulb or @heitorlessa want to merge it, I'll be happy to make any changes at that point.

@heitorlessa
Copy link
Contributor

heitorlessa commented May 18, 2022 via email

@heitorlessa heitorlessa added the internal Maintenance changes label May 18, 2022
@michaelbrewer
Copy link
Contributor

PR is already up

Screen Shot 2022-05-17 at 8 56 52 PM

@michaelbrewer
Copy link
Contributor

@peterschutt sure, i update the PR to match all of the feedback. As we use powertools in production at Fiserv, this will likely help us there too.

@peterschutt
Copy link
Contributor Author

@heitorlessa I was curious why this wasn't already picked up by mypy - as you prob. know it is because that method itself is untyped so mypy is ignoring. Is more complete typing the sort of thing that you are working on?

I've been developing in python for quite a number of years now, we static check everything strict here and I'm using lambda and this library significantly for the first time. If I can help in any way, let me know how best to find something useful to do and I'll look to allocate a bit of time. This is a nice library, cheers!

@heitorlessa
Copy link
Contributor

@heitorlessa I was curious why this wasn't already picked up by mypy - as you prob. know it is because that method itself is untyped so mypy is ignoring. Is more complete typing the sort of thing that you are working on?

I've been developing in python for quite a number of years now, we static check everything strict here and I'm using lambda and this library significantly for the first time. If I can help in any way, let me know how best to find something useful to do and I'll look to allocate a bit of time. This is a nice library, cheers!

Exactly my thought too - it'd be great to hear from you on strategies to do that (strict mode) without increasing the bar to contribution too high for those inexperienced.

Mind shooting us an email and we can arrange a call?

[email protected]

Strict typing is new to us, we experimented with Cython, Mypyc, and MyPy in strict mode but far from experts. We introduced mypy last year and addresses roughly ~614 issues.

We tried mypy --strict earlier this year but had a number of false positives, areas we couldn't see solutions for, or changes that would affect performance negatively -- help here will be amazing.

@peterschutt
Copy link
Contributor Author

Sure, I'm in UTC + 10, so I'll email when I'm in the office tomorrow.

It would be good to hear more about what you were trying to optimize with cython/mypyc. I've got a cython module in prod that I built well before typing was popular and have an upcoming opportunity to refactor it. I'm looking forward to experiment with those tools again and see how they look now.

Hopefully I can make some meaningful contribution.

@michaelbrewer
Copy link
Contributor

@peterschutt @heitorlessa Pylance picks this up fine. However we can't add it as part of the project as the library still supports Python 3.6. Also MyPy has not been upgraded for a while.

Screen Shot 2022-05-18 at 7 06 40 AM

With the fix applied

Screen Shot 2022-05-18 at 7 08 24 AM

@michaelbrewer
Copy link
Contributor

I did some work after adding the initial MyPy updates (#357 , #365 , #508 etc..), i started looking at the fixes recommended by Pylance and found that some of our code is just to make work around for MyPy

And example being this code in aws_lambda_powertools/utilities/parameters/appconfig.py

Screen Shot 2022-05-19 at 10 52 13 AM

client: Any = None is a hack to make MyPy happy as the assignment would fail.

@peterschutt
Copy link
Contributor Author

Hey @michaelbrewer. I haven't spent any time looking deeper at the typing yet, but appreciate the extra context. As per @heitorlessa's earlier message I've emailed to arrange a chat, just waiting on that as I don't really want to put the time into it until I understand what the objectives/constraints are and how the work would fit into their roadmap.

However we can't add it as part of the project as the library still supports Python 3.6.

I assume this library just has to track the lambda runtime environment support, which is August 17 per this (8 months after the true EOL of 3.6). Assuming that 3.7 is supported for as long after actual EOL, then we've probably got until early 2024 to support 3.7. Supporting 3.7+ with this work means we could use typing-extensions which has already dropped 3.6.

client: Any = None is a hack to make MyPy happy as the assignment would fail.

Mypy should just assume the type based on the self assignment in __init__() (ref). If I comment out the client: Any = None class variable assignment, mypy doesn't raise any issue for me (using project's mypy config). However, that is on line 63 in the develop branch, and on line 68 in your image above and the self.client assignment is different in the develop version to your image above, so maybe not comparing apples with apples.

I'll talk about typing all day @michaelbrewer but we seem to have jumped away from the original issue here and prob. just making unnecessary noise for the maintainers. Feel free to email me.

@michaelbrewer
Copy link
Contributor

@peterschutt i am sure you can. In this case, it would result in runtime errors. So it is good that the fix for this would be simple and not a breaking change.

Having used pylance for a while now, i had found as good results in catching bugs as MyPy.

@heitorlessa heitorlessa added the batch Batch processing utility label Nov 9, 2022
@heitorlessa heitorlessa added the help wanted Could use a second pair of eyes/hands label Mar 10, 2023
@heitorlessa heitorlessa added tech-debt Technical Debt tasks and removed internal Maintenance changes labels Apr 17, 2023
@heitorlessa heitorlessa linked a pull request Apr 21, 2023 that will close this issue
7 tasks
@github-actions
Copy link
Contributor

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@github-actions github-actions bot added the pending-release Fix or implementation already in dev waiting to be released label Apr 21, 2023
@github-actions
Copy link
Contributor

This is now released under 2.14.1 version!

@github-actions github-actions bot removed the pending-release Fix or implementation already in dev waiting to be released label Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
batch Batch processing utility help wanted Could use a second pair of eyes/hands tech-debt Technical Debt tasks typing Static typing definition related issues (mypy, pyright, etc.)
Projects
None yet
4 participants