Skip to content

PYTHON-5172 bugfix: Add __repr__ and __eq__ to bson.binary.BinaryVector #2162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 10, 2025
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions bson/binary.py
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,9 @@ def __init__(self, data: Sequence[float | int], dtype: BinaryVectorDtype, paddin
self.dtype = dtype
self.padding = padding

def __repr__(self) -> str:
return f"BinaryVector - {self.dtype=}, {self.dtype},{self.padding=}, {self.data=}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to follow Python's convention for repr: https://docs.python.org/3/library/functions.html#repr

And we should add tests for this like we do for our other custom type reprs.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold up, the real bug here is that we're using @dataclass but also defining __init__. Why is that? The whole point of dataclass is that it provides __init__ and __repr__ automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that we had issues with the combo of {dataclass, slots, defaults}. So I've just removed the dataclass decorator.

Copy link
Member

@ShaneHarvey ShaneHarvey Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other features that will break by removing @dataclass? We may be stuck with it until 5.0 if it's a breaking change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. As mentioned in the closed PR, we've done init, and repr, and we don't want to rely on equality except via the binary representation.

Copy link
Contributor Author

@caseyclements caseyclements Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code seems happy to be a dataclass or not. I have no strong preference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@ShaneHarvey ShaneHarvey Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show an example of == and > before and after this change? I'm wondering if the old code (with @dataclass) even works at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It's easy to try. Put a breakpoint on line 83 of test_bson_binary.

# With the decorator:
vector_obs == vector_exp.as_vector()
True
vector_obs > vector_exp.as_vector()
# TypeError: '>' not supported between instances of 'BinaryVector' and 'BinaryVector'

# Without the decorator:
vector_obs == vector_exp.as_vector()
False
vector_obs > vector_exp.as_vector()
# TypeError: '>' not supported between instances of 'BinaryVector' and 'BinaryVector'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

== is already broken with our current @dataclass approach:

>>> from bson.binary import *
>>> BinaryVector([1], BinaryVectorDtype.INT8) == BinaryVector([2], BinaryVectorDtype.INT8)
True

So we need to remove dataclass and implement == as well as __repr__.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Take a look. I hate subtleties with == and floats..


class Binary(bytes):
"""Representation of BSON binary data.
Expand Down
Loading