Skip to content

Add issubset, issuperset and size to Range type #563

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 2, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions asyncpg/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,52 @@ def upper_inf(self):
def isempty(self):
return self._empty

def issubset(self, other):
if self._empty:
return True
if other._empty:
return False

if other._lower is None:
lowerOk = True
else:
if self._lower is None:
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would read better if you set lowerOk = False here and use an elif. Also, please use lower_ok for the variable name for style consistency.

if other._lower_inc or not self._lower_inc:
lowerOk = self._lower >= other._lower
else:
lowerOk = self._lower > other._lower

if not lowerOk:
return False

if other._upper is None:
upperOk = True
else:
if self._upper is None:
return False
if other._upper_inc or not self._upper_inc:
upperOk = self._upper <= other._upper
else:
upperOk = self._upper < other._upper

return lowerOk and upperOk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lowerOk is always True by this point, so returning just upperOk is sufficient.


def issuperset(self, other):
return other.issubset(self)

def size(self, step=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why the step argument is necessary. The size of the range is determined unambiguously by whether the upper and lower bounds are inclusive.

Copy link
Contributor Author

@kdorsel kdorsel Apr 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned size for [1,2) and [1,2] should be different. In postgres they have a distinction between a discrete and continuous range. If this was a discrete integer range the results would be 2-1 = 1 and 2-1+1=2. +1 since the discrete step for an integer is 1. What is the step value for continuous range like a float? Or if using datetime what is your lowest time increment you care about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think the generic Range class shouldn't be implementing the size() method then. It's too dependent on the data type and, since Postgres allows user-defined range types, hardcoding it for int, float, and date types wouldn't save us.

Copy link
Contributor

@lelit lelit Oct 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction between "discrete" and "continuous" ranges has an impact also on issubset() and issuperset().

Consider the following two ranges: (1,2]::int4range and [2,3)::int4range. They represent the very same set of values, and indeed select '(1,2]'::int4range <@ '[2,3)'::int4range returns True, but

>>> r1 = asyncpg.Range(1, 2, lower_inc=False, upper_inc=True)
>>> r2 = asyncpg.Range(2, 3, lower_inc=True, upper_inc=False)
>>> r1.issubset(r2)
False

if self._upper is None or self._lower is None:
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to raise a ValueError here. size() on an open range is meaningless and akin to division by zero, returning None in this case risks masking bugs.

Copy link
Contributor Author

@kdorsel kdorsel Apr 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understood it, a None in either lower/upper means infinity which matches the null/missing bound definition of postgres. Hence why I return the "infinity". I'm ok either way.

For numbers this could be changed from None to float("inf"), and for datetimes to datetime.datetime.min/max.

Copy link
Contributor Author

@kdorsel kdorsel Apr 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since postgres only has numeric and date/time ranges adding type check for datetime and numeric and returning values for each one and removing None completely might be worthwhile refactor.

It would also simply the logic in here since there is no longer a need for checking None.

if step is None or self._upper_inc != self._lower_inc:
step = 0
else:
step = abs(step)
if not self._upper_inc:
step *= -1

return step + self._upper - self._lower

def __bool__(self):
return not self._empty

Expand Down