Skip to content

Expand Series.between() function with only left/right inclusive #36768

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Maikel0J opened this issue Oct 1, 2020 · 3 comments
Closed

Expand Series.between() function with only left/right inclusive #36768

Maikel0J opened this issue Oct 1, 2020 · 3 comments
Labels
Duplicate Report Duplicate issue or pull request Enhancement Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@Maikel0J
Copy link

Maikel0J commented Oct 1, 2020

pandas/pandas/core/series.py

Lines 4690 to 4763 in 2a7d332

def between(self, left, right, inclusive=True) -> "Series":
"""
Return boolean Series equivalent to left <= series <= right.
This function returns a boolean vector containing `True` wherever the
corresponding Series element is between the boundary values `left` and
`right`. NA values are treated as `False`.
Parameters
----------
left : scalar or list-like
Left boundary.
right : scalar or list-like
Right boundary.
inclusive : bool, default True
Include boundaries.
Returns
-------
Series
Series representing whether each element is between left and
right (inclusive).
See Also
--------
Series.gt : Greater than of series and other.
Series.lt : Less than of series and other.
Notes
-----
This function is equivalent to ``(left <= ser) & (ser <= right)``
Examples
--------
>>> s = pd.Series([2, 0, 4, 8, np.nan])
Boundary values are included by default:
>>> s.between(1, 4)
0 True
1 False
2 True
3 False
4 False
dtype: bool
With `inclusive` set to ``False`` boundary values are excluded:
>>> s.between(1, 4, inclusive=False)
0 True
1 False
2 False
3 False
4 False
dtype: bool
`left` and `right` can be any scalar value:
>>> s = pd.Series(['Alice', 'Bob', 'Carol', 'Eve'])
>>> s.between('Anna', 'Daniel')
0 False
1 True
2 True
3 False
dtype: bool
"""
if inclusive:
lmask = self >= left
rmask = self <= right
else:
lmask = self > left
rmask = self < right
return lmask & rmask

This is my first post, I hope this is conform the contributing guidelines and code of conduct.
The code for between has a variable 'inclusive' which makes a difference if the left and right value should be taken into account. Difference between < and <= for example.
My suggestion is to extend the function between a bit. Sometimes you just want to include the left value and exclude the right value. Or the way arround.

This change is possible in many ways. One example is:

def between(self, left, right, left_inclusive=True, right_inclusive=True) -> "Series":
        """
        Return boolean Series equivalent to left <= series <= right.
        This function returns a boolean vector containing `True` wherever the
        corresponding Series element is between the boundary values `left` and
        `right`. NA values are treated as `False`.
        Parameters
        ----------
        left : scalar or list-like
            Left boundary.
        right : scalar or list-like
            Right boundary.
        left_inclusive : bool, default True
            Include boundaries of left value.
        right_inclusive : bool, default True
            Include boundaries of right value.
        Returns
        -------
        Series
            Series representing whether each element is between left and
            right (inclusive).
        See Also
        --------
        Series.gt : Greater than of series and other.
        Series.lt : Less than of series and other.
        Notes
        -----
        This function is equivalent to ``(left <= ser) & (ser <= right)``
        Examples
        --------
        >>> s = pd.Series([2, 0, 4, 8, np.nan])
        Boundary values are included by default:
        >>> s.between(1, 4)
        0     True
        1    False
        2     True
        3    False
        4    False
        dtype: bool
        With `inclusive` set to ``False`` boundary values are excluded:
        >>> s.between(1, 4, right_inclusive=False)
        0     True
        1    False
        2    False
        3    False
        4    False
        dtype: bool
        `left` and `right` can be any scalar value:
        >>> s = pd.Series(['Alice', 'Bob', 'Carol', 'Eve'])
        >>> s.between('Anna', 'Daniel')
        0    False
        1     True
        2     True
        3    False
        dtype: bool
        """
        if left_inclusive:
            lmask = self >= left
        else:
            lmask = self > left
        
        if right_inclusive:
            rmask = self <= right
        else:
            rmask = self < right

        return lmask & rmask
@jreback
Copy link
Contributor

jreback commented Oct 1, 2020

see #13027 and then linked issues which proposed exactly this

i will revise my comments and suggest we should simply deprecate at this point

@Maikel0J
Copy link
Author

Maikel0J commented Oct 1, 2020

@jreback Should I close this thread or what's the convention?

@jreback
Copy link
Contributor

jreback commented Oct 1, 2020

yes, duplicate of #12398 as well.

@jreback jreback closed this as completed Oct 1, 2020
@jreback jreback added Duplicate Report Duplicate issue or pull request Enhancement Indexing Related to indexing on series/frames, not to indexes themselves labels Oct 1, 2020
@jreback jreback added this to the No action milestone Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request Enhancement Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

2 participants