Skip to content

ENH: gb.is_monotonic_increasing #17015 #17453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

No-Stream
Copy link
Contributor

@No-Stream No-Stream commented Sep 6, 2017

  • closes #17015
  • tests added / passed:
    updated test_tab_completion to expect additional functions is_monotonic_increasing and is_monotonic_decreasing
    added (semi-redundant) tests of gb.is_monotonically_increasing() - strictly increasing/decreasing, non-strictly, w/ inf vals, w/ nan vals
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
    added at the end of the Other Enhancements section

This is my first PR to Pandas, so I don't doubt that I'll need to edit things.

@jreback
Copy link
Contributor

jreback commented Sep 7, 2017

tests go in pandas/tests/groupby/test_groupby.py

@@ -130,7 +130,7 @@ Other Enhancements
- `read_*` methods can now infer compression from non-string paths, such as ``pathlib.Path`` objects (:issue:`17206`).
- :func:`pd.read_sas()` now recognizes much more of the most frequently used date (datetime) formats in SAS7BDAT files (:issue:`15871`).
- :func:`DataFrame.items` and :func:`Series.items` is now present in both Python 2 and 3 and is lazy in all cases (:issue:`13918`, :issue:`17213`)

- :func: groupby.is_monotonic_increasing and .is_monotonic_decreasing extend Series.is_monotonic_increasing to groups, returning whether each group is monotonically increasing or decreasing, respectively. (:issue:`17015`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't work for :func: as you have it written, I think enough is to say .is_monotonic_increasing/decreasing is added to .groupby()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix and the above test location, which now seems painfully obvious. Will patch up the PR and fix tomorrow.

@No-Stream
Copy link
Contributor Author

Added basic tests, for strictly monotonically increasing/decreasing, non-strictly, groups with inf, groups with nan, and to check method result == .apply(lambda x: x.is_monotonic_increasing) (though the method is just syntactic sugar in this implementation). (Not sure how much detail should be in the tests of gb.is_monotonic_increasing vs. Series.is_monotonic_increasing.)

df = pd.DataFrame(source_dict)
result = df.groupby(['B']).C.is_monotonic_decreasing()
expected = pd.Series(index=['a', 'b', 'c', 'd'],
data=[False, False, False, False],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI some of these may be less verbose as pytest parameterized tests, though not a big impact

Thanks for the PR!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I did consider parametrizing, but when I've done it before, it inflated the test count (i.e. if I parametrized using the same test cases for both is_monotonic_increasing() and is_monotonic_decreasing, I'd have added two tests, rather than one. Maybe there's a way around that.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually we would prefer to parametrize, it reduces the amount of duplication in code. feel free.

@@ -3843,6 +3843,128 @@ def test_cummin_cummax(self):
expected = pd.Series([1, 2, 1], name='b')
tm.assert_series_equal(result, expected)

def test_is_increasing_is_decreasing(self):
# GH 17015

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't really need all this, as is_montonic and friends are already well tested in Index (where they are actually implemented), but no big deal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

figured as much but wasn't sure whether to test here and risk duplication or not test here and have "untested" code.

name='C')
expected.index.name = 'B'
tm.assert_series_equal(result, expected)
# Also check result equal to manually taking x.is_monotonic_increasing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line

name='C')
expected.index.name = 'B'
tm.assert_series_equal(result, expected)
# Also check result equal to manually taking x.is_monotonic_increasing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

df = pd.DataFrame(source_dict)
result = df.groupby(['B']).C.is_monotonic_decreasing()
expected = pd.Series(index=['a', 'b', 'c', 'd'],
data=[False, False, False, False],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually we would prefer to parametrize, it reduces the amount of duplication in code. feel free.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also update api.rst for these methods (and I suspect others might be missing as well for GroupBy and sub-classes)

@codecov
Copy link

codecov bot commented Sep 8, 2017

Codecov Report

Merging #17453 into master will decrease coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17453      +/-   ##
==========================================
- Coverage   91.61%   91.58%   -0.04%     
==========================================
  Files         150      150              
  Lines       48807    48867      +60     
==========================================
+ Hits        44716    44755      +39     
- Misses       4091     4112      +21
Flag Coverage Δ
#multiple 89.96% <ø> (-0.04%) ⬇️
#single 41.75% <ø> (+0.02%) ⬆️
Impacted Files Coverage Δ
pandas/core/groupby.py 92.2% <ø> (ø) ⬆️
pandas/io/s3.py 72.72% <0%> (-13.64%) ⬇️
pandas/plotting/_converter.py 65.22% <0%> (-1.74%) ⬇️
pandas/io/packers.py 87.31% <0%> (-1.34%) ⬇️
pandas/core/arrays/base.py 60% <0%> (-0.61%) ⬇️
pandas/io/json/json.py 92.23% <0%> (-0.37%) ⬇️
pandas/io/common.py 68.77% <0%> (-0.3%) ⬇️
pandas/io/parquet.py 71.79% <0%> (-0.28%) ⬇️
pandas/io/parsers.py 95.32% <0%> (-0.25%) ⬇️
pandas/core/series.py 94.56% <0%> (-0.05%) ⬇️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 324379c...787c436. Read the comment docs.

@No-Stream
Copy link
Contributor Author

@jreback I went ahead and parametrized the tests. Wanted to have the method itself as a param as I could then just use one test function to test both is_monotonic_increasing and decreasing, but I couldn't figure out a way to pass a method as a param.

Also added these to SeriesGroupBy in api.rst. Not sure what else I should add. If I run the api_rst_coverage.py script I found mentioned in the docs, I get a gigantic laundry list of Class members (other than those beginning with '_') missing from api.rst. Let me know if you have a suggestion for that.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you rebase

@@ -130,7 +130,7 @@ Other Enhancements
- `read_*` methods can now infer compression from non-string paths, such as ``pathlib.Path`` objects (:issue:`17206`).
- :func:`pd.read_sas()` now recognizes much more of the most frequently used date (datetime) formats in SAS7BDAT files (:issue:`15871`).
- :func:`DataFrame.items` and :func:`Series.items` is now present in both Python 2 and 3 and is lazy in all cases (:issue:`13918`, :issue:`17213`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to 0.21.1

@No-Stream
Copy link
Contributor Author

I rebased to try to integrate changes to pandas-dev/pandas, but it seems the commit history is profoundly screwed up now. (It does look to me like the upstream changes are now reflected, though.) Could someone suggest how to fix this?

@jreback
Copy link
Contributor

jreback commented Nov 1, 2017

@AdamShamlian you need to build the extensions

python setup.py build_ext --inplace

you should do this whenever you pull (it has no effect if they are already built).

extensions are undergoing a lot of renovation lately.

@jreback jreback force-pushed the is-monotonic-increasing-gb-method branch from 95d4ba8 to ea42697 Compare November 1, 2017 01:40
@jreback
Copy link
Contributor

jreback commented Nov 1, 2017

@No-Stream I rebased you. please have a look thru and make any additional corrections

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small changes. pls rebase. ping on green.

@@ -22,7 +22,7 @@ Other Enhancements
^^^^^^^^^^^^^^^^^^

- :meth:`Timestamp.timestamp` is now available in Python 2.7. (:issue:`17329`)
-
- :func: groupby.is_monotonic_increasing and .is_monotonic_decreasing extend Series.is_monotonic_increasing to groups, returning whether each group is monotonically increasing or decreasing, respectively. (:issue:`17015`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can add :func: where needed here

'C': in_vals}
df = pd.DataFrame(source_dict)
result = df.groupby(['B']).C.is_monotonic_increasing()
expected = pd.Series(index=['a', 'b', 'c', 'd'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can write this as Index(list('abcd'), name='B')

@jreback jreback added this to the 0.21.1 milestone Nov 10, 2017
ghasemnaddaf and others added 2 commits November 13, 2017 15:01
…ev#18269)

ENH: gb.is_monotonic_increasing pandas-dev#17015 fix rebase conflicts

parametrized tests for gb.is_monotonic_increasing/decreasing

ENH: gb.is_monotonic_increasing, is_monotonic_decreasing  pandas-dev#17015

added tests for gb.is_monotonically_increasing()/decreasing

parametrized tests for gb.is_monotonic_increasing/decreasing

ENH: gb.is_monotonic_increasing pandas-dev#17015 fix rebase conflicts

ENH: gb.is_monotonic_increasing pandas-dev#17015 fix rebase conflicts

rebase and cleanup

simplified test format

fixed whatsnew to include method tags
@jorisvandenbossche jorisvandenbossche added this to the 0.22.0 milestone Nov 30, 2017
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments, otherwise lgtm.

@@ -1172,4 +1172,3 @@ Other
^^^^^
- Bug where some inplace operators were not being wrapped and produced a copy when invoked (:issue:`12962`)
- Bug in :func:`eval` where the ``inplace`` parameter was being incorrectly handled (:issue:`16732`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you can revert this file

@@ -23,6 +23,7 @@ Other Enhancements

- :meth:`Timestamp.timestamp` is now available in Python 2.7. (:issue:`17329`)
- :class:`Grouper` and :class:`TimeGrouper` now have a friendly repr output (:issue:`18203`).
- :meth: groupby.is_monotonic_increasing and :meth: .is_monotonic_decreasing extend :meth: Series.is_monotonic_increasing to groups, returning whether each group is monotonically increasing or decreasing, respectively. (:issue:`17015`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move to 0.22

# abs val increasing (F), non-strictly increasing (T)
([1, 2, 5, 3, 2, 0, 4, 5, -6, 1, 1],
[True, False, False, True]),
# Test with inf vals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a blank line before comments

'C': in_vals}
df = pd.DataFrame(source_dict)
result = df.groupby('B').C.is_monotonic_increasing()
expected = pd.Series(index=list('abcd'), data=out_vals, name='C')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can construct the index=Index(list('abcd'), name='B')

@jreback
Copy link
Contributor

jreback commented Dec 10, 2017

can you rebase and update according to comments

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. small doc-change.

@@ -138,6 +138,8 @@ Other Enhancements
- :func:`Series` / :func:`DataFrame` tab completion also returns identifiers in the first level of a :func:`MultiIndex`. (:issue:`16326`)
- :func:`read_excel()` has gained the ``nrows`` parameter (:issue:`16645`)
- :func:``DataFrame.to_json`` and ``Series.to_json`` now accept an ``index`` argument which allows the user to exclude the index from the JSON output (:issue:`17394`)
- :meth: groupby.is_monotonic_increasing and :meth: .is_monotonic_decreasing extend :meth: Series.is_monotonic_increasing to groups, returning whether each group is monotonically increasing or decreasing, respectively. (:issue:`17015`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these need to be

:func:`SeriesGroupBy.is_monotonic_increasing` (and so on)

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if it is possible to completely remove the is_monotonic_increasing function definitions by adding them to _series_apply_whitelist

@No-Stream
Copy link
Contributor Author

@jorisvandenbossche was able to implement it as you requested. This change does make the syntax a bit uglier. i.e. it's series_groupby.is_monotonic_increasing rather than series_groupby.is_monotonic_increasing(). That makes it inconsistent with other groupby methods, but it's working fine.

@jreback
Copy link
Contributor

jreback commented Feb 11, 2018

@gfyoung can you rebase this

@gfyoung
Copy link
Member

gfyoung commented Feb 11, 2018

@jreback : Resolved the merge conflict, though I don't know if the CI will be all green.

@jorisvandenbossche
Copy link
Member

@jorisvandenbossche was able to implement it as you requested. This change does make the syntax a bit uglier. i.e. it's series_groupby.is_monotonic_increasing rather than series_groupby.is_monotonic_increasing(). That makes it inconsistent with other groupby methods, but it's working fine.

That's actually a good point. We don't have many precedents for attributes on groupby objects (or for methods that are the equivalent of attributes).
From a a quick look, we have .dtype (an attribute both on Series and SeriesGroupBy, but not sure of the usefulness of this one), and .size() (an attribute on Series, but method on SeriesGroupBy)

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am ok with this being an attribute
it is consistent with Series/Index

@@ -323,6 +323,7 @@ Other Enhancements

- ``IntervalIndex.astype`` now supports conversions between subtypes when passed an ``IntervalDtype`` (:issue:`19197`)
- :class:`IntervalIndex` and its associated constructor methods (``from_arrays``, ``from_breaks``, ``from_tuples``) have gained a ``dtype`` parameter (:issue:`19262`)
- :func: groupby.is_monotonic_increasing and :func: .is_monotonic_decreasing extend :func: Series.is_monotonic_increasing to groups, returning whether each group is monotonically increasing or decreasing, respectively. (:issue:`17015`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these will not render correctly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -323,6 +323,7 @@ Other Enhancements

- ``IntervalIndex.astype`` now supports conversions between subtypes when passed an ``IntervalDtype`` (:issue:`19197`)
- :class:`IntervalIndex` and its associated constructor methods (``from_arrays``, ``from_breaks``, ``from_tuples``) have gained a ``dtype`` parameter (:issue:`19262`)
- :func:`groupby.is_monotonic_increasing` and :func:`.is_monotonic_decreasing` extend `:func: Series.is_monotonic_increasing` to groups, returning whether each group is monotonically increasing or decreasing, respectively. (:issue:`17015`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be

:func:`SeriesGroupBy.is_monotonic_increasing`

I also don't think "extend" is the right word. Perhaps just

Added :func:`SeriesGroupBy.is_monotonic_increasing` and :func:`SeriesGroupBy.is_monotonic_decreasing` (:issue:`17015`)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Fixed.

@jreback
Copy link
Contributor

jreback commented Feb 14, 2018

look ok to me. @gfyoung can you rebase again.

@gfyoung
Copy link
Member

gfyoung commented Feb 15, 2018

@jreback @jorisvandenbossche : CI is passing, and the PR is mergeable. PTAL.

@jorisvandenbossche jorisvandenbossche merged commit d59aad6 into pandas-dev:master Feb 15, 2018
@jorisvandenbossche
Copy link
Member

Thanks @No-Stream and @gfyoung

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants