Skip to content

BUG: Fix _binop for operators for serials which has more than one returns (divmod/rdivmod). #25588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 20, 2019

Conversation

sighingnow
Copy link
Contributor

When for binary operator returns more than one Series as result, instantiate Series for every element in the result tuple.

@sighingnow sighingnow changed the title Fix _binop for operators for serials which has more than one returns (divmod/rdivmod). BUG: Fix _binop for operators for serials which has more than one returns (divmod/rdivmod). Mar 7, 2019
@TomAugspurger TomAugspurger added the Numeric Operations Arithmetic, Comparison, and Logical operations label Mar 7, 2019
@@ -31,6 +31,7 @@ Fixed Regressions
- Fixed regression in ``IntervalDtype`` construction where passing an incorrect string with 'Interval' as a prefix could result in a ``RecursionError``. (:issue:`25338`)
- Fixed regression in :class:`Categorical`, where constructing it from a categorical ``Series`` and an explicit ``categories=`` that differed from that in the ``Series`` created an invalid object which could trigger segfaults. (:issue:`25318`)
- Fixed pip installing from source into an environment without NumPy (:issue:`25193`)
- Fixed bug in :meth:`Series._binop` to handle binary operators that returns more than one :class:`Series` correctly (:issue:`25557`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference :class:=Series.divmod=. These note are for users, and _binop isn't public.

You can remove the "that returns more than one Series correctly" part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will refactor this part, thanks!

@@ -2502,6 +2502,15 @@ def _binop(self, other, func, level=None, fill_value=None):
-------
Series
"""

def mk_ret(result, new_index, name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reuse anything from core/ops.py here? In particular, _construct_divmod_result.

# When name is None, __finalize__ overwrites current name
result.name = None
return result
if isinstance(result, tuple):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully replace with _construct_result and _construct_divmod_result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reminding me that two utilities. After _construct_result or _construct_divmod_result we still need to invoke __finalize__ to propagate metadata and overwrite name again. Any suggestion on implementing such construction in a more concise way ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in core/ops.py, we call res_name = get_op_result_name(left, right). Is that an option here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However after _construct_(divmod_)result, we still need __finalize__, which may overwrite name. Thus after __finalize__ we need assign name again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or should we do __finalize__ in _construct_result in core/ops.py ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to do __finalize__ in _construct_result since _construct_result is only used in core/ops.py at where the __finalize__ should be called (but not now).

Will do that.

@@ -741,6 +741,20 @@ def test_op_duplicate_index(self):
expected = pd.Series([11, 12, np.nan], index=[1, 1, 2])
assert_series_equal(result, expected)

def test_divmod(self):
a = Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the GitHub issue number as a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@sighingnow
Copy link
Contributor Author

Update:

  1. do __finalize__ in _construct_result
  2. use _construct_result and _construct_divmod_result in _binop, remove duplicate code
  3. revise doc

@codecov
Copy link

codecov bot commented Mar 7, 2019

Codecov Report

Merging #25588 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25588      +/-   ##
==========================================
+ Coverage   91.26%   91.26%   +<.01%     
==========================================
  Files         173      173              
  Lines       52966    52966              
==========================================
+ Hits        48337    48338       +1     
+ Misses       4629     4628       -1
Flag Coverage Δ
#multiple 89.83% <100%> (ø) ⬆️
#single 41.71% <42.85%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/ops.py 91.64% <100%> (+0.01%) ⬆️
pandas/core/series.py 93.67% <100%> (-0.01%) ⬇️
pandas/util/testing.py 87.66% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74a9ae3...87d1636. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 7, 2019

Codecov Report

Merging #25588 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25588      +/-   ##
==========================================
- Coverage   91.26%   91.26%   -0.01%     
==========================================
  Files         173      173              
  Lines       52982    52981       -1     
==========================================
- Hits        48356    48354       -2     
- Misses       4626     4627       +1
Flag Coverage Δ
#multiple 89.83% <100%> (-0.01%) ⬇️
#single 41.76% <20%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/ops.py 91.74% <100%> (ø) ⬆️
pandas/core/series.py 93.67% <100%> (-0.01%) ⬇️
pandas/util/testing.py 89.3% <0%> (-0.11%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4663951...3bde353. Read the comment docs.

@jreback jreback added the Bug label Mar 9, 2019
out.name = name
out.__finalize__(left)
if name is None:
out.name = name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this i pretty odd, what exactly is the issue here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have realized that the if condition is not correct here. Assign name to out.name directly should be ok.

Fixed now.

@@ -31,6 +31,7 @@ Fixed Regressions
- Fixed regression in ``IntervalDtype`` construction where passing an incorrect string with 'Interval' as a prefix could result in a ``RecursionError``. (:issue:`25338`)
- Fixed regression in :class:`Categorical`, where constructing it from a categorical ``Series`` and an explicit ``categories=`` that differed from that in the ``Series`` created an invalid object which could trigger segfaults. (:issue:`25318`)
- Fixed pip installing from source into an environment without NumPy (:issue:`25193`)
- Fixed bug in :meth:`Series.divmod` and :meth:`Series.rdivmod` (:issue:`25557`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to 0.25.0. pls explain what exactly the bug is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

a = Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
b = Series([2, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])

r1 = divmod(a, b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the format
result=
expected=
assert_series_equal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -1660,18 +1660,19 @@ def _construct_result(left, result, index, name, dtype=None):
not be enough; we still need to override the name attribute.
"""
out = left._constructor(result, index=index, dtype=dtype)

out.__finalize__(left)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be out = out.__finalize__(left)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

result.name = None
return result
if func.__name__ in ['divmod', 'rdivmod']:
ret = ops._construct_divmod_result(self, result, new_index, name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could just return on each statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous review comment suggested me to use _construct_divmod_result.

@@ -140,6 +140,7 @@ Bug Fixes
- Bug in :func:`to_datetime` which would raise an (incorrect) ``ValueError`` when called with a date far into the future and the ``format`` argument specified instead of raising ``OutOfBoundsDatetime`` (:issue:`23830`)
- Bug in an error message in :meth:`DataFrame.plot`. Improved the error message if non-numerics are passed to :meth:`DataFrame.plot` (:issue:`25481`)
- Bug in error messages in :meth:`DataFrame.corr` and :meth:`Series.corr`. Added the possibility of using a callable. (:issue:`25729`)
- Bug in :meth:`Series.divmod` and :meth:`Series.rdivmod` which would raise an (incorrect) ``ValueError`` rather than return a pair of :class:`Series` object as result (:issue:`25557`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object -> objects as a result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I have moved this line to the Numeric section.

@jreback
Copy link
Contributor

jreback commented Mar 19, 2019

can you merge master and update small comments, ping on green.

* origin/master:
  DOC: clean bug fix section in whatsnew (pandas-dev#25792)
  DOC: Fixed PeriodArray api ref (pandas-dev#25526)
  Move locale code out of tm, into _config (pandas-dev#25757)
  Unpin pycodestyle (pandas-dev#25789)
  Add test for rdivmod on EA array (GH23287) (pandas-dev#24047)
  ENH: Support datetime.timezone objects (pandas-dev#25065)
  Cython language level 3 (pandas-dev#24538)
  API: concat on sparse values (pandas-dev#25719)
  TST: assert_produces_warning works with filterwarnings (pandas-dev#25721)
  make core.config self-contained (pandas-dev#25613)
  CLN: replace %s syntax with .format in pandas.io.parsers (pandas-dev#24721)
  TST: Check pytables<3.5.1 when skipping (pandas-dev#25773)
  DOC: Fix typo in docstring of DataFrame.memory_usage  (pandas-dev#25770)
  Replace dicts with OrderedDicts in groupby aggregation functions (pandas-dev#25693)
  TST: Fixturize tests/frame/test_missing.py (pandas-dev#25640)
  DOC: Improve the docsting of Series.iteritems (pandas-dev#24879)
  DOC: Fix function name. (pandas-dev#25751)
  Implementing iso_week_year support for to_datetime (pandas-dev#25541)
  DOC: clarify corr behaviour when using a callable (pandas-dev#25732)
  remove unnecessary check_output (pandas-dev#25755)

# Conflicts:
#	doc/source/whatsnew/v0.25.0.rst
Signed-off-by: HE, Tao <[email protected]>
@sighingnow
Copy link
Contributor Author

CI is green. /cc @jreback

@jreback jreback added this to the 0.25.0 milestone Mar 20, 2019
@jreback jreback merged commit 37d04a3 into pandas-dev:master Mar 20, 2019
@jreback
Copy link
Contributor

jreback commented Mar 20, 2019

thanks @sighingnow

@sighingnow sighingnow deleted the fix-25557 branch March 20, 2019 12:51
thoo added a commit to thoo/pandas that referenced this pull request Mar 20, 2019
* upstream/master: (55 commits)
  PERF: Improve performance of StataReader (pandas-dev#25780)
  Speed up tokenizing of a row in csv and xstrtod parsing (pandas-dev#25784)
  BUG: Fix _binop for operators for serials which has more than one returns (divmod/rdivmod). (pandas-dev#25588)
  BUG-24971 copying blocks also considers ndim (pandas-dev#25521)
  CLN: Panel reference from documentation (pandas-dev#25649)
  ENH: Quoting column names containing spaces with backticks to use them in query and eval. (pandas-dev#24955)
  BUG: reading windows utf8 filenames in py3.6 (pandas-dev#25769)
  DOC: clean bug fix section in whatsnew (pandas-dev#25792)
  DOC: Fixed PeriodArray api ref (pandas-dev#25526)
  Move locale code out of tm, into _config (pandas-dev#25757)
  Unpin pycodestyle (pandas-dev#25789)
  Add test for rdivmod on EA array (GH23287) (pandas-dev#24047)
  ENH: Support datetime.timezone objects (pandas-dev#25065)
  Cython language level 3 (pandas-dev#24538)
  API: concat on sparse values (pandas-dev#25719)
  TST: assert_produces_warning works with filterwarnings (pandas-dev#25721)
  make core.config self-contained (pandas-dev#25613)
  CLN: replace %s syntax with .format in pandas.io.parsers (pandas-dev#24721)
  TST: Check pytables<3.5.1 when skipping (pandas-dev#25773)
  DOC: Fix typo in docstring of DataFrame.memory_usage  (pandas-dev#25770)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: ValueError in Series.divmod
3 participants