-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fix _binop for operators for serials which has more than one returns (divmod/rdivmod). #25588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: HE, Tao <[email protected]>
doc/source/whatsnew/v0.24.2.rst
Outdated
@@ -31,6 +31,7 @@ Fixed Regressions | |||
- Fixed regression in ``IntervalDtype`` construction where passing an incorrect string with 'Interval' as a prefix could result in a ``RecursionError``. (:issue:`25338`) | |||
- Fixed regression in :class:`Categorical`, where constructing it from a categorical ``Series`` and an explicit ``categories=`` that differed from that in the ``Series`` created an invalid object which could trigger segfaults. (:issue:`25318`) | |||
- Fixed pip installing from source into an environment without NumPy (:issue:`25193`) | |||
- Fixed bug in :meth:`Series._binop` to handle binary operators that returns more than one :class:`Series` correctly (:issue:`25557`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference :class:=Series.divmod=
. These note are for users, and _binop isn't public.
You can remove the "that returns more than one Series correctly" part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will refactor this part, thanks!
pandas/core/series.py
Outdated
@@ -2502,6 +2502,15 @@ def _binop(self, other, func, level=None, fill_value=None): | |||
------- | |||
Series | |||
""" | |||
|
|||
def mk_ret(result, new_index, name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reuse anything from core/ops.py
here? In particular, _construct_divmod_result
.
pandas/core/series.py
Outdated
# When name is None, __finalize__ overwrites current name | ||
result.name = None | ||
return result | ||
if isinstance(result, tuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully replace with _construct_result
and _construct_divmod_result
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reminding me that two utilities. After _construct_result
or _construct_divmod_result
we still need to invoke __finalize__
to propagate metadata and overwrite name
again. Any suggestion on implementing such construction in a more concise way ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in core/ops.py, we call res_name = get_op_result_name(left, right)
. Is that an option here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However after _construct_(divmod_)result
, we still need __finalize__
, which may overwrite name
. Thus after __finalize__
we need assign name
again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should we do __finalize__
in _construct_result
in core/ops.py ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok to do __finalize__
in _construct_result
since _construct_result
is only used in core/ops.py at where the __finalize__
should be called (but not now).
Will do that.
@@ -741,6 +741,20 @@ def test_op_duplicate_index(self): | |||
expected = pd.Series([11, 12, np.nan], index=[1, 1, 2]) | |||
assert_series_equal(result, expected) | |||
|
|||
def test_divmod(self): | |||
a = Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the GitHub issue number as a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
Signed-off-by: HE, Tao <[email protected]>
Update:
|
Codecov Report
@@ Coverage Diff @@
## master #25588 +/- ##
==========================================
+ Coverage 91.26% 91.26% +<.01%
==========================================
Files 173 173
Lines 52966 52966
==========================================
+ Hits 48337 48338 +1
+ Misses 4629 4628 -1
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #25588 +/- ##
==========================================
- Coverage 91.26% 91.26% -0.01%
==========================================
Files 173 173
Lines 52982 52981 -1
==========================================
- Hits 48356 48354 -2
- Misses 4626 4627 +1
Continue to review full report at Codecov.
|
pandas/core/ops.py
Outdated
out.name = name | ||
out.__finalize__(left) | ||
if name is None: | ||
out.name = name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, this i pretty odd, what exactly is the issue here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have realized that the if
condition is not correct here. Assign name
to out.name
directly should be ok.
Fixed now.
doc/source/whatsnew/v0.24.2.rst
Outdated
@@ -31,6 +31,7 @@ Fixed Regressions | |||
- Fixed regression in ``IntervalDtype`` construction where passing an incorrect string with 'Interval' as a prefix could result in a ``RecursionError``. (:issue:`25338`) | |||
- Fixed regression in :class:`Categorical`, where constructing it from a categorical ``Series`` and an explicit ``categories=`` that differed from that in the ``Series`` created an invalid object which could trigger segfaults. (:issue:`25318`) | |||
- Fixed pip installing from source into an environment without NumPy (:issue:`25193`) | |||
- Fixed bug in :meth:`Series.divmod` and :meth:`Series.rdivmod` (:issue:`25557`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 0.25.0. pls explain what exactly the bug is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
a = Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd']) | ||
b = Series([2, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e']) | ||
|
||
r1 = divmod(a, b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the format
result=
expected=
assert_series_equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
# Conflicts: # doc/source/whatsnew/v0.24.2.rst
…ition. Signed-off-by: HE, Tao <[email protected]>
pandas/core/ops.py
Outdated
@@ -1660,18 +1660,19 @@ def _construct_result(left, result, index, name, dtype=None): | |||
not be enough; we still need to override the name attribute. | |||
""" | |||
out = left._constructor(result, index=index, dtype=dtype) | |||
|
|||
out.__finalize__(left) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be out = out.__finalize__(left)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks!
result.name = None | ||
return result | ||
if func.__name__ in ['divmod', 'rdivmod']: | ||
ret = ops._construct_divmod_result(self, result, new_index, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could just return on each statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous review comment suggested me to use _construct_divmod_result
.
doc/source/whatsnew/v0.25.0.rst
Outdated
@@ -140,6 +140,7 @@ Bug Fixes | |||
- Bug in :func:`to_datetime` which would raise an (incorrect) ``ValueError`` when called with a date far into the future and the ``format`` argument specified instead of raising ``OutOfBoundsDatetime`` (:issue:`23830`) | |||
- Bug in an error message in :meth:`DataFrame.plot`. Improved the error message if non-numerics are passed to :meth:`DataFrame.plot` (:issue:`25481`) | |||
- Bug in error messages in :meth:`DataFrame.corr` and :meth:`Series.corr`. Added the possibility of using a callable. (:issue:`25729`) | |||
- Bug in :meth:`Series.divmod` and :meth:`Series.rdivmod` which would raise an (incorrect) ``ValueError`` rather than return a pair of :class:`Series` object as result (:issue:`25557`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
object -> objects as a result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. I have moved this line to the Numeric
section.
can you merge master and update small comments, ping on green. |
* origin/master: DOC: clean bug fix section in whatsnew (pandas-dev#25792) DOC: Fixed PeriodArray api ref (pandas-dev#25526) Move locale code out of tm, into _config (pandas-dev#25757) Unpin pycodestyle (pandas-dev#25789) Add test for rdivmod on EA array (GH23287) (pandas-dev#24047) ENH: Support datetime.timezone objects (pandas-dev#25065) Cython language level 3 (pandas-dev#24538) API: concat on sparse values (pandas-dev#25719) TST: assert_produces_warning works with filterwarnings (pandas-dev#25721) make core.config self-contained (pandas-dev#25613) CLN: replace %s syntax with .format in pandas.io.parsers (pandas-dev#24721) TST: Check pytables<3.5.1 when skipping (pandas-dev#25773) DOC: Fix typo in docstring of DataFrame.memory_usage (pandas-dev#25770) Replace dicts with OrderedDicts in groupby aggregation functions (pandas-dev#25693) TST: Fixturize tests/frame/test_missing.py (pandas-dev#25640) DOC: Improve the docsting of Series.iteritems (pandas-dev#24879) DOC: Fix function name. (pandas-dev#25751) Implementing iso_week_year support for to_datetime (pandas-dev#25541) DOC: clarify corr behaviour when using a callable (pandas-dev#25732) remove unnecessary check_output (pandas-dev#25755) # Conflicts: # doc/source/whatsnew/v0.25.0.rst
Signed-off-by: HE, Tao <[email protected]>
CI is green. /cc @jreback |
thanks @sighingnow |
* upstream/master: (55 commits) PERF: Improve performance of StataReader (pandas-dev#25780) Speed up tokenizing of a row in csv and xstrtod parsing (pandas-dev#25784) BUG: Fix _binop for operators for serials which has more than one returns (divmod/rdivmod). (pandas-dev#25588) BUG-24971 copying blocks also considers ndim (pandas-dev#25521) CLN: Panel reference from documentation (pandas-dev#25649) ENH: Quoting column names containing spaces with backticks to use them in query and eval. (pandas-dev#24955) BUG: reading windows utf8 filenames in py3.6 (pandas-dev#25769) DOC: clean bug fix section in whatsnew (pandas-dev#25792) DOC: Fixed PeriodArray api ref (pandas-dev#25526) Move locale code out of tm, into _config (pandas-dev#25757) Unpin pycodestyle (pandas-dev#25789) Add test for rdivmod on EA array (GH23287) (pandas-dev#24047) ENH: Support datetime.timezone objects (pandas-dev#25065) Cython language level 3 (pandas-dev#24538) API: concat on sparse values (pandas-dev#25719) TST: assert_produces_warning works with filterwarnings (pandas-dev#25721) make core.config self-contained (pandas-dev#25613) CLN: replace %s syntax with .format in pandas.io.parsers (pandas-dev#24721) TST: Check pytables<3.5.1 when skipping (pandas-dev#25773) DOC: Fix typo in docstring of DataFrame.memory_usage (pandas-dev#25770) ...
When for binary operator returns more than one
Series
as result, instantiateSeries
for every element in the result tuple.git diff upstream/master -u -- "*.py" | flake8 --diff