Skip to content

BUG: Series dot product __rmatmul__ doesn't allow matrix vector multiplication #21578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 22, 2018

Conversation

minggli
Copy link
Contributor

@minggli minggli commented Jun 21, 2018

the new __rmatmul__ implementation in Series seems to be missing matrix vector multiplication case as raised in the issue in question. Only inner product of two vectors was supported in __rmatmul__ method.

This PR uses the same implementation in DataFrame to add support in Series and test case.

matmul operator (@, @=) was added in python 3.5 in https://www.python.org/dev/peps/pep-0465/

class A(object):
    def __matmul__(self, other):
        print('__matmul__ is called in A.')

class B(object):
    def __rmatmul__(self, other):
        print('__rmatmul__ is called in B.')

A() @ B()
B() @ A()
del A.__matmul__
A() @ B()

>>> A() @ B()
__matmul__ is called in A.
>>> B() @ A()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for @: 'B' and 'A'
>>> del A.__matmul__
>>> A() @ B()
__rmatmul__ is called in B.

@gfyoung gfyoung added Bug Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Jun 21, 2018
@gfyoung
Copy link
Member

gfyoung commented Jun 21, 2018

@minggli : Looks like a good start! Can we also add a whatsnew too?

@codecov
Copy link

codecov bot commented Jun 21, 2018

Codecov Report

Merging #21578 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #21578   +/-   ##
=======================================
  Coverage    91.9%    91.9%           
=======================================
  Files         153      153           
  Lines       49549    49549           
=======================================
  Hits        45539    45539           
  Misses       4010     4010
Flag Coverage Δ
#multiple 90.3% <100%> (ø) ⬆️
#single 41.78% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/series.py 94.19% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 028c9c0...d5a0d28. Read the comment docs.

@minggli
Copy link
Contributor Author

minggli commented Jun 21, 2018

raised an issue #21581 regarding error message when __rmatmul__ is called in dataframe and series. as dot method is implemented treating dataframe or series as left. when __rmatmul__ is called the error message doesn't work well.

this is an existing error.

need advice. fix it in the same PR and how to do that or should we handle it separately? @jreback

result = operator.matmul(a.values, a)
# GH 21530
# vector (list) @ Series (__rmatmul__)
result = operator.matmul(a.values.tolist(), a)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the np.array @ Series ? is that always calling matmul (and not rmatmul)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for both 1d and 2d. the OP example was 2 d, can you add that as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.array @ Series should call __rmatmul__, so either list or np.array should be the same, i.e. calling __rmatmul__
np.array @ DataFrame however only calls np.array.__matmul__

actioned.

result = operator.matmul(a.values, a)
# GH 21530
# vector (list) @ Series (__rmatmul__)
result = operator.matmul(a.values.tolist(), a)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for both 1d and 2d. the OP example was 2 d, can you add that as well.

@minggli minggli force-pushed the bugfix/dotproduct branch from 7a74af7 to 3db45a9 Compare June 22, 2018 12:16
@jreback jreback added this to the 0.24.0 milestone Jun 22, 2018
@jreback jreback merged commit 7d8626d into pandas-dev:master Jun 22, 2018
@jreback
Copy link
Contributor

jreback commented Jun 22, 2018

thanks @minggli very nice

@minggli
Copy link
Contributor Author

minggli commented Jun 23, 2018

Thanks @jreback and @gfyoung

@minggli minggli deleted the bugfix/dotproduct branch June 23, 2018 07:34
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: np.array @ pd.Series swaps arguments
3 participants