-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Add matmul to DataFrame, Series #19035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #19035 +/- ##
==========================================
- Coverage 91.83% 91.82% -0.02%
==========================================
Files 152 152
Lines 49260 49269 +9
==========================================
+ Hits 45240 45241 +1
- Misses 4020 4028 +8
Continue to review full report at Codecov.
|
pandas/tests/frame/test_analytics.py
Outdated
@@ -2067,6 +2067,58 @@ def test_dot(self): | |||
with tm.assert_raises_regex(ValueError, 'aligned'): | |||
df.dot(df2) | |||
|
|||
def test_matmul(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference issue number.
pandas/core/series.py
Outdated
@@ -1625,6 +1625,13 @@ def dot(self, other): | |||
else: # pragma: no cover | |||
raise TypeError('unsupported type: %s' % type(other)) | |||
|
|||
def __matmul__(self, other): | |||
from pandas.core.frame import DataFrame | |||
dot_types = (Index, DataFrame, Series, np.ndarray, list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your tests need to ensure that for every supported dot type, this operation works. I don't think that's the case currently in your changes.
pandas/core/frame.py
Outdated
@@ -866,6 +866,12 @@ def dot(self, other): | |||
else: # pragma: no cover | |||
raise TypeError('unsupported type: %s' % type(other)) | |||
|
|||
def __matmul__(self, other): | |||
dot_types = (Index, DataFrame, Series, np.ndarray, list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this should become a common resource, since you use it in multiple places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add a whatsnew note in other enhancements
pandas/core/frame.py
Outdated
dot_types = (Index, DataFrame, Series, np.ndarray, list) | ||
if not isinstance(other, dot_types): | ||
return NotImplemented | ||
return self.dot(other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to specify the dot_types here, these are already handled in .dot (though maybe not tested fully). if something is rejected it should be there
pandas/tests/frame/test_analytics.py
Outdated
@@ -2067,6 +2067,58 @@ def test_dot(self): | |||
with tm.assert_raises_regex(ValueError, 'aligned'): | |||
df.dot(df2) | |||
|
|||
def test_matmul(self): | |||
a = DataFrame(np.random.randn(3, 4), index=['a', 'b', 'c'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of writing a new test, just parameterize test_dot on dot/matmul as this is pretty much the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you will need to skip the matmul tests if not PY35
@@ -837,6 +837,32 @@ def test_dot(self): | |||
pytest.raises(Exception, a.dot, a.values[:3]) | |||
pytest.raises(ValueError, a.dot, b.T) | |||
|
|||
def test_matmul(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
5519e13
to
d5110ed
Compare
Per @jreback 's comment I changed the implementation to just run Re: Python<3.5: it seems like we can still run the test for |
pandas/core/series.py
Outdated
@@ -1596,6 +1596,12 @@ def dot(self, other): | |||
else: # pragma: no cover | |||
raise TypeError('unsupported type: %s' % type(other)) | |||
|
|||
def __matmul__(self, other): | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u add a docstring
pandas/core/frame.py
Outdated
@@ -874,6 +874,12 @@ def dot(self, other): | |||
else: # pragma: no cover | |||
raise TypeError('unsupported type: %s' % type(other)) | |||
|
|||
def __matmul__(self, other): | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc string
pandas/tests/frame/test_analytics.py
Outdated
@@ -2075,41 +2075,42 @@ def test_clip_with_na_args(self): | |||
self.frame) | |||
|
|||
# Matrix-like | |||
|
|||
@pytest.mark.parametrize('dot_fn', [DataFrame.dot, DataFrame.__matmul__]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u add a test that hits the not implemented error
doc/source/whatsnew/v0.23.0.txt
Outdated
@@ -202,6 +202,7 @@ Other Enhancements | |||
- ``Resampler`` objects now have a functioning :attr:`~pandas.core.resample.Resampler.pipe` method. | |||
Previously, calls to ``pipe`` were diverted to the ``mean`` method (:issue:`17905`). | |||
- :func:`~pandas.api.types.is_scalar` now returns ``True`` for ``DateOffset`` objects (:issue:`18943`). | |||
- :class:`DataFrame` and :class:`Series` now support matrix multiplication (```@```) operator (:issue:`10259`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
say for >= 3.5
@@ -1596,6 +1596,12 @@ def dot(self, other): | |||
else: # pragma: no cover | |||
raise TypeError('unsupported type: %s' % type(other)) | |||
|
|||
def __matmul__(self, other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u update the .dot doc string that ‘@‘ operator is supported in >=3.5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean update the .dot doc string? I believe that is done
can you rebase / update |
36a0cb8
to
419c9d2
Compare
@jreback rebased and added docstrings. I also tried to add a test for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small comments. lgtm otherwise. yeah I think the np.asarray(other)
call is actually ok.
pandas/core/frame.py
Outdated
def __matmul__(self, other): | ||
""" Matrix multiplication using binary `@` operator in Python>=3.5 """ | ||
try: | ||
return self.dot(other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so let's remove the try/except then
pandas/core/series.py
Outdated
def __matmul__(self, other): | ||
""" Matrix multiplication using binary `@` operator in Python>=3.5 """ | ||
try: | ||
return self.dot(other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
removed! |
lgtm. ping on green! |
@jreback 💚 |
@@ -895,28 +895,30 @@ def test_count(self): | |||
ts.iloc[[0, 3, 5]] = nan | |||
assert_series_equal(ts.count(level=1), right - 1) | |||
|
|||
def test_dot(self): | |||
@pytest.mark.parametrize('dot_fn', [Series.dot, Series.__matmul__]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer operator.matmul
(which corresponds to the @
operator) to Series.__matmul__
.
You really should implement |
I don't think this is a great excuse for not implementing That said, it seems that pandas doesn't currently defer to other objects for any arithmetic operations, so I don't see any particularly good reasons to start here first. |
can you rebase and update |
9922a78
to
41bc7b1
Compare
Hello @bnaul! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on March 27, 2018 at 23:15 Hours UTC |
e391293
to
236bae4
Compare
Added
|
@bnaul can you rebase |
@@ -1596,6 +1596,12 @@ def dot(self, other): | |||
else: # pragma: no cover | |||
raise TypeError('unsupported type: %s' % type(other)) | |||
|
|||
def __matmul__(self, other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do this
@jbrockmendel do these belong in |
I think the PR defines the methods in the right place. Defining things directly in the class makes for more obvious code than pinning from afar. The tests might belong in @bnaul can I get you to add a couple of mixed-dtype tests? e.g. a DataFrame with a mix of integer/float columns. |
@jbrockmendel What would a mixed dtype test case test exactly? I can't think of a scenario where everything wouldn't just get coerced to floats. |
exactly that it does get coerced correctly. Some methods don't play nicely with mixed-dtypes |
Makes sense, lmk if you think the tests I added missed anything @jreback I think I already made the change you requested |
Looks good, thanks |
thanks @bnaul nice patch! |
Fixes #10259. Not sure about the exact set of types that should be supported, for now I just copied the classes that are referenced in
dot
.