-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Add columns argument to read_feather() (#24025) #24034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add columns argument to read_feather() (#24025) #24034
Conversation
Hello @nixphix! Thanks for submitting the PR.
|
Codecov Report
@@ Coverage Diff @@
## master #24034 +/- ##
=======================================
Coverage 42.46% 42.46%
=======================================
Files 161 161
Lines 51557 51557
=======================================
Hits 21892 21892
Misses 29665 29665
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #24034 +/- ##
===========================================
+ Coverage 42.38% 92.25% +49.86%
===========================================
Files 161 161
Lines 51701 51701
===========================================
+ Hits 21914 47696 +25782
+ Misses 29787 4005 -25782
Continue to review full report at Codecov.
|
Any reason to not just specify the columns as a list? It’d make the test easier to read.
________________________________
From: Prabakaran Kumaresshan <[email protected]>
Sent: Saturday, December 1, 2018 11:43 AM
To: pandas-dev/pandas
Cc: Tom Augspurger; Mention
Subject: Re: [pandas-dev/pandas] ENH: Add columns argument to read_feather() (#24025) (#24034)
@nixphix commented on this pull request.
________________________________
In pandas/tests/io/test_feather.py<#24034 (comment)>:
@@ -74,6 +77,18 @@ def test_stringify_columns(self):
df = pd.DataFrame(np.arange(12).reshape(4, 3)).copy()
self.check_error_on_write(df, ValueError)
+ def test_read_columns(self):
+
+ df = pd.DataFrame({'col1': list('abc'),
+ 'col2': list(range(1, 4)),
+ 'col3': list('xyz'),
+ 'col4': list(range(4, 7))})
+ self.check_round_trip(df, columns=None)
+ self.check_round_trip(df, columns=df.columns)
+ random_cols = np.random.choice(df.columns, 2)
@TomAugspurger<https://github.com/TomAugspurger> I missed replace=Flase argument that will ensure we are getting unique columns, this should work
random_cols = np.random.choice(df.columns, 2, replace=False)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#24034 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ABQHIq2LF0wkf9uXwQ5c22CFCGQfVx8gks5u0r-_gaJpZM4Y803h>.
|
@TomAugspurger It's just that I'm not a fan of hard coded values, thought it might make the test case more dynamic and robust. I will change it to list for better readability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/tests/io/test_feather.py
Outdated
@@ -74,6 +77,21 @@ def test_stringify_columns(self): | |||
df = pd.DataFrame(np.arange(12).reshape(4, 3)).copy() | |||
self.check_error_on_write(df, ValueError) | |||
|
|||
@pytest.mark.parametrize("columns", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate to harp on this, but what are we testing by parametrizing this test? IMO, we should only be concerned with testing that pandas passes through the columns
argument, so these tests seem redundant. I'd only keep the ['col1', 'col2']
test.
Are we worried about pyarrow breaking something, so this is some kind of integration test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let's not worry about how the integration works, removed None
and ['col1', 'col2', 'col3', 'col4']
lgtm. @TomAugspurger |
Thanks @nixphix! |
passedgit diff upstream/master -u -- "*.py" | flake8 --diff
I have added test case but when running
test cases (including some of already existing test cases) fails with error
let me know if this is expected or I'm missing something