Skip to content

added mising numeric_only option for DataFrame.std/var/sem #9209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 12, 2015

Conversation

mortada
Copy link
Contributor

@mortada mortada commented Jan 7, 2015

closes #9201, the numeric_only option is missing for DataFrame.std() (and also DataFrame.var() and DataFrame.sem()), this is a fix for it

@jreback jreback added Bug Numeric Operations Arithmetic, Comparison, and Logical operations labels Jan 8, 2015
@jreback jreback added this to the 0.16.0 milestone Jan 8, 2015
@@ -11458,6 +11458,21 @@ def test_var_std(self):
self.assertFalse((result < 0).any())
nanops._USE_BOTTLENECK = True

def test_numeric_only_flag(self):
methods = ['sem', 'var', 'std']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add the issue number in a comment here

@jreback
Copy link
Contributor

jreback commented Jan 8, 2015

pls add a release note in the Bug Fix section of 0.16

@mortada
Copy link
Contributor Author

mortada commented Jan 8, 2015

@jreback updated, please take a look, thanks


df2 = DataFrame(np.random.randn(5, 3), columns=['foo', 'bar', 'baz'])
# set one entry in a non-existing column to a str
df2.ix[0, 'a'] = '150'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean set the entry to 'a', e.g. make it an object series that has a completely invalid string (and not a string-like number)

@mortada mortada force-pushed the numeric_only branch 2 times, most recently from c5adc3e to c74b932 Compare January 8, 2015 17:17
@mortada
Copy link
Contributor Author

mortada commented Jan 8, 2015

@jreback PR updated, please take a look

@jreback
Copy link
Contributor

jreback commented Jan 8, 2015

I think that if u have a string 100 in a column and the rest are actual float/int values (thus dtype is object) thrn this should always raise a ValueError and not do the coercion

it hides errors to evaluate the string as a number

@mortada
Copy link
Contributor Author

mortada commented Jan 9, 2015

that's a good point. I'll dig into this.

@jreback
Copy link
Contributor

jreback commented Jan 18, 2015

@mortada pls rebase and update according to the last. thanks

@jreback
Copy link
Contributor

jreback commented Mar 5, 2015

pls rebase

@jreback jreback modified the milestones: 0.16.1, 0.16.0 Mar 5, 2015
@mortada
Copy link
Contributor Author

mortada commented Mar 6, 2015

@jreback just rebased, sorry I've been pretty swamped and haven't got around to this. But I should have more time going forward.

@mortada mortada force-pushed the numeric_only branch 2 times, most recently from 2e51776 to 414b31a Compare March 8, 2015 08:52
@mortada
Copy link
Contributor Author

mortada commented Mar 10, 2015

@jreback rebased and addressed the issue you mentioned about type coercion. Please take a look.

With this PR _nanvar will only convert values for int types, and not coerce str or other types into float like it used to. It will throw a TypeError if numeric_only is False and there's a str type.

This is now consistent with the behavior of other methods such as DataFrame.mean

@jreback jreback modified the milestones: 0.16.0, 0.16.1 Mar 12, 2015
jreback added a commit that referenced this pull request Mar 12, 2015
added mising numeric_only option for DataFrame.std/var/sem
@jreback jreback merged commit 09ea608 into pandas-dev:master Mar 12, 2015
@jreback
Copy link
Contributor

jreback commented Mar 12, 2015

thanks!

@mortada mortada deleted the numeric_only branch April 29, 2015 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issues with numeric_only for DataFrame.std()
2 participants