Skip to content

Correct bool to datetime conversion (behaviour based on int/float processing) #13176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

gliptak
Copy link
Contributor

@gliptak gliptak commented May 14, 2016

@@ -2230,7 +2230,7 @@ cpdef array_to_datetime(ndarray[object] values, errors='raise',
seen_integer=1
else:
try:
if len(val) == 0 or val in _nat_strings:
if not hasattr(val, "__len__") or len(val) == 0 or val in _nat_strings:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of this make a new path for bool detection

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified

@jreback jreback added Bug Datetime Datetime data dtype Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels May 14, 2016
@@ -2214,7 +2214,8 @@ cpdef array_to_datetime(ndarray[object] values, errors='raise',
iresult[i] = NPY_NAT
continue
raise

elif is_bool_object(val):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only true for coerce, pls follow the existing patterns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@codecov-io
Copy link

codecov-io commented May 14, 2016

Current coverage is 84.12%

Merging #13176 into master will decrease coverage by <.01%

@@             master     #13176   diff @@
==========================================
  Files           138        138          
  Lines         50587      50385   -202   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits          42593      42383   -210   
- Misses         7994       8002     +8   
  Partials          0          0          

Powered by Codecov. Last updated by b4e2d34...e27a444

self.assertTrue(to_datetime(True, errors="coerce") is tslib.NaT)
self.assertEqual(to_datetime(True, errors="raise"), to_datetime(1))
self.assertEqual(to_datetime(True, errors="ignore"), to_datetime(1))
self.assert_numpy_array_equal(to_datetime([True, False, tslib.NaT]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tm.assert_index_equal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@jreback
Copy link
Contributor

jreback commented May 14, 2016

add a whatsnew entry

@@ -608,6 +608,18 @@ def test_parsers(self):
self.assertTrue(result1 is tslib.NaT)
self.assertTrue(result1 is tslib.NaT)

def test_datetime_bool(self):
self.assertEqual(to_datetime(False), to_datetime(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number as a comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@gliptak gliptak changed the title Convert bool to datetime returns NaT Correct bool to datetime conversion (behaviour based on int/float processing) May 14, 2016
@gliptak
Copy link
Contributor Author

gliptak commented May 14, 2016

added whatsnew

@gliptak
Copy link
Contributor Author

gliptak commented May 15, 2016

build is green

@@ -608,6 +608,19 @@ def test_parsers(self):
self.assertTrue(result1 is tslib.NaT)
self.assertTrue(result1 is tslib.NaT)

def test_datetime_bool(self):
# GH13176
self.assertEqual(to_datetime(False), to_datetime(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add some tests where bools are intermixed with valid (and also invalid conversions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@gliptak
Copy link
Contributor Author

gliptak commented May 17, 2016

Would you like to see any other changes? Thanks

1 similar comment
@gliptak
Copy link
Contributor Author

gliptak commented May 19, 2016

Would you like to see any other changes? Thanks

@jreback
Copy link
Contributor

jreback commented May 19, 2016

@gliptak actually I think I steered you wrong here. These should raise if errors='raise' (more informative message though), pass thru for errors='ignore', and become NaT for errors='coerce'.

They should not convert to 0/1 datetimes.

@gliptak
Copy link
Contributor Author

gliptak commented May 19, 2016

What is the informative message when raise? Also what does pass through mean when ignore? Thanks

@jreback
Copy link
Contributor

jreback commented May 20, 2016

# this is what an unconvertible string does
In [1]: pd.to_datetime('foo', errors='raise')
ValueError: Unknown string format

# ignore just passes thru the values
In [2]: pd.to_datetime('foo', errors='ignore')
Out[2]: 'foo'

so a messge like booleans are not convertible to datetimes is prob enough.

@gliptak
Copy link
Contributor Author

gliptak commented May 21, 2016

@jreback Is this the expected processing?

if val == NPY_NAT:
iresult[i] = NPY_NAT
elif is_raise:
raise ValueError("boolean is not convertible to datetime")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow the existing pattern for ignore

Copy link
Contributor

@jreback jreback May 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to raise TypeError so it kicks out to the object loop

add a test that has something like

pd.to_datetime(['20130101',True])

@jreback jreback added this to the 0.18.2 milestone May 24, 2016
@jreback
Copy link
Contributor

jreback commented May 24, 2016

ok we have another case.

I accidently gave it this (from the original issue)

In [5]: pd.to_datetime(df.bool)
TypeError: object of type 'instancemethod' has no len()

bool is a method! so this hit the else clause. I think need to make that an elif and is_string_object, then have the else fall thru and raise/coerce as needed. (you could even remove the is_bool_object and just do that in the else and in the exception do type(v)

@gliptak
Copy link
Contributor Author

gliptak commented May 25, 2016

updated

@gliptak
Copy link
Contributor Author

gliptak commented May 26, 2016

build is green

@jreback jreback closed this in f2ce0ac May 26, 2016
@jreback
Copy link
Contributor

jreback commented May 26, 2016

thanks @gliptak

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ERR: better exception for converting bool to datetime with 0.17
3 participants