Skip to content

BUG: Convert non-dates in xls date cells to number #13042

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

kordek
Copy link
Contributor

@kordek kordek commented Apr 30, 2016

closes GH10001

@jreback
Copy link
Contributor

jreback commented Apr 30, 2016

pls add some tests

@jreback jreback added the IO Excel read_excel, to_excel label Apr 30, 2016
@@ -329,29 +329,38 @@ def _parse_cell(cell_contents, cell_typ):
appropriate object"""

if cell_typ == XL_CELL_DATE:
if xlrd_0_9_3:
# Use the newer xlrd datetime handling.
cell_contents = xldate.xldate_as_datetime(cell_contents,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than a giant try: except: can you narrow down its placement

@kordek kordek force-pushed the #10001 branch 2 times, most recently from d4ebfdb to 515a5fb Compare April 30, 2016 18:07
@codecov-io
Copy link

codecov-io commented Apr 30, 2016

Current coverage is 84.14%

Merging #13042 into master will decrease coverage by -<.01%

@@             master     #13042   diff @@
==========================================
  Files           137        137          
  Lines         50214      50220     +6   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          42253      42256     +3   
- Misses         7961       7964     +3   
  Partials          0          0          

Powered by Codecov. Last updated by b42d1dc...5724ea0

@kordek kordek force-pushed the #10001 branch 3 times, most recently from 77951ed to c3523db Compare May 1, 2016 10:40
@kordek
Copy link
Contributor Author

kordek commented May 1, 2016

Prevoiusly failed tests on the lines that now appear as uncovered according to codecov, I'm not getting this.

@jreback
Copy link
Contributor

jreback commented May 1, 2016

if you want to rebase I think I fixed codecov

@kordek
Copy link
Contributor Author

kordek commented May 1, 2016

Thanks, seems ok now

@jreback
Copy link
Contributor

jreback commented May 1, 2016

can u post a screen shot of what this excel file looks like

@kordek
Copy link
Contributor Author

kordek commented May 1, 2016

image

Hashed cell is this 10^n number

# GH 10001 : pandas.ExcelFile ignore parse_dates=False
expected_value = 100000000000000000000
act_series = self.get_exceldf('testdateoverflow')['DateColWithBigInt']
act_value = act_series.iloc[2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

construct the expected frame and use assert_frame_equal

@kordek
Copy link
Contributor Author

kordek commented May 1, 2016

Same thing with codecov

@jreback
Copy link
Contributor

jreback commented May 1, 2016

you need to rebase on master

git rebase -i orgin/master

you are not picking up the change I made recently.

@kordek kordek force-pushed the #10001 branch 2 times, most recently from 2441fcb to 05683f6 Compare May 1, 2016 20:40
@kordek
Copy link
Contributor Author

kordek commented May 1, 2016

check now pls

# GH 10001 : pandas.ExcelFile ignore parse_dates=False
refdf = pd.DataFrame([[pd.Timestamp('2016-03-12'), 'Marc Johnson'],
[pd.Timestamp('2016-03-16'), 'Jack Black'],
[1e+20, 'Timothy Brown']],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am think this would be converted as an integer, right? (or is it actually a float)

Copy link
Contributor Author

@kordek kordek May 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation it is a float (simply cell_contents is returned). Or such way would be more desired ?:

val = int(cell_contents)
                if val == cell_contents:
                    cell_contents = val

closes GH10001

If there is a date column in excel in which there are cells with some big
integers, that during parsing to date cause int/long overflow, issue a
warning and convert the value to int or float.
@jreback jreback added this to the 0.18.2 milestone May 6, 2016
@jreback jreback closed this in 1296ab3 May 6, 2016
@jreback
Copy link
Contributor

jreback commented May 6, 2016

thanks @kordek

@kordek kordek deleted the #10001 branch June 21, 2016 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pandas.ExcelFile ignore parse_dates=False
3 participants