Skip to content

API: Can't override type sniffing in df.from_csv()? #3171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghost opened this issue Mar 25, 2013 · 8 comments
Closed

API: Can't override type sniffing in df.from_csv()? #3171

ghost opened this issue Mar 25, 2013 · 8 comments
Labels
API Design Bug IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@ghost
Copy link

ghost commented Mar 25, 2013

In [10]: ix = [-352.737091, 183.575577]
    ...: df=pd.DataFrame([0,1],index=ix)
    ...: df.to_csv("/tmp/1.csv")
    ...: df2=pd.DataFrame.from_csv("/tmp/1.csv")
    ...: print df
    ...: print df2
             0
-352.737091  0
 183.575577  1
                            0
2105-11-21 22:43:41.128654  0
1936-11-21 22:43:41.128654  1

and if you try:


In [13]: df2.index.astype('f')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-8d8276604fe4> in <module>()
----> 1 df2.index.astype('f')

/home/user1/src/pandas/pandas/tseries/index.pyc in astype(self, dtype)
    624             return self.asi8.copy()
    625         else:  # pragma: no cover
--> 626             raise ValueError('Cannot cast DatetimeIndex to dtype %s' % dtype)
    627 
    628     @property

ValueError: Cannot cast DatetimeIndex to dtype float32

related SO

@stephenwlin
Copy link
Contributor

Wow, what a coincidence, I just happened to get this same thing when trying to process the results I just posted at #3146 and was going to post something:

(.env0)stephen@anacreon:~$ head -5 take.csv
count,naive,memmove,contig,sse
2,29.94,62.24,34.41,22.34
3,27.41,46.79,23.07,17.95
4,23.94,40.03,17.38,13.02
5,22.30,35.95,18.39,14.99
(.env0)stephen@anacreon:~$ ipython
In [1]: import pandas as p; p.DataFrame.from_csv("take.csv").head(4)
Out[1]: 
            naive  memmove  contig    sse
count                                    
2013-03-02  29.94    62.24   34.41  22.34
2013-03-03  27.41    46.79   23.07  17.95
2013-03-04  23.94    40.03   17.38  13.02
2013-03-05  22.30    35.95   18.39  14.99

@ghost
Copy link
Author

ghost commented Mar 25, 2013

That means users are hitting it too. needs to get fixed.

@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

this is prob being forced because parse_date=True and index_col=0 by default and I guess the index somehow looks datelike....

@ghost
Copy link
Author

ghost commented Mar 25, 2013

saved by jeff... again.

I guess -352.737091 does look a little like october if you squint.

@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

but that still shouldn't be forcibly parsing into a datelike

@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

here's the correct parsing

but as I said above, the parser shouldn't interpret those floats as dates anyhow

In [12]: df3=pd.DataFrame.from_csv('/tmp/1.csv',parse_dates=False)

In [13]: df3
Out[13]: 
             0
-352.737091  0
 183.575577  1

In [14]: df3.index
Out[14]: Index([-352.737091,  183.575577], dtype=object)

@ghost
Copy link
Author

ghost commented Mar 25, 2013

yeah, I already used it in my test to get things working, will leave this issue open,
for a look at the type-detection code in the future.

@ghost
Copy link
Author

ghost commented Aug 18, 2013

moved type inferencing bug into it's own issue #4601

@ghost ghost closed this as completed Aug 18, 2013
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Bug IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

No branches or pull requests

2 participants