I case of dtype issues, read_csv doesn't give an error as useful as pd.to_numeric does

A follow-up to #13237 . Copied examples:
Here's what to_numeric shows:
```python
In [137]: pd.to_numeric(o)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
pandas/src/inference.pyx in pandas.lib.maybe_convert_numeric (pandas/lib.c:55708)()

ValueError: Unable to parse string "52721156299871854681072370812488856336599863860274272781560003496046130816295143841376557767523688876482371118681808060318819187029855011862637267061548684520491431327693943042785705302632892888308342787190965977140539558800921069356177102235514666302335984730634641934384020650987601849970936578094137344.00000"
```

And here's what `read_csv` shows (the data is at ftp://ftp.sanger.ac.uk/pub/consortia/ibdgenetics/iibdgc-trans-ancestry-summary-stats.tar):
```python
In [138]: d = pd.read_csv('EUR.UC.gwas.assoc', delim_whitespace=True, usecols=['OR'], dtype={'OR': np.float64})
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
pandas/parser.pyx in pandas.parser.TextReader._convert_tokens (pandas/parser.c:14411)()

TypeError: Cannot cast array from dtype('O') to dtype('float64') according to the rule 'safe'

During handling of the above exception, another exception occurred:

[... long stacktrace ...]

pandas/parser.pyx in pandas.parser.TextReader._convert_tokens (pandas/parser.c:14632)()

ValueError: cannot safely convert passed user dtype of float64 for object dtyped data in column 8
```

@jreback 
I've finally started looking into it, and it seems that I can't implement it in a good way without changing NumPy because, in the end, it's NumPy who doesn't give any row/value information, albeit Pandas conditionally changes the exception to its own.

I can write an ad-hoc implementation for numeric conversion using `pd.to_numeric` though, and use its row/value information in case it raises an exception. What do you think?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

I case of dtype issues, read_csv doesn't give an error as useful as pd.to_numeric does #15898

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

I case of dtype issues, read_csv doesn't give an error as useful as pd.to_numeric does #15898

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions