BUG: Respect value of raise_on_error #14877

tomderuijter · 2016-12-14T08:41:37Z

Code Sample, a copy-pastable example if possible

Pasteable snippet.

import numpy as np
import pandas as pd
x = pd.DataFrame(data=[[1,2],[3,4]], columns=['A','B'])
x.astype({'C': np.int32}, raise_on_error=False)

Output

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/tomdr/VirtualEnvs/work-py3/lib/python3.5/site-packages/pandas/core/generic.py", line 3041, in astype
    raise KeyError('Only a column name can be used for the '
KeyError: 'Only a column name can be used for the key in a dtype mappings argument.'

Problem description

The method argument raise_on_error of DataFrame.astype suggests there is the option to ignore errors caused by mismatch in column names. Currently, an exception is raised regardless of the value of this argument. This is fixed in this pull request.

Before: https://github.com/pandas-dev/pandas/blob/v0.19.1/pandas/core/generic.py#L3041

Expected Output

An exception should not be raised.

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.5.1.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 21.0.0
Cython: 0.24.1
numpy: 1.11.2
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.0.0
sphinx: 1.4.8
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.14
pymysql: 0.7.9.None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: None
pandas_datareader: None

sinhrks · 2016-12-14T09:51:51Z

I understand the keyword is to ignore dtype conversion error, not input validation. So I think the documentation should be fixed to clarify this.

If we're ignoring all kinds of errors, you should fix KeyError raised from Series.

jorisvandenbossche · 2017-01-02T10:20:05Z

I agree with @sinhrks that the current meaning of the raise_on_error keyword is to raise/ignore conversion errors, not errors for column names.

The keyword is renamed in #14967, so let's making this clearer in the docs there.

@tomderuijter The use case for ignoring keys in the dict that are not present in the column names is also certainly a valid use case. So we could discuss whether we want to support this in a certain way (similar requests came up for drop to drop columns I think). Welcome to open a new issue about this to discuss!

ratman

Reporting the problematic key's name in the exception message would make one's life much easier.

Respect value of raise_on_error

35b86e1

tomderuijter changed the title ~~Respect value of raise_on_error~~ BUG: Respect value of raise_on_error Dec 14, 2016

sinhrks added the Error Reporting Incorrect or improved errors from pandas label Dec 14, 2016

sinhrks added the Dtype Conversions Unexpected or buggy dtype conversions label Dec 14, 2016

jreback mentioned this pull request Dec 14, 2016

DEPR: change .astype(.., raise_on_error) -> errors #14878

Closed

jorisvandenbossche closed this Jan 2, 2017

jorisvandenbossche added this to the No action milestone Jan 2, 2017

jorisvandenbossche mentioned this pull request Jan 2, 2017

[Depr] raise_on_error kwarg with errors kwarg in astype#14878 #14967

Merged

4 tasks

ratman reviewed May 26, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Respect value of raise_on_error #14877

BUG: Respect value of raise_on_error #14877

Uh oh!

tomderuijter commented Dec 14, 2016 •

edited

Loading

Uh oh!

sinhrks commented Dec 14, 2016

Uh oh!

jorisvandenbossche commented Jan 2, 2017

Uh oh!

ratman left a comment

Uh oh!

Uh oh!

Uh oh!

BUG: Respect value of raise_on_error #14877

BUG: Respect value of raise_on_error #14877

Uh oh!

Conversation

tomderuijter commented Dec 14, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Uh oh!

sinhrks commented Dec 14, 2016

Uh oh!

jorisvandenbossche commented Jan 2, 2017

Uh oh!

ratman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tomderuijter commented Dec 14, 2016 •

edited

Loading

Output of `pd.show_versions()`