-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Dataframe constructor fails when given dict with None value #14381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So this works correctly in the following cases.
The behavior in 0.18.1 is actually wrong, this should coerce to the pull-requests to fix are welcome. |
Hey @brandonmburroughs, I saw that you're working on this too and beat me to the PR. No worries, I wasn't as far along. Just wanted to let you know that the same problem shows up with the Series constructor too, i.e. Series([None]) fails to coerce to NaN. I looked at fixing it a little further down the stack in series.py, but didn't check with any tests yet. Feel free to see my commit above that referenced this. |
I was going to work on a PR but looks like you guys are on top of it. Thanks! |
@shawnheide I actually noticed this problem after I created my PR. I created an issue (#14393) about this and there is some discussion going on there as to how to handle this as the cases are different. Depending upon how they want to handle the API design, your fix may be better suited to handle all cases. |
@jreback Given your comment in #14393 (comment), I would personally say that the above case should not coerce to NaN, but keep the None. Thoughts? But in that case, @brandonmburroughs, your PR should be updated. |
yeah open to having it be pre-0.19.0 behavior (IOW, remain as |
To illustrate, in pandas 0.18:
So for 0.19.1, I would choose to go back to 0.18.1 behaviour, so not coercing to NaN (keep as None). |
A small, complete example of the issue
Expected Output
This previously worked with a sensible output in 0.18.1:
In [2]: pd.DataFrame(dict(a=None),index=[0])
Out[2]:
a
0 None
Output of
pd.show_versions()
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 3.2.0-4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.24
numpy: 1.11.2
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.2.0
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.5
lxml: 3.6.0
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None
Broken version:
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 3.2.0-4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.19.0
nose: 1.3.7
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.24
numpy: 1.11.2
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.2.0
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.5
lxml: 3.6.0
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None
The text was updated successfully, but these errors were encountered: