Skip to content

Broadcasting issue with 2d index in DataFrames #25416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dbelgrod opened this issue Feb 22, 2019 · 4 comments · Fixed by #29083
Closed

Broadcasting issue with 2d index in DataFrames #25416

dbelgrod opened this issue Feb 22, 2019 · 4 comments · Fixed by #29083
Labels
Needs Tests Unit test(s) needed to prevent regressions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@dbelgrod
Copy link

dbelgrod commented Feb 22, 2019

#Under pandas==0.23.0
import pandas
df1 = pd.DataFrame([[1]], columns = [[1]], index = [ 1, 2 ])
print(df1)
df2 = pd.DataFrame([[1]], columns = [[1]], index = [ [ 1, 2 ] ])
print(df2)

Output:
   1
1  1
2  1

ValueError: Shape of passed values is (1, 1), indices imply (1, 2)

#Under pandas==0.22.0, this issue did not arise.
import pandas
df1 = pd.DataFrame([[1]], columns = [[1]], index = [ 1, 2 ])
print(df1)
df2 = pd.DataFrame([[1]], columns = [[1]], index = [ [ 1, 2 ] ])
print(df2)
Output:
   1
1  1
2  1
   1
1  1
2  1

It seems that if the index is just a list then broadcasting the values occurs. If it a list of lists, the index still maintains the same shape but broadcasting is not done in this case.

Output of pd.show_versions()

pandas: 0.23.0 pytest: None pip: 18.1 setuptools: 40.6.3 Cython: None numpy: 1.11.3 scipy: 1.0.0 pyarrow: None xarray: None IPython: 7.2.0 sphinx: 1.8.2 patsy: 0.5.1 dateutil: 2.7.5 pytz: 2018.9 blosc: None bottleneck: 1.2.1 tables: 3.2.2 numexpr: 2.6.8 feather: None matplotlib: 3.0.1 openpyxl: None xlrd: 1.2.0 xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: 1.2.7 pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: 0.7.0
@Liam3851
Copy link
Contributor

I cannot reproduce under latest version 0.24.1:

In [1]: import pandas as pd

In [2]: df1 = pd.DataFrame([[1]], columns = [[1]], index = [ 1, 2 ])

In [3]: df1
Out[3]:
   1
1  1
2  1

In [4]: df2 = pd.DataFrame([[1]], columns = [[1]], index = [ [ 1, 2 ] ])

In [5]: df2
Out[5]:
   1
1  1
2  1

In [6]: pd.__version__
Out[6]: u'0.24.1'

Can you try the latest version?

@gfyoung gfyoung added Can't Repro Index Related to the Index class or subclasses Testing pandas testing functions or related to the test suite labels Feb 23, 2019
@gfyoung
Copy link
Member

gfyoung commented Feb 23, 2019

@Liam3851 : I can't reproduce either. We should add the example as a test to close this unless someone can reproduce on the latest releases.

@jorisvandenbossche jorisvandenbossche added Needs Tests Unit test(s) needed to prevent regressions and removed Can't Repro Index Related to the Index class or subclasses Testing pandas testing functions or related to the test suite labels Mar 4, 2019
@jorisvandenbossche
Copy link
Member

I can reproduce it. It was indeed failing on 0.23.4, but now working on 0.24.1. So indeed good to add a test for it.

However, do we actually want to support 2D input as columns / index ?

@jhereth
Copy link
Contributor

jhereth commented Oct 19, 2019

Added tests in #29083. ci is failing (for unrelated reasons). Will try to re-push later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Tests Unit test(s) needed to prevent regressions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants