df.explode is required for reset_index #29479

vaaaaanquish · 2019-11-08T08:19:01Z

Code Sample, a copy-pastable example if possible

# Your code here
A = pd.DataFrame({'x': [['hoge', 'piyo'], ['fuga']], 'y': ['1', '2']})
B = pd.DataFrame({'x': [['hoge2', 'piyo2'], ['fuga2'], ['meta2']], 'y': ['1', '2', '3']})
A = A.append(B)

# >>> print(A)

    x  y
0    [hoge, piyo]  1
1          [fuga]  2
0  [hoge2, piyo2]  1
1         [fuga2]  2
2         [meta2]  3

# >>> A.explode('x')
        x  y
0   hoge  1
0   piyo  1
0  hoge2  1
0  piyo2  1
0   hoge  1
0   piyo  1
0  hoge2  1
0  piyo2  1
1   fuga  2
1  fuga2  2
1   fuga  2
1  fuga2  2
2  meta2  3

Problem description

The ideal of A.explode('x') output is following.

      x  y
hoge  1
piyo  1
fuga  2
hoge2 1
piyo2 1
fuga2 2
meta2 3

But, The actual output is enlarged because the index of DataFrame A.
This will cause mistakes for users of explode method.

Expected Output

The solution is to do a reset_index before the explode.

# >>> A.reset_index(drop=True).explode('x')
         x  y
0   hoge  1
0   piyo  1
1   fuga  2
2  hoge2  1
2  piyo2  1
3  fuga2  2
4  meta2  3

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : None
python : 3.6.8.final.0
python-bits : 64
OS : Darwin
OS-release : 18.7.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : ja_JP.UTF-8
LOCALE : ja_JP.UTF-8

pandas : 0.25.3
numpy : 1.16.4
pytz : 2018.9
dateutil : 2.7.5
pip : 19.2.3
setuptools : 41.0.1
Cython : 0.29.4
pytest : 4.4.0
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.3.3
html5lib : None
pymysql : None
psycopg2 : 2.8.1 (dt dec pq3 ext lo64)
jinja2 : 2.10.1
IPython : 7.4.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : None
fastparquet : None
gcsfs : 0.3.0+7.g59da8cd
lxml.etree : 4.3.3
matplotlib : 3.0.3
numexpr : None
odfpy : None
openpyxl : 2.6.0
pandas_gbq : 0.10.0
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.0
sqlalchemy : 1.3.4
tables : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None

Sorry for ugly English. Thanks.

The text was updated successfully, but these errors were encountered:

vaaaaanquish · 2019-11-08T09:12:29Z

Sorry. Seems to have solved here.
#28010

vaaaaanquish · 2019-11-08T09:17:52Z

Close and wait for release. Thanks :)

vaaaaanquish closed this as completed Nov 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

df.explode is required for reset_index #29479

df.explode is required for reset_index #29479

vaaaaanquish commented Nov 8, 2019 •

edited

Loading

INSTALLED VERSIONS

vaaaaanquish commented Nov 8, 2019

vaaaaanquish commented Nov 8, 2019

df.explode is required for reset_index #29479

df.explode is required for reset_index #29479

Comments

vaaaaanquish commented Nov 8, 2019 • edited Loading

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

vaaaaanquish commented Nov 8, 2019

vaaaaanquish commented Nov 8, 2019

vaaaaanquish commented Nov 8, 2019 •

edited

Loading

Output of `pd.show_versions()`