Skip to content

Fix Issue #34923: Inferred dtype at the end of df explode method #35011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

SanthoshBala18
Copy link
Contributor

@SanthoshBala18 SanthoshBala18 commented Jun 26, 2020

  • closes Infer dtype when using df.explode()ENH: #34923
  • tests added / passed
    pandas/tests/frame/methods/test_explode.py, method test_inferred_dtype
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@pep8speaks
Copy link

pep8speaks commented Jun 26, 2020

Hello @SanthoshBala18! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-07-20 11:26:43 UTC

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jun 26, 2020
@jreback
Copy link
Contributor

jreback commented Jun 26, 2020

you may need to update the example in the docs / doc-string as well

Pulling latest master branch
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm pls merge master and ping on green

@SanthoshBala18
Copy link
Contributor Author

@jreback, merged the master branch..

@simonjayhawkins
Copy link
Member

I'm -1 on this. This is a breaking change and not a bug.

the change for lists of ints with some rows containing empty lists now being cast to float should wait till a major release and maybe use the nullable integer type instead.

current

>>> pd.__version__
'1.1.0rc0'
>>>
>>> s = pd.Series([[1, 2, 3, 4], [], [5, 6, 7]])
>>> s
0    [1, 2, 3, 4]
1              []
2       [5, 6, 7]
dtype: object
>>>
>>> res = s.explode()
>>> res
0      1
0      2
0      3
0      4
1    NaN
2      5
2      6
2      7
dtype: object
>>>
>>> type(res.iloc[1])
<class 'int'>
>>>

this PR

>>> pd.__version__
'1.1.0rc0+13.gf6144512d'
>>>
>>> s = pd.Series([[1, 2, 3, 4], [], [5, 6, 7]])
>>> s
0    [1, 2, 3, 4]
1              []
2       [5, 6, 7]
dtype: object
>>>
>>> res = s.explode()
>>> res
0    1.0
0    2.0
0    3.0
0    4.0
1    NaN
2    5.0
2    6.0
2    7.0
dtype: float64
>>>
>>> type(res.iloc[1])
<class 'numpy.float64'>
>>>

@simonjayhawkins simonjayhawkins added the Needs Discussion Requires discussion from core team before further action label Sep 15, 2020
@jreback
Copy link
Contributor

jreback commented Nov 26, 2020

closing this. see my comments on the issue (basically need to add an option downcast= instead here.

@jreback jreback closed this Nov 26, 2020
@jreback jreback added this to the No action milestone Nov 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Needs Discussion Requires discussion from core team before further action Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infer dtype when using df.explode()ENH:
4 participants