Skip to content

REGR: to_frame with name=None no longer gives default name #45448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue Jan 18, 2022 · 6 comments · Fixed by #45523
Closed

REGR: to_frame with name=None no longer gives default name #45448

jorisvandenbossche opened this issue Jan 18, 2022 · 6 comments · Fixed by #45523
Labels
Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@jorisvandenbossche
Copy link
Member

With released pandas we have:

In [1]: pd.__version__
Out[1]: '1.3.5'

In [2]: pd.Series([1, 2, 3]).to_frame(None)
Out[2]: 
   0
0  1
1  2
2  3

while in master / pandas 1.4, the same code gives a DataFrame with column name of NaN (actually None, but the display is wrong):

In [1]: pd.Series([1, 2, 3]).to_frame(None)
Out[1]: 
   NaN
0    1
1    2
2    3

This change gives a bunch of failures in the dask test suite (xref dask/dask#8580)

@jorisvandenbossche jorisvandenbossche added the Regression Functionality that used to work in a prior pandas version label Jan 18, 2022
@jorisvandenbossche jorisvandenbossche added this to the 1.4 milestone Jan 18, 2022
@jorisvandenbossche
Copy link
Member Author

jorisvandenbossche commented Jan 18, 2022

Hmm, I see this was changed deliberately to "honor" passed None values (#44212).

But so if you want to be able to also pass a default value as third party package (as dask is doing at https://github.com/dask/dask/blob/06e49039764ff29e1de65705e16ff1748731d75b/dask/dataframe/core.py#L3657-L3659), you either have to pass lib.no_default as well (but I don't think we made this public?), or have to resort to some if/else statements to include or not include the keyword, which quickly gets a bit ugly.

We could also first deprecate this change.

cc @jbrockmendel

@jorisvandenbossche
Copy link
Member Author

There are also other places where we don't honor name=None, eg in the Series constructor:

In [3]: pd.Series(pd.Series([1, 2, 3], name="test"), name=None)
Out[3]: 
0    1
1    2
2    3
Name: test, dtype: int64

@jbrockmendel
Copy link
Member

but I don't think we made this public?

its in api.extensions, not sure why that file in particular

There are also other places where we don't honor name=None, eg in the Series constructor:

id change/deprecate that too. in general if a user explicitly asks for None, they should get it.

We could also first deprecate this change.

IIRC this change was needed for some of the groupby work I was doing which has since started using to_2d_mgr instead of to_frame, so reverting+deprecating might not break things.

@jsignell
Copy link
Contributor

Thanks for raising this @jorisvandenbossche. So for Dask, it seems like long-term we'll want to switch to using lib.no_default rather than None?

@jorisvandenbossche
Copy link
Member Author

but I don't think we made this public?

its in api.extensions, not sure why that file in particular

Ah, yes, forgot about that.

So for Dask, it seems like long-term we'll want to switch to using lib.no_default rather than None?

Yes, using that (through pd.api.extensions.no_default). Or the approach to not pass the keyword, as you are already doing in dask/dask#8554 is also perfectly fine (the no_default might give a bit cleaner code, but that is only available since pandas 1.0, so depending if you have still support for older pandas, your current PR might be simpler)

@jsignell
Copy link
Contributor

Ok thanks, yeah I'll stick with the current approach since I think it is safer for cudf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants