BUG: Behavior change on DataFrame instantiation from 2.0.0 2.0.1 #53100
Labels
Bug
Constructors
Series/DataFrame/Index/pd.array Constructors
DataFrame
DataFrame data structure
Regression
Functionality that used to work in a prior pandas version
Milestone
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
On 2.0.0, this returned:
On 2.0.1, this returns:
This was changed in #52404, and I'd previously reported a similar issue around the 2.0.0rcs in #51725.
Happy to continue this conversation on #51725 if that's more appropriate.
I would note that:
Still returns
dtype('O')
in 2.0.1, as it did in 2.0.0 and in the 1.x series.I have lots of tests that rely on creating an empty dataframe where the dtype of the columns should be "string like" or
pd.api.types.infer_dtype
-> "empty". (e.g. my library requires only works with dataframes with where column names must be strings, similar topyarrow
).From
It sounds like there is a desire to move to "empty" as the dtype. I am a fan of this, and think it resembles the 1.x behavior.
Expected Behavior
Ideally, for the behavior of
pd.DataFrame({})
to not change between bug fix releases.This could also be addressed with information around:
pd.DataFrame(columns=[])
will also change in a future version.Installed Versions
The text was updated successfully, but these errors were encountered: