Skip to content

BUG: Fix DataFrame construction regression #22232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 8, 2018
Merged

BUG: Fix DataFrame construction regression #22232

merged 8 commits into from
Aug 8, 2018

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Aug 7, 2018

'Ohio': {2000: 1.5, 2001: 1.7, 2002: 3.6}}
result = pd.DataFrame(pop, index=[2001, 2002, 2003], columns=columns)
expected = pd.DataFrame(
{'Nevada': [2.4, 2.9, np.nan], 'Ohio': [1.7, 3.6, np.nan]},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to construct this with a list of lists? Would put an extra safeguard in place in case of something wonky going on with dict construction

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@cpcloud cpcloud added this to the 0.23.5 milestone Aug 7, 2018
@cpcloud cpcloud added Bug Regression Functionality that used to work in a prior pandas version labels Aug 7, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments. ping on green.

@@ -268,3 +268,15 @@ def test_deepcopy_empty(self):
empty_frame_copy = deepcopy(empty_frame)

self._compare(empty_frame_copy, empty_frame)

def test_nested_dict_construction(self):
columns = ['Nevada', 'Ohio']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the gh as a comment here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


def test_nested_dict_construction(self):
columns = ['Nevada', 'Ohio']
pop = {'Nevada': {2001: 2.4, 2002: 2.9},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move to pandas/tests/frame/test_construction instead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jreback
Copy link
Contributor

jreback commented Aug 7, 2018

@WillAyd would you mind pushing a 0.23.5 empty whatsnew?

@jreback jreback modified the milestones: 0.23.5, 0.24.0 Aug 7, 2018
@jreback
Copy link
Contributor

jreback commented Aug 7, 2018

lgtm. let's leave on 0.24, can think about backporting.

@WillAyd
Copy link
Member

WillAyd commented Aug 7, 2018

@jreback done via #22233


- Constructing a DataFrame with an index argument that wasn't already an
instance of :class:`~pandas.core.Index` was broken in `4efb39f
<https://github.com/pandas-dev/pandas/commit/4efb39f01f5880122fa38d91e12d217ef70fad9e>`_ (:issue:`22227`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we created a separate Regressions section in the whatsnew ? Generally, we just put regression bugs in the bugs section and don't "call out" the offending commits like this.

Copy link
Member Author

@cpcloud cpcloud Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, keeping track of the number of regressions over time is an important indicator of stability of a project. I think pandas is fairly stable with respect to regression, and I think it's important that we know how often regressions are happening.

The specific commit is there for minimizing the number of clicks someone has to perform to see exactly what code introduced the regression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so actually 0.23.5 whatsnew is now pushed, let backport this, can you move the whatsnew. ping when pushed

@jreback jreback modified the milestones: 0.24.0, 0.23.5 Aug 7, 2018
@cpcloud
Copy link
Member Author

cpcloud commented Aug 7, 2018

@jreback Pushed

@codecov
Copy link

codecov bot commented Aug 7, 2018

Codecov Report

Merging #22232 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22232      +/-   ##
==========================================
+ Coverage   92.06%   92.06%   +<.01%     
==========================================
  Files         169      169              
  Lines       50698    50699       +1     
==========================================
+ Hits        46676    46677       +1     
  Misses       4022     4022
Flag Coverage Δ
#multiple 90.47% <100%> (ø) ⬆️
#single 42.32% <100%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 97.26% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 737b329...8039b33. Read the comment docs.

@jreback jreback merged commit 3c4ecb7 into pandas-dev:master Aug 8, 2018
@jreback
Copy link
Contributor

jreback commented Aug 8, 2018

thanks @cpcloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Regression creating DataFrame from nested dict
6 participants