Skip to content

BUG: DatetimeIndex(tz) & single column name, return empty df (GH19157) #19330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 27, 2018

Conversation

JQGoh
Copy link
Contributor

@JQGoh JQGoh commented Jan 21, 2018

This issue is due to self._init_dict({0: values}, index, columns, dtype=dtype) will call for the filtering if columns passed (data = {k: v for k, v in compat.iteritems(data) if k in columns}), see function _init_dict of frame.py

So self._init_dict({0: values}, index, columns, dtype=dtype) expects the column name of values as '0', but since we pass a column with a different name, upon filtering it will result in an empty DataFrame.

My solution assumes that the conversion of a series of DatetimeIndex with tz_info. Hence, we will initialize a DataFrame according to the given column name. If no column name specified, index '0' is chosen. I introduced an assertion to warn the users if multiple column names are passed.

Update 27-01-2018: The updated PR includes a test, and update on whatsnew entry. The revised solution uses _arrays_to_mgr instead, such that a default column name 0 is specified if columns not specified.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs tests from the original issue. add tests first locally, then make sure they fail! then add a fix to see if it passes.

add a whatsnew note

@@ -518,7 +518,17 @@ def _get_axes(N, K, index=index, columns=columns):
return _arrays_to_mgr([values], columns, index, columns,
dtype=dtype)
elif is_datetimetz(values):
return self._init_dict({0: values}, index, columns, dtype=dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we we can just call _arrays_to_mgr here instead, w/o all of this additional code

@jreback jreback added Datetime Datetime data dtype Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jan 21, 2018
@jreback
Copy link
Contributor

jreback commented Jan 21, 2018

you can add tests for #13407 as well (if this fixes it).

@jreback jreback added the Timezones Timezone data dtype label Jan 21, 2018
@codecov
Copy link

codecov bot commented Jan 22, 2018

Codecov Report

Merging #19330 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19330      +/-   ##
==========================================
- Coverage   91.65%   91.63%   -0.03%     
==========================================
  Files         150      150              
  Lines       48724    48726       +2     
==========================================
- Hits        44658    44648      -10     
- Misses       4066     4078      +12
Flag Coverage Δ
#multiple 90% <100%> (-0.03%) ⬇️
#single 41.74% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/frame.py 97.62% <100%> (ø) ⬆️
pandas/plotting/_converter.py 65.22% <0%> (-1.74%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5f6c80b...7e258d6. Read the comment docs.

@JQGoh
Copy link
Contributor Author

JQGoh commented Jan 27, 2018

@jreback Thank you for your suggestion, I have revised the changes accordingly, including an addition of test case and whatsnew entry note.

However, this does not solve the issue #13407 yet.

@jreback jreback added this to the 0.23.0 milestone Jan 27, 2018
@jreback jreback merged commit 5f79123 into pandas-dev:master Jan 27, 2018
@jreback
Copy link
Contributor

jreback commented Jan 27, 2018

thanks @JQGoh!

happy to take a PR for #13407 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame() returns an empty DataFrame, if DatetimeIndex with timezone-info and column label are passed
2 participants