Skip to content

BUG-22796 Concat multicolumn tz-aware DataFrame #23036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Oct 9, 2018
Merged
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -892,6 +892,7 @@ Reshaping
- Bug in :func:`pandas.wide_to_long` when a string is passed to the stubnames argument and a column name is a substring of that stubname (:issue:`22468`)
- Bug in :func:`merge` when merging ``datetime64[ns, tz]`` data that contained a DST transition (:issue:`18885`)
- Bug in :func:`merge_asof` when merging on float values within defined tolerance (:issue:`22981`)
- Bug in :func:`pandas.concat` when concatenating a multicolumn DataFrame with tz-aware data against a DataFrame with a different number of columns (:issue`22796`)

Build Changes
^^^^^^^^^^^^^
Expand Down
11 changes: 11 additions & 0 deletions pandas/core/dtypes/dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -546,6 +546,17 @@ def __new__(cls, unit=None, tz=None):
cls._cache[key] = u
return u

@classmethod
def construct_array_type(cls):
"""Return the array type associated with this dtype

Returns
-------
type
"""
from pandas import DatetimeIndex
return DatetimeIndex

@classmethod
def construct_from_string(cls, string):
""" attempt to construct this type from a string, raise a TypeError if
Expand Down
4 changes: 4 additions & 0 deletions pandas/core/internals/concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,10 @@ def get_reindexed_values(self, empty_dtype, upcasted_na):

if getattr(self.block, 'is_datetimetz', False) or \
is_datetimetz(empty_dtype):
if self.block is None:
array = empty_dtype.construct_array_type()
missing_arr = array([fill_value], dtype=empty_dtype)
return missing_arr.repeat(self.shape[1])
pass
elif getattr(self.block, 'is_categorical', False):
pass
Expand Down
32 changes: 32 additions & 0 deletions pandas/tests/frame/test_combine_concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,38 @@ def test_concat_multiple_tzs(self):
expected = DataFrame(dict(time=[ts2, ts3]))
assert_frame_equal(results, expected)

@pytest.mark.parametrize(
't1',
[
'2015-01-01',
pytest.param(pd.NaT, marks=pytest.mark.xfail(
reason='GH23037 incorrect dtype when concatenating',
strict=True))])
def test_concat_tz_NaT(self, t1):
# GH 22796
# Concating tz-aware multicolumn DataFrames
ts1 = Timestamp(t1, tz='UTC')
ts2 = Timestamp('2015-01-01', tz='UTC')
ts3 = Timestamp('2015-01-01', tz='UTC')

df1 = DataFrame([[ts1, ts2]])
df2 = DataFrame([[ts3]])

result = pd.concat([df1, df2])
expected = DataFrame([[ts1, ts2], [ts3, pd.NaT]], index=[0, 0])

assert_frame_equal(result, expected)

def test_concat_tz_not_aligned(self):
# GH 22796
ts = pd.to_datetime([1, 2]).tz_localize("UTC")
a = pd.DataFrame({"A": ts})
b = pd.DataFrame({"A": ts, "B": ts})
result = pd.concat([a, b], sort=True, ignore_index=True)
expected = pd.DataFrame({"A": list(ts) + list(ts),
"B": [pd.NaT, pd.NaT] + list(ts)})
assert_frame_equal(result, expected)

def test_concat_tuple_keys(self):
# GH 14438
df1 = pd.DataFrame(np.ones((2, 2)), columns=list('AB'))
Expand Down