-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Min/max does not work for dates with timezones if there are missing values in the data frame #27794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
may i take a look at it ? |
That'd be great! |
Here is a workaround I'm using, but it is slow : def __column_max_without_nans(df):
"""Takes the min between two columns, avoiding a bug in pandas
Link to github issue : https://github.com/pandas-dev/pandas/issues/27794
"""
return df[df.notnull().all(axis=1)].max(axis=1).reindex(df.index) |
great! My workaround: get rid of the timezones altogether. |
@MaximeWeyl, your workaround only solves the issue partially, i.e. when all timestamps (with timezone) are present in an axis. Consider the following example:
Expected output of
Applying your workaround:
gives:
As you can see, your workaround helps for the 0th row, but returns an incorrect value for the 1st row which has an |
Here's my workaround - instead of using:
I'm using:
|
The underlying issue here is that |
sad, its alive in 1.2.3, any idea can we fix it? |
part of the way there: passing |
Fixed by the removal of numeric_only=None, closing. |
Code Sample, a copy-pastable example if possible
Problem description
Without
tz_localize
, everything works as expected. Also, if both columns have the same length and time zones, it works just fine. However, the combination of time zones and different lengths results in:of type float64.
Expected Output
List of dates
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.6.5.final.0
python-bits : 64
OS : Linux
OS-release : 4.4.0-17134-Microsoft
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8
pandas : 0.25.0
numpy : 1.17.0
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 41.0.1
Cython : 0.29.10
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : 4.3.4
html5lib : None
pymysql : None
psycopg2 : 2.8.2 (dt dec pq3 ext lo64)
jinja2 : 2.10.1
IPython : 7.5.0
pandas_datareader: None
bs4 : None
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.3.4
matplotlib : 3.1.1
numexpr : 2.6.9
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.0
sqlalchemy : 1.3.3
tables : 3.5.2
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : 1.1.8
I have found some related fixed issues, but not exactly this one.
The text was updated successfully, but these errors were encountered: