-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Join several dataframes with MultiIndex containing dates raises ValueError #33692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report @neshitov. Something strange is going on here. Are you interested in investigating more? Can you start by looking at the behavior of Note that we're hitting this date vs. datetime vs. Timestamp issue in several places right now (#35830 and related issues). |
I looked a bit into it. The error is in here:
Gets a datetime.date and a Timestamp as input. Are they supposed to be equal? |
We have several issues dealing with date vs. Timestamp equality right now
that we'd like to address consistently, but I don't recall what the current
preference is. cc @jbrockmendel.
…On Fri, Sep 11, 2020 at 9:01 PM patrick ***@***.***> wrote:
@TomAugspurger <https://github.com/TomAugspurger>
I looked a bit into it. The error is in here:
fast_unique_multiple in
https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/lib.pyx
Gets a datetime.date and a Timestamp as input. Are they supposed toy be
equal?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#33692 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIXNZOU6ON5AHKUNBXLSFLJABANCNFSM4MM7MBFA>
.
|
#36131 deprecates date vs Timestamp comparisons to match the stdlib datetime. |
It appears this works in master now. Could use a test
|
Give me this issue. I will work on it |
Can I have a look at this issue? |
take |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample
When I try to join 3 dataframes with multiindices containing dates I got ValueError:
Traceback
Problem description
The issue seems to be that groupby produces dataframes df1 and df2 having multiindex with level 'date' consisting of 'dateteime.date' objects, while in df3 we get multiindex with level 'date' consisting of 'pandas._libs.tslibs.timestamps.Timestamp' objects. This does not lead to an error in subsequent joins:
or when dealing with single level indices, but creates a problem when joining 3 dataframes at once.
Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.6.final.0
python-bits : 64
OS : Linux
OS-release : 4.4.0-124-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.0.3
numpy : 1.18.1
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 45.2.0.post20200210
Cython : 0.29.12
pytest : 5.0.1
hypothesis : None
sphinx : 2.1.2
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : 4.3.4
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.11.1
IPython : 7.7.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : None
gcsfs : 0.3.0
lxml.etree : 4.3.4
matplotlib : 3.1.0
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : 0.15.1
pytables : None
pytest : 5.0.1
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.5
tables : 3.5.2
tabulate : 0.8.3
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.8
numba : 0.45.0
The text was updated successfully, but these errors were encountered: