You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wonder if it possible to implement conditional join (merge) between pandas dataframes. Basically, I am thinking some conditional SQL-like joins:
select a.id, a.date, a.var1, a.var2,
b.var3
from data1 as a left join data2 as b
on (a.id<b.key+2 and a.id>b.key-3) and (a.date>b.date-10 and a.date<b.date+10);
Above code is just a made up example to show the purpose.
the linked SO soln seems quite reasonable to me. I think adding a where to merge could be done, but would make the syntax even more cumbersome and add even more options. This is makes this it more non-pythonic / pandonic IMHO.
Python/pandas is all about doing relatively simple, chained operations, in a pipeline. Easier for the user to grok what is going on and examine. A where followed by a merge, is IMHO, by a black box (like SQL). Yes power is important, but I think most would prefer to have a statement that can be expressed in a short statement.
That said I marked it as an issue. If you'd like to attempt an impl. Pls do so.
I wonder if it possible to implement conditional join (merge) between pandas dataframes. Basically, I am thinking some conditional SQL-like joins:
Above code is just a made up example to show the purpose.
http://stackoverflow.com/questions/23508351/how-to-do-a-conditional-join-in-python-pandas
This question on stackoverflow is ask for the same purpose. The answer provided in the question solves the problem, but it's quite ad hoc.
Can we have something like:
It seems until now, the SQL statements are still way more flexible than pandas "merge" or "join". It would be nice to have it in pandas itself.
The text was updated successfully, but these errors were encountered: