BUG: fix #59950 handle duplicate column names in dataframe queries #59971

miguelcsx · 2024-10-04T19:42:45Z

Issue Description

When calling DataFrame.query() on a DataFrame with duplicate column names, an unexpected TypeError is raised due to a change introduced in pandas 2.2.1. The root cause is the self.dtypes[k] returning a Series for duplicate column names instead of a single value, leading to a failure in the query evaluation.

Changes Made

Fixed an issue where Dataframe.query() would throw an unexpected error
The error was caused by self.dtypes[k]
Adjusted the behavior to match the behavior prior to pandas version
Added tests to ensure that Dataframe.query() works as expected

PR Checklist

closes BUG: DataFrame.query() throws error when df has duplicate column names #59950
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

- Fixed an issue where `Dataframe.query()` would throw an unexpected error - The error was caused by `self.dtypes[k]` - Adjusted the behavior to match the behavior prior to pandas version - Added tests to ensure that `Dataframe.query()` works as expected

mroeschke · 2024-11-05T19:01:14Z

Thanks @miguelcsx

miguelcsx force-pushed the fix/dataframe-query branch 2 times, most recently from 68259e1 to 2637a24 Compare October 4, 2024 20:06

miguelcsx marked this pull request as draft October 4, 2024 21:23

miguelcsx force-pushed the fix/dataframe-query branch from 2637a24 to a5bd34b Compare October 5, 2024 03:01

Merge branch 'main' into fix/dataframe-query

a817ff4

miguelcsx marked this pull request as ready for review October 5, 2024 03:02

miguelcsx changed the title ~~fix: #59950 handle duplicate column names in dataframe queries~~ BUG: fix #59950 handle duplicate column names in dataframe queries Oct 5, 2024

miguelcsx added 2 commits October 7, 2024 19:08

Merge branch 'main' into fix/dataframe-query

c5f273d

Merge branch 'main' into fix/dataframe-query

c8a9528

mroeschke mentioned this pull request Oct 8, 2024

Fix TypeError problem with the DataFrame.query() method when the DataFrame contains duplicate column names. #60005

Closed

miguelcsx and others added 2 commits October 17, 2024 17:40

Merge branch 'main' into fix/dataframe-query

9521b4f

Merge branch 'main' into fix/dataframe-query

cd2ecf5

mroeschke added the expressions pd.eval, query label Nov 4, 2024

Merge branch 'main' into fix/dataframe-query

c9f3be4

mroeschke approved these changes Nov 5, 2024

View reviewed changes

mroeschke added this to the 3.0 milestone Nov 5, 2024

mroeschke merged commit 6631202 into pandas-dev:main Nov 5, 2024
50 of 51 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: fix #59950 handle duplicate column names in dataframe queries #59971

BUG: fix #59950 handle duplicate column names in dataframe queries #59971

Uh oh!

miguelcsx commented Oct 4, 2024

Uh oh!

Uh oh!

mroeschke commented Nov 5, 2024

Uh oh!

Uh oh!

Uh oh!

BUG: fix #59950 handle duplicate column names in dataframe queries #59971

BUG: fix #59950 handle duplicate column names in dataframe queries #59971

Uh oh!

Conversation

miguelcsx commented Oct 4, 2024

Issue Description

Changes Made

PR Checklist

Uh oh!

Uh oh!

mroeschke commented Nov 5, 2024

Uh oh!

Uh oh!