Skip to content

Allow for positional arguments instead of str | list[str] #284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 24, 2023

Conversation

MarcoGorelli
Copy link
Contributor

I'd like to suggest passing multiple columns positionally, rather than as str | list[str].

First, this is more ergonomic to write, as it saves an extra pair of brackets (and an extra level
of indentation...).

Second, because this is actually already supported in both pandas and Polars:

# pandas

In [2]: df.assign(
   ...:     d=lambda x: x['a']*2,
   ...:     e=lambda x: x['b'] - x['a']
   ...: )
Out[2]:
   a  b  c  d  e
0  1  4  4  2  3
1  2  5  2  4  3
2  3  6  1  6  3

# Polars

In [1]: df.with_columns(
   ...:     d=pl.col('a')*2,
   ...:     e=pl.col('b')-pl.col('a')
   ...: )
Out[1]:
shape: (3, 5)
┌─────┬─────┬─────┬─────┬─────┐
│ abcde   │
│ --------------- │
│ i64i64i64i64i64 │
╞═════╪═════╪═════╪═════╪═════╡
│ 14423   │
│ 25243   │
│ 36163   │
└─────┴─────┴─────┴─────┴─────┘

Third, because it doesn't restrict multiple arguments to being list - now, as long as they're an iterable,
they can be passed.

(Reminder: the reason we can't have str | Iterable[str] is that in Python str is itself an Iterable[str],
so it's ambiguous whether columns='abc' means "just column 'abc'" or "columns 'a', 'b', and 'c'".)

@kkraus14
Copy link
Collaborator

I completely agree with the intention of this PR to always treat the column keys as something "Iterable-like". Is there ever a situation where someone wants to pass something "Column-like" to any of these parameters where we then can't iterate (and implementations would maybe want to take advantage of a non-Python fastpath)?

@MarcoGorelli
Copy link
Contributor Author

MarcoGorelli commented Oct 21, 2023

thanks

not sure what you're asking sorry, could you give an example please?

@kkraus14
Copy link
Collaborator

thanks

not sure what you're asking sorry, could you give an example please?

I was thinking that there might be situations where you'd want to feed a Column[str] into any of these APIs, but we're consistently using Python strings for column names where this just makes sense. Approving.

@MarcoGorelli
Copy link
Contributor Author

thanks - moving forwards then as this is relatively minor

@MarcoGorelli MarcoGorelli merged commit b891d2e into data-apis:main Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants