Skip to content

Infer dtype of Series in more cases #766

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 18, 2023
Merged

Conversation

twoertwein
Copy link
Member

@twoertwein
Copy link
Member Author

I don't know why both mypy and pyright think that Series[int].xs is int!

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw this is in draft form, but I think my comments would need to be addressed anyway

assert assert_type(s.replace(1, 2, inplace=True), None) is None


def test_cat_accessor() -> None:
# GH 43
s = pd.Series(pd.Categorical(["a", "b", "a"], categories=["a", "b"]))
s: pd.Series[str] = pd.Series(
pd.Categorical(["a", "b", "a"], categories=["a", "b"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have a CategoricalSeries ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might need to reconsider S1 at some point (not in this PR). I would slightly prefer S1 = TypeVar("S1") as any type can be in a Series.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I'd rather encourage "typical" types to be in a Series, and let people do cast or ignore if they are putting other types in a Series.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do a CategoricalSeries as a future PR

@twoertwein twoertwein marked this pull request as ready for review August 18, 2023 13:51
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It all looks good. Some open comments to address. I think this is the list:

  1. Where you changed the construction of the Series to shorten them, I'd rather leave them as is to have less changes to the tests.
  2. For Series.update(), need to surround that ignore with a TYPE_CHECKING_INVALID_USAGE pattern
  3. Use np.integer rather than np.intp
  4. For strings passed into Series.agg(), maybe we just need to have an overload for np.mean and "mean" to say they return Series[float] (as opposed to what I wrote elsewhere about returning Any) and then you can remove the ignore in the tests.

@twoertwein
Copy link
Member Author

I believe I fixed everything but agg/aggregate. will look into whether we can do something like:

  • special case "mean"
  • any string return S1
  • for callables: return the actual return type

@@ -834,7 +859,7 @@ class Series(IndexOpsMixin[S1], NDFrame):
axis: AxisIndex = ...,
*args,
**kwargs,
) -> Series[S1]: ...
) -> Self: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make this Series since we really don't know the type of the Series - could be mix of int and float, or one, or the other

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, that made it work!

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty close. Just that one case where there is a mix of funcs on the Series.agg() call. Maybe the solution is that if a list is passed, the result is Series instead of Series[S1] . See comment

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @twoertwein

@Dr-Irv Dr-Irv merged commit 3c7df2f into pandas-dev:main Aug 18, 2023
@twoertwein twoertwein deleted the series branch February 10, 2024 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create pd.Series initializer overload to propagate the generic type of an input container
2 participants