Skip to content

ENH: Add more 1.5.0 features #338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 30, 2022
Merged

ENH: Add more 1.5.0 features #338

merged 10 commits into from
Sep 30, 2022

Conversation

bashtage
Copy link
Contributor

  • Closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added: Please use assert_type() to assert the type of any return value

@bashtage bashtage mentioned this pull request Sep 28, 2022
@bashtage
Copy link
Contributor Author

More OSX timeout.

image

@bashtage
Copy link
Contributor Author

Second time out of OSX 3.10

image

@bashtage
Copy link
Contributor Author

image

@bashtage
Copy link
Contributor Author

3rd attempt time out.

image

@bashtage
Copy link
Contributor Author

Attempt 4 time out

image

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Sep 28, 2022

We need to increase the time limit. Can you modify the other PR to do that?

@bashtage
Copy link
Contributor Author

We need to increase the time limit. Can you modify the other PR to do that?

I'll try but I think it is stuck and any timeout won't fix it. It should only take a couple of seconds to finish what it is doing when it stops working.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change on read_csv()

@@ -44,7 +48,7 @@ def read_csv(
| npt.NDArray
| Callable[[str], bool]
| None = ...,
dtype: DtypeArg | None = ...,
dtype: DtypeArg | defaultdict[str, Dtype] | None = ...,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this have to be changed for all the overloads in read_csv() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, and also for read_clipboard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still seeing that only one of the overloads was changed.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. move a test as indicated
  2. Not seeing all overloads for read_csv() changed

@@ -821,6 +821,19 @@ def test_types_to_feather() -> None:
df.to_feather(file)


def test_arrow_dtype() -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should go in test_pandas.py not test_frame.py

@@ -44,7 +48,7 @@ def read_csv(
| npt.NDArray
| Callable[[str], bool]
| None = ...,
dtype: DtypeArg | None = ...,
dtype: DtypeArg | defaultdict[str, Dtype] | None = ...,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still seeing that only one of the overloads was changed.

@twoertwein
Copy link
Member

We need to increase the time limit. Can you modify the other PR to do that?

I'll try but I think it is stuck and any timeout won't fix it. It should only take a couple of seconds to finish what it is doing when it stops working.

We could again try caching the entire venv. The only downside is that infrequently it can take ages to retrieve a large cache which can negate any speedups. Currently, we just cache the tiny poetry.lock.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @bashtage

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Sep 29, 2022

@bashtage ping when green. PR looks good, but testing is failing.

@bashtage
Copy link
Contributor Author

bashtage commented Sep 30, 2022

defaultdict typing is very problematic. Best we can do for now seems to be just a plain defaultdict with no type info. Using teh obvious choices typing.DefautlDict[Hashable, Dtype] of [HashableT,Dtype] do not work. Even str, Dtype don't work. Could be due to invariance/covariance.

@bashtage bashtage requested a review from Dr-Irv September 30, 2022 10:01
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @bashtage

@Dr-Irv Dr-Irv merged commit 2111a97 into pandas-dev:main Sep 30, 2022
@bashtage bashtage deleted the more-1.5.0 branch September 30, 2022 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants