Skip to content

REF: move union_categoricals call outside of cython #40964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 28, 2021

Conversation

jbrockmendel
Copy link
Member

There's no real perf bump to calling union_categoricals inside cython, better to do it in the python code where we can e.g. get the benefit of mypy. Plus gets us closer to dependency structure goals.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, request

cc @gfyoung

return result


def ensure_dtype_objs(dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we do this more generally elsewhere? e.g. astype should be relocated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we did it would probably be in io.common and i dont see it there

@jreback jreback added IO CSV read_csv, to_csv Refactor Internal refactoring of code labels Apr 15, 2021
@jreback jreback added this to the 1.3 milestone Apr 15, 2021
@jbrockmendel
Copy link
Member Author

gentle ping; lots of annotations to be done around these functions

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebase. also puzzled why the concat logic lives where it does (except i guess historically), is moving this part of the plan?

warning_columns = []

result = {}
for name in names:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we not have a generic pandas concat call to handle this? or is that for a followup

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we not have a generic pandas concat call to handle this? or is that for a followup

im hopeful we can use something more generic, but havent tried it out yet (comment on L336)

@jbrockmendel
Copy link
Member Author

id like to get this in before the upcoming PR doing annotations for io.parsers, which has recently become a blocker for #41059

"""
# Conserve intermediate space
# Caller is responsible for concatenating chunks,
# see c_parser_wrapper._concatenatve_chunks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@jreback jreback merged commit 770e0e9 into pandas-dev:master Apr 28, 2021
@jreback
Copy link
Contributor

jreback commented Apr 28, 2021

thanks, can you catch @simonjayhawkins comment in next pass.

@jbrockmendel jbrockmendel deleted the ref-cy-parsers branch April 28, 2021 15:54
yeshsurya pushed a commit to yeshsurya/pandas that referenced this pull request May 6, 2021
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants