Skip to content

CLN: Some code cleanups in pandas/_libs/parsers.pyx #32369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 16, 2020

Conversation

ShaharNaveh
Copy link
Member

There are a lot of cdef unused variables in pandas/_libs/parsers.pyx this PR is covering some of the unused variables.

coliter_t it
const char *word = NULL

coliter_setup(&it, parser, col, line_start)

for i in range(line_end - line_start):
for _ in range(line_end - line_start):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you justify this change to someone who doesn't know much about cython.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed with @simonjayhawkins

moreover, id leave this alone. seemingly-innocuous changes can have performance affects in cython, and i dont think this is worth the effort of profling

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you justify this change to someone who doesn't know much about cython.

@simonjayhawkins No I can not. I tend to forget that Cython files are not Python, and Python idioms are not always applicable when dealing with Cython.

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you keep cosmetic changes separate from other changes, #32177 (review)


na_count[0] = 0
coliter_setup(&it, parser, col, line_start)

if na_filter:
for i in range(lines):
for _ in range(lines):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MomIsBestFriend need to be careful making changes like this. Previously this loop was done with a Py_ssize_t declared variable I and by doing this the iteration is doing with assignment to a Python object. I suspect this is slower than it was previously

Unless there is another upside or you have benchmarks showing otherwise it's not worth making changes like this

@@ -1614,7 +1612,7 @@ cdef inline void _to_fw_string_nogil(parser_t *parser, int64_t col,
int64_t line_start, int64_t line_end,
size_t width, char *data) nogil:
cdef:
int64_t i
Py_ssize_t i
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place where I am not removing a declaration of an unused variable, but I am chaning the cdef of it.

I can revert if this is unrelated to this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would revert this. Technically this is used as the difference of line_start and line_end which are both int64_t, so makes sense to keep that declaration consistent particularly for 32 bit platforms

@simonjayhawkins simonjayhawkins added this to the 1.1 milestone Mar 5, 2020
@@ -1614,7 +1612,7 @@ cdef inline void _to_fw_string_nogil(parser_t *parser, int64_t col,
int64_t line_start, int64_t line_end,
size_t width, char *data) nogil:
cdef:
int64_t i
Py_ssize_t i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would revert this. Technically this is used as the difference of line_start and line_end which are both int64_t, so makes sense to keep that declaration consistent particularly for 32 bit platforms

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MomIsBestFriend lgtm ping on green

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MomIsBestFriend, nice clean up.

Sphinx failed to lead a web page when building the docs. Can you merge master to get the CI green please?

@WillAyd
Copy link
Member

WillAyd commented Mar 11, 2020

@MomIsBestFriend can you merge master?

@jreback
Copy link
Contributor

jreback commented Mar 14, 2020

can you merge master once again, we lost azure coverage; ping on green.

@jreback
Copy link
Contributor

jreback commented Mar 14, 2020

/azp run

@azure-pipelines
Copy link
Contributor

Azure Pipelines successfully started running 1 pipeline(s).

@ShaharNaveh
Copy link
Member Author

/azp run

That's very cool, didn't know it's possible.

@WillAyd WillAyd merged commit ad81de1 into pandas-dev:master Mar 16, 2020
@WillAyd
Copy link
Member

WillAyd commented Mar 16, 2020

Great thanks @MomIsBestFriend

@jbrockmendel
Copy link
Member

nice

@ShaharNaveh ShaharNaveh deleted the CLN-_libs-parsers branch March 20, 2020 00:06
SeeminSyed pushed a commit to CSCD01-team01/pandas that referenced this pull request Mar 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants