-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: Some code cleanups in pandas/_libs/parsers.pyx #32369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/_libs/parsers.pyx
Outdated
coliter_t it | ||
const char *word = NULL | ||
|
||
coliter_setup(&it, parser, col, line_start) | ||
|
||
for i in range(line_end - line_start): | ||
for _ in range(line_end - line_start): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you justify this change to someone who doesn't know much about cython.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed with @simonjayhawkins
moreover, id leave this alone. seemingly-innocuous changes can have performance affects in cython, and i dont think this is worth the effort of profling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you justify this change to someone who doesn't know much about cython.
@simonjayhawkins No I can not. I tend to forget that Cython files are not Python, and Python idioms are not always applicable when dealing with Cython.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you keep cosmetic changes separate from other changes, #32177 (review)
pandas/_libs/parsers.pyx
Outdated
|
||
na_count[0] = 0 | ||
coliter_setup(&it, parser, col, line_start) | ||
|
||
if na_filter: | ||
for i in range(lines): | ||
for _ in range(lines): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MomIsBestFriend need to be careful making changes like this. Previously this loop was done with a Py_ssize_t
declared variable I
and by doing this the iteration is doing with assignment to a Python object. I suspect this is slower than it was previously
Unless there is another upside or you have benchmarks showing otherwise it's not worth making changes like this
pandas/_libs/parsers.pyx
Outdated
@@ -1614,7 +1612,7 @@ cdef inline void _to_fw_string_nogil(parser_t *parser, int64_t col, | |||
int64_t line_start, int64_t line_end, | |||
size_t width, char *data) nogil: | |||
cdef: | |||
int64_t i | |||
Py_ssize_t i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only place where I am not removing a declaration of an unused variable, but I am chaning the cdef
of it.
I can revert if this is unrelated to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would revert this. Technically this is used as the difference of line_start
and line_end
which are both int64_t, so makes sense to keep that declaration consistent particularly for 32 bit platforms
pandas/_libs/parsers.pyx
Outdated
@@ -1614,7 +1612,7 @@ cdef inline void _to_fw_string_nogil(parser_t *parser, int64_t col, | |||
int64_t line_start, int64_t line_end, | |||
size_t width, char *data) nogil: | |||
cdef: | |||
int64_t i | |||
Py_ssize_t i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would revert this. Technically this is used as the difference of line_start
and line_end
which are both int64_t, so makes sense to keep that declaration consistent particularly for 32 bit platforms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MomIsBestFriend lgtm ping on green
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MomIsBestFriend, nice clean up.
Sphinx failed to lead a web page when building the docs. Can you merge master to get the CI green please?
@MomIsBestFriend can you merge master? |
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
That's very cool, didn't know it's possible. |
Great thanks @MomIsBestFriend |
nice |
There are a lot of cdef unused variables in
pandas/_libs/parsers.pyx
this PR is covering some of the unused variables.