Skip to content

STYLE: Fix incorrect path to ujson directory #39352

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jan 26, 2021

Conversation

gustavocmaciel
Copy link
Contributor

I fixed the path to ujson directory so cpplint will run on pandas/_libs/src/ujson/ instead of pandas/_libs/ujson, which doesn't exist.

But the title of the issue says cpplint (unintentionally?) skips pandas/_libs/src/ujson/, so I understand there's a chance the pandas/_libs/src/ujson/ is getting intentionally skipped.

Assuming this was unintentional, I believe this small fix solves the issue.

@MarcoGorelli
Copy link
Member

Thanks @gcmaciel , but now that they are not skipped, there are lots of issues to fixup in them - is that something to want to work on?

@gustavocmaciel
Copy link
Contributor Author

Yeah, I'd like to work on it.

@MarcoGorelli
Copy link
Member

awesome! I'll mark this as "draft" then until that's done then, ping when it's ready and we'll review

@MarcoGorelli MarcoGorelli marked this pull request as draft January 23, 2021 19:53
@gustavocmaciel
Copy link
Contributor Author

@MarcoGorelli , so far I've fixed 35 out of 42 issues. But before I start working on the remaining 7, I have a few questions about them.

5 of the remaining issues says Missing username in TODO; it should look like "// TODO(my_username): Stuff.", which can be fixed by adding any word between those parenthesis like TODO(username) or TODO(contributor). But I'd like to know first if this project is following some convention or anything I choose would be fine.

The other two issues has to do with a missing copyright message on two files (date_conversions.c and date_conversions.h). In the beginning, I had no clue which copyright message should be included, but after I inspected the other files on the same directory, I've noticed all of them share the same copyright message, which is this one:

/*
Copyright (c) 2011-2013, ESN Social Software AB and Jonas Tarnstrom
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
    * Redistributions of source code must retain the above copyright
      notice, this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright
      notice, this list of conditions and the following disclaimer in the
      documentation and/or other materials provided with the distribution.
    * Neither the name of the ESN Social Software AB nor the
      names of its contributors may be used to endorse or promote products
      derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL ESN SOCIAL SOFTWARE AB OR JONAS TARNSTROM BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Portions of code from MODP_ASCII - Ascii transformations (upper/lower, etc)
https://github.com/client9/stringencoders
Copyright (c) 2007  Nick Galbreath -- nickg [at] modp [dot] com. All rights reserved.
Numeric decoder derived from from TCL library
https://www.opensource.apple.com/source/tcl/tcl-14/tcl/license.terms
 * Copyright (c) 1988-1993 The Regents of the University of California.
 * Copyright (c) 1994 Sun Microsystems, Inc.
*/

Which makes me think this is the copyright that should be included. But I'm not sure about it.

@MarcoGorelli
Copy link
Member

I don't know I'm afraid, this isn't something I've (yet) worked on - cc @gfyoung @WillAyd

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@MarcoGorelli
Copy link
Member

Thanks @WillAyd - do you know if the same copyright notice (pasted in #39352 (comment)) can be used here and needs to be included in each file?

@WillAyd
Copy link
Member

WillAyd commented Jan 24, 2021

The date_conversions files were created by pandas and not vendored from ultrajson, so they are fine without that copyright notice

@WillAyd
Copy link
Member

WillAyd commented Jan 24, 2021

Sorry approved from my phone without seeing the checks - need to be green before really approving / merging, but conceptually this is all OK

@gustavocmaciel
Copy link
Contributor Author

Unfortunately, without the copyright message, the linting test will fail.

The way it is right now, when I run:

./ci/code_checks.sh lint

This is the output:

Linting .pyx code for spacing conventions in casting
Linting .pyx code for spacing conventions in casting DONE
Linting .c and .h
pandas/_libs/src/ujson/python/date_conversions.c:0:  No copyright message found.  You should have a line: "Copyright [year] <Copyright Owner>"  [legal/copyright] [5]
pandas/_libs/src/ujson/python/date_conversions.c:111:  Missing username in TODO; it should look like "// TODO(my_username): Stuff."  [readability/todo] [2]
Done processing pandas/_libs/src/ujson/python/date_conversions.c
pandas/_libs/src/ujson/python/date_conversions.h:0:  No copyright message found.  You should have a line: "Copyright [year] <Copyright Owner>"  [legal/copyright] [5]
pandas/_libs/src/ujson/python/date_conversions.h:17:  Missing username in TODO; it should look like "// TODO(my_username): Stuff."  [readability/todo] [2]
Done processing pandas/_libs/src/ujson/python/date_conversions.h
pandas/_libs/src/ujson/python/objToJSON.c:1522:  Missing username in TODO; it should look like "// TODO(my_username): Stuff."  [readability/todo] [2]
pandas/_libs/src/ujson/python/objToJSON.c:1621:  Missing username in TODO; it should look like "// TODO(my_username): Stuff."  [readability/todo] [2]
Done processing pandas/_libs/src/ujson/python/objToJSON.c
pandas/_libs/src/ujson/python/ujson.c:76:  Missing username in TODO; it should look like "// TODO(my_username): Stuff."  [readability/todo] [2]
Done processing pandas/_libs/src/ujson/python/ujson.c
Total errors found: 7
Linting .c and .h DONE

So I still need to fix that lack of copyright (and put something in that TODO part).

@WillAyd
Copy link
Member

WillAyd commented Jan 24, 2021 via email

@gustavocmaciel
Copy link
Contributor Author

Great! Thanks @WillAyd

@jreback jreback added IO JSON read_json, to_json, json_normalize Code Style Code style, linting, code_checks labels Jan 24, 2021
@jreback jreback modified the milestone: 1.3 Jan 24, 2021
@gustavocmaciel
Copy link
Contributor Author

gustavocmaciel commented Jan 25, 2021

I fixed all the issues raised by cpplint.

But one of the changes I've made led to some fails. After inspecting those logs I've identified that when I've replaced sprintf with snprinf at objToJSON.c, line 1399, to fix the issue Never use sprintf. Use snprintf instead. [runtime/printf] [5], it led to these errors when it tried to build pandas error: passing argument 2 of ‘snprintf’ makes integer from pointer without a cast [-Werror=int-conversion] and error: passing argument 3 of ‘snprintf’ makes pointer from integer without a cast [-Werror=int-conversion].

So to make it work, I reverted snprinf to sprintf so that function will keep working as it was before, and, in code_checks.sh, I added this new filter -runtime/printf to the list of filters so cpplint will not raise the Never use sprintf. Use snprintf instead error again. I also included this new line of comment:

    # runtime/printf: Warnings about using sprintf instead of snprintf

@gustavocmaciel gustavocmaciel marked this pull request as ready for review January 25, 2021 15:41
@gustavocmaciel
Copy link
Contributor Author

It's all green now! @MarcoGorelli @WillAyd , this PR is ready to be reviewed

@MarcoGorelli MarcoGorelli self-requested a review January 25, 2021 15:52
@WillAyd
Copy link
Member

WillAyd commented Jan 25, 2021 via email

@jreback jreback added this to the 1.3 milestone Jan 25, 2021
@gustavocmaciel
Copy link
Contributor Author

gustavocmaciel commented Jan 25, 2021

@WillAyd sure, but actually I wasn't writing a new function or anything, I only replaced sprintf with snprintf because it was required by cpplint. So this is the code from https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/objToJSON.c#L1398, line 1398, before the replacement:

cLabel = PyObject_Malloc(21); // 21 chars for int64
sprintf(cLabel, "%" NPY_DATETIME_FMT,
        NpyDateTimeToEpoch(nanosecVal, base));
len = strlen(cLabel);

Then, I've replaced sprintf with snprintf and it looked like this:

cLabel = PyObject_Malloc(21); // 21 chars for int64
snprintf(cLabel, "%" NPY_DATETIME_FMT,
        NpyDateTimeToEpoch(nanosecVal, base));
len = strlen(cLabel);

But this second code led to those errors. I tried to find a way to make it work without removing the rule but, in the end, I decided to revert that to sprinft (which was working before) and add -runtime/printf to the list of filters because I've noticed there were some other rules that had been already removed and also because at least it could have a working version until this could be better discussed.

@WillAyd
Copy link
Member

WillAyd commented Jan 25, 2021 via email

@gustavocmaciel
Copy link
Contributor Author

Like this, right?

cLabel = PyObject_Malloc(21); // 21 chars for int64
snprintf(cLabel, 21, "%" NPY_DATETIME_FMT,
        NpyDateTimeToEpoch(nanosecVal, base));
len = strlen(cLabel);

@WillAyd
Copy link
Member

WillAyd commented Jan 25, 2021 via email

@gustavocmaciel gustavocmaciel marked this pull request as draft January 25, 2021 23:14
@gustavocmaciel
Copy link
Contributor Author

It didn't work as expected because cpplint raised this error If you can, use sizeof(cLabel) instead of 21 as the 2nd arg to snprintf. [runtime/printf] [3], so I went there and replaced 21 with sizeof(cLabel), but if failed the test when it tried to build pandas giving this error error: ‘clabel’ undeclared (first use in this function).

So I decided to declare int size_of_cLabel = 21 and pass it to snprintf, which is basically the same as passing 21 to the function, but this time cpplint won't raise any error. And, as expected, it worked.

The final version looks like this:

cLabel = PyObject_Malloc(21); // 21 chars for int64
int size_of_cLabel = 21;
snprintf(cLabel, size_of_cLabel, "%" NPY_DATETIME_FMT,
        NpyDateTimeToEpoch(nanosecVal, base));
len = strlen(cLabel);

Now I'm going to remove -runtime/printf from the list of filters (which I also agree it wasn't the optimal solution) and push the changes here (I'll keep this as a draft until then).

@WillAyd
Copy link
Member

WillAyd commented Jan 26, 2021

Can you push the change? Would be a little cleaner if you move the variable declaration up one line and pass it as an argument to PyObject_Malloc as well

@gustavocmaciel
Copy link
Contributor Author

I pushed the changes.

I'm pasting here too to make it easier:

int size_of_cLabel = 21;  // 21 chars for int 64
cLabel = PyObject_Malloc(size_of_cLabel);
snprintf(cLabel, size_of_cLabel, "%" NPY_DATETIME_FMT,
        NpyDateTimeToEpoch(nanosecVal, base));
len = strlen(cLabel);

@gustavocmaciel gustavocmaciel marked this pull request as ready for review January 26, 2021 13:54
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. @WillAyd merge when ready

@WillAyd WillAyd merged commit 3b4aec2 into pandas-dev:master Jan 26, 2021
@WillAyd
Copy link
Member

WillAyd commented Jan 26, 2021

Great thanks @gcmaciel

@gustavocmaciel gustavocmaciel deleted the issue-39347 branch January 27, 2021 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

STYLE: cpplint (unintentionally?) skips pandas/_libs/src/ujson/
4 participants