Skip to content

Using LGTM.com to find and fix issues in Python and C code #26664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
AlexTereshenkov opened this issue Jun 5, 2019 · 2 comments
Closed

Using LGTM.com to find and fix issues in Python and C code #26664

AlexTereshenkov opened this issue Jun 5, 2019 · 2 comments
Labels
Duplicate Report Duplicate issue or pull request

Comments

@AlexTereshenkov
Copy link
Contributor

There are a few issues in the code that were flagged up by LGTM.com website: https://lgtm.com/projects/g/pandas-dev/pandas/alerts/?mode=tree. Some of the other numerical computing repositories have been analyzed there as well such as numpy and scipy.

You can see the issues in both Python and C code. For instance, here is the issue Implicit scaling of pointer arithmetic expressions can cause buffer overflow conditions that is found in SciPy source C code. Some of the issues may be of higher importance to the project (e.g. Allocating memory with a size controlled by an external user can result in integer overflow vs A pure expression whose value is ignored is likely to be the result of a typo.). It is completely up to the developers to pick what's relevant.

Due to the dynamic nature of Python and some heavy use of all kinds of magic in numpy (e.g. Comparison of identical values, the intent of which is unclear and scipy (e.g. Using a named argument whose name does not correspond to a parameter of the init method of the class being instantiated, will result in a TypeError at runtime, some of the alerts may look like false positives and I would love to find out which of them are to be able to fix the code to avoid raising them.

If you like, you can use LGTM for automatically reviewing code in pull requests. Here's an example of how Google's AMPHTML use that to flag up security vulnerabilities in their code base: ampproject/amphtml#13060. This will prevent introducing new issues as you can see if there are any issues introduced before merging.

(full disclosure: I'm a huge fan of pandas and also part of the team that runs LGTM.com)

@AlexTereshenkov AlexTereshenkov changed the title Using LGTM.com to find and fix some issues in Python and C code Using LGTM.com to find and fix issues in Python and C code Jun 5, 2019
@mroeschke
Copy link
Member

Duplicate of #20589. We would probably be open to integrate LGTM in our CI.

@mroeschke mroeschke added the Duplicate Report Duplicate issue or pull request label Jun 5, 2019
@AlexTereshenkov
Copy link
Contributor Author

@mroeschke Cool. To get help setting this up, review https://lgtm.com/help/lgtm/github-apps-integration and for step-by-step guide see https://lgtm.com/help/lgtm/github-apps-integration. Ping me any time if you need help with this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

2 participants