Skip to content

STY: Removed unconcatenated strings #30464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 25, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,10 @@ if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then
invgrep -R --include=*.{py,pyx} 'xrange' pandas
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Check for use of not concatenated strings' ; echo $MSG
python ./scripts/validate_string_concatenation.py pandas
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Check that no file in the repo contains trailing whitespaces' ; echo $MSG
INVGREP_APPEND=" <- trailing whitespaces found"
invgrep -RI --exclude=\*.{svg,c,cpp,html,js} --exclude-dir=env "\s$" *
Expand Down
73 changes: 73 additions & 0 deletions scripts/validate_string_concatenation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
#!/usr/bin/env python
"""
Check where there is a string that needs to be concatenated.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you flesh this out with a GH reference and a sentence about how black causes the issue


import os
import sys
import token
import tokenize

FILE_EXTENTIONS_TO_CHECK = [".py", ".pyx"]


def main():
path = sys.argv[1]

if not os.path.exists(path):
raise ValueError("Please enter a valid path, to a file/directory.")

if os.path.isfile(path):
# Means that the given path is of a single file.
sys.exit(is_concatenated(path))

status_codes = set()
# Means that the given path is of a directory.
for subdir, _, files in os.walk(path):
for file_name in files:
ext = os.path.splitext(os.path.join(subdir, file_name))[1]
if ext in FILE_EXTENTIONS_TO_CHECK:
status_codes.add(is_concatenated(os.path.join(subdir, file_name)))

if 1 in status_codes:
sys.exit(1)

sys.exit(0)


def is_concatenated(file_path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I typed this whole script, but it said it was a syntax error, same with f-strings, I think that the running platform of this is python 3.5

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - we run validate_docstrings.py as part of code_checks.sh which has 3.6 + syntax?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - we run validate_docstrings.py as part of code_checks.sh which has 3.6 + syntax?

I really don't know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - we run validate_docstrings.py as part of code_checks.sh which has 3.6 + syntax?

@alimcmaster1 #30467 (comment)

"""
Checking if the file containing strings that needs to be concatenated.

Parameters
----------
file_path : str
File path pointing to a single file.

Returns
-------
int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like a bool would be a more appropriate return type if this function is only ever going to return 0/1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did also thought so, but I am relying on an exit status, so int feels more native.

Status code representing if the file needs a fix.
0 - All good.
1 - Needs to be fixed.
"""
with open(file_path, "r") as file_name:
toks = list(tokenize.generate_tokens(file_name.readline))
for i in range(len(toks) - 1):
tok = toks[i]
tok2 = toks[i + 1]
if tok[0] == token.STRING and tok[0] == tok2[0]:
print(
"{file_path}:{line_number}:\t{start} and {end}".format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I already answered above 😄

file_path=file_path,
line_number=tok[2][0],
start=tok[1],
end=tok2[1],
)
)
return 1
return 0


if __name__ == "__main__":
main()