Skip to content

STYLE: specify encodings when opening files #52999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
May 7, 2023

Conversation

KetuPatel806
Copy link
Contributor

@KetuPatel806 KetuPatel806 commented Apr 29, 2023

@MarcoGorelli
Copy link
Member

let's stick to one PR, please don't open a new one which does the same thing

same comments from #52997, please address them

@MarcoGorelli MarcoGorelli changed the title Review it if there some errors. STYLE: specify encodings when opening files Apr 29, 2023
@KetuPatel806
Copy link
Contributor Author

This is my first PR so I was making mistake next time i was more carefull for making PR

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting there 💪

could you please:

  • remove b/.pre-commit-config.yaml
  • run pre-commit on the rest of the files you've modified?
pre-commit run --from-ref=upstream/main --to-ref=HEAD --all-files

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this is it, thanks @KetuPatel806 !

@@ -126,7 +126,7 @@ def ensure_clean(
handle_or_str: str | IO = str(path)
if return_filelike:
kwargs.setdefault("mode", "w+b")
handle_or_str = open(path, **kwargs)
handle_or_str = open(path, **kwargs, encoding="utf-8")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies, looks like some tests are failing here due to

 TypeError: io.open() got multiple values for keyword argument 'encoding'

we may have to do something like

handle_or_str = open(path, encoding=kwargs.get('encoding', None), **{key: value for key, value in kwargs.items() if key != 'encoding'})

, could you try this please and see if it fixes the failing tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok,I'll do it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it run the pre-commit command and i'll commit it.

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting closer

@@ -1444,7 +1444,7 @@ def test_file_io_iterparse(datapath, parser, mode):
filename = datapath("io", "data", "xml", "books.xml")

funcIO = StringIO if mode == "r" else BytesIO
with open(filename, mode) as f:
with open(filename, mode, encoding="utf-8") as f:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one's causing test failures - could we do encoding="utf-8" if mode=="r" else None?

@@ -1444,7 +1444,7 @@ def test_file_io_iterparse(datapath, parser, mode):
filename = datapath("io", "data", "xml", "books.xml")

funcIO = StringIO if mode == "r" else BytesIO
with open(filename, mode) as f:
with open(filename, mode, encoding="utf-8" if mode=="r" else None) as f:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still gonna have to run pre-commit on the files you've modifed though ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was run it. it passed all of the hooks. only black (hook) failed it

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's still some test failures

it may be worth parametrising over both mode and encoding. either that, or putting similar encoding='utf-8' if mode=='r' else None in the failing tests

with open(filename, mode) as f:
with open(
filename,
mode="utf-8" if mode == "r" else None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to change encoding, not mode

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just keep mode as it is

@MarcoGorelli
Copy link
Member

this is close, a few tests left. would be great if you could finish it up, else I can add a commit on Tuesday and push this over the finish line

@mroeschke mroeschke added the Code Style Code style, linting, code_checks label May 1, 2023
@KetuPatel806
Copy link
Contributor Author

can i fixes that cases or you push this over the final line???

@MarcoGorelli
Copy link
Member

feel free to fix

@KetuPatel806
Copy link
Contributor Author

can you refer that test cases so i'll fix it?

@MarcoGorelli
Copy link
Member

check the failing tests in the CI logs

@MarcoGorelli
Copy link
Member

you'll need to fix the tests (in addition to merging upstream/main) to get this passing

@KetuPatel806
Copy link
Contributor Author

i have check the circle ci to find the error and i tired on a local to fix it.

@KetuPatel806
Copy link
Contributor Author

i try to fix read Binary mode error but if i remove the encoding then pre-commit command does not work it give me error massage called as "unspecified encoding"

@KetuPatel806
Copy link
Contributor Author

i thought it finish it up.

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be good, thanks @KetuPatel806 !

@MarcoGorelli MarcoGorelli merged commit 87970b2 into pandas-dev:main May 7, 2023
@MarcoGorelli MarcoGorelli added this to the 2.1 milestone May 7, 2023
Rylie-W pushed a commit to Rylie-W/pandas that referenced this pull request May 19, 2023
* Changes Confirmed

* Encoding Completed

* Spaces Are Completed

* Pre-ccommit manually completed

* b.pre-commit removed

* Final Commit

* Some Changed reflected

* test_xml Updated

* Pre-commit check passed

* Mode changed in xml file

* mode reverted

* Try to fix errors

* error-checks

* Fix Some errors

* Unspecified-encodingFixed

* final commited

* simplify

---------

Co-authored-by: MarcoGorelli <>
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
* Changes Confirmed

* Encoding Completed

* Spaces Are Completed

* Pre-ccommit manually completed

* b.pre-commit removed

* Final Commit

* Some Changed reflected

* test_xml Updated

* Pre-commit check passed

* Mode changed in xml file

* mode reverted

* Try to fix errors

* error-checks

* Fix Some errors

* Unspecified-encodingFixed

* final commited

* simplify

---------

Co-authored-by: MarcoGorelli <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants