REF: Unify _set_noconvert_dtype_columns for parsers #39365

phofl · 2021-01-24T00:57:26Z

xref REF: consolidate in CSV parser module #39345
tests added / passed
Ensure all linting tests pass, see here for how to run them

I unified the code which led to a bugfix. The test failed for the python parser case previously

jreback · 2021-01-25T16:46:24Z

pandas/io/parsers/base_parser.py

@@ -546,6 +548,65 @@ def _convert_to_ndarrays(
                print(f"Filled {na_count} NA values in column {c!s}")
        return result

+    def _set_noconvert_dtype_columns(self, col_indices, names):


can you type this in any way? (esp the return value) and add a doc-string

jreback · 2021-01-25T16:46:37Z

pandas/io/parsers/base_parser.py

+        if self.usecols_dtype == "integer":
+            # A set of integers will be converted to a list in
+            # the correct order every single time.
+            usecols = list(self.usecols)


Safe-sort, because could be mixed.

no could not, sorry. Used sorted

pandas/io/parsers/base_parser.py

jreback

lgtm. if you can try that suggestion and if it works ping (or comment that its a nogo)

jreback · 2021-01-26T00:12:26Z

pandas/io/parsers/base_parser.py

+
+            # pandas\io\parsers.py:2030: error: Incompatible types in
+            # assignment (expression has type "None", variable has type
+            # "List[Any]")  [assignment]


i think if you predeclare at the top

usecols: Optional[List[Any]] = []

you can remove the ignore

Yep, this works, typed it a bit more specific.
I've got #39342 for the mypy errors in there, will rebase when this is merged

pandas/io/parsers/base_parser.py

jreback · 2021-01-26T00:14:06Z

pandas/tests/io/parser/test_parse_dates.py

+        parse_dates=[1],
+        usecols=[1, 2],
+        thousands="-",
+    )


this is a bug yes? can you add a whatsnew note

phofl · 2021-01-26T20:18:34Z

@jreback green

jreback · 2021-01-27T14:01:39Z

thanks !

phofl added 2 commits January 24, 2021 01:54

REF: Unify _set_noconvert_dtype_columns for parsers

19e5d35

Ad pr reference

19a1c3c

phofl added IO CSV read_csv, to_csv Refactor Internal refactoring of code labels Jan 24, 2021

jreback requested changes Jan 25, 2021

View reviewed changes

Type and document function

8aee58c

jreback requested changes Jan 26, 2021

View reviewed changes

jreback reviewed Jan 26, 2021

View reviewed changes

phofl added 3 commits January 26, 2021 20:20

Type usecols

50c0699

Add whatsnew

0d6fa64

Change whatsnew

df7000a

jreback added this to the 1.3 milestone Jan 27, 2021

jreback approved these changes Jan 27, 2021

View reviewed changes

jreback merged commit bc3adf2 into pandas-dev:master Jan 27, 2021

phofl deleted the ref_noconvert branch January 27, 2021 18:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REF: Unify _set_noconvert_dtype_columns for parsers #39365

REF: Unify _set_noconvert_dtype_columns for parsers #39365

phofl commented Jan 24, 2021

jreback Jan 25, 2021

phofl Jan 25, 2021

jreback Jan 25, 2021

phofl Jan 25, 2021

phofl Jan 25, 2021

jreback left a comment

jreback Jan 26, 2021

phofl Jan 26, 2021

jreback Jan 26, 2021

phofl Jan 26, 2021

phofl commented Jan 26, 2021

jreback commented Jan 27, 2021

REF: Unify _set_noconvert_dtype_columns for parsers #39365

REF: Unify _set_noconvert_dtype_columns for parsers #39365

Conversation

phofl commented Jan 24, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phofl commented Jan 26, 2021

jreback commented Jan 27, 2021