Skip to content

REF: simplify CSVFormatter #36046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Sep 9, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
4223349
REF: extract properties cols and has_mi_columns
ivanovmg Sep 1, 2020
58ef283
REF: extract property chunksize
ivanovmg Sep 1, 2020
f4fe66d
REF: extract property quotechar
ivanovmg Sep 1, 2020
59a2d21
REF: extract properties data_index and nlevels
ivanovmg Sep 1, 2020
29256d4
REF: refactor _save_chunk
ivanovmg Sep 1, 2020
a6e84e1
REF: refactor _save
ivanovmg Sep 1, 2020
c840b3f
REF: extract method _save_body
ivanovmg Sep 1, 2020
6828146
REF: reorder _save-like methods
ivanovmg Sep 1, 2020
98d4e47
REF: extract compression property
ivanovmg Sep 1, 2020
d6b2827
REF: Extract property index_label
ivanovmg Sep 1, 2020
15dbc83
REF: extract helper properties
ivanovmg Sep 1, 2020
5e7b778
REF: delete local variables in _save_header
ivanovmg Sep 1, 2020
6e3b389
REF: extract method _get_header_rows
ivanovmg Sep 1, 2020
d733f0f
REF: move check for header into _save function
ivanovmg Sep 1, 2020
cdeb115
TYP: add several type annotations
ivanovmg Sep 1, 2020
417e74a
FIX: fix index labels
ivanovmg Sep 2, 2020
9df1d82
FIX: fix multiindex
ivanovmg Sep 4, 2020
9fd8d13
Merge branch 'master' into refactor/csvs
ivanovmg Sep 4, 2020
22955db
FIX: fix test failures on compression
ivanovmg Sep 4, 2020
5dcff8e
REF: eliminate preallocation of self.data
ivanovmg Sep 4, 2020
ff144d8
REF: extract method _convert_to_native_types
ivanovmg Sep 4, 2020
3da7207
REF: rename regular -> flat as reviewed
ivanovmg Sep 5, 2020
6041666
TYP: add type annotations as reviewed
ivanovmg Sep 5, 2020
46f593d
REF: refactor number formatting
ivanovmg Sep 5, 2020
080e6e1
FIX: mypy error with index_label
ivanovmg Sep 5, 2020
1e35f87
FIX: reorder if-statements in index_label
ivanovmg Sep 5, 2020
ba353a5
TYP: move IndexLabel to pandas._typing
ivanovmg Sep 5, 2020
a49dd63
TYP: quotechar, has_mi_columns, _need_to_save...
ivanovmg Sep 5, 2020
f1e1ac8
TYP: chunksize, but ignored assignment check
ivanovmg Sep 5, 2020
1346995
TYP: cols property
ivanovmg Sep 5, 2020
bebdfcf
TYP: nlevels and _has_aliases
ivanovmg Sep 5, 2020
b381e8a
Merge branch 'master' into refactor/csvs
ivanovmg Sep 5, 2020
ca888c1
CLN: move GH21227 check to pandas/io/common.py
ivanovmg Sep 5, 2020
b7dae11
TYP: remove redundant bool from IndexLabel type
ivanovmg Sep 5, 2020
1f8c488
TYP: add to _get_index_label... methods
ivanovmg Sep 5, 2020
1a750b4
TYP: use Iterator instead of Generator
ivanovmg Sep 5, 2020
7b89921
TYP: explicitly use List type
ivanovmg Sep 5, 2020
2478084
TYP: correct dict typing
ivanovmg Sep 6, 2020
e08f656
TYP: remaining properties
ivanovmg Sep 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions pandas/_typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
List,
Mapping,
Optional,
Sequence,
Type,
TypeVar,
Union,
Expand Down Expand Up @@ -82,6 +83,7 @@

Axis = Union[str, int]
Label = Optional[Hashable]
IndexLabel = Optional[Union[Label, Sequence[Label]]]
Copy link
Member

@simonjayhawkins simonjayhawkins Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Optional in Label above is to include None in Label. (On hindsight, I think this could have been Union[Hashable, None] for clarity but we don't use the Union[..., None] pattern)

And also I'm not sure how we got the Ordered alias below.

In pandas._typing, I would generally prefer that we don't include the Optional, and just add it when needed in the annotations. I think this in generally allows more use of the aliases (i.e. after setting a default since we don't always set defaults in the signatures)

so in the code, you would have

index_label: Optional[IndexLabel] = None

instead of

index_label: IndexLabel = None

of course Label includes None anyway so the Optional isn't needed anyway. it's just a stylistic preference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins, I agree that it is more reasonable to define Label and IndexLabel without Optional. I will take a look at this in a separate PR.

Level = Union[Label, int]
Ordered = Optional[bool]
JSONSerializable = Optional[Union[PythonScalar, List, Dict]]
Expand Down
3 changes: 2 additions & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
CompressionOptions,
FilePathOrBuffer,
FrameOrSeries,
IndexLabel,
JSONSerializable,
Label,
Level,
Expand Down Expand Up @@ -3160,7 +3161,7 @@ def to_csv(
columns: Optional[Sequence[Label]] = None,
header: Union[bool_t, List[str]] = True,
index: bool_t = True,
index_label: Optional[Union[bool_t, str, Sequence[Label]]] = None,
index_label: IndexLabel = None,
mode: str = "w",
encoding: Optional[str] = None,
compression: CompressionOptions = "infer",
Expand Down
15 changes: 15 additions & 0 deletions pandas/io/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,21 @@ def get_filepath_or_buffer(
# handle compression dict
compression_method, compression = get_compression_method(compression)
compression_method = infer_compression(filepath_or_buffer, compression_method)

# GH21227 internal compression is not used for non-binary handles.
if (
compression_method
and hasattr(filepath_or_buffer, "write")
and mode
and "b" not in mode
):
warnings.warn(
"compression has no effect when passing a non-binary object as input.",
RuntimeWarning,
stacklevel=2,
)
compression_method = None

compression = dict(compression, method=compression_method)

# bz2 and xz do not write the byte order mark for utf-16 and utf-32
Expand Down
Loading