REF: simplify CSVFormatter (#36046)

ivanovmg · web-flow · commit 44e933a765f9 · 2020-09-09T06:33:10.000-04:00
* REF: extract properties cols and has_mi_columns

* REF: extract property chunksize

* REF: extract property quotechar

* REF: extract properties data_index and nlevels

* REF: refactor _save_chunk

* REF: refactor _save

* REF: extract method _save_body

* REF: reorder _save-like methods

* REF: extract compression property

* REF: Extract property index_label

* REF: extract helper properties

* REF: delete local variables in _save_header

* REF: extract method _get_header_rows

* REF: move check for header into _save function

* TYP: add several type annotations

* FIX: fix index labels

* FIX: fix multiindex

* FIX: fix test failures on compression

Needed to eliminate compression setter
due to the interdependencies between ioargs and compression.

* REF: eliminate preallocation of self.data

* REF: extract method _convert_to_native_types

* REF: rename regular -&gt; flat as reviewed

* TYP: add type annotations as reviewed

* REF: refactor number formatting

Replace _convert_to_native_types method
in favor of a number formatting dictionary.

* FIX: mypy error with index_label

* FIX: reorder if-statements in index_label

To make sure that the newer mypy (v0.782) passes.

* TYP: move IndexLabel to pandas._typing

This eliminates repetition of the type annotations
for index label in multiple places.

* TYP: quotechar, has_mi_columns, _need_to_save...

* TYP: chunksize, but ignored assignment check

For some reason mypy would not recognize that chunksize
turns from Optional[int] to int inside the setter.
Even setting an intentional assertion
``assert chunksize is not None``
does not help.

* TYP: cols property

Limitations:
 - ignore type[assignment] error.
 - Created additional method _refine_cols to allow
 conversion from Optional[Sequence[Label]] to Sequence[Label].

* TYP: nlevels and _has_aliases

* CLN: move GH21227 check to pandas/io/common.py

* TYP: remove redundant bool from IndexLabel type

* TYP: add to _get_index_label... methods

* TYP: use Iterator instead of Generator

* TYP: explicitly use List type

* TYP: correct dict typing

* TYP: remaining properties
diff --git a/pandas/_typing.py b/pandas/_typing.py
@@ -15,6 +15,7 @@
     List,
     Mapping,
     Optional,
+    Sequence,
     Type,
     TypeVar,
     Union,
@@ -82,6 +83,7 @@
 
 Axis = Union[str, int]
 Label = Optional[Hashable]
+IndexLabel = Optional[Union[Label, Sequence[Label]]]
 Level = Union[Label, int]
 Ordered = Optional[bool]
 JSONSerializable = Optional[Union[PythonScalar, List, Dict]]
diff --git a/pandas/core/generic.py b/pandas/core/generic.py
@@ -40,6 +40,7 @@
     CompressionOptions,
     FilePathOrBuffer,
     FrameOrSeries,
+    IndexLabel,
     JSONSerializable,
     Label,
     Level,
@@ -3160,7 +3161,7 @@ def to_csv(
         columns: Optional[Sequence[Label]] = None,
         header: Union[bool_t, List[str]] = True,
         index: bool_t = True,
-        index_label: Optional[Union[bool_t, str, Sequence[Label]]] = None,
+        index_label: IndexLabel = None,
         mode: str = "w",
         encoding: Optional[str] = None,
         compression: CompressionOptions = "infer",
diff --git a/pandas/io/common.py b/pandas/io/common.py
@@ -208,6 +208,21 @@ def get_filepath_or_buffer(
     # handle compression dict
     compression_method, compression = get_compression_method(compression)
     compression_method = infer_compression(filepath_or_buffer, compression_method)
+
+    # GH21227 internal compression is not used for non-binary handles.
+    if (
+        compression_method
+        and hasattr(filepath_or_buffer, "write")
+        and mode
+        and "b" not in mode
+    ):
+        warnings.warn(
+            "compression has no effect when passing a non-binary object as input.",
+            RuntimeWarning,
+            stacklevel=2,
+        )
+        compression_method = None
+
     compression = dict(compression, method=compression_method)
 
     # bz2 and xz do not write the byte order mark for utf-16 and utf-32
diff --git a/pandas/io/formats/csvs.py b/pandas/io/formats/csvs.py