Skip to content

Commit 05e1eae

Browse files
NasaGeekPingviinituutti
authored andcommitted
BUG: Infer compression by default in read_fwf() (pandas-dev#22200)
Closes pandas-devgh-22199.
1 parent 2a0f5fc commit 05e1eae

File tree

3 files changed

+12
-5
lines changed

3 files changed

+12
-5
lines changed

doc/source/whatsnew/v0.24.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -1536,6 +1536,7 @@ Notice how we now instead output ``np.nan`` itself instead of a stringified form
15361536
- Bug in :meth:`DataFrame.to_dict` when the resulting dict contains non-Python scalars in the case of numeric data (:issue:`23753`)
15371537
- :func:`DataFrame.to_string()`, :func:`DataFrame.to_html()`, :func:`DataFrame.to_latex()` will correctly format output when a string is passed as the ``float_format`` argument (:issue:`21625`, :issue:`22270`)
15381538
- Bug in :func:`read_csv` that caused it to raise ``OverflowError`` when trying to use 'inf' as ``na_value`` with integer index column (:issue:`17128`)
1539+
- Bug in :func:`read_fwf` in which the compression type of a file was not being properly inferred (:issue:`22199`)
15391540
- Bug in :func:`pandas.io.json.json_normalize` that caused it to raise ``TypeError`` when two consecutive elements of ``record_path`` are dicts (:issue:`22706`)
15401541
- Bug in :meth:`DataFrame.to_stata`, :class:`pandas.io.stata.StataWriter` and :class:`pandas.io.stata.StataWriter117` where a exception would leave a partially written and invalid dta file (:issue:`23573`)
15411542
- Bug in :meth:`DataFrame.to_stata` and :class:`pandas.io.stata.StataWriter117` that produced invalid files when using strLs with non-ASCII characters (:issue:`23573`)

pandas/io/parsers.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -401,7 +401,7 @@ def _read(filepath_or_buffer, kwds):
401401
encoding = re.sub('_', '-', encoding).lower()
402402
kwds['encoding'] = encoding
403403

404-
compression = kwds.get('compression')
404+
compression = kwds.get('compression', 'infer')
405405
compression = _infer_compression(filepath_or_buffer, compression)
406406
filepath_or_buffer, _, compression, should_close = get_filepath_or_buffer(
407407
filepath_or_buffer, encoding, compression)

pandas/tests/io/parser/test_read_fwf.py

+10-4
Original file line numberDiff line numberDiff line change
@@ -555,20 +555,26 @@ def test_default_delimiter():
555555
tm.assert_frame_equal(result, expected)
556556

557557

558-
@pytest.mark.parametrize("compression", ["gzip", "bz2"])
559-
def test_fwf_compression(compression):
558+
@pytest.mark.parametrize("infer", [True, False, None])
559+
def test_fwf_compression(compression_only, infer):
560560
data = """1111111111
561561
2222222222
562562
3333333333""".strip()
563563

564+
compression = compression_only
565+
extension = "gz" if compression == "gzip" else compression
566+
564567
kwargs = dict(widths=[5, 5], names=["one", "two"])
565568
expected = read_fwf(StringIO(data), **kwargs)
566569

567570
if compat.PY3:
568571
data = bytes(data, encoding="utf-8")
569572

570-
with tm.ensure_clean() as path:
573+
with tm.ensure_clean(filename="tmp." + extension) as path:
571574
tm.write_to_compressed(compression, path, data)
572575

573-
result = read_fwf(path, compression=compression, **kwargs)
576+
if infer is not None:
577+
kwargs["compression"] = "infer" if infer else compression
578+
579+
result = read_fwf(path, **kwargs)
574580
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)