Skip to content

Commit 171a640

Browse files
NasaGeekgfyoung
authored andcommitted
BUG: Infer compression by default in read_fwf()
Closes gh-22199.
1 parent f6cf7d9 commit 171a640

File tree

3 files changed

+10
-4
lines changed

3 files changed

+10
-4
lines changed

doc/source/whatsnew/v0.24.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -1534,6 +1534,7 @@ Notice how we now instead output ``np.nan`` itself instead of a stringified form
15341534
- Bug in :meth:`DataFrame.to_dict` when the resulting dict contains non-Python scalars in the case of numeric data (:issue:`23753`)
15351535
- :func:`DataFrame.to_string()`, :func:`DataFrame.to_html()`, :func:`DataFrame.to_latex()` will correctly format output when a string is passed as the ``float_format`` argument (:issue:`21625`, :issue:`22270`)
15361536
- Bug in :func:`read_csv` that caused it to raise ``OverflowError`` when trying to use 'inf' as ``na_value`` with integer index column (:issue:`17128`)
1537+
- Bug in :func:`read_fwf` in which the compression type of a file was not being properly inferred (:issue:`22199`)
15371538
- Bug in :func:`pandas.io.json.json_normalize` that caused it to raise ``TypeError`` when two consecutive elements of ``record_path`` are dicts (:issue:`22706`)
15381539
- Bug in :meth:`DataFrame.to_stata`, :class:`pandas.io.stata.StataWriter` and :class:`pandas.io.stata.StataWriter117` where a exception would leave a partially written and invalid dta file (:issue:`23573`)
15391540
- Bug in :meth:`DataFrame.to_stata` and :class:`pandas.io.stata.StataWriter117` that produced invalid files when using strLs with non-ASCII characters (:issue:`23573`)

pandas/io/parsers.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -401,7 +401,7 @@ def _read(filepath_or_buffer, kwds):
401401
encoding = re.sub('_', '-', encoding).lower()
402402
kwds['encoding'] = encoding
403403

404-
compression = kwds.get('compression')
404+
compression = kwds.get('compression', 'infer')
405405
compression = _infer_compression(filepath_or_buffer, compression)
406406
filepath_or_buffer, _, compression, should_close = get_filepath_or_buffer(
407407
filepath_or_buffer, encoding, compression)

pandas/tests/io/parser/test_read_fwf.py

+8-3
Original file line numberDiff line numberDiff line change
@@ -556,19 +556,24 @@ def test_default_delimiter():
556556

557557

558558
@pytest.mark.parametrize("compression", ["gzip", "bz2"])
559-
def test_fwf_compression(compression):
559+
@pytest.mark.parametrize("infer", [True, False, None])
560+
def test_fwf_compression(compression, infer):
560561
data = """1111111111
561562
2222222222
562563
3333333333""".strip()
563564

565+
extension = "gz" if compression == "gzip" else "bz2"
564566
kwargs = dict(widths=[5, 5], names=["one", "two"])
565567
expected = read_fwf(StringIO(data), **kwargs)
566568

567569
if compat.PY3:
568570
data = bytes(data, encoding="utf-8")
569571

570-
with tm.ensure_clean() as path:
572+
with tm.ensure_clean(filename="tmp." + extension) as path:
571573
tm.write_to_compressed(compression, path, data)
572574

573-
result = read_fwf(path, compression=compression, **kwargs)
575+
if infer is not None:
576+
kwargs["compression"] = ("infer" if infer else compression)
577+
578+
result = read_fwf(path, **kwargs)
574579
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)