Skip to content

WARN read_table with infer_datetime_format doesn't show FutureWarning #51017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MarcoGorelli opened this issue Jan 27, 2023 · 11 comments · Fixed by #51048
Closed

WARN read_table with infer_datetime_format doesn't show FutureWarning #51017

MarcoGorelli opened this issue Jan 27, 2023 · 11 comments · Fixed by #51048
Assignees
Labels
good first issue Warnings Warnings that appear or should be added to pandas
Milestone

Comments

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Jan 27, 2023

Running

import pandas as pd
import io

timestamp_format = '%Y-%d-%m %H:%M:%S'
date_index = pd.date_range(start='1900', end='2000')
dates_df = date_index.strftime(timestamp_format).to_frame(name='ts_col')
data = dates_df.to_csv()
df = pd.read_csv(
    io.StringIO(data),
    date_parser=lambda x: pd.to_datetime(x, format=timestamp_format),
    parse_dates=['ts_col'],
    infer_datetime_format=True,
    sep=',',
)

results in

t.py:34: UserWarning: The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.
  df = pd.read_csv(

However,

df = pd.read_table(
    io.StringIO(data),
    date_parser=lambda x: pd.to_datetime(x, format=timestamp_format),
    parse_dates=['ts_col'],
    infer_datetime_format=True,
    sep=',',
)

shows no warning

Task is just to add a warning to this function

def read_table(
filepath_or_buffer: FilePath | ReadCsvBuffer[bytes] | ReadCsvBuffer[str],
*,
sep: str | None | lib.NoDefault = lib.no_default,
delimiter: str | None | lib.NoDefault = None,
# Column and Index Locations and Names
header: int | Sequence[int] | None | Literal["infer"] = "infer",
names: Sequence[Hashable] | None | lib.NoDefault = lib.no_default,
index_col: IndexLabel | Literal[False] | None = None,
usecols=None,
# General Parsing Configuration
dtype: DtypeArg | None = None,
engine: CSVEngine | None = None,
converters=None,
true_values=None,
false_values=None,
skipinitialspace: bool = False,
skiprows=None,
skipfooter: int = 0,
nrows: int | None = None,
# NA and Missing Data Handling
na_values=None,
keep_default_na: bool = True,
na_filter: bool = True,
verbose: bool = False,
skip_blank_lines: bool = True,
# Datetime Handling
parse_dates: bool | Sequence[Hashable] = False,
infer_datetime_format: bool | lib.NoDefault = lib.no_default,
keep_date_col: bool = False,
date_parser=None,
dayfirst: bool = False,
cache_dates: bool = True,
# Iteration
iterator: bool = False,
chunksize: int | None = None,
# Quoting, Compression, and File Format
compression: CompressionOptions = "infer",
thousands: str | None = None,
decimal: str = ".",
lineterminator: str | None = None,
quotechar: str = '"',
quoting: int = csv.QUOTE_MINIMAL,
doublequote: bool = True,
escapechar: str | None = None,
comment: str | None = None,
encoding: str | None = None,
encoding_errors: str | None = "strict",
dialect: str | csv.Dialect | None = None,
# Error Handling
on_bad_lines: str = "error",
# Internal
delim_whitespace: bool = False,
low_memory=_c_parser_defaults["low_memory"],
memory_map: bool = False,
float_precision: str | None = None,
storage_options: StorageOptions = None,
use_nullable_dtypes: bool | lib.NoDefault = lib.no_default,
) -> DataFrame | TextFileReader:

similarly to how is already done here:

if infer_datetime_format is not lib.no_default:
warnings.warn(
"The argument 'infer_datetime_format' is deprecated and will "
"be removed in a future version. "
"A strict version of it is now the default, see "
"https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. "
"You can safely remove this argument.",
stacklevel=find_stack_level(),
)

Finally, you'll need to add a test: you can duplicate

def test_parse_dates_infer_datetime_format_warning(all_parsers):
# GH 49024
parser = all_parsers
data = "Date,test\n2012-01-01,1\n,2"
parser.read_csv_check_warnings(
UserWarning,
"The argument 'infer_datetime_format' is deprecated",
StringIO(data),
parse_dates=["Date"],
infer_datetime_format=True,
)

but for read_table (or parametrise over parser.read_csv_check_warnings and parser.read_table_check_warnings). Note that you'll need to add sep=','

@MarcoGorelli MarcoGorelli added Warnings Warnings that appear or should be added to pandas good first issue labels Jan 27, 2023
@MarcoGorelli MarcoGorelli added this to the 2.0 milestone Jan 27, 2023
@MarcoGorelli
Copy link
Member Author

To parametrise, you might want to use

@pytest.mark.parametrize( 'reader', [ 'read_csv_check_warnings', 'read_table_check_warnings' ])

and then use getattr(parser, reader) instead of parser.read_csv_check_warnings

@MarcoGorelli
Copy link
Member Author

Also, this should be a FutureWarning

@kathleenhang
Copy link
Contributor

take

@kathleenhang
Copy link
Contributor

Running your first code snippet which should have produced a FutureWarning, I did not receive any warning.

I tried figuring out how to enable warnings in pandas by running pd.reset_option('all').

This resulted in the appearance of other FutureWarnings but still, it didn't produce the same FutureWarning like the one you received: "FutureWarning: The argument 'date_parser' is deprecated..."

I'm using pandas v1.5.3. Do you know why this is the case?

For now, I'll work on adding in the warning for the second code snippet and seeing if that one appears properly.

@MarcoGorelli
Copy link
Member Author

hey @kathleenhang ,

sorry, my bad, I was running that from a branch. updating now, sorry for the confusion, thanks for having asked

@MarcoGorelli
Copy link
Member Author

@kathleenhang I've updated the example in the issue - do you receive a warning now if you run it?

@kathleenhang
Copy link
Contributor

@MarcoGorelli Hey there, I still don't receive any warning. I'm using Python 3.9.1 if that helps.

@MarcoGorelli
Copy link
Member Author

Does it work if you run

pytest pandas/tests/io/parser/test_parse_dates.py -k test_parse_dates_infer_datetime_format_warning

?

@kathleenhang
Copy link
Contributor

I ran it twice in two different folders. Here are the results:

Screen Shot 2023-01-27 at 1 10 56 PM

Screen Shot 2023-01-27 at 1 11 30 PM

pandas-kathleenhang is my forked pandas repo
pandas-dev just contains a file called t.py

@MarcoGorelli
Copy link
Member Author

looks like you need to rebuild the C extensions: https://pandas.pydata.org/docs/dev/development/contributing_environment.html#step-3-build-and-install-pandas

@kathleenhang
Copy link
Contributor

I also was not inside of my Docker virtual environment. I just set that up yesterday, and I am still understanding how it works, its purpose, and when I should have it activated.

I had built my C extensions inside of the Docker virtual environment, but since I was working outside of the virtual environment, it didn't have the updated C extensions.

I see the warning now. Thanks @MarcoGorelli !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Warnings Warnings that appear or should be added to pandas
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants