Skip to content

TYP: check_untyped_defs io.sas.sasreader #30659

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 16 additions & 7 deletions pandas/io/sas/sasreader.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,24 @@
"""
Read SAS sas7bdat or xport files.
"""
from typing import TYPE_CHECKING, AnyStr, Optional, Union

from pandas._typing import FilePathOrBuffer

from pandas.io.common import stringify_path

if TYPE_CHECKING:
from pandas.io.sas.sas_xport import XportReader # noqa: F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acutally i think you can just import both readers at the top and call it a day.

from pandas.io.sas.sas7bdat import SAS7BDATReader # noqa: F401


def read_sas(
filepath_or_buffer,
format=None,
filepath_or_buffer: FilePathOrBuffer[AnyStr],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should change the definition in _typing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FilePathOrBuffer[AnyStr] should be in _typing instead of just the not parametric FilePathOrBuffer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some read/write methods only a string buffer (or bytes buffer) is allowed, i.e. to_html and we need to annotate as FilePathOrBuffer[str] (or FilePathOrBuffer[bytes]) otherwise we get mypy errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that’s a very limited case compared to the more general which accepts strings and buffers

these should be distinct / parametrized

not just in this PR - this is my point

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand, AnyStr is a TypeVar and therefore parametrized https://docs.python.org/3/library/typing.html#typing.AnyStr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, you are adding this here, rather than as a generailized soln in _typing. I would think it would be better there than adding to each of the IO readers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i reverted these additions for now to keep this PR focused on the untyped_defs error, #30659 (comment)

format: Optional[str] = None,
index=None,
encoding=None,
chunksize=None,
iterator=False,
encoding: Optional[str] = None,
chunksize: Optional[int] = None,
iterator: bool = False,
):
"""
Read SAS files stored as either XPORT or SAS7BDAT format files.
Expand Down Expand Up @@ -63,14 +71,15 @@ def read_sas(
else:
raise ValueError("unable to infer format of SAS file")

reader: Union["XportReader", "SAS7BDATReader"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these have a shared base class that is directly importable? Seems like this requires a bit of import gymnastics not sure worth it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they share abc.Iterable ...

pandas\io\sas\sasreader.py:93: error: "Iterator[Any]" has no attribute "read"
pandas\io\sas\sasreader.py:94: error: "Iterator[Any]" has no attribute "close"

many options:

Use protocol (python 3.8) from typing_extensions
define abc base class with read and close methods (add comment to use Protocol from py3.8)
use dynamically typed code, i.e. reader: Any
use ignore
leave as is with import gymnastics and Union.

we normally try to avoid most of these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define abc base class with read and close methods (add comment to use Protocol from py3.8)

gone with this option.

if format.lower() == "xport":
from pandas.io.sas.sas_xport import XportReader
from pandas.io.sas.sas_xport import XportReader # noqa: F811

reader = XportReader(
filepath_or_buffer, index=index, encoding=encoding, chunksize=chunksize
)
elif format.lower() == "sas7bdat":
from pandas.io.sas.sas7bdat import SAS7BDATReader
from pandas.io.sas.sas7bdat import SAS7BDATReader # noqa: F811

reader = SAS7BDATReader(
filepath_or_buffer, index=index, encoding=encoding, chunksize=chunksize
Expand Down
3 changes: 0 additions & 3 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -313,9 +313,6 @@ check_untyped_defs=False
[mypy-pandas.io.sas.sas7bdat]
check_untyped_defs=False

[mypy-pandas.io.sas.sasreader]
check_untyped_defs=False

[mypy-pandas.io.sql]
check_untyped_defs=False

Expand Down