Skip to content

Remove Internal Imports from Top Level Pandas Module #25923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
WillAyd opened this issue Mar 29, 2019 · 4 comments
Closed

Remove Internal Imports from Top Level Pandas Module #25923

WillAyd opened this issue Mar 29, 2019 · 4 comments
Labels
Typing type annotations, mypy/pyright type checking

Comments

@WillAyd
Copy link
Member

WillAyd commented Mar 29, 2019

Mostly driven by MyPy (which has it's own import mechanism) we get 67 failures as of now on master for imports coming from the top level pandas namespace:

pandas/io/stata.py:32: error: Module 'pandas' has no attribute 'DatetimeIndex'
pandas/io/stata.py:32: error: Module 'pandas' has no attribute 'concat'
pandas/io/stata.py:32: error: Module 'pandas' has no attribute 'isna'
pandas/io/stata.py:32: error: Module 'pandas' has no attribute 'to_datetime'
pandas/io/stata.py:32: error: Module 'pandas' has no attribute 'to_timedelta'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'DataFrame'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'DatetimeIndex'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'Index'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'Int64Index'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'MultiIndex'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'PeriodIndex'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'Series'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'SparseDataFrame'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'SparseSeries'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'TimedeltaIndex'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'concat'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'isna'
pandas/io/pytables.py:30: error: Module 'pandas' has no attribute 'to_datetime'
pandas/io/parquet.py:8: error: Module 'pandas' has no attribute 'DataFrame'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'Categorical'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'CategoricalIndex'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'DataFrame'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'DatetimeIndex'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'Float64Index'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'Index'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'Int64Index'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'Interval'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'IntervalIndex'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'MultiIndex'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'NaT'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'Panel'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'Period'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'PeriodIndex'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'RangeIndex'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'Series'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'TimedeltaIndex'
pandas/io/packers.py:58: error: Module 'pandas' has no attribute 'Timestamp'
pandas/io/html.py:17: error: Module 'pandas' has no attribute 'Series'
pandas/io/feather_format.py:7: error: Module 'pandas' has no attribute 'DataFrame'
pandas/io/feather_format.py:7: error: Module 'pandas' has no attribute 'Int64Index'
pandas/io/feather_format.py:7: error: Module 'pandas' has no attribute 'RangeIndex'
pandas/io/json/table_schema.py:15: error: Module 'pandas' has no attribute 'DataFrame'
pandas/io/json/normalize.py:11: error: Module 'pandas' has no attribute 'DataFrame'
pandas/io/json/json.py:14: error: Module 'pandas' has no attribute 'DataFrame'
pandas/io/json/json.py:14: error: Module 'pandas' has no attribute 'MultiIndex'
pandas/io/json/json.py:14: error: Module 'pandas' has no attribute 'Series'
pandas/io/json/json.py:14: error: Module 'pandas' has no attribute 'isna'
pandas/io/json/json.py:14: error: Module 'pandas' has no attribute 'to_datetime'
pandas/core/reshape/tile.py:16: error: Module 'pandas' has no attribute 'Categorical'
pandas/core/reshape/tile.py:16: error: Module 'pandas' has no attribute 'Index'
pandas/core/reshape/tile.py:16: error: Module 'pandas' has no attribute 'Interval'
pandas/core/reshape/tile.py:16: error: Module 'pandas' has no attribute 'IntervalIndex'
pandas/core/reshape/tile.py:16: error: Module 'pandas' has no attribute 'Series'
pandas/core/reshape/tile.py:16: error: Module 'pandas' has no attribute 'Timedelta'
pandas/core/reshape/tile.py:16: error: Module 'pandas' has no attribute 'Timestamp'
pandas/core/reshape/tile.py:16: error: Module 'pandas' has no attribute 'to_datetime'
pandas/core/reshape/tile.py:16: error: Module 'pandas' has no attribute 'to_timedelta'
pandas/core/reshape/merge.py:25: error: Module 'pandas' has no attribute 'Categorical'
pandas/core/reshape/merge.py:25: error: Module 'pandas' has no attribute 'DataFrame'
pandas/core/reshape/merge.py:25: error: Module 'pandas' has no attribute 'Index'
pandas/core/reshape/merge.py:25: error: Module 'pandas' has no attribute 'MultiIndex'
pandas/core/reshape/merge.py:25: error: Module 'pandas' has no attribute 'Series'
pandas/core/reshape/merge.py:25: error: Module 'pandas' has no attribute 'Timedelta'
pandas/core/reshape/concat.py:9: error: Module 'pandas' has no attribute 'DataFrame'
pandas/core/reshape/concat.py:9: error: Module 'pandas' has no attribute 'Index'
pandas/core/reshape/concat.py:9: error: Module 'pandas' has no attribute 'MultiIndex'
pandas/core/reshape/concat.py:9: error: Module 'pandas' has no attribute 'Series'

Specifically with MyPy doing explicit imports of these versus current wildcard helps, as does importing directly from the modules where these are defined.

With MyPy if we wanted to we could always ignore these imports, but at the same time I was wondering if we really should be importing from the top level within the code base as it theoretically makes our import machinery more convoluted than it needs to be.

I spoke with @jorisvandenbossche on Gitter yesterday about this and there were two things we discussed:

  • Option 1: Always import from where the object is defined, so you would always do from pandas.core.frame import DataFrame
  • Option 2: Leverage the API of particular subpackages when outside of them, import directly within. So if you are in pandas.core you would still do from pandas.core.frame import DataFrame, but outside of core you would do from pandas.core.api import DataFrame

FWIW I think option 2 is kind of nice, because it abstracts where within sub packages we choose to place particular items and allows things like from pandas.core.api import DataFrame, Series instead of having two separate imports

Curious if anyone has any thoughts or objections to either of the options presented above versus current allowance of imports from top level

@WillAyd WillAyd added the Needs Discussion Requires discussion from core team before further action label Mar 29, 2019
@jreback
Copy link
Contributor

jreback commented Mar 30, 2019

we should always be using the pandas.core.frame import, but this was never enforced. We have a number of places, to avoid circular imports, where we need to import Dataframe and/or Series inside functions; these are generally from pandas import DataFrame, but could also be fully specified imports.

Note we generally also should/mostly are using ABCDataFrame et all when merely type checking (and these are freely importable at the top level), or _constructor from an existing frame, so explicit imports is mostly not needed.

So option 1 lgtm (though could be option 2 as well, not a big deal)

@WillAyd
Copy link
Member Author

WillAyd commented Mar 30, 2019

Great thanks for the direction Jeff.

Note we generally also should/mostly are using ABCDataFrame et all when merely type checking (and these are freely importable at the top level), or _constructor from an existing frame, so explicit imports is mostly not needed.

Yea this issue was mostly spurred by existing places in the code base where we are already importing DataFrame. Good to have this as a standard when we need to add typing to modules where that isn't already explicitly imported though

@WillAyd
Copy link
Member Author

WillAyd commented Apr 10, 2019

Alternately solved by #26019

@WillAyd WillAyd closed this as completed Apr 10, 2019
@WillAyd WillAyd added Typing type annotations, mypy/pyright type checking and removed Needs Discussion Requires discussion from core team before further action labels Apr 10, 2019
@jorisvandenbossche
Copy link
Member

Ah, that seems as the better solution in the end!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

No branches or pull requests

3 participants