Skip to content

Deprecation warning for SparseDataFrame shown for repr / every operation #26555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue May 29, 2019 · 3 comments
Labels
Deprecate Functionality to remove in pandas Sparse Sparse Data Type
Milestone

Comments

@jorisvandenbossche
Copy link
Member

If you are still using the SparseDataFrame class, the warnings are quite sticky:

In [5]: sparse_df = pd.SparseDataFrame(np.array([[1,2, np.nan], [np.nan, 5, 6]]), columns=['a','b','c'])
/home/joris/miniconda3/envs/dev37/bin/ipython:1: FutureWarning: SparseDataFrame is deprecated and will be removed in a future version.
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  #!/home/joris/miniconda3/envs/dev37/bin/python

In [6]: sparse_df                                      
Out[6]: /home/joris/scipy/pandas/pandas/core/frame.py:3337: FutureWarning: SparseSeries is deprecated and will be removed in a future version.
Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  return klass(values, index=self.index, name=items, fastpath=True)

     a    b    c
0  1.0  2.0  NaN
1  NaN  5.0  6.0

In [7]: sparse_df + 1                               
/home/joris/scipy/pandas/pandas/core/frame.py:3337: FutureWarning: SparseSeries is deprecated and will be removed in a future version.
Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  return klass(values, index=self.index, name=items, fastpath=True)
/home/joris/scipy/pandas/pandas/core/ops.py:2304: FutureWarning: SparseSeries is deprecated and will be removed in a future version.
Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  name=self.name)
/home/joris/scipy/pandas/pandas/core/sparse/frame.py:298: FutureWarning: SparseDataFrame is deprecated and will be removed in a future version.
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  default_fill_value=self.default_fill_value).__finalize__(self)
Out[7]: 
     a    b    c
0  2.0  3.0  NaN
1  NaN  6.0  7.0

So they are also raised each time you display them (repr), on each operation (like the addition I did).

I know users can filter them, but ideally they would be shown a bit less I think to not annoy users too much.

cc @TomAugspurger

@jorisvandenbossche jorisvandenbossche added this to the 0.25.0 milestone May 29, 2019
@jorisvandenbossche jorisvandenbossche added Deprecate Functionality to remove in pandas Sparse Sparse Data Type labels May 29, 2019
@TomAugspurger
Copy link
Contributor

This is why I initially added docs on how to filter the warnings :)

So we can either add the once filter ourselves, or we can do some finer-grained filters in places like __repr__. Do you have a preference? All else equal, I'd just add a once filter in pandas/__init__.py since it's less work.

@TomAugspurger
Copy link
Contributor

I don't think a filter in pandas will do the trick. IIUC, the default is to show the warning once per source line.

import pandas as pd


def f():
    for i in range(2):
        print(f"{i}")
        pd.SparseDataFrame({"A": []})


def main():
    f()
    f()


if __name__ == '__main__':
    main()

This prints it twice

0
foo.py:7: FutureWarning: SparseDataFrame is deprecated and will be removed in a future version.
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  pd.SparseDataFrame({"A": []})
1
foo.py:7: FutureWarning: SparseDataFrame is deprecated and will be removed in a future version.
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  pd.SparseDataFrame({"A": []})
0
1

Not much we can do about that. Which leaves us with finer-grained filters. I'll make a PR for the repr, but I don't think I'll go beyond that.

@jorisvandenbossche
Copy link
Member Author

Closing this. The repr has been fixed, operations that create new SparseDataFrame/Series under the hood probably not. If somebody cares enough we are open for a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Sparse Sparse Data Type
Projects
None yet
Development

No branches or pull requests

2 participants