Skip to content

Commit 0580391

Browse files
committed
DOC Address comments
1 parent add7a8a commit 0580391

File tree

1 file changed

+18
-6
lines changed

1 file changed

+18
-6
lines changed

slep018/proposal.rst

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,21 @@ transformers::
6565
The global default configuration is ``"default"`` where the transformer
6666
determines the output container.
6767

68+
The configuration can also be set locally using the ``config_context`` context
69+
manager:
70+
71+
from sklearn import config_context
72+
with config_context(transform_output="pandas"):
73+
num_prep = make_pipeline(SimpleImputer(), StandardScalar(), PCA())
74+
num_preprocessor.fit_transform(X_df)
75+
76+
The following specifies the precedence levels for the three ways to configure
77+
the output container:
78+
79+
1. Locally configure a transformer: ``transformer.set_output``
80+
2. Context manager: ``config_context``
81+
3. Global configuration: ``set_config``
82+
6883
Implementation
6984
--------------
7085

@@ -84,10 +99,7 @@ Alternatives to this SLEP includes:
8499

85100
1. `SLEP014 <https://github.com/scikit-learn/enhancement_proposals/pull/37>`__
86101
proposes that if the input is a DataFrame than the output is a DataFrame.
87-
2. :ref:`SLEP012 <slep_012>` proposes a custom scikit-learn container for dense
88-
and sparse data that contains feature names. This SLEP also proposes a custom
89-
container for sparse data, but pandas for dense data.
90-
3. Prototype `#20100
102+
2. Prototype `#20100
91103
<https://github.com/scikit-learn/scikit-learn/pull/20100>`__ showcases
92104
``array_out="pandas"`` in `transform`. This API is limited because does not
93105
directly support fitting on a pipeline where the steps requires data frames
@@ -107,8 +119,8 @@ For information only!
107119
Sparse Data
108120
...........
109121

110-
The Pandas DataFrame is not suitable to provide column names because it has
111-
performance issues as shown in `#16772
122+
The Pandas DataFrame is not suitable to provide column names for sparse data
123+
because it has performance issues as shown in `#16772
112124
<https://github.com/scikit-learn/scikit-learn/pull/16772#issuecomment-615423097>`__.
113125
A future extension to this SLEP is to have a ``"pandas_or_namedsparse"`` option.
114126
This option will use a scikit-learn specific sparse container that subclasses

0 commit comments

Comments
 (0)