|
| 1 | +================================================== |
| 2 | +Clone Override Protocol with ``__sklearn_clone__`` |
| 3 | +================================================== |
| 4 | + |
| 5 | +:Author: Joel Nothman |
| 6 | +:Status: Draft |
| 7 | +:Type: Standards Track |
| 8 | +:Created: 2022-03-19 |
| 9 | +:Resolution: (required for Accepted | Rejected | Withdrawn) |
| 10 | + |
| 11 | +Abstract |
| 12 | +-------- |
| 13 | + |
| 14 | +The ability to clone Scikit-learn estimators -- removing any state due to |
| 15 | +previous fitting -- is essential to ensuring estimator configurations are |
| 16 | +reusable across multiple instances in cross validation. |
| 17 | +A centralised implementation of :func:`sklearn.base.clone` regards |
| 18 | +an estimator's constructor parameters as the state that should be copied. |
| 19 | +This proposal allows for an estimator class to implement custom cloning |
| 20 | +functionality with a ``__sklearn_clone__`` method, which will default to |
| 21 | +the current ``clone`` behaviour. |
| 22 | + |
| 23 | +Detailed description |
| 24 | +-------------------- |
| 25 | + |
| 26 | +Cloning estimators is one way that Scikit-learn ensures that there is no |
| 27 | +data leakage across data splits in cross-validation: by only copying an |
| 28 | +estimator's configuration, with no data from previous fitting, the |
| 29 | +estimator must fit with a cold start. Cloning an estimator often also |
| 30 | +occurs prior to parallelism, ensuring that a minimal version of the |
| 31 | +estimator -- without a large stored model -- is serialised and distributed. |
| 32 | + |
| 33 | +Cloning is currently governed by the implementation of |
| 34 | +:func:`sklearn.base.clone`, which recursively descends and copies the |
| 35 | +parameters of the passed object. For an estimator, it constructs a new |
| 36 | +instance of the estimator's class, passing to it cloned versions of the |
| 37 | +parameter values returned by its ``get_params``. It then performs some |
| 38 | +sanity checks to ensure that the values passed to the construtor are |
| 39 | +identical to what is then returned by the clone's ``get_params``. |
| 40 | + |
| 41 | +The current equivalence between constructor parameters and what is cloned |
| 42 | +means that whenever an estimator or library developer deems it necessary |
| 43 | +to have further configuration of an estimator reproduced in a clone, |
| 44 | +they must include this configuration as a constructor parameter. |
| 45 | + |
| 46 | +Cases where this need has been raised in Scikit-learn development include: |
| 47 | + |
| 48 | +* ensuring metadata requests are cloned with an estimator |
| 49 | +* ensuring parameter spaces are cloned with an estimator |
| 50 | +* building a simple wrapper that can "freeze" a pre-fitted estimator |
| 51 | +* allowing existing options for using prefitted models in ensembles |
| 52 | + to work under cloning |
| 53 | + |
| 54 | +The current design also limits the ability for an estimator developer to |
| 55 | +define an exception to the sanity checks (see :issue:`15371`). |
| 56 | + |
| 57 | +This proposal empowers estimator developers to extend the base implementation |
| 58 | +of ``clone`` by providing a ``__sklearn_clone__`` method, which ``clone`` will |
| 59 | +delegate to when available. The default implementaton will match current |
| 60 | +``clone`` behaviour. It will be provided through |
| 61 | +``BaseEstimator.__sklearn_clone__`` but also |
| 62 | +provided for estimators not inheriting from :obj:`~sklearn.base.BaseEstimator`. |
| 63 | + |
| 64 | +This shifts the paradigm from ``clone`` being a fixed operation that |
| 65 | +Scikit-learn must be able to perform on an estimator to ``clone`` being a |
| 66 | +behaviour that each Scikit-learn compatible estimator may implement. |
| 67 | + |
| 68 | +Developers that define ``__sklearn_clone__`` are expected to be responsible |
| 69 | +in maintaintaining the fundamental properties of cloning. Ordinarily, they |
| 70 | +can achieve this through use of ``super().__sklearn_clone__``. Core behaviours, |
| 71 | +such as constructor parameters being preserved through ``clone`` operations, |
| 72 | +can be ensured through estimator checks. |
| 73 | + |
| 74 | +Implementation |
| 75 | +-------------- |
| 76 | + |
| 77 | +Implementing this SLEP will require: |
| 78 | + |
| 79 | +1. Factoring out `clone_parametrized` from `clone`, being the portion of its |
| 80 | + implementation that handles objects with `get_params`. |
| 81 | +2. Modifying `clone` to call ``__sklearn_clone__`` when available on an |
| 82 | + object with ``get_params``, or ``clone_parametrized`` when not available. |
| 83 | +3. Defining ``BaseEstimator.__sklearn_clone__`` to call ``clone_parametrized``. |
| 84 | +4. Documenting the above. |
| 85 | + |
| 86 | +Backward compatibility |
| 87 | +---------------------- |
| 88 | + |
| 89 | +No breakage. |
| 90 | + |
| 91 | +Alternatives |
| 92 | +------------ |
| 93 | + |
| 94 | +Instead of allowing estimators to overwrite the entire clone process, |
| 95 | +the core clone process could be obligatory, with the ability for an |
| 96 | +estimator class to customise additional steps. |
| 97 | + |
| 98 | +One API would allow for an estimator class to provide |
| 99 | +``__sklearn__post_clone__(self, source)`` for operations in addition |
| 100 | +to the core cloning, or ``__sklearn__clone_attrs__`` could be defined |
| 101 | +on a class to specify additional attributes that should be copied for |
| 102 | +that class and its descendants. |
| 103 | + |
| 104 | +Alternative solutions include continuing to force developers into providing |
| 105 | +sometimes-awkward constructor parameters for any clonable material, and |
| 106 | +Scikit-learn core developers having the exceptional ability to extend |
| 107 | +the ``clone`` function as needed. |
| 108 | + |
| 109 | +Discussion |
| 110 | +---------- |
| 111 | + |
| 112 | +:issue:`5080` raised the proposal of polymorphism for ``clone`` as the right |
| 113 | +way to provide an object-oriented API, and as a way to enable the |
| 114 | +implementation of wrappers around estimators for model memoisation and |
| 115 | +freezing. |
| 116 | +The naming of ``__sklearn_clone__`` was further proposed and discussed in |
| 117 | +:issue:`21838`. |
| 118 | + |
| 119 | +Making cloning more flexible either enables or simplifies the design and |
| 120 | +implementation of several features, including wrapping pre-fitted estimators, |
| 121 | +and providing estimator configuration through methods without adding new |
| 122 | +constructor arguments (e.g. through mixins). |
| 123 | + |
| 124 | +Related issues include: |
| 125 | + |
| 126 | +- :issue:`6451`, :issue:`8710`, :issue:`19848`: CalibratedClassifierCV with |
| 127 | + prefitted base estimator |
| 128 | +- :issue:`7382`: VotingClassifier with prefitted base estimator |
| 129 | +- :issue:`16748`: Stacking estimator with prefitted base estimator |
| 130 | +- :issue:`8370`, :issue:`9464`: generic estimator wrapper for model freezing |
| 131 | +- :issue:`5082`: configuring parameter search spaces |
| 132 | +- :issue:`16079`: configuring the routing of sample-aligned metadata |
| 133 | +- :issue:`16185`: configuring selected parameters to not be deep-copied |
| 134 | + |
| 135 | +Under the incumbent monolithic clone implementation, designing such additional |
| 136 | +per-estimator configuration requires resolving whether to: |
| 137 | + |
| 138 | +- adjust the monolithic ``clone`` to account for the new configuration |
| 139 | + attributes (an option only available to the Scikit-learn core developer |
| 140 | + team); |
| 141 | +- add constructor attributes for each new configuration option; or |
| 142 | +- not clone estimator configurations, and accept that some use cases may not |
| 143 | + be possible. |
| 144 | + |
| 145 | +A more flexible cloning operation provides a simpler pattern for adding new |
| 146 | +configuration options through mixins. |
| 147 | +It should be noted that adding new capabilities to *all* estimators remains |
| 148 | +possible only through modifying the default ``__sklearn_clone__`` |
| 149 | +implementation. |
| 150 | + |
| 151 | +There are, however, notable concerns in relation to this proposal. |
| 152 | +Introducing a generic clone handler on each estimator gives a developer |
| 153 | +complete freedom to disregard existing conventions regarding parameter |
| 154 | +setting and construction in Scikit-learn. |
| 155 | +In this vein, objections to :issue:`5080` cited the notion that "``clone`` |
| 156 | +has a simple contract," and that "extension to it would open the door to |
| 157 | +violations of that contract" [2]_. |
| 158 | + |
| 159 | +While these objections identify considerable risks, many public libraries |
| 160 | +include developers regularly working around Scikit-learn conventions and |
| 161 | +contracts, in part because developers are backed into a "design corner", |
| 162 | +wherein it is not always obvious how to build an acceptable UX while adhering |
| 163 | +to established conventions; in this case, that everything to be cloned must |
| 164 | +go into ``__init__``. This proposal paves a road for how developers can |
| 165 | +solve functionality UX limitations in the core library, rather than |
| 166 | +inviting custom workarounds. |
| 167 | + |
| 168 | +References and Footnotes |
| 169 | +------------------------ |
| 170 | + |
| 171 | +.. [1] Each SLEP must either be explicitly labeled as placed in the public |
| 172 | + domain (see this SLEP as an example) or licensed under the `Open |
| 173 | + Publication License`_. |
| 174 | +.. _Open Publication License: https://www.opencontent.org/openpub/ |
| 175 | + |
| 176 | +.. [2] `Gael Varoquaux's comments on #5080 in 2015 |
| 177 | + <https://github.com/scikit-learn/scikit-learn/issues/5080#issuecomment-127128808>`__ |
| 178 | +
|
| 179 | +
|
| 180 | +Copyright |
| 181 | +--------- |
| 182 | + |
| 183 | +This document has been placed in the public domain. [1]_ |
0 commit comments