Merging BayesianEstimator into ModelBuilder #165

michaelraczycki · 2023-05-10T14:05:49Z

Complete merge of BayesianEstimator and it's functionalities into ModelBuilder:

Implementation details:
-splitting create_sample_input into 3 new functions:
*generate_model_data - a method that takes over data generation/pre-formatting
*default_sampler_config - property function that allows access to default sampler config
*default_model_config - property function that allows access to default model config

Split as above, allows for simplification of ModelBuilder initialisation in the default mode, allowing user easy access to the data structures, while making all parameters optional maintains full customisation option.

All model-related functions now take X and y as input, to mimic scikit-learn model input parameters. That was intended to allow usage of sklearn Transformers and Pipelines with PyMC models.
For now, data can be provided still on each of the steps that it could have been provided in the past (class init, build_model function). This allows for a live test period, in which it will be determined whether the combined data provisioning system will stay, or we'll move into sklearn-like way.
Fit now takes an optional parameter 'predictor_names' which allows custom naming of the predictors in fit_data, in case the predictors were given in 2d array, in case of providing parameters in pandas Dataframe, the predictor names will be taken from the data frame itself.

michaelraczycki · 2023-05-10T14:09:43Z

I decided to adapt some of the favourite solution ideas that were introduced with BayesianEstimator class, also I decided to not fully skip their implementation to avoid possible runtime issues that I've encountered in last development.

michaelraczycki · 2023-05-11T08:37:10Z

the pytest exception needed to be added because of numpy 1.24. introducing deprecation warning for bool8, which is used in bokeh library, was causing all test to fail on collection for any envs with the latest numpy

michaelraczycki · 2023-05-11T09:42:20Z

this PR is needed for #144 , as the ModelBuilder and BasesianEstimator changes will be used in PoC of model builds in DelayedSaturatedMMM in pymc-experimental, therefore the changes need to be released

twiecki · 2023-05-11T09:44:59Z

pymc_experimental/model_builder.py

+
+        Examples
+        --------
+        >>> import arviz as az


These are pretty weird examples, more like tests. This should give the user info on how to use the method to achieve something.

Doctests provide an option to perform additional checks based on the docstrings, they might look a bit odd, but you're right, this one is not the best. I'll try to refactor it. Maybe putting some comment on the last line that is actually both doctest and the user manual (in this case 280) would help?

pymc_experimental/model_builder.py

twiecki · 2023-05-11T10:22:08Z

pymc_experimental/model_builder.py

-        >>> idata = az.InferenceData()
-        >>> self.set_idata_attrs(idata=idata)
-        >>> assert "id" in idata.attrs
+        >>> model = YourModel(ModelBuilder)


Maybe keep it consistent between MyModel, YourModel.

My mistake, didn't notice that one. It's fixed now.

fixing indentation issue, adding exception for pytest adding forgotten decorator to generate_model_data making doctest more user-manual like, renaming example model for consistency chaning YourClass to MyClass for consistency

twiecki · 2023-05-11T20:43:23Z

pymc_experimental/bayesian_estimator_linearmodel.py

@@ -60,6 +60,7 @@ class BayesianEstimator(ModelBuilder):

    def __init__(
        self,
+        data: Union[np.ndarray, pd.DataFrame, pd.Series] = None,


Why are you adding this back in?

As discussed, I want to experiment with adding a no-parameter initialization that will act as a sandbox for the new users, but keeping the data in the constructor helps to reduce the number of adaptations before going into implementing model builder in other classes. Also it has low impact, and should be easy to remove later if we don't like it

michaelraczycki · 2023-05-19T12:44:50Z

please do not squash merge it, the commits were split on purpose to make it reversible in case it's needed

michaelraczycki · 2023-05-22T08:15:12Z

@twiecki @ricardoV94 any suggestions to the current state?

ricardoV94 · 2023-05-23T12:42:27Z

.pylintrc

@@ -46,7 +46,6 @@ enable=import-self,
       used-before-assignment,
       cell-var-from-loop,
       global-variable-undefined,
-       dangerous-default-value,


Didn't have time to check the rest yet, but this doesn't seem right

There's a problem with pymc-marketing and pymc-expental, as they don't use the same linting rules. So It's something I forgot to 'undo' after installing locally pymc-experimental from my local repo.

splitting create_sample_input, adapting tests

db13bde

michaelraczycki added enhancements New feature or request request discussion labels May 10, 2023

michaelraczycki requested review from twiecki and ricardoV94 May 10, 2023 14:05

michaelraczycki changed the title ~~splitting create_sample_input, adapting tests~~ splitting create_sample_input, adapting tests in ModelBuilder May 10, 2023

step 2: removing duplications and adapting BayesianEstimator

c272052

twiecki reviewed May 11, 2023

View reviewed changes

pymc_experimental/model_builder.py Show resolved Hide resolved

twiecki reviewed May 11, 2023

View reviewed changes

adding and updating doctests

539659d

fixing indentation issue, adding exception for pytest adding forgotten decorator to generate_model_data making doctest more user-manual like, renaming example model for consistency chaning YourClass to MyClass for consistency

michaelraczycki force-pushed the model_builder_compatibility_with_sklearn branch from 6a61dc1 to 539659d Compare May 11, 2023 12:15

michaelraczycki requested a review from twiecki May 11, 2023 12:53

twiecki reviewed May 11, 2023

View reviewed changes

michaelraczycki requested a review from twiecki May 12, 2023 07:22

michaelraczycki added 5 commits May 15, 2023 13:31

moving new load to ModelBuilder

a17539c

merging fit from BayesianEstimator to ModelBuilder, adapting tests

0215404

moving predict and sample_posterior_predictive, adapting tests

4e7879b

canibalize BayesianEstimator by ModelBuilder

ece5406

adaptation of ModelBuiler to make Linearmodel tests pass

9e557ab

michaelraczycki changed the title ~~splitting create_sample_input, adapting tests in ModelBuilder~~ Merging BayesianEstimator into ModelBuilder May 16, 2023

final adjustments for pymc-marketing compatibility with sklearn

ad21234

fixing incorrect if clause

52354ff

ricardoV94 reviewed May 23, 2023

View reviewed changes

replacing dangerous default value

d812138

twiecki approved these changes May 26, 2023

View reviewed changes

twiecki merged commit 573e5bc into pymc-devs:main May 26, 2023

michaelraczycki deleted the model_builder_compatibility_with_sklearn branch July 5, 2023 09:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging BayesianEstimator into ModelBuilder #165

Merging BayesianEstimator into ModelBuilder #165

michaelraczycki commented May 10, 2023 •

edited

Loading

michaelraczycki commented May 10, 2023

michaelraczycki commented May 11, 2023

michaelraczycki commented May 11, 2023

twiecki May 11, 2023

michaelraczycki May 11, 2023

twiecki May 11, 2023

michaelraczycki May 11, 2023

twiecki May 11, 2023

michaelraczycki May 12, 2023

michaelraczycki commented May 19, 2023

michaelraczycki commented May 22, 2023

ricardoV94 May 23, 2023

michaelraczycki May 23, 2023

Merging BayesianEstimator into ModelBuilder #165

Merging BayesianEstimator into ModelBuilder #165

Conversation

michaelraczycki commented May 10, 2023 • edited Loading

michaelraczycki commented May 10, 2023

michaelraczycki commented May 11, 2023

michaelraczycki commented May 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelraczycki commented May 19, 2023

michaelraczycki commented May 22, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelraczycki commented May 10, 2023 •

edited

Loading