Skip to content

Update first half of generalized linear models to "Best Practices" #148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 25, 2021

Conversation

KavyaJaiswal
Copy link
Contributor

@KavyaJaiswal KavyaJaiswal commented Apr 29, 2021

Description

File: https://github.com/pymc-devs/pymc-examples/blob/main/examples/generalized_linear_models/GLM.ipynb

Addresses issue #91 and aims to advance the notebook to the 'Best Practices' State.

The new code uses the new numpy generator instead of the numpy random. The code also has an ArviZ related addition of return_inferencedata=True

The PR has a try... except clause in the code. It will ensure that users who have cloned pymc-examples repo will read their local copy of the data while also downloading the data from github for those who don't have a local copy.

try:
    sat_data = pd.read_csv("../data/Guber1999data.txt")
except:
    sat_data = pd.read_csv(pm.get_data("Guber1999data.txt"))

This PR also addresses the error message in the Hierarchical GLM section of the code. The error in the code is shown in the screenshots attached below. The fix for this error can be discussed and worked upon.

Error1
Error2
Error3

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@KavyaJaiswal KavyaJaiswal changed the title Replaced numpy.random with the new numpy generator #91 Replaced numpy.random with the new numpy generator Apr 29, 2021
@review-notebook-app
Copy link

review-notebook-app bot commented Apr 29, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-04-29T21:27:29Z
----------------------------------------------------------------

This is the old global seeding version of the numpy random number generator. You have to use the new random generator as shown in the references.


OriolAbril commented on 2021-05-01T20:43:59Z
----------------------------------------------------------------

Now a generator is initialized, but it is not stored nor used afterwards to generate the random samples. It should be stored (i.e. as rng) and then random samples should be generated with rng.normal, rng.poisson... there should be no instances of np.random.<distribution> anywhere in the code anymore.

KavyaJaiswal commented on 2021-05-03T10:54:44Z
----------------------------------------------------------------

I have made the required changes.

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 29, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-04-29T21:27:30Z
----------------------------------------------------------------

Here return_inferencedata=True should be used in pm.sample. plot_posterior_predictive_glm works with both multitrace and inferencedata so everything else should work. Same goes for all other calls to pm.sample


KavyaJaiswal commented on 2021-05-03T10:55:31Z
----------------------------------------------------------------

Done!

@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-04-29T21:27:31Z
----------------------------------------------------------------

Here you have to use a try...except clause as indicated in https://github.com/pymc-devs/pymc3/wiki/PyMC3-Jupyter-Notebook-Style-Guide. This is completely independent from the glm issue in the cell below.

This cell does work and you have been able to execute it without problem. But right now it's downloading the data from github every time the cell is executed. If the users have cloned pymc-examples repo, we want to avoid that and use the local copy. This should be faster for those users as no download is needed in the cloned repo case, and will allow people with the cloned repo to run notebooks offline without problem which is currently not possible.


@review-notebook-app
Copy link

review-notebook-app bot commented Apr 29, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-04-29T21:27:32Z
----------------------------------------------------------------

Asked around for help with this error


OriolAbril commented on 2021-05-01T22:34:36Z
----------------------------------------------------------------

It is probably not worth it to fix those errors. Long term, bambi will replace the glm module, but the timeline is not clear yet. I would suggest working on the other comments as practice (together with the other PRs) and for now leaving this as is until we have had some time to plan the deprecation of the glm module

KavyaJaiswal commented on 2021-05-03T10:59:33Z
----------------------------------------------------------------

Fair enough.

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 29, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-04-29T21:27:32Z
----------------------------------------------------------------

You need to install the watermark library for this to work, make sure you install it to the right environment, not to "general" or "base" ones.


KavyaJaiswal commented on 2021-05-03T10:58:36Z
----------------------------------------------------------------

Updated!

@KavyaJaiswal KavyaJaiswal changed the title Replaced numpy.random with the new numpy generator Replaced numpy.random with the new random generator May 1, 2021
Copy link
Member

Now a generator is initialized, but it is not stored nor used afterwards to generate the random samples. It should be stored (i.e. as rng) and then random samples should be generated with rng.normal, rng.poisson... there should be no instances of np.random.<distribution> anywhere in the code anymore.


View entire conversation on ReviewNB

Copy link
Member

It is probably not worth it to fix those errors. Long term, bambi will replace the glm module, but the timeline is not clear yet. I would suggest working on the other comments as practice (together with the other PRs) and for now leaving this as is until we have had some time to plan the deprecation of the glm module


View entire conversation on ReviewNB

@review-notebook-app
Copy link

review-notebook-app bot commented May 1, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-05-01T22:34:46Z
----------------------------------------------------------------

The code should be try except directly, not be commented. The ellipsis should not be there either. They are in the guide because many notebooks need to pass extra arguments to read_csv. Also note that we have just updated the guide to show the specific exception that needs to be catched.


KavyaJaiswal commented on 2021-05-03T10:56:16Z
----------------------------------------------------------------

Updated accordingly.

OriolAbril commented on 2021-05-04T13:54:00Z
----------------------------------------------------------------

I would also remove the comment and use a specific exception as Marco pointed out and added to the style guide: https://github.com/pymc-devs/pymc3/wiki/PyMC3-Jupyter-Notebook-Style-Guide

@KavyaJaiswal
Copy link
Contributor Author

I have added all the required changes.

Copy link
Contributor Author

I have made the required changes.


View entire conversation on ReviewNB

Copy link
Contributor Author

Done!


View entire conversation on ReviewNB

Copy link
Contributor Author

Updated accordingly.


View entire conversation on ReviewNB

Copy link
Contributor Author

Updated!


View entire conversation on ReviewNB

Copy link
Contributor Author

Fair enough.


View entire conversation on ReviewNB

Copy link
Member

I would also remove the comment and use a specific exception as Marco pointed out and added to the style guide: https://github.com/pymc-devs/pymc3/wiki/PyMC3-Jupyter-Notebook-Style-Guide


View entire conversation on ReviewNB

Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two general comments.

First one is to update the PR title so when we merge it and it becomes a commit message, it will be informative of the changes, to do so it needs to mention the notebook that was modified. Most notebooks will be updated to use the new generator, so without that we'd end up with ~70 identical commit messages.

Second one is to remember to run pre-commit to ensure that CI passes. There is some guidance on that at https://github.com/pymc-devs/pymc3/wiki/PyMC3-Python-Code-Style

@KavyaJaiswal KavyaJaiswal changed the title Replaced numpy.random with the new random generator Update generalized linear models to "Best Practices" State May 4, 2021
@KavyaJaiswal
Copy link
Contributor Author

Updated the Pull Request with the suggested changes,

@OriolAbril OriolAbril changed the title Update generalized linear models to "Best Practices" State Update first half of generalized linear models to "Best Practices" May 25, 2021
@OriolAbril OriolAbril merged commit 5255505 into pymc-devs:main May 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants