Skip to content

Commit e98b9a6

Browse files
committed
[Reliability Bayesian pymc-devs#474] updated with failure count prediction interval
Signed-off-by: Nathaniel <[email protected]>
1 parent 702debc commit e98b9a6

File tree

2 files changed

+1140
-428
lines changed

2 files changed

+1140
-428
lines changed

examples/case_studies/reliability_and_calibration.ipynb

+924-397
Large diffs are not rendered by default.

myst_nbs/case_studies/reliability_and_calibration.myst.md

+216-31
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ kernelspec:
2222

2323
```{code-cell} ipython3
2424
import os
25+
import random
2526
2627
from io import StringIO
2728
@@ -404,20 +405,36 @@ plot_ln_pi(
404405

405406
### Bootstrap Calibration and Coverage Estimation
406407

407-
We want now to estimate the coverage implied by this prediction interval, and to do so we will bootstrap estimates for the lower and upper bounds of the 95% confidence interval and ultimately assess their coverage conditional on the MLE fit.
408+
We want now to estimate the coverage implied by this prediction interval, and to do so we will bootstrap estimates for the lower and upper bounds of the 95% confidence interval and ultimately assess their coverage conditional on the MLE fit. We will use the fractional weighted (bayesian) bootstrap. We'll report two methods of estimating the coverage statistic - the first is the empirical coverage based on sampling a random value from within the known range and assess if it falls between the 95% MLE lower bound and upper bounds.
408409

409-
```{code-cell} ipython3
410-
import random
410+
The second method we'll use to assess coverage is to bootstrap estimates of a 95% lower bound and upper bound and then assess how much those bootstrapped values would cover conditional on the MLE fit.
411411

412+
```{code-cell} ipython3
413+
def bayes_boot(df, lb, ub, seed=100):
414+
w = np.random.dirichlet(np.ones(len(df)), 1)[0]
415+
lnf = LogNormalFitter().fit(df["t"] + 1e-25, df["failed"], weights=w)
416+
## Sample random choice from 95% percentile interval of bootstrapped dist
417+
# choices = draws['t'].values
418+
choices = np.linspace(df["t"].min(), df["t"].max(), 1000)
419+
future = random.choice(choices)
420+
## Check if choice is contained within the MLE 95% PI
421+
contained = (future >= lb) & (future <= ub)
422+
## Record 95% interval of bootstrapped dist
423+
lb = lognorm(s=lnf.sigma_, scale=np.exp(lnf.mu_)).ppf(0.025)
424+
ub = lognorm(s=lnf.sigma_, scale=np.exp(lnf.mu_)).ppf(0.975)
425+
return lb, ub, contained, future, lnf.sigma_, lnf.mu_
426+
```
412427

413-
def bootstrap(lb, ub):
414-
draws = actuarial_table_shock[["t", "failed"]].sample(replace=True, frac=1)
428+
```{code-cell} ipython3
429+
def bootstrap(df, lb, ub, seed=100):
430+
draws = df[["t", "failed"]].sample(replace=True, frac=1, random_state=seed)
415431
draws.sort_values("t", inplace=True)
416432
## Fit Lognormal Dist to
417433
lnf = LogNormalFitter().fit(draws["t"] + 1e-25, draws["failed"])
418434
## Sample random choice from 95% percentile interval of bootstrapped dist
419435
# choices = draws['t'].values
420-
choices = np.linspace(draws["t"].min(), draws["t"].max(), draws.shape[0])
436+
## Essentially sampling from a uniform interval
437+
choices = np.linspace(draws["t"].min(), draws["t"].max(), 1000)
421438
future = random.choice(choices)
422439
## Check if choice is contained within the MLE 95% PI
423440
contained = (future >= lb) & (future <= ub)
@@ -427,7 +444,7 @@ def bootstrap(lb, ub):
427444
return lb, ub, contained, future, lnf.sigma_, lnf.mu_
428445
429446
430-
CIs = [bootstrap(8928, 70188) for i in range(1000)]
447+
CIs = [bayes_boot(actuarial_table_shock, 8928, 70188, seed=i) for i in range(1000)]
431448
draws = pd.DataFrame(
432449
CIs, columns=["Lower Bound PI", "Upper Bound PI", "Contained", "future", "Sigma", "Mu"]
433450
)
@@ -461,8 +478,8 @@ axs[1].scatter(
461478
axs[0].set_title("Bootstrapped Mu against Bootstrapped 95% Lower Bound")
462479
prop = draws["Contained"].sum() / len(draws)
463480
axs[0].annotate(
464-
f"Estimated Prediction \nEmpirical Coverage Based on Sampling : {np.round(prop, 3)}",
465-
xy=(10.4, 16000),
481+
f"Estimated Prediction \nEmpirical Coverage \nBased on Sampling : {np.round(prop, 3)}",
482+
xy=(10.4, 12000),
466483
fontweight="bold",
467484
)
468485
axs[1].set_title("Bootstrapped Sigma against Bootstrapped 95% Lower Bound")
@@ -514,6 +531,10 @@ axs[2].annotate(
514531
);
515532
```
516533

534+
These simulations should be repeated a far larger number of times than we do here. We can also vary the interval size to achieve the desired coverage level.
535+
536+
+++
537+
517538
### Bearing Cage Data: A Study in Bayesian Reliability Analysis
518539

519540
Next we'll look at a data set which has a slightly less clean parametric fit. The most obvious feature of this data is the small amount of failing records. The data is recorded in the **period** format with counts showing the extent of the `risk set` in each period.
@@ -778,7 +799,7 @@ ax2.legend()
778799
ax.set_ylabel("Fraction Failing");
779800
```
780801

781-
## Bayesian Modelling
802+
## Bayesian Modelling of Reliability Data
782803

783804
We've now seen how to model and visualise the parametric model fits to sparse reliability using a frequentist or MLE framework. We want to now show how the same style of inferences can be achieved in the Bayesian paradigm.
784805

@@ -793,7 +814,7 @@ where $\delta_{i}$ is an indicator for whether the observation is a faiure or a
793814

794815
### Direct PYMC implementation of Weibull Survival
795816

796-
We'll first model the Weibull likelihood directly in terms of the parameters $\alpha, \beta$, and then consider an alternative parameterisation.
817+
We fit two versions of this model with different specifications for the priors, one takes a **vague** uniform prior over the data, and another specifies priors closer to the MLE fits. We will show the implications of the prior specification has for the calibration of the model with the observed data below.
797818

798819
```{code-cell} ipython3
799820
def weibull_lccdf(y, alpha, beta):
@@ -805,26 +826,38 @@ item_period_max = item_period_bearing_cage.groupby("id")[["t", "failed"]].max()
805826
y = item_period_max["t"].values
806827
censored = ~item_period_max["failed"].values.astype(bool)
807828
808-
with pm.Model() as model:
809829
810-
beta = pm.Uniform("beta", 100, 15_000)
811-
alpha = pm.TruncatedNormal("alpha", 2, 0.5, lower=0.02, upper=8)
830+
priors = {"beta": [100, 15_000], "alpha": [2, 0.5, 0.02, 8]}
831+
priors_informative = {"beta": [9500, 12_000], "alpha": [2, 0.5, 0.02, 3]}
832+
833+
834+
def make_model(p):
835+
with pm.Model() as model:
836+
837+
beta = pm.Uniform("beta", p["beta"][0], p["beta"][1])
838+
alpha = pm.TruncatedNormal(
839+
"alpha", p["alpha"][0], p["alpha"][1], lower=p["alpha"][2], upper=p["alpha"][3]
840+
)
812841
813-
y_obs = pm.Weibull("y_obs", alpha=alpha, beta=beta, observed=y[~censored])
814-
y_cens = pm.Potential("y_cens", weibull_lccdf(y[censored], alpha, beta))
815-
idata = pm.sample_prior_predictive()
816-
idata.extend(pm.sample(random_seed=100, target_accept=0.95))
817-
idata.extend(pm.sample_posterior_predictive(idata))
842+
y_obs = pm.Weibull("y_obs", alpha=alpha, beta=beta, observed=y[~censored])
843+
y_cens = pm.Potential("y_cens", weibull_lccdf(y[censored], alpha, beta))
844+
idata = pm.sample_prior_predictive()
845+
idata.extend(pm.sample(random_seed=100, target_accept=0.95))
846+
idata.extend(pm.sample_posterior_predictive(idata))
818847
819-
pm.model_to_graphviz(model)
848+
return idata, model
849+
850+
851+
idata, model = make_model(priors)
852+
idata_informative, model = make_model(priors_informative)
820853
```
821854

822855
```{code-cell} ipython3
823856
idata
824857
```
825858

826859
```{code-cell} ipython3
827-
az.plot_trace(idata);
860+
az.plot_trace(idata, kind="rank_vlines");
828861
```
829862

830863
```{code-cell} ipython3
@@ -848,8 +881,17 @@ def ecdf(sample):
848881
```
849882

850883
```{code-cell} ipython3
851-
alphas = az.extract_dataset(idata, group="posterior", num_samples=200)["alpha"].values
852-
betas = az.extract_dataset(idata, group="posterior", num_samples=200)["beta"].values
884+
joint_draws = az.extract_dataset(idata, group="posterior", num_samples=1000)[
885+
["alpha", "beta"]
886+
].to_dataframe()
887+
alphas = joint_draws["alpha"].values
888+
betas = joint_draws["beta"].values
889+
890+
joint_draws_informative = az.extract_dataset(
891+
idata_informative, group="posterior", num_samples=1000
892+
)[["alpha", "beta"]].to_dataframe()
893+
alphas_informative = joint_draws_informative["alpha"].values
894+
betas_informative = joint_draws_informative["beta"].values
853895
854896
mosaic = """AAAA
855897
BBCC"""
@@ -859,26 +901,40 @@ ax = axs[0]
859901
ax1 = axs[2]
860902
ax2 = axs[1]
861903
hist_data = []
862-
for i in range(200):
904+
for i in range(1000):
863905
draws = pm.draw(pm.Weibull.dist(alpha=alphas[i], beta=betas[i]), 1000)
864906
qe, pe = ecdf(draws)
865907
lkup = dict(zip(pe, qe))
866908
hist_data.append([lkup[0.1], lkup[0.05]])
867-
ax.plot(qe, pe, color="slateblue", alpha=0.4)
909+
ax.plot(qe, pe, color="slateblue", alpha=0.1)
910+
hist_data_info = []
911+
for i in range(1000):
912+
draws = pm.draw(pm.Weibull.dist(alpha=alphas_informative[i], beta=betas_informative[i]), 1000)
913+
qe, pe = ecdf(draws)
914+
lkup = dict(zip(pe, qe))
915+
hist_data_info.append([lkup[0.1], lkup[0.05]])
916+
ax.plot(qe, pe, color="pink", alpha=0.1)
868917
hist_data = pd.DataFrame(hist_data, columns=["p10", "p05"])
918+
hist_data_info = pd.DataFrame(hist_data_info, columns=["p10", "p05"])
869919
draws = pm.draw(pm.Weibull.dist(alpha=np.mean(alphas), beta=np.mean(betas)), 1000)
870920
qe, pe = ecdf(draws)
871-
ax.plot(qe, pe, color="cyan", label="Expected CDF")
921+
ax.plot(qe, pe, color="purple", label="Expected CDF Uninformative")
922+
draws = pm.draw(
923+
pm.Weibull.dist(alpha=np.mean(alphas_informative), beta=np.mean(betas_informative)), 1000
924+
)
925+
qe, pe = ecdf(draws)
926+
ax.plot(qe, pe, color="magenta", label="Expected CDF Informative")
872927
ax.plot(
873928
actuarial_table_bearings["t"],
874929
actuarial_table_bearings["logit_CI_95_ub"],
875930
"--",
876931
label="Non-Parametric CI UB",
877932
color="black",
878933
)
879-
ax.scatter(
934+
ax.plot(
880935
actuarial_table_bearings["t"],
881936
actuarial_table_bearings["F_hat"],
937+
"-o",
882938
label="Non-Parametric CDF",
883939
color="black",
884940
alpha=1,
@@ -892,18 +948,147 @@ ax.plot(
892948
)
893949
ax.set_xlim(0, 2500)
894950
ax.set_title(
895-
"Bayesian Estimation of Uncertainty in the CDF \n and the non-parametric estimates", fontsize=20
951+
"Bayesian Estimation of Uncertainty in the Posterior Predictive CDF \n Informative and Non-Informative Priors",
952+
fontsize=20,
896953
)
897954
ax.set_ylabel("Fraction Failing")
898955
ax.set_xlabel("Time")
899-
ax1.hist(hist_data["p10"], bins=30, ec="black", color="skyblue", alpha=0.4)
956+
ax1.hist(
957+
hist_data["p10"], bins=30, ec="black", color="skyblue", alpha=0.4, label="Uninformative Prior"
958+
)
959+
ax1.hist(
960+
hist_data_info["p10"],
961+
bins=30,
962+
ec="black",
963+
color="slateblue",
964+
alpha=0.4,
965+
label="Informative Prior",
966+
)
900967
ax1.set_title("Distribution of 10% failure Time", fontsize=20)
901-
ax2.hist(hist_data["p05"], bins=30, ec="black", color="cyan", alpha=0.4)
968+
ax1.legend()
969+
ax2.hist(hist_data["p05"], bins=30, ec="black", color="cyan", alpha=0.4, label="Uninformative")
970+
ax2.hist(hist_data_info["p05"], bins=30, ec="black", color="pink", alpha=0.4, label="Informative")
971+
ax2.legend()
902972
ax2.set_title("Distribution of 5% failure Time", fontsize=20)
973+
wbf = WeibullFitter().fit(item_period_bearing_cage["t"] + 1e-25, item_period_bearing_cage["failed"])
974+
wbf.plot_cumulative_density(ax=ax, color="cyan", label="Weibull MLE Fit")
975+
ax.legend()
976+
ax.set_ylim(0, 0.25);
977+
```
978+
979+
We can see here how the Bayesian uncertainty estimates driven by our deliberately vague priors encompasses more uncertainty than our MLE fit and the uninformative prior implies a wider predictive distribution for the 5% and 10% failure times. The Bayesian model with uninformative priors seems to do a better job of capturing the uncertainty in the non-parametric estimates of our CDF.
980+
981+
The concrete estimates of failure percentage over time of each model fit are especially crucial in a situation where we have sparse data. It is a meaningful sense check that we can consult with subject matter experts about how plausible the expectation and range for the 10% failure time is for their product.
982+
983+
+++
984+
985+
## Predicting the Number of Failures in an Interval
986+
987+
Because our data on observed failures is extremely sparse, we have to be very careful about extrapolating beyond the observed range of time, but we can ask about the predictable number of failures in the lower tail of our cdf. Another view on this data which can be helpful for discussing with subject matters experts is the number of implied failures over a time interval.
988+
989+
### The Plugin Estimate
990+
991+
Imagine we want to know how many bearings will fail between 150 and 600 hours based of service. We can calculate this based on the estimated CDF and number of new future bearings. We first calculate:
992+
993+
$$ \hat{\rho} = \dfrac{\hat{F}(t_1) - \hat{F}(t_0)}{1 - \hat{F}(t_0)} $$
994+
995+
to establish a probability for the failure occurring in the interval and then a point prediction for the number of failures in the interval is given by `risk_set`*$\hat{\rho}$.
996+
997+
```{code-cell} ipython3
998+
mle_fit = weibull_min(c=2, scale=10_000)
999+
rho = mle_fit.cdf(600) - mle_fit.cdf(150) / (1 - mle_fit.cdf(150))
1000+
print("Rho:", rho)
1001+
print("N at Risk:", 1700)
1002+
print("Expected Number Failing in first 300 hours:", 1700 * rho)
1003+
from scipy.stats import binom
1004+
1005+
print("Lower Bound 95% PI :", binom(1700, rho).ppf(0.05))
1006+
print("Upper Bound 95% PI:", binom(1700, rho).ppf(0.95))
1007+
```
1008+
1009+
### Applying the Same Procedure on the Bayesian Posterior
1010+
1011+
We'll use the posterior predictive distribution of the uniformative model. We show here how to derive the uncertainty in the estimates of the 95% prediction interval for the number of failures in a time interval.
1012+
1013+
```{code-cell} ipython3
1014+
def PI_failures(joint_draws, lp, up, n_at_risk):
1015+
records = []
1016+
alphas = joint_draws["alpha"].values
1017+
betas = joint_draws["beta"].values
1018+
for i in range(len(joint_draws)):
1019+
fit = weibull_min(c=alphas[i], scale=betas[i])
1020+
rho = fit.cdf(up) - fit.cdf(lp) / (1 - fit.cdf(lp))
1021+
lb = binom(n_at_risk, rho).ppf(0.05)
1022+
ub = binom(n_at_risk, rho).ppf(0.95)
1023+
point_prediction = n_at_risk * rho
1024+
records.append([alphas[i], betas[i], rho, n_at_risk, lb, ub, point_prediction])
1025+
return pd.DataFrame(
1026+
records, columns=["alpha", "beta", "rho", "n_at_risk", "lb", "ub", "expected"]
1027+
)
1028+
1029+
1030+
output_df = PI_failures(joint_draws, 150, 600, 1700)
1031+
output_df
1032+
```
1033+
1034+
```{code-cell} ipython3
1035+
mosaic = """AAAA
1036+
BBBB"""
1037+
fig, axs = plt.subplot_mosaic(mosaic=mosaic, figsize=(20, 12))
1038+
axs = [axs[k] for k in axs.keys()]
1039+
ax = axs[0]
1040+
ax1 = axs[1]
1041+
1042+
ax.scatter(
1043+
output_df["alpha"],
1044+
output_df["expected"],
1045+
c=output_df["beta"],
1046+
cmap=cm.cool,
1047+
alpha=0.3,
1048+
label="Coloured by function of Beta values",
1049+
)
9031050
ax.legend()
904-
ax.set_ylim(0, 0.1);
1051+
ax.set_ylabel("Expected Failures")
1052+
ax.set_xlabel("Alpha")
1053+
ax.set_title(
1054+
"Posterior Predictive Expected Failure Count between 150-600 hours \nas a function of Weibull(alpha, beta)",
1055+
fontsize=20,
1056+
)
1057+
1058+
ax1.hist(
1059+
output_df["lb"],
1060+
ec="black",
1061+
color="slateblue",
1062+
label="95% PI Lower Bound on Failure Count",
1063+
alpha=0.3,
1064+
)
1065+
ax1.axvline(output_df["lb"].mean(), label="Expected 95% PI Lower Bound on Failure Count")
1066+
ax1.axvline(output_df["ub"].mean(), label="Expected 95% PI Upper Bound on Failure Count")
1067+
ax1.hist(
1068+
output_df["ub"],
1069+
ec="black",
1070+
color="cyan",
1071+
label="95% PI Upper Bound on Failure Count",
1072+
bins=20,
1073+
alpha=0.3,
1074+
)
1075+
ax1.hist(
1076+
output_df["expected"], ec="black", color="pink", label="Expected Count of Failures", bins=20
1077+
)
1078+
ax1.set_title("Uncertainty in the Posterior Prediction Interval of Failure Counts", fontsize=20)
1079+
ax1.legend();
9051080
```
9061081

1082+
The choice of model in such cases is crucial. The decision about which failure profile is apt, has to be informed by a subject matter expert because extrapolation from such sparse data is always risky. An understanding of the uncertainty is crucial if real costs attach to the failures and the subject matter expert is usually better placed to tell if you 2 or 7 failures can be expected within 600 hours of service.
1083+
1084+
# Conclusion
1085+
1086+
We've seen how to analyse and model reliability from both a frequentist and Bayesian perspective and compare against the non-parametric estimates. We've shown how prediction intervals can be derived for a number of key statistics by both a bootstrapping and a bayesian approach. We've seen approaches to calibrating these prediction intervals through re-sampling methods and informative prior specification. These views on the problem are complementary and the choice of technique which is appropriate should be driven by factors of the questions of interest, not some ideological commitment.
1087+
1088+
In particular we've seen how the MLE fits to our bearings data provide a decent first guess approach to establishing priors in the Bayesian analysis. We've also seen how subject matter expertise can be elicited by deriving key quantities from the implied models and subjecting these implications to scrutiny. The choice of Bayesian prediction interval is calibrated to our priors expectations, and where we have none - we can supply vague or non-informative priors. The implications of these priors can again be checked and analysed against an appropriate cost function.
1089+
1090+
+++
1091+
9071092
## Authors
9081093

9091094
Nathaniel Forde

0 commit comments

Comments
 (0)