Skip to content

Commit 98fd63e

Browse files
lucianopaztwiecki
authored andcommitted
Made Broken Pipe Error more verbose (#3292)
* Fix for #3225. Made Triangular `c` attribute be handled consistently with scipy.stats. Added test and updated example code. * Added a more detailed error message for Broken pipes. * Not a fix for #3140 * Fixed import of time. Trimmed the broken pipe exception handling. Added release notes. * Moved maintenance message to release notes of pymc3.7
1 parent e67c476 commit 98fd63e

File tree

2 files changed

+51
-1
lines changed

2 files changed

+51
-1
lines changed

RELEASE-NOTES.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66

77
### Maintenance
88

9+
- Made `BrokenPipeError` for parallel sampling more verbose on Windows.
910
- Added the `broadcast_distribution_samples` function that helps broadcasting arrays of drawn samples, taking into account the requested `size` and the inferred distribution shape. This sometimes is needed by distributions that call several `rvs` separately within their `random` method, such as the `ZeroInflatedPoisson` (Fix issue #3310).
1011
- The `Wald`, `Kumaraswamy`, `LogNormal`, `Pareto`, `Cauchy`, `HalfCauchy`, `Weibull` and `ExGaussian` distributions `random` method used a hidden `_random` function that was written with scalars in mind. This could potentially lead to artificial correlations between random draws. Added shape guards and broadcasting of the distribution samples to prevent this (Similar to issue #3310).
1112

pymc3/parallel_sampling.py

Lines changed: 50 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from collections import namedtuple
77
import traceback
88
from pymc3.exceptions import SamplingError
9+
import errno
910

1011
import numpy as np
1112

@@ -14,6 +15,34 @@
1415
logger = logging.getLogger("pymc3")
1516

1617

18+
def _get_broken_pipe_exception():
19+
import sys
20+
if sys.platform == 'win32':
21+
return RuntimeError("The communication pipe between the main process "
22+
"and its spawned children is broken.\n"
23+
"In Windows OS, this usually means that the child "
24+
"process raised an exception while it was being "
25+
"spawned, before it was setup to communicate to "
26+
"the main process.\n"
27+
"The exceptions raised by the child process while "
28+
"spawning cannot be caught or handled from the "
29+
"main process, and when running from an IPython or "
30+
"jupyter notebook interactive kernel, the child's "
31+
"exception and traceback appears to be lost.\n"
32+
"A known way to see the child's error, and try to "
33+
"fix or handle it, is to run the problematic code "
34+
"as a batch script from a system's Command Prompt. "
35+
"The child's exception will be printed to the "
36+
"Command Promt's stderr, and it should be visible "
37+
"above this error and traceback.\n"
38+
"Note that if running a jupyter notebook that was "
39+
"invoked from a Command Prompt, the child's "
40+
"exception should have been printed to the Command "
41+
"Prompt on which the notebook is running.")
42+
else:
43+
return None
44+
45+
1746
class ParallelSamplingError(Exception):
1847
def __init__(self, message, chain, warnings=None):
1948
super().__init__(message)
@@ -83,10 +112,19 @@ def run(self):
83112
pass
84113
except BaseException as e:
85114
e = ExceptionWithTraceback(e, e.__traceback__)
115+
# Send is not blocking so we have to force a wait for the abort
116+
# message
86117
self._msg_pipe.send(("error", None, e))
118+
self._wait_for_abortion()
87119
finally:
88120
self._msg_pipe.close()
89121

122+
def _wait_for_abortion(self):
123+
while True:
124+
msg = self._recv_msg()
125+
if msg[0] == "abort":
126+
break
127+
90128
def _make_numpy_refs(self):
91129
shape_dtypes = self._step_method.vars_shape_dtype
92130
point = {}
@@ -200,7 +238,18 @@ def __init__(self, draws, tune, step_method, chain, seed, start):
200238
seed,
201239
)
202240
# We fork right away, so that the main process can start tqdm threads
203-
self._process.start()
241+
try:
242+
self._process.start()
243+
except IOError as e:
244+
# Something may have gone wrong during the fork / spawn
245+
if e.errno == errno.EPIPE:
246+
exc = _get_broken_pipe_exception()
247+
if exc is not None:
248+
# Sleep a little to give the child process time to flush
249+
# all its error message
250+
time.sleep(0.2)
251+
raise exc
252+
raise
204253

205254
@property
206255
def shared_point_view(self):

0 commit comments

Comments
 (0)