Skip to content

Commit 547c784

Browse files
Alex Alekseyevjreback
Alex Alekseyev
authored andcommitted
ENH: Allow exponentially weighted functions to specify alpha directly
Closes #10789 Author: Alex Alekseyev <[email protected]> Closes #12492 from evectant/issue-10789 and squashes the following commits: a8c2753 [Alex Alekseyev] ENH: Allow exponentially weighted functions to specify alpha directly
1 parent a0aaad9 commit 547c784

File tree

6 files changed

+171
-76
lines changed

6 files changed

+171
-76
lines changed

doc/source/computation.rst

+16-11
Original file line numberDiff line numberDiff line change
@@ -733,24 +733,29 @@ therefore there is an assumption that :math:`x_0` is not an ordinary value
733733
but rather an exponentially weighted moment of the infinite series up to that
734734
point.
735735

736-
One must have :math:`0 < \alpha \leq 1`, but rather than pass :math:`\alpha`
737-
directly, it's easier to think about either the **span**, **center of mass
738-
(com)** or **halflife** of an EW moment:
736+
One must have :math:`0 < \alpha \leq 1`, and while since version 0.18.0
737+
it has been possible to pass :math:`\alpha` directly, it's often easier
738+
to think about either the **span**, **center of mass (com)** or **half-life**
739+
of an EW moment:
739740

740741
.. math::
741742
742743
\alpha =
743744
\begin{cases}
744-
\frac{2}{s + 1}, & s = \text{span}\\
745-
\frac{1}{1 + c}, & c = \text{center of mass}\\
746-
1 - \exp^{\frac{\log 0.5}{h}}, & h = \text{half life}
745+
\frac{2}{s + 1}, & \text{for span}\ s \geq 1\\
746+
\frac{1}{1 + c}, & \text{for center of mass}\ c \geq 0\\
747+
1 - \exp^{\frac{\log 0.5}{h}}, & \text{for half-life}\ h > 0
747748
\end{cases}
748749
749-
One must specify precisely one of the three to the EW functions. **Span**
750-
corresponds to what is commonly called a "20-day EW moving average" for
751-
example. **Center of mass** has a more physical interpretation. For example,
752-
**span** = 20 corresponds to **com** = 9.5. **Halflife** is the period of
753-
time for the exponential weight to reduce to one half.
750+
One must specify precisely one of **span**, **center of mass**, **half-life**
751+
and **alpha** to the EW functions:
752+
753+
- **Span** corresponds to what is commonly called an "N-day EW moving average".
754+
- **Center of mass** has a more physical interpretation and can be thought of
755+
in terms of span: :math:`c = (s - 1) / 2`.
756+
- **Half-life** is the period of time for the exponential weight to reduce to
757+
one half.
758+
- **Alpha** specifies the smoothing factor directly.
754759

755760
Here is an example for a univariate time series:
756761

doc/source/whatsnew/v0.18.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -947,6 +947,7 @@ Other API Changes
947947
- More helpful error message when constructing a ``DataFrame`` with empty data but with indices (:issue:`8020`)
948948
- ``.describe()`` will now properly handle bool dtype as a categorical (:issue:`6625`)
949949
- More helpful error message invalid ``.transform`` with user defined input (:issue:`10165`)
950+
- Exponentially weighted functions now allow specifying alpha directly (:issue:`10789`) and raise ``ValueError`` if parameters violate ``0 < alpha <= 1`` (:issue:`12492`)
950951

951952
.. _whatsnew_0180.deprecations:
952953

pandas/core/generic.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -5165,11 +5165,12 @@ def expanding(self, min_periods=1, freq=None, center=False, axis=0):
51655165
cls.expanding = expanding
51665166

51675167
@Appender(rwindow.ewm.__doc__)
5168-
def ewm(self, com=None, span=None, halflife=None, min_periods=0,
5169-
freq=None, adjust=True, ignore_na=False, axis=0):
5168+
def ewm(self, com=None, span=None, halflife=None, alpha=None,
5169+
min_periods=0, freq=None, adjust=True, ignore_na=False,
5170+
axis=0):
51705171
axis = self._get_axis_number(axis)
51715172
return rwindow.ewm(self, com=com, span=span, halflife=halflife,
5172-
min_periods=min_periods, freq=freq,
5173+
alpha=alpha, min_periods=min_periods, freq=freq,
51735174
adjust=adjust, ignore_na=ignore_na, axis=axis)
51745175

51755176
cls.ewm = ewm

pandas/core/window.py

+43-28
Original file line numberDiff line numberDiff line change
@@ -1038,13 +1038,21 @@ class EWM(_Rolling):
10381038
10391039
Parameters
10401040
----------
1041-
com : float. optional
1042-
Center of mass: :math:`\alpha = 1 / (1 + com)`,
1041+
com : float, optional
1042+
Specify decay in terms of center of mass,
1043+
:math:`\alpha = 1 / (1 + com),\text{ for } com \geq 0`
10431044
span : float, optional
1044-
Specify decay in terms of span, :math:`\alpha = 2 / (span + 1)`
1045+
Specify decay in terms of span,
1046+
:math:`\alpha = 2 / (span + 1),\text{ for } span \geq 1`
10451047
halflife : float, optional
1046-
Specify decay in terms of halflife,
1047-
:math:`\alpha = 1 - exp(log(0.5) / halflife)`
1048+
Specify decay in terms of half-life,
1049+
:math:`\alpha = 1 - exp(log(0.5) / halflife),\text{ for } halflife > 0`
1050+
alpha : float, optional
1051+
Specify smoothing factor :math:`\alpha` directly,
1052+
:math:`0 < \alpha \leq 1`
1053+
1054+
.. versionadded:: 0.18.0
1055+
10481056
min_periods : int, default 0
10491057
Minimum number of observations in window required to have a value
10501058
(otherwise result is NA).
@@ -1063,16 +1071,10 @@ class EWM(_Rolling):
10631071
10641072
Notes
10651073
-----
1066-
Either center of mass, span or halflife must be specified
1067-
1068-
EWMA is sometimes specified using a "span" parameter `s`, we have that the
1069-
decay parameter :math:`\alpha` is related to the span as
1070-
:math:`\alpha = 2 / (s + 1) = 1 / (1 + c)`
1071-
1072-
where `c` is the center of mass. Given a span, the associated center of
1073-
mass is :math:`c = (s - 1) / 2`
1074-
1075-
So a "20-day EWMA" would have center 9.5.
1074+
Exactly one of center of mass, span, half-life, and alpha must be provided.
1075+
Allowed values and relationship between the parameters are specified in the
1076+
parameter descriptions above; see the link at the end of this section for
1077+
a detailed explanation.
10761078
10771079
The `freq` keyword is used to conform time series data to a specified
10781080
frequency by resampling the data. This is done with the default parameters
@@ -1096,14 +1098,15 @@ class EWM(_Rolling):
10961098
(if adjust is True), and 1-alpha and alpha (if adjust is False).
10971099
10981100
More details can be found at
1099-
http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-moment-functions
1101+
http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-windows
11001102
"""
11011103
_attributes = ['com', 'min_periods', 'freq', 'adjust', 'ignore_na', 'axis']
11021104

1103-
def __init__(self, obj, com=None, span=None, halflife=None, min_periods=0,
1104-
freq=None, adjust=True, ignore_na=False, axis=0):
1105+
def __init__(self, obj, com=None, span=None, halflife=None, alpha=None,
1106+
min_periods=0, freq=None, adjust=True, ignore_na=False,
1107+
axis=0):
11051108
self.obj = obj
1106-
self.com = _get_center_of_mass(com, span, halflife)
1109+
self.com = _get_center_of_mass(com, span, halflife, alpha)
11071110
self.min_periods = min_periods
11081111
self.freq = freq
11091112
self.adjust = adjust
@@ -1320,20 +1323,32 @@ def dataframe_from_int_dict(data, frame_template):
13201323
return _flex_binary_moment(arg2, arg1, f)
13211324

13221325

1323-
def _get_center_of_mass(com, span, halflife):
1324-
valid_count = len([x for x in [com, span, halflife] if x is not None])
1326+
def _get_center_of_mass(com, span, halflife, alpha):
1327+
valid_count = len([x for x in [com, span, halflife, alpha]
1328+
if x is not None])
13251329
if valid_count > 1:
1326-
raise Exception("com, span, and halflife are mutually exclusive")
1327-
1328-
if span is not None:
1329-
# convert span to center of mass
1330+
raise ValueError("com, span, halflife, and alpha "
1331+
"are mutually exclusive")
1332+
1333+
# Convert to center of mass; domain checks ensure 0 < alpha <= 1
1334+
if com is not None:
1335+
if com < 0:
1336+
raise ValueError("com must satisfy: com >= 0")
1337+
elif span is not None:
1338+
if span < 1:
1339+
raise ValueError("span must satisfy: span >= 1")
13301340
com = (span - 1) / 2.
13311341
elif halflife is not None:
1332-
# convert halflife to center of mass
1342+
if halflife <= 0:
1343+
raise ValueError("halflife must satisfy: halflife > 0")
13331344
decay = 1 - np.exp(np.log(0.5) / halflife)
13341345
com = 1 / decay - 1
1335-
elif com is None:
1336-
raise Exception("Must pass one of com, span, or halflife")
1346+
elif alpha is not None:
1347+
if alpha <= 0 or alpha > 1:
1348+
raise ValueError("alpha must satisfy: 0 < alpha <= 1")
1349+
com = (1.0 - alpha) / alpha
1350+
else:
1351+
raise ValueError("Must pass one of com, span, halflife, or alpha")
13371352

13381353
return float(com)
13391354

pandas/stats/moments.py

+35-27
Original file line numberDiff line numberDiff line change
@@ -67,13 +67,21 @@
6767
"""
6868

6969

70-
_ewm_kw = r"""com : float. optional
71-
Center of mass: :math:`\alpha = 1 / (1 + com)`,
70+
_ewm_kw = r"""com : float, optional
71+
Specify decay in terms of center of mass,
72+
:math:`\alpha = 1 / (1 + com),\text{ for } com \geq 0`
7273
span : float, optional
73-
Specify decay in terms of span, :math:`\alpha = 2 / (span + 1)`
74+
Specify decay in terms of span,
75+
:math:`\alpha = 2 / (span + 1),\text{ for } span \geq 1`
7476
halflife : float, optional
75-
Specify decay in terms of halflife,
76-
:math:`\alpha = 1 - exp(log(0.5) / halflife)`
77+
Specify decay in terms of half-life,
78+
:math:`\alpha = 1 - exp(log(0.5) / halflife),\text{ for } halflife > 0`
79+
alpha : float, optional
80+
Specify smoothing factor :math:`\alpha` directly,
81+
:math:`0 < \alpha \leq 1`
82+
83+
.. versionadded:: 0.18.0
84+
7785
min_periods : int, default 0
7886
Minimum number of observations in window required to have a value
7987
(otherwise result is NA).
@@ -92,16 +100,10 @@
92100
_ewm_notes = r"""
93101
Notes
94102
-----
95-
Either center of mass, span or halflife must be specified
96-
97-
EWMA is sometimes specified using a "span" parameter `s`, we have that the
98-
decay parameter :math:`\alpha` is related to the span as
99-
:math:`\alpha = 2 / (s + 1) = 1 / (1 + c)`
100-
101-
where `c` is the center of mass. Given a span, the associated center of mass is
102-
:math:`c = (s - 1) / 2`
103-
104-
So a "20-day EWMA" would have center 9.5.
103+
Exactly one of center of mass, span, half-life, and alpha must be provided.
104+
Allowed values and relationship between the parameters are specified in the
105+
parameter descriptions above; see the link at the end of this section for
106+
a detailed explanation.
105107
106108
When adjust is True (default), weighted averages are calculated using weights
107109
(1-alpha)**(n-1), (1-alpha)**(n-2), ..., 1-alpha, 1.
@@ -121,7 +123,7 @@
121123
True), and 1-alpha and alpha (if adjust is False).
122124
123125
More details can be found at
124-
http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-moment-functions
126+
http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-windows
125127
"""
126128

127129
_expanding_kw = """min_periods : int, default None
@@ -323,14 +325,15 @@ def rolling_corr(arg1, arg2=None, window=None, pairwise=None, **kwargs):
323325
@Substitution("Exponentially-weighted moving average", _unary_arg, _ewm_kw,
324326
_type_of_input_retval, _ewm_notes)
325327
@Appender(_doc_template)
326-
def ewma(arg, com=None, span=None, halflife=None, min_periods=0, freq=None,
327-
adjust=True, how=None, ignore_na=False):
328+
def ewma(arg, com=None, span=None, halflife=None, alpha=None, min_periods=0,
329+
freq=None, adjust=True, how=None, ignore_na=False):
328330
return ensure_compat('ewm',
329331
'mean',
330332
arg,
331333
com=com,
332334
span=span,
333335
halflife=halflife,
336+
alpha=alpha,
334337
min_periods=min_periods,
335338
freq=freq,
336339
adjust=adjust,
@@ -341,14 +344,15 @@ def ewma(arg, com=None, span=None, halflife=None, min_periods=0, freq=None,
341344
@Substitution("Exponentially-weighted moving variance", _unary_arg,
342345
_ewm_kw + _bias_kw, _type_of_input_retval, _ewm_notes)
343346
@Appender(_doc_template)
344-
def ewmvar(arg, com=None, span=None, halflife=None, min_periods=0, bias=False,
345-
freq=None, how=None, ignore_na=False, adjust=True):
347+
def ewmvar(arg, com=None, span=None, halflife=None, alpha=None, min_periods=0,
348+
bias=False, freq=None, how=None, ignore_na=False, adjust=True):
346349
return ensure_compat('ewm',
347350
'var',
348351
arg,
349352
com=com,
350353
span=span,
351354
halflife=halflife,
355+
alpha=alpha,
352356
min_periods=min_periods,
353357
freq=freq,
354358
adjust=adjust,
@@ -361,14 +365,15 @@ def ewmvar(arg, com=None, span=None, halflife=None, min_periods=0, bias=False,
361365
@Substitution("Exponentially-weighted moving std", _unary_arg,
362366
_ewm_kw + _bias_kw, _type_of_input_retval, _ewm_notes)
363367
@Appender(_doc_template)
364-
def ewmstd(arg, com=None, span=None, halflife=None, min_periods=0, bias=False,
365-
freq=None, how=None, ignore_na=False, adjust=True):
368+
def ewmstd(arg, com=None, span=None, halflife=None, alpha=None, min_periods=0,
369+
bias=False, freq=None, how=None, ignore_na=False, adjust=True):
366370
return ensure_compat('ewm',
367371
'std',
368372
arg,
369373
com=com,
370374
span=span,
371375
halflife=halflife,
376+
alpha=alpha,
372377
min_periods=min_periods,
373378
freq=freq,
374379
adjust=adjust,
@@ -383,9 +388,9 @@ def ewmstd(arg, com=None, span=None, halflife=None, min_periods=0, bias=False,
383388
@Substitution("Exponentially-weighted moving covariance", _binary_arg_flex,
384389
_ewm_kw + _pairwise_kw, _type_of_input_retval, _ewm_notes)
385390
@Appender(_doc_template)
386-
def ewmcov(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
387-
bias=False, freq=None, pairwise=None, how=None, ignore_na=False,
388-
adjust=True):
391+
def ewmcov(arg1, arg2=None, com=None, span=None, halflife=None, alpha=None,
392+
min_periods=0, bias=False, freq=None, pairwise=None, how=None,
393+
ignore_na=False, adjust=True):
389394
if arg2 is None:
390395
arg2 = arg1
391396
pairwise = True if pairwise is None else pairwise
@@ -401,6 +406,7 @@ def ewmcov(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
401406
com=com,
402407
span=span,
403408
halflife=halflife,
409+
alpha=alpha,
404410
min_periods=min_periods,
405411
bias=bias,
406412
freq=freq,
@@ -414,8 +420,9 @@ def ewmcov(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
414420
@Substitution("Exponentially-weighted moving correlation", _binary_arg_flex,
415421
_ewm_kw + _pairwise_kw, _type_of_input_retval, _ewm_notes)
416422
@Appender(_doc_template)
417-
def ewmcorr(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
418-
freq=None, pairwise=None, how=None, ignore_na=False, adjust=True):
423+
def ewmcorr(arg1, arg2=None, com=None, span=None, halflife=None, alpha=None,
424+
min_periods=0, freq=None, pairwise=None, how=None, ignore_na=False,
425+
adjust=True):
419426
if arg2 is None:
420427
arg2 = arg1
421428
pairwise = True if pairwise is None else pairwise
@@ -430,6 +437,7 @@ def ewmcorr(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
430437
com=com,
431438
span=span,
432439
halflife=halflife,
440+
alpha=alpha,
433441
min_periods=min_periods,
434442
freq=freq,
435443
how=how,

0 commit comments

Comments
 (0)