Skip to content

ENH: Allow exponentially weighted functions to specify alpha directly #12492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 16 additions & 11 deletions doc/source/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -733,24 +733,29 @@ therefore there is an assumption that :math:`x_0` is not an ordinary value
but rather an exponentially weighted moment of the infinite series up to that
point.

One must have :math:`0 < \alpha \leq 1`, but rather than pass :math:`\alpha`
directly, it's easier to think about either the **span**, **center of mass
(com)** or **halflife** of an EW moment:
One must have :math:`0 < \alpha \leq 1`, and while since version 0.18.0
it has been possible to pass :math:`\alpha` directly, it's often easier
to think about either the **span**, **center of mass (com)** or **half-life**
of an EW moment:

.. math::

\alpha =
\begin{cases}
\frac{2}{s + 1}, & s = \text{span}\\
\frac{1}{1 + c}, & c = \text{center of mass}\\
1 - \exp^{\frac{\log 0.5}{h}}, & h = \text{half life}
\frac{2}{s + 1}, & \text{for span}\ s \geq 1\\
\frac{1}{1 + c}, & \text{for center of mass}\ c \geq 0\\
1 - \exp^{\frac{\log 0.5}{h}}, & \text{for half-life}\ h > 0
\end{cases}

One must specify precisely one of the three to the EW functions. **Span**
corresponds to what is commonly called a "20-day EW moving average" for
example. **Center of mass** has a more physical interpretation. For example,
**span** = 20 corresponds to **com** = 9.5. **Halflife** is the period of
time for the exponential weight to reduce to one half.
One must specify precisely one of **span**, **center of mass**, **half-life**
and **alpha** to the EW functions:

- **Span** corresponds to what is commonly called an "N-day EW moving average".
- **Center of mass** has a more physical interpretation and can be thought of
in terms of span: :math:`c = (s - 1) / 2`.
- **Half-life** is the period of time for the exponential weight to reduce to
one half.
- **Alpha** specifies the smoothing factor directly.

Here is an example for a univariate time series:

Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.18.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -947,6 +947,7 @@ Other API Changes
- More helpful error message when constructing a ``DataFrame`` with empty data but with indices (:issue:`8020`)
- ``.describe()`` will now properly handle bool dtype as a categorical (:issue:`6625`)
- More helpful error message invalid ``.transform`` with user defined input (:issue:`10165`)
- Exponentially weighted functions now allow specifying alpha directly (:issue:`10789`) and raise ``ValueError`` if parameters violate ``0 < alpha <= 1`` (:issue:`12492`)

.. _whatsnew_0180.deprecations:

Expand Down
7 changes: 4 additions & 3 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -5165,11 +5165,12 @@ def expanding(self, min_periods=1, freq=None, center=False, axis=0):
cls.expanding = expanding

@Appender(rwindow.ewm.__doc__)
def ewm(self, com=None, span=None, halflife=None, min_periods=0,
freq=None, adjust=True, ignore_na=False, axis=0):
def ewm(self, com=None, span=None, halflife=None, alpha=None,
min_periods=0, freq=None, adjust=True, ignore_na=False,
axis=0):
axis = self._get_axis_number(axis)
return rwindow.ewm(self, com=com, span=span, halflife=halflife,
min_periods=min_periods, freq=freq,
alpha=alpha, min_periods=min_periods, freq=freq,
adjust=adjust, ignore_na=ignore_na, axis=axis)

cls.ewm = ewm
Expand Down
71 changes: 43 additions & 28 deletions pandas/core/window.py
Original file line number Diff line number Diff line change
Expand Up @@ -1038,13 +1038,21 @@ class EWM(_Rolling):

Parameters
----------
com : float. optional
Center of mass: :math:`\alpha = 1 / (1 + com)`,
com : float, optional
Specify decay in terms of center of mass,
:math:`\alpha = 1 / (1 + com),\text{ for } com \geq 0`
span : float, optional
Specify decay in terms of span, :math:`\alpha = 2 / (span + 1)`
Specify decay in terms of span,
:math:`\alpha = 2 / (span + 1),\text{ for } span \geq 1`
halflife : float, optional
Specify decay in terms of halflife,
:math:`\alpha = 1 - exp(log(0.5) / halflife)`
Specify decay in terms of half-life,
:math:`\alpha = 1 - exp(log(0.5) / halflife),\text{ for } halflife > 0`
alpha : float, optional
Specify smoothing factor :math:`\alpha` directly,
:math:`0 < \alpha \leq 1`

.. versionadded:: 0.18.0

min_periods : int, default 0
Minimum number of observations in window required to have a value
(otherwise result is NA).
Expand All @@ -1063,16 +1071,10 @@ class EWM(_Rolling):

Notes
-----
Either center of mass, span or halflife must be specified

EWMA is sometimes specified using a "span" parameter `s`, we have that the
decay parameter :math:`\alpha` is related to the span as
:math:`\alpha = 2 / (s + 1) = 1 / (1 + c)`

where `c` is the center of mass. Given a span, the associated center of
mass is :math:`c = (s - 1) / 2`

So a "20-day EWMA" would have center 9.5.
Exactly one of center of mass, span, half-life, and alpha must be provided.
Allowed values and relationship between the parameters are specified in the
parameter descriptions above; see the link at the end of this section for
a detailed explanation.

The `freq` keyword is used to conform time series data to a specified
frequency by resampling the data. This is done with the default parameters
Expand All @@ -1096,14 +1098,15 @@ class EWM(_Rolling):
(if adjust is True), and 1-alpha and alpha (if adjust is False).

More details can be found at
http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-moment-functions
http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-windows
"""
_attributes = ['com', 'min_periods', 'freq', 'adjust', 'ignore_na', 'axis']

def __init__(self, obj, com=None, span=None, halflife=None, min_periods=0,
freq=None, adjust=True, ignore_na=False, axis=0):
def __init__(self, obj, com=None, span=None, halflife=None, alpha=None,
min_periods=0, freq=None, adjust=True, ignore_na=False,
axis=0):
self.obj = obj
self.com = _get_center_of_mass(com, span, halflife)
self.com = _get_center_of_mass(com, span, halflife, alpha)
self.min_periods = min_periods
self.freq = freq
self.adjust = adjust
Expand Down Expand Up @@ -1320,20 +1323,32 @@ def dataframe_from_int_dict(data, frame_template):
return _flex_binary_moment(arg2, arg1, f)


def _get_center_of_mass(com, span, halflife):
valid_count = len([x for x in [com, span, halflife] if x is not None])
def _get_center_of_mass(com, span, halflife, alpha):
valid_count = len([x for x in [com, span, halflife, alpha]
if x is not None])
if valid_count > 1:
raise Exception("com, span, and halflife are mutually exclusive")

if span is not None:
# convert span to center of mass
raise ValueError("com, span, halflife, and alpha "
"are mutually exclusive")

# Convert to center of mass; domain checks ensure 0 < alpha <= 1
if com is not None:
if com < 0:
raise ValueError("com must satisfy: com >= 0")
elif span is not None:
if span < 1:
raise ValueError("span must satisfy: span >= 1")
com = (span - 1) / 2.
elif halflife is not None:
# convert halflife to center of mass
if halflife <= 0:
raise ValueError("halflife must satisfy: halflife > 0")
decay = 1 - np.exp(np.log(0.5) / halflife)
com = 1 / decay - 1
elif com is None:
raise Exception("Must pass one of com, span, or halflife")
elif alpha is not None:
if alpha <= 0 or alpha > 1:
raise ValueError("alpha must satisfy: 0 < alpha <= 1")
com = (1.0 - alpha) / alpha
else:
raise ValueError("Must pass one of com, span, halflife, or alpha")

return float(com)

Expand Down
62 changes: 35 additions & 27 deletions pandas/stats/moments.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,13 +67,21 @@
"""


_ewm_kw = r"""com : float. optional
Center of mass: :math:`\alpha = 1 / (1 + com)`,
_ewm_kw = r"""com : float, optional
Specify decay in terms of center of mass,
:math:`\alpha = 1 / (1 + com),\text{ for } com \geq 0`
span : float, optional
Specify decay in terms of span, :math:`\alpha = 2 / (span + 1)`
Specify decay in terms of span,
:math:`\alpha = 2 / (span + 1),\text{ for } span \geq 1`
halflife : float, optional
Specify decay in terms of halflife,
:math:`\alpha = 1 - exp(log(0.5) / halflife)`
Specify decay in terms of half-life,
:math:`\alpha = 1 - exp(log(0.5) / halflife),\text{ for } halflife > 0`
alpha : float, optional
Specify smoothing factor :math:`\alpha` directly,
:math:`0 < \alpha \leq 1`

.. versionadded:: 0.18.0

min_periods : int, default 0
Minimum number of observations in window required to have a value
(otherwise result is NA).
Expand All @@ -92,16 +100,10 @@
_ewm_notes = r"""
Notes
-----
Either center of mass, span or halflife must be specified

EWMA is sometimes specified using a "span" parameter `s`, we have that the
decay parameter :math:`\alpha` is related to the span as
:math:`\alpha = 2 / (s + 1) = 1 / (1 + c)`

where `c` is the center of mass. Given a span, the associated center of mass is
:math:`c = (s - 1) / 2`

So a "20-day EWMA" would have center 9.5.
Exactly one of center of mass, span, half-life, and alpha must be provided.
Allowed values and relationship between the parameters are specified in the
parameter descriptions above; see the link at the end of this section for
a detailed explanation.

When adjust is True (default), weighted averages are calculated using weights
(1-alpha)**(n-1), (1-alpha)**(n-2), ..., 1-alpha, 1.
Expand All @@ -121,7 +123,7 @@
True), and 1-alpha and alpha (if adjust is False).

More details can be found at
http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-moment-functions
http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-windows
"""

_expanding_kw = """min_periods : int, default None
Expand Down Expand Up @@ -323,14 +325,15 @@ def rolling_corr(arg1, arg2=None, window=None, pairwise=None, **kwargs):
@Substitution("Exponentially-weighted moving average", _unary_arg, _ewm_kw,
_type_of_input_retval, _ewm_notes)
@Appender(_doc_template)
def ewma(arg, com=None, span=None, halflife=None, min_periods=0, freq=None,
adjust=True, how=None, ignore_na=False):
def ewma(arg, com=None, span=None, halflife=None, alpha=None, min_periods=0,
freq=None, adjust=True, how=None, ignore_na=False):
return ensure_compat('ewm',
'mean',
arg,
com=com,
span=span,
halflife=halflife,
alpha=alpha,
min_periods=min_periods,
freq=freq,
adjust=adjust,
Expand All @@ -341,14 +344,15 @@ def ewma(arg, com=None, span=None, halflife=None, min_periods=0, freq=None,
@Substitution("Exponentially-weighted moving variance", _unary_arg,
_ewm_kw + _bias_kw, _type_of_input_retval, _ewm_notes)
@Appender(_doc_template)
def ewmvar(arg, com=None, span=None, halflife=None, min_periods=0, bias=False,
freq=None, how=None, ignore_na=False, adjust=True):
def ewmvar(arg, com=None, span=None, halflife=None, alpha=None, min_periods=0,
bias=False, freq=None, how=None, ignore_na=False, adjust=True):
return ensure_compat('ewm',
'var',
arg,
com=com,
span=span,
halflife=halflife,
alpha=alpha,
min_periods=min_periods,
freq=freq,
adjust=adjust,
Expand All @@ -361,14 +365,15 @@ def ewmvar(arg, com=None, span=None, halflife=None, min_periods=0, bias=False,
@Substitution("Exponentially-weighted moving std", _unary_arg,
_ewm_kw + _bias_kw, _type_of_input_retval, _ewm_notes)
@Appender(_doc_template)
def ewmstd(arg, com=None, span=None, halflife=None, min_periods=0, bias=False,
freq=None, how=None, ignore_na=False, adjust=True):
def ewmstd(arg, com=None, span=None, halflife=None, alpha=None, min_periods=0,
bias=False, freq=None, how=None, ignore_na=False, adjust=True):
return ensure_compat('ewm',
'std',
arg,
com=com,
span=span,
halflife=halflife,
alpha=alpha,
min_periods=min_periods,
freq=freq,
adjust=adjust,
Expand All @@ -383,9 +388,9 @@ def ewmstd(arg, com=None, span=None, halflife=None, min_periods=0, bias=False,
@Substitution("Exponentially-weighted moving covariance", _binary_arg_flex,
_ewm_kw + _pairwise_kw, _type_of_input_retval, _ewm_notes)
@Appender(_doc_template)
def ewmcov(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
bias=False, freq=None, pairwise=None, how=None, ignore_na=False,
adjust=True):
def ewmcov(arg1, arg2=None, com=None, span=None, halflife=None, alpha=None,
min_periods=0, bias=False, freq=None, pairwise=None, how=None,
ignore_na=False, adjust=True):
if arg2 is None:
arg2 = arg1
pairwise = True if pairwise is None else pairwise
Expand All @@ -401,6 +406,7 @@ def ewmcov(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
com=com,
span=span,
halflife=halflife,
alpha=alpha,
min_periods=min_periods,
bias=bias,
freq=freq,
Expand All @@ -414,8 +420,9 @@ def ewmcov(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
@Substitution("Exponentially-weighted moving correlation", _binary_arg_flex,
_ewm_kw + _pairwise_kw, _type_of_input_retval, _ewm_notes)
@Appender(_doc_template)
def ewmcorr(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
freq=None, pairwise=None, how=None, ignore_na=False, adjust=True):
def ewmcorr(arg1, arg2=None, com=None, span=None, halflife=None, alpha=None,
min_periods=0, freq=None, pairwise=None, how=None, ignore_na=False,
adjust=True):
if arg2 is None:
arg2 = arg1
pairwise = True if pairwise is None else pairwise
Expand All @@ -430,6 +437,7 @@ def ewmcorr(arg1, arg2=None, com=None, span=None, halflife=None, min_periods=0,
com=com,
span=span,
halflife=halflife,
alpha=alpha,
min_periods=min_periods,
freq=freq,
how=how,
Expand Down
Loading