Skip to content

Request saturation (rule 3.17) #196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DougLea opened this issue Jan 5, 2015 · 17 comments · Fixed by #203
Closed

Request saturation (rule 3.17) #196

DougLea opened this issue Jan 5, 2015 · 17 comments · Fixed by #203

Comments

@DougLea
Copy link
Contributor

DougLea commented Jan 5, 2015

The part of rule 3.17 saying...

"If demand becomes higher than 2^63-1 then the Publisher MUST signal an onError with java.lang.IllegalStateException on the given Subscriber."

... is not clear about whether it is OK to always saturate at Long.MAX_VALUE, as in the overflow-detection code (assuming feld "demand" holding cumulative unfilled requests):
request(long n) { long prev = demand, d = prev + n; demand = (d < prev) ? Long.MAX_VALUE : d; ...}

I think it ought to be OK (if not striictly required). One reason that implementations ought to do this anyway is that Subscribers should not have to keep track of whether they've already done request(Long.MAX_VALUE) before whenever they'd like to do it.

@viktorklang
Copy link
Contributor

Hi @DougLea,

The reason for this rule was that we couldn't find a single plausible situation where an overflow of 2^63 would make sense, so an overflow would mean a buggy Subscriber, in which case we wanted to fail fast.

Remember that signalled demand == current buffer capacity (since sync Subscribers would be able to call request(Long.MAX_VALUE) in onSubscribe which is specced to only allow to be called once).

Sane of operation (please feel free to add more here):

N buffered lock-step == call request(N) after receiving N elements == need to keep track of demand
N buffered concurrent == call request(N) initially and then call request(N-X) after receiving every X elements == need to keep track of demand
Unbounded sync subscriber == call request(Long.MAX_VALUE) on onSubscribe == no need for demand tracking
1-buffered == call request(1) on onSubscribe and then request(1) after receiving every element == no need to keep track of demand

@DougLea
Copy link
Contributor Author

DougLea commented Jan 6, 2015

Do you mean that the non-error saturation/overflow cases are only those in which demand is already Long.MAX_VALUE, or there is a request(MAX_VALUE) call? In which case, the "may treat as unbounded" must be changed to "must treat as unbounded" -- otherwise demand will become less than Long.MAX_VALUE after a consume.

This rule seems to target a narrow class of bad usages while prohibiting a narrow class of valid usages, so might not be worthwhile.

The use cases I had in mind include those with a known-to-be-slow producer (like a once-an-hour timer) for which you'd set request(Long.MAX_VALUE) up front. But then might call some utility method that must ensure demand > 0 so calls request(1). (Similar issues with other sets of request values that happen to add up to > Long.MAX_VALUE.) There's no good way for users to cope unless saturating on overflow is legal (which I'm still not sure is not legal according to current wording.)

@viktorklang
Copy link
Contributor

Hi Doug!

Answers inline:

On 6 Jan 2015 13:19, "DougLea" [email protected] wrote:

Do you mean that the non-error saturation/overflow cases are only those
in which demand is already Long.MAX_VALUE, or there is a request(MAX_VALUE)
call? In which case, the "may treat as unbounded" must be changed to "must
treat as unbounded" -- otherwise demand will become less than
Long.MAX_VALUE after a consume.

Which is fine, the rationale is in the footnote of the rule: " [1] : As it
is not feasibly reachable with current or forseen hardware within a
reasonable amount of time (1 element per nanosecond would take 292 years)
to fulfill a demand of 2^63-1."

This rule seems to target a narrow class of bad usages while prohibiting
a narrow class of valid usages, so might not be worthwhile.

It is less about preventing bad intentional usage as it is to prevent buggy
implementations and fail fast.

The use cases I had in mind include those with a known-to-be-slow
producer (like a once-an-hour timer) for which you'd set
request(Long.MAX_VALUE) up front.

A Subscriber does not know whether a Publisher is fast or slow tho (at
compile time). And it is still fine to issue request (Long.MAX_VALUE) in on
subscribe if one is willing to suffer the consequences (effectively
disabling back pressure).

For a slow Publisher the overhead of calling request (1) in onNext has
virtually 0 per impact since it will be waiting for the Publisher most/all
of the time anyway?

But then might call some utility method that must ensure demand > 0 so
calls request(1). (Similar issues with other sets of request values that
happen to add up to > Long.MAX_VALUE.) There's no good way for users to
cope unless saturating on overflow is legal (which I'm still not sure is
not legal according to current wording.)

I am not sure I follow, isn't this covered by any the scenarios I listed?
(And if not, why?)

Overflowing is not compliant, IIRC the TCK checks for that.
Can we improve the wording of the rule to be more crisp? Or is this another
good use case for the proposed rule rationale section?


Reply to this email directly or view it on GitHub.

@DougLea
Copy link
Contributor Author

DougLea commented Jan 6, 2015

I think that this comes down to either (1) adding wording that amounts to saying that upon consume, a demand of Long.MAX_VALUE is not decremented (otherwise you cannot define/detect some violations of 3.17), or (2) just omitting the wording and allowing saturation. (2) still seems like a better tradeoff. I imagine that usages will eventually include some specialized Pub/Sub pairs that do know approximate rates before hand and try to set up best flow statically. And then become unhappy when good-intentioned rules prevented them from doing this.

@viktorklang
Copy link
Contributor

I think I understand your point now, is your question whether it is legal
to call request (1) when processing an element in onNext if one has
previously issued request (Long.MAX_VALUE) before?

Cheers,

On 6 Jan 2015 14:20, "Viktor Klang" [email protected] wrote:

Hi Doug!

Answers inline:

On 6 Jan 2015 13:19, "DougLea" [email protected] wrote:

Do you mean that the non-error saturation/overflow cases are only those
in which demand is already Long.MAX_VALUE, or there is a request(MAX_VALUE)
call? In which case, the "may treat as unbounded" must be changed to "must
treat as unbounded" -- otherwise demand will become less than
Long.MAX_VALUE after a consume.

Which is fine, the rationale is in the footnote of the rule: " [1] : As it
is not feasibly reachable with current or forseen hardware within a
reasonable amount of time (1 element per nanosecond would take 292 years)
to fulfill a demand of 2^63-1."

This rule seems to target a narrow class of bad usages while prohibiting
a narrow class of valid usages, so might not be worthwhile.

It is less about preventing bad intentional usage as it is to prevent
buggy implementations and fail fast.

The use cases I had in mind include those with a known-to-be-slow
producer (like a once-an-hour timer) for which you'd set
request(Long.MAX_VALUE) up front.

A Subscriber does not know whether a Publisher is fast or slow tho (at
compile time). And it is still fine to issue request (Long.MAX_VALUE) in on
subscribe if one is willing to suffer the consequences (effectively
disabling back pressure).

For a slow Publisher the overhead of calling request (1) in onNext has
virtually 0 per impact since it will be waiting for the Publisher most/all
of the time anyway?

But then might call some utility method that must ensure demand > 0 so
calls request(1). (Similar issues with other sets of request values that
happen to add up to > Long.MAX_VALUE.) There's no good way for users to
cope unless saturating on overflow is legal (which I'm still not sure is
not legal according to current wording.)

I am not sure I follow, isn't this covered by any the scenarios I listed?
(And if not, why?)

Overflowing is not compliant, IIRC the TCK checks for that.
Can we improve the wording of the rule to be more crisp? Or is this
another good use case for the proposed rule rationale section?


Reply to this email directly or view it on GitHub.

@DougLea
Copy link
Contributor Author

DougLea commented Jan 6, 2015

Well, I think that if you answer that question, then you've answered all of them!

@viktorklang
Copy link
Contributor

Yes, that would be legal since it would not overflow 2^63-1.

I'm open to be convinced we can drop the onError for overflow requirement,
but it does prevent discrepancies between Subscriber demand-tracking and
Publisher demand tracking.

Disabling that rule may have some adverse effects on the implementations
though (unless it is OK to decrement demand when onNexting, and request
only fills up until Long.MAX_VALUE). (Which means that Subscriber that do
demand tracking will have to replicate this bookkeeping rule to avoid
diverging from Publisher demand tracking)

The question is how important that'd be given it'd take > 292 years to find
out you were out of sync.

Cheers,

On 6 Jan 2015 14:52, "DougLea" [email protected] wrote:

Well, I think that if you answer that question, then you've answered all
of them!


Reply to this email directly or view it on GitHub
#196 (comment)
.

@viktorklang
Copy link
Contributor

Yes, that would be legal since it would not overflow 2^63-1.

Of course assuming that no previous demand existed prior to issuing the
request (Long.MAX_VALUE).

Cheers,

On 6 Jan 2015 15:11, "Viktor Klang" [email protected] wrote:

Yes, that would be legal since it would not overflow 2^63-1.

I'm open to be convinced we can drop the onError for overflow requirement,
but it does prevent discrepancies between Subscriber demand-tracking and
Publisher demand tracking.

Disabling that rule may have some adverse effects on the implementations
though (unless it is OK to decrement demand when onNexting, and request
only fills up until Long.MAX_VALUE). (Which means that Subscriber that do
demand tracking will have to replicate this bookkeeping rule to avoid
diverging from Publisher demand tracking)

The question is how important that'd be given it'd take > 292 years to
find out you were out of sync.

Cheers,

On 6 Jan 2015 14:52, "DougLea" [email protected] wrote:

Well, I think that if you answer that question, then you've answered all
of them!


Reply to this email directly or view it on GitHub
#196 (comment)
.

@DougLea
Copy link
Contributor Author

DougLea commented Jan 6, 2015

How about calling request(17) in onNext if you've already called request (Long.MAX_VALUE)?

Since there is no Subscription.getDemand() method (and I don't think there needs to be one), the simpler the rule the better. Saturate-at-max is one way to reduce pubs/subs implementor errors as they try account for these rules when tracking. Anything simpler would be even better, but I can't think of anything.

@viktorklang
Copy link
Contributor

On Tue, Jan 6, 2015 at 3:25 PM, DougLea [email protected] wrote:

How about calling request(17) in onNext if you've already called request
(Long.MAX_VALUE)?

Now THAT is a good question, and as with all good questions the answer is
"it depends" as the Publisher could have registered the Long.MAX_VALUE and
already sent 17 elements downstream when the request(17) arrives. Now, the
questions are:

  1. When does request(X) after request(Long.MAX_VALUE) make sense?
  2. Does it really matter that we can detect demand overflow if it may take
    292 years to do so? (Of course, we would be able to detect overflow in the
    sense of request(Long.MAX_VALUE)^2)
  3. Is it good enough to have the spec state that Publishers are only
    required to track demand up to Long.MAX_VALUE and if it overflows it can
    decrement from that?
  4. Is there any unforeseen risk of having Subscriber and Publisher demand
    tracking diverge at that level? (I.e. 1-elment-per-ns-for-292-years)

What does Mr Protocol say (@tmontgomery)?

Since there is no Subscription.getDemand() method (and I don't think there
needs to be one), the simpler the rule the better. Saturate-at-max is one
way to reduce pubs/subs implementor errors as they try account for these
rules when tracking. Anything simpler would be even better, but I can't
think of anything.


Reply to this email directly or view it on GitHub
#196 (comment)
.

Cheers,

@DougLea
Copy link
Contributor Author

DougLea commented Jan 6, 2015

In case it is helpful: Note that the current demand is an upper bound on how much buffering producers need to guarantee not to drop or block. Capacities past a billion or so (and usually far far less) are never going to be preallocated. So subscription.request(Long.MAX_VALUE) means that the caller risks potential block/drop/fail whenever resources run out (or whenever the Publisher feels like it) rather than in a controlled fashion. This may be a common initial/casual usage idiom for simple consumers, who will then sometimes find out that they should do something better.

@viktorklang
Copy link
Contributor

On Tue, Jan 6, 2015 at 7:30 PM, DougLea [email protected] wrote:

In case it is helpful: Note that the current demand is an upper bound on
how much buffering producers need to guarantee not to drop or block.

Am I correct in that you mean "Subscriber" instead of "producer" above?
(the demand signals how many elements a Publisher is allowed to send to a
Subscriber, i.e. the buffering is done at the Subscriber side)

Capacities past a billion or so (and usually far far less) are never going
to be preallocated.

The problem I see is that requesting a lot of elements means that you (i.e.
the Subscriber) are able to buffer them as they arrive, so it may be fine
when you only have a few instances of Subscribers with huge buffers, but as
soon as you end up with many of those, you can quite easily get into OOME
territory.

So subscription.request(Long.MAX_VALUE) means that the caller risks
potential block/drop/fail whenever resources run out (or whenever the
Publisher feels like it) rather than in a controlled fashion.

FWIW subscription.request(Long.MAX_VALUE) most typically(?) has little
impact on the Publisher, since lets say that the Publisher is a
time-publisher with 1ns resolution, then it would try to publish 1 element
per nanosecond for the next 292 years. This does not require it to buffer
any of the nanos.

This may be a common initial/casual usage idiom for simple consumers, who
will then sometimes find out that they should do something better.

The tradeoff with request(large number) is that you need to be willing and
able to receive it, so for sync subscribers it is typically not a problem,
but for async ones it is more of a gamble (for generic Publishers).


Reply to this email directly or view it on GitHub
#196 (comment)
.

Cheers,

@viktorklang
Copy link
Contributor

Currently we have:

| <a name="3.17">17</a>     | A `Subscription` MUST support an unbounded number of calls to request and MUST support a demand (sum requested - sum delivered) up to 2^63-1 (`java.lang.Long.MAX_VALUE`). A demand of exactly 2^63-1 (`java.lang.Long.MAX_VALUE`) MAY be considered by the `Publisher` as `effectively unbounded`[[1](#footnote-3-1)]. If demand becomes higher than 2^63-1 then the `Publisher` MUST signal an onError with `java.lang.IllegalStateException` on the given `Subscriber`. The cause message MUST include a reference to this rule and/or quote the full rule |

[<a name="footnote-3-1">1</a>] : As it is not feasibly reachable with current or forseen hardware within a reasonable amount of time (1 element per nanosecond would take 292 years) to fulfill a demand of 2^63-1.

How about we change it to:

| <a name="3.17">17</a>     | A `Subscription` MUST support an unbounded number of calls to request and MUST support a demand (sum requested - sum delivered) up to 2^63-1 (`java.lang.Long.MAX_VALUE`). A demand equal or greater than 2^63-1 (`java.lang.Long.MAX_VALUE`) MAY be considered by the `Publisher` as `effectively unbounded`[[1](#footnote-3-1)]. |

[<a name="footnote-3-1">1</a>] : As it is not feasibly reachable with current or forseen hardware within a reasonable amount of time (1 element per nanosecond would take 292 years) to fulfill a demand of 2^63-1, it is allowed for a `Publisher` to stop tracking demand beyond this point.

@reactive-streams/contributors What do you think about this? The implications on the spec would be to not have to emit onError on overflow of long and overflow can silently be discarded.

@DougLea
Copy link
Contributor Author

DougLea commented Jan 10, 2015

Yes please. Your rationale is better than mine!

@rkuhn
Copy link
Member

rkuhn commented Jan 12, 2015

AFAICS this does not place extra burden on the Publisher and does not hurt error reporting in practice: if we overflow due to a flawed Subscriber than that would mean requestion A LOT of elements already, i.e. 1000000 outstanding demand would already be “buggy”.

So: LGTM

@drewhk
Copy link
Contributor

drewhk commented Jan 12, 2015

If the Subscriber reports more demand erroneously than it can handle, then it will fail anyway. If the Subscriber can handle arbitrary amount of elements (for example synchronous), then it again does not matter. So yes, +1 to the proposal.

@rkuhn
Copy link
Member

rkuhn commented Jan 16, 2015

We should fix this before 1.0 in order to minimize confusion later (since changing this will break compatibility). I’ll open a PR shortly to inch forward in this direction.

rkuhn added a commit that referenced this issue Jan 16, 2015
fixes #196

Also terminate spec rule sentences with a full stop where missing.
rkuhn added a commit that referenced this issue Jan 23, 2015
fixes #196

Also terminate spec rule sentences with a full stop where missing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants