bug#10511 Make Ordering methods consistent #6323

NthPortal · 2018-02-13T21:47:53Z

Remove overridden method implementations in Ordering.

Reverts 460bbc1

Fixes scala/bug#10511

NthPortal · 2018-02-14T20:19:31Z

/sync

SethTisue · 2018-02-27T23:41:55Z

see also scala/bug#5104

this is the kind of thing it can be hard to get a reviewer for...

I'm somewhat uncomfortable with losing the ScalaCheck tests that were added in 460bbc1, are any of them still applicable, or easily adapted to be applicable?

@khernyo are you still around to weigh in...?

NthPortal · 2018-02-28T04:23:30Z

@SethTisue all of the ScalaCheck tests check that various operations are inconsistent with Ordering.compare, and thus are not applicable (given the premise of this PR/issue is accepted)

NthPortal · 2018-02-28T05:06:31Z

The root problem is that IEEE floating point values do not have a total ordering; only a partial ordering. Thus, Ordering cannot be implemented for Float or Double without either violating the contract for Ordering or the IEEE spec.

To complicate matters further, Numeric extends Ordering (rather than, perhaps, PartialOrdering). Float and Double are certainly Numeric (they're numbers!), but now cannot properly fulfill part of their contract. As can be seen in 460bbc1, it is the use of Numeric to compare floating point numbers that is particularly confusing. To me at least, it is not a stretch to say that Ordering.max and Ordering.min violate the IEEE spec in order to make a total ordering; however, I would expect Numeric.max and Numeric.min to conform to the IEEE spec. But Numeric.FloatIsFractional inherits from Ordering.FloatOrdering (and similarly for Double).

Since 460bbc1, Floats and Doubles could not reliably be used in sorted collections if the collection used Ordering.lteq, as putting NaN (and perhaps -0.0) in the collection would create inconsistencies (and perhaps lead to non-deterministic sort order). For that reason, I think it is imperative that Ordering.FloatOrdering and Ordering.DoubleOrdering be made internally consistent, and behave as a total ordering.

A possible solution is to use Orderings other than Ordering.FloatOrdering and Ordering.DoubleOrdering to implement Numeric.FloatIsFractional and Numeric.DoubleIsFractional. With this behaviour, it would be possible to keep the ScalaCheck tests from 460bbc1. However, Numeric[Float] and Numeric[Double] would no longer behave the same as Ordering[Float] and Ordering[Double], which could potentially be very confusing (especially if you have the wrong one in the implicit scope by accident).

Another solution is to change Numeric to extend PartialOrdering instead of Ordering (Integral could still extend Ordering). Then, Numeric.FloatIsFractional and Numeric.DoubleIsFractional could implement PartialOrderings complying with the IEEE spec, and Ordering.FloatOrdering and Ordering.DoubleOrdering could implement consistent total orderings which violate the IEEE spec. Of course, this would be a significant change, potentially breaking lots of code.

Remove overridden method implementations in Ordering. Reverts 460bbc1 Fixes scala/bug#10511

khernyo · 2018-02-28T09:19:02Z

I just skimmed it for now but as I see it, you are trading one bugfix for another. I can understand where you are coming from, but accepting this MR will break currently working code, so let's not do that.

adriaanm · 2018-02-28T15:49:53Z

Sorry for not responding sooner. I'm afraid I'm not the best person to review this. (I like floats, but don't know much about comparing them :-))

NthPortal · 2018-02-28T15:54:46Z

No worries. To be honest, I'm not sure who's the best person to review it.

Edit: Sorry the bot added you back @adriaanm - I've removed the original comment, and hopefully it won't do that again

NthPortal · 2018-02-28T15:55:33Z

@khernyo can I ask what currently working code it breaks?

khernyo · 2018-03-03T12:35:40Z

@NthPortal, the commit you reverted is about consistency of primitive relational operators and Ordering of floats. So, any code that were counting on these behaving the same might break.
E.g. you might write a function using Double and then you rewrite it using Numeric[T] (which extends Ordering[T]). The two functions will look exactly the same, but behave subtly differently. Or at least that's what I remember about the problem.

I'd say that 460bbc1 was a short sighted fix but the consistency it provided was good to have. The fixes in the current MR are also desired, but as you concluded, the two fixes are incompatible. I have no idea which one is more important to have, so it's fine by me either way.

I'd vote for using PartialOrdering instead of Ordering for floats but I know it's a huge undertaking. AFAIK, all other solutions will end up with inconsistencies.

Ichoran

There is a fundamental problem with ordering floating-point values, as people have already pointed out. However, this commit doesn't, I think, meaningfully improve the situation because it breaks as much as it fixes.

The fundamental problem is in the IEEE spec, but we still should try to make things work as naturally as possible. In particular, there are three really natural identities, at least one of which we will have to break.

def invariants(x: Double, y: Double) = {
  val o = implicitly[Ordering[Double]]
  def bits(x: Double) =
    java.lang.Double.doubleToRawLongBits(x)
  (x < y) == O.lt(x, y) && // #1
  !(O.lt(x, y) && !bits(O.min(x, y)) == bits(x)) && // #2
  O.min(x, y) == (x min y)  // #3
}

The current state of affairs is that #1 and #3 are true but #2 is false; this PR swaps it so that #1 and #3 are false but #2 is true (but has the advantage of a simpler implementation).

There is an additional constraint, which is orderings should be usable for sorting things. Unfortunately, #1 is incompatible with this; it fails the basic invariant that a total ordering must have of !(x < y && y < x). Sorts often assume this is true, and breaking it can cause not just mis-sorting but runtime failures in the algorithm. Unfortunately, the current state of affairs is that the ordering isn't an ordering.

If we want an ordering, we have to abandon #1, but arguably < has pretty batty behavior anyway for an operator named that.

We still can choose, though, whether #2 or #3 is true. And here, I think, the resolution falls in favor of #3 being true.

This still leaves us with two possibilities regarding the behavior of min. And here I think it's very dangerous to switch from min returning NaN to it returning a value in any case where you might stumble into it by accident. (This could be literally dangerous if anyone relies upon the behavior to detect an error condition.) The most problematic place this could occur subtly is in something like List(1.0, 2.0, Double.NaN).min.

It is also possible to have a typeclass that is weaker that Ordering which is something like Extrema which just defines min and max methods which is implemented by Ordering but for which there's a higher precedence implicit for Extrema[Double] and Extrema[Float] so that collections min/max can work the "right way". Failing that, I would say that overriding min and max in FloatOrdering and DoubleOrdering is needed in this PR for it to be considered (it's always more dangerous to go from an error-signaling to an error-ignoring behavior).

I'm still not sure that this PR is an improvement even with that change. It might be better to simply add a separate total ordering that users can apply when they know they want the total ordering.

NthPortal · 2018-03-03T19:58:00Z

The problem is, Ordering[Double] already violates #‌3. The primary method of Ordering - compare - is already inconsistent with math.min and math.max. The question is, should Ordering also violate #‌2 (sometimes).

My concern isn't that methods in Ordering do not conform with an actual total ordering; if the methods were consistent with each other, but conformed to IEEE instead of a total ordering, I'd be fine. However, some of the methods are consistent with a total ordering, and some are consistent with IEEE (and some neither), but never both at once. It should not matter whether or not you say O.lt(x, y) or O.compare(x, y) < 0, but it does. It should not matter whether or not you say if (O.lteq(x, y)) x else y or O.min(x, y), but it does. The various methods lt, lteq, gt, max, etc. for Ordering are meant (I believe) for convenience, to make code more readable - O.min(x, y) is much clearer than if (O.compare(x, y) <= 0) x else y. But as it is now, which method you use changes behaviour in often unexpected ways. TraversableOnce.min is implemented using Ordering.lteq, which leads to the unexpected and incorrect result that Seq(5.0, 3.0, NaN, 4.0).min == 4.0.

The purpose of this PR is to make it so that Ordering does one thing, consistently. And due to the signature of compare, it is not actually possible for it to comply with IEEE, unless it throws an exception when NaN is encountered (which would be very bad).

I could change the PR so that Ordering.min and .max forward to math.min and math.max, but that still leaves Ordering internally inconsistent (and with TraversibleOnce.min and .max not being consistent with Ordering.min and .max).

Regarding #‌1, I don't believe there are any values for which x < y && y < x; feel free to correct me if I'm wrong.

NthPortal · 2018-03-03T20:04:53Z

If others are amenable to it, I would be happy to work on the significant change of having Numeric extend PartialOrdering instead of Ordering (I'm with you here @khernyo). I don't know how much this would break, but I think it would make more sense in the long run (for the reasons given by several people above).

I don't know how much code this would break in the community. We could potentially add min and max methods to Fractional (or Numeric?) which don't inherit from Ordering, if it helps reduce breakage, though of course PartialOrderings are not guaranteed to have min and max for arbitrary elements.

Ichoran · 2018-03-06T15:51:04Z

@NthPortal - Right now, #3 is not violated:

Welcome to Scala 2.12.4 (OpenJDK 64-Bit Server VM, Java 1.8.0_151).
Type in expressions for evaluation. Or try :help.

scala> val o = implicitly[Ordering[Double]]
o: Ordering[Double] = scala.math.Ordering$Double$@5699578f

scala> o.min(1.0, Double.NaN)
res0: Double = NaN

scala> (1.0 min Double.NaN)
res1: Double = NaN

However, the inconsistency you point out is a problem. But I think the resolution is that min and max should be inconsistent, but that implementations of min and max should defer to those of their ordering and not try to recompute it. In particular, this is the law we would break:

xs.isEmpty || (xs.sorted.head == xs.reduce(_ min _))

rather than breaking

xs.isEmpty || (xs.reduce(_ min _) == xs.min)

Both are surprising, but the latter seems even more surprising to me.

Anyway, I am not fine with Ordering not being an ordering in the case of IEEE floating point, except if we think it's necessary to avoid breaking people's stuff. But I am fine with having min and max propagate errors instead of blindly following lt and such.

Note that you can't actually have the consistency you wanted if you follow IEEE rules, because IEEE rules are that everything is false. But of course we want O.lt(x, y) == O.gteq(y, x)--so, no, you can't have a consistent result for compare, but that's because it needs to be consistent with two contradictory things (which are contradictory regardless of whether compare is there at all).

So I think the marginally sensible possibilities are

compare is the odd one out, giving an ordering, and everything else follows IEEE rules
min and max are the odd ones out, following IEEE rules, while all else follows the ordering of compare
We don't even have a default ordering for Float and Double (but provide them as options the user can choose), but we do have a default min/max that follows IEEE rules.

None of these are great, but I maintain that everything else is worse. Again, I don't think it's acceptable that xs.min be different from xs.reduce(_ min _). That's about as fundamental of a relation as you can get.

Ichoran · 2018-03-06T15:56:37Z

Incidentally, I don't think PartialOrdering really helps here--it isn't to IEEE spec, it's more expensive to compute, and it doesn't preserve the API for things that we need (like min over floats, because min isn't definable). As I mentioned before, I do think a new typeclass that computed extrema would help.

NthPortal · 2018-03-06T15:57:47Z

@Ichoran ah, but xs.min is already different from xs.reduce(_ min _), because TraversibleOnce.min is implemented as reduceLeft((x, y) => if (ord.lteq(x, y)) x else y)

Edit: perhaps that should be changed, but it is the current behaviour

Ichoran · 2018-03-06T15:59:10Z

@NthPortal - Yes, but TraversableOnce is doing min wrong by not deferring to Ordering#min.

NthPortal · 2018-03-06T16:02:05Z

@Ichoran Why doesn't PartialOrdering help? AFAICT, it agrees with IEEE spec

trait DoublePartialOrdering extends PartialOrdering[Double] {
  def tryCompare(x: Double, y: Double): Option[Int] = {
    if (x < y) Some(-1)
    else if (x > y) Some(1)
    else if (x == y) Some(0) // includes -0.0 == 0.0
    else None                // one or both is NaN
  }
  def lteq(x: Double, y: Double): Boolean = x <= y
  override def gteq(x: Double, y: Double): Boolean = x >= y
  override def lt(x: Double, y: Double): Boolean = x < y
  override def gt(x: Double, y: Double): Boolean = x > y
  override def equiv(x: Double, y: Double): Boolean = x == y
}

And if you don't use tryCompare, it just uses primitive operations

Ichoran · 2018-03-06T16:41:34Z

My mistake--PartialOrdering can agree with IEEE spec. But it doesn't solve the min/max problem which is what we had trouble with to begin with; and it doesn't let us sort doubles, which was also what we wanted to do to begin with. It's just a more faithful model of (the comparison operation part of) the spec for IEEE floating point behavior.

NthPortal · 2018-03-06T16:45:40Z

If we have Numeric[Double] use a PartialOrdering instead, then we can have Ordering[Double] be fully consistent and ignore IEEE spec (like Java does)

Ichoran · 2018-03-06T16:49:30Z

Does that resolve the max/min problem?

NthPortal · 2018-03-06T16:52:33Z

I think so? It makes it so that Numeric[Double].max/.min follow IEEE (as one would expect), and Ordering[Double].max/.min follow the total ordering (which one also might expect).

As you mentioned, it would involve adding an Extrema typeclass as well, probably (or the methods can just be added to Numeric itself).

Ichoran · 2018-03-06T17:34:27Z

So the proposal is to change the behavior of Array(1.0, Double.NaN, 2.0).min such that it returns 1.0, instead of it now returning 2.0 which is just nonsense, or 2.0 which is what you'd get from .reduce(_ min _)?

NthPortal · 2018-03-06T17:52:05Z

Correct; I am proposing that it would return 1.0 (and .max would return NaN).

If you wanted both min and max to return NaN, you could do Array(1.0, Double.NaN, 2.0).reduce(Numeric[Double].min) as well. Or potentially, the signatures could be changed to require an implicit Extrema instead of Ordering, in which case .min(Numeric[Double]) would work (I don't know if that would break the implicit search).

Ichoran · 2018-03-06T17:55:39Z

I think the xs.min == xs.reduce(_ min _) invariant is more important than the min-according-to-ordering.

But I guess there is an issue with sorted sets, then, as the trivial implementation wouldn't work.

I do think we should do something; I just don't think this reversion is enough to get there.

NthPortal · 2018-03-06T17:55:57Z

I have some other issues with Numeric/Ordering as well, so I might start a thread over at https://contributors.scala-lang.org/

Edit: your concerns are another good reason to start a thread

NthPortal · 2018-03-06T17:59:14Z

Funnily enough, I've looked at sorted sets, and there's no easy way to get the min/max by the set's Ordering - it just uses TraversableOnce's implementation (which also seems like something worth changing). So the trivial implementation only exists for maxBefore and minAfter, but not max and min (which is really weird).

Ichoran · 2018-03-07T15:39:11Z

Do you think it's worth going to all the trouble of changing the ordering type, adding new traits, and so on? Or do you think it's maybe better to define a total ordering in, say, Double.totalOrdering and let people invoke it explicitly when it is desired?

NthPortal · 2018-03-07T17:04:29Z

I do think it's worth it. Though adding a Double.totalOrdering would help a little bit, most people would never know it exists, and it wouldn't solve the underlying problems.

I've started working on a rewrite/redesign over at #6375

NthPortal · 2018-03-07T22:02:15Z

Suppose we change Numeric to extend PartialOrdering instead. What does Numeric.reverse return?

If it returns a PartialOrdering, then it might break existing code. But if it returns an Ordering, then it breaks future code that uses the Numeric as a PartialOrdering.

(It shouldn't be called regardless, but still)

Ichoran · 2018-03-07T22:12:12Z

It has to return PartialOrdering because there is no "reverse total ordering" because there is no total ordering.

NthPortal · 2018-03-07T22:15:23Z

Hopefully no one calls reverse explicitly on a Numeric (which, again, doesn't make a lot of sense in the first place)

NthPortal · 2018-03-11T00:44:01Z

@Ichoran I just realized a major problem with using the current Ordering in a SortedMap or SortedSet. If Ordering.equiv is used, NaN keys will not be retrievable from the map, because they are never equal.

Ichoran · 2018-03-11T01:09:10Z

@NthPortal - Good point. However, this can be fixed by making sure that SortedMap and SortedSet always use compare to decide what to return.

NthPortal · 2018-03-11T01:12:51Z

@Ichoran true (and I believe that's what they already do); it's still a possible pitfall though

NthPortal · 2018-03-11T01:13:38Z

I agree with the suggestion you made on Discourse, though I'm not certain the best way to implement it

Ichoran · 2018-03-11T01:55:14Z

@NthPortal - Just figure out wherever Ordering[Double] comes from and make sure that both Numeric[Double] and RealOrder[Double] are brought into scope together in such a situation. Then you'll get a conflict, and people will have to go look at the docs where we can explain carefully how to select one to have priority over the other. They might need to come in at low priority so that it's easier to override them.

NthPortal · 2018-03-12T00:46:13Z

superseded by #6410

scala-jenkins added this to the 2.13.0-M4 milestone Feb 13, 2018

scala-jenkins requested a review from adriaanm February 21, 2018 19:33

NthPortal force-pushed the bug#10511/PR branch from 0acffd0 to b7915be Compare February 28, 2018 05:10

bug#10511 Make Ordering methods consistent

3048e24

Remove overridden method implementations in Ordering. Reverts 460bbc1 Fixes scala/bug#10511

NthPortal force-pushed the bug#10511/PR branch from b7915be to 3048e24 Compare February 28, 2018 05:56

adriaanm removed their request for review February 28, 2018 15:50

scala-jenkins requested a review from adriaanm February 28, 2018 16:13

Ichoran self-requested a review February 28, 2018 23:40

NthPortal mentioned this pull request Mar 1, 2018

Forward more methods for reverse orderings #6360

Merged

Ichoran suggested changes Mar 3, 2018

View reviewed changes

NthPortal mentioned this pull request Mar 5, 2018

[WIP] bug#10511 Restructure Numeric/Ordering [ci: last-only] #6375

Closed

NthPortal closed this Mar 12, 2018

SethTisue removed this from the 2.13.0-M4 milestone Mar 13, 2018

NthPortal mentioned this pull request Apr 29, 2018

Use Ordering.min and .max to implement IterableOnceOps.min and .max scala/bug#10855

Closed

NthPortal removed the request for review from adriaanm April 30, 2018 03:16

NthPortal deleted the bug#10511/PR branch June 7, 2018 01:21

NthPortal mentioned this pull request Jun 30, 2019

Rewrite rules for @deprecated since 2.13.x scala/scala-collection-compat#44

Open

This was referenced Dec 30, 2019

Ordering methods are inconsistent for Double and Float scala/bug#10511

Closed

Improve floating point IEEE Ordering consistency #8622

Closed

eed3si9n mentioned this pull request Jan 2, 2020

List(1.0, -1.0).sorted gives a deprecation warning scala/bug#11844

Closed

bug#10511 Make Ordering methods consistent #6323

bug#10511 Make Ordering methods consistent #6323

Conversation

NthPortal commented Feb 13, 2018

NthPortal commented Feb 14, 2018

SethTisue commented Feb 27, 2018

NthPortal commented Feb 28, 2018

NthPortal commented Feb 28, 2018 • edited Loading

khernyo commented Feb 28, 2018

adriaanm commented Feb 28, 2018 • edited Loading

NthPortal commented Feb 28, 2018 • edited Loading

NthPortal commented Feb 28, 2018

khernyo commented Mar 3, 2018

Ichoran left a comment

Choose a reason for hiding this comment

NthPortal commented Mar 3, 2018 • edited Loading

NthPortal commented Mar 3, 2018 • edited Loading

Ichoran commented Mar 6, 2018

Ichoran commented Mar 6, 2018

NthPortal commented Mar 6, 2018 • edited Loading

Ichoran commented Mar 6, 2018

NthPortal commented Mar 6, 2018 • edited Loading

Ichoran commented Mar 6, 2018

NthPortal commented Mar 6, 2018

Ichoran commented Mar 6, 2018

NthPortal commented Mar 6, 2018

Ichoran commented Mar 6, 2018

NthPortal commented Mar 6, 2018

Ichoran commented Mar 6, 2018

NthPortal commented Mar 6, 2018 • edited Loading

NthPortal commented Mar 6, 2018

Ichoran commented Mar 7, 2018

NthPortal commented Mar 7, 2018

NthPortal commented Mar 7, 2018 • edited Loading

Ichoran commented Mar 7, 2018

NthPortal commented Mar 7, 2018

NthPortal commented Mar 11, 2018

Ichoran commented Mar 11, 2018

NthPortal commented Mar 11, 2018

NthPortal commented Mar 11, 2018

Ichoran commented Mar 11, 2018

NthPortal commented Mar 12, 2018

NthPortal commented Feb 28, 2018 •

edited

Loading

adriaanm commented Feb 28, 2018 •

edited

Loading

NthPortal commented Feb 28, 2018 •

edited

Loading

NthPortal commented Mar 3, 2018 •

edited

Loading

NthPortal commented Mar 3, 2018 •

edited

Loading

NthPortal commented Mar 6, 2018 •

edited

Loading

NthPortal commented Mar 6, 2018 •

edited

Loading

NthPortal commented Mar 6, 2018 •

edited

Loading

NthPortal commented Mar 7, 2018 •

edited

Loading