Skip to content

Added the Minkowski distance function #10143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 9, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions maths/minkowski_distance.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
def minkowski_distance(
point_a: list[float],
point_b: list[float],
order: int,
) -> float:
"""
This function calculates the Minkowski distance for a given order between
two n-dimensional points represented as lists. For the case of order = 1,
the Minkowski distance degenerates to the Manhattan distance. For
order = 2, the usual Euclidean distance is obtained.

https://en.wikipedia.org/wiki/Minkowski_distance

>>> minkowski_distance([1.0, 1.0], [2.0, 2.0], 1)
2.0
>>> minkowski_distance([1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], 2)
8.0
>>> minkowski_distance([1.0], [2.0], -1)
Traceback (most recent call last):
...
Exception: The order must be greater than or equal to 1.
>>> minkowski_distance([1.0], [1.0, 2.0], 1)
Traceback (most recent call last):
...
Exception: Both points must have the same dimension.
"""
if order < 1:
raise Exception("The order must be greater than or equal to 1.")

if len(point_a) != len(point_b):
raise Exception("Both points must have the same dimension.")

return float(
sum(abs(a - b) ** order for a, b in zip(point_a, point_b)) ** (1 / order)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that the ** (1 / order) could produce inaccuracies in the output. For example, 125 ** (1/3) returns 4.999... instead of 5 because of floating-point errors. I don't know of any better alternative, but I think it's worth looking into.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can't find a good solution for this either. I've tried using Decimal, pow, numpy but I'm getting the same result as you. I think adding in some sort of approximation scheme like Newton's method would unnecessarily complicate and detract from the main thrust of this algorithm. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to not overcomplicate the code, I think it should be fine to keep it as is. We should leave a note in the comments acknowledging this issue so that future contributors know about it, however. For doctests, we could use np.isclose() to compare doctest outputs to the true, mathematical values.

)


if __name__ == "__main__":
import doctest

doctest.testmod()