Skip to content

Implemented KD Tree Data Structure #11532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
0d6985c
Implemented KD-Tree Data Structure
Ramy-Badr-Ahmed Aug 28, 2024
6665d23
Implemented KD-Tree Data Structure. updated DIRECTORY.md.
Ramy-Badr-Ahmed Aug 28, 2024
6b3d47e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 28, 2024
4203cda
Create __init__.py
Ramy-Badr-Ahmed Aug 28, 2024
3222bd3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 28, 2024
a41ae5b
Replaced legacy `np.random.rand` call with `np.random.Generator` in k…
Ramy-Badr-Ahmed Aug 28, 2024
1668d73
Replaced legacy `np.random.rand` call with `np.random.Generator` in k…
Ramy-Badr-Ahmed Aug 28, 2024
81d6917
added typehints and docstrings
Ramy-Badr-Ahmed Aug 28, 2024
6cddcbd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 28, 2024
8b238d1
docstring for search()
Ramy-Badr-Ahmed Aug 28, 2024
cd1dd9f
Merge remote-tracking branch 'origin/feature/kd-tree-implementation' …
Ramy-Badr-Ahmed Aug 28, 2024
ead2838
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 28, 2024
543584c
Added tests. Updated docstrings/typehints
Ramy-Badr-Ahmed Aug 28, 2024
ad31f83
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 28, 2024
1322921
updated tests and used | for type annotations
Ramy-Badr-Ahmed Aug 28, 2024
4608a9f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 28, 2024
7c1aa7e
E501 for build_kdtree.py, hypercube_points.py, nearest_neighbour_sear…
Ramy-Badr-Ahmed Aug 29, 2024
ba24e75
I001 for example_usage.py and test_kdtree.py
Ramy-Badr-Ahmed Aug 29, 2024
05975a3
I001 for example_usage.py and test_kdtree.py
Ramy-Badr-Ahmed Aug 29, 2024
31782d1
Update data_structures/kd_tree/build_kdtree.py
Ramy-Badr-Ahmed Sep 3, 2024
6a9b3e1
Update data_structures/kd_tree/example/hypercube_points.py
Ramy-Badr-Ahmed Sep 3, 2024
2fd24d4
Update data_structures/kd_tree/example/hypercube_points.py
Ramy-Badr-Ahmed Sep 3, 2024
2cf9d92
Added new test cases requested in Review. Refactored the test_build_k…
Ramy-Badr-Ahmed Sep 3, 2024
a3803ee
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 3, 2024
f1f5862
Considered ruff errors
Ramy-Badr-Ahmed Sep 3, 2024
ec6559d
Merge remote-tracking branch 'origin/feature/kd-tree-implementation' …
Ramy-Badr-Ahmed Sep 3, 2024
5c07a1a
Considered ruff errors
Ramy-Badr-Ahmed Sep 3, 2024
3c09ac1
Apply suggestions from code review
cclauss Sep 3, 2024
bab43e7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 3, 2024
a10ff15
Update kd_node.py
cclauss Sep 3, 2024
d77a285
imported annotations from __future__
Ramy-Badr-Ahmed Sep 3, 2024
0426806
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions DIRECTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,12 @@
* Trie
* [Radix Tree](data_structures/trie/radix_tree.py)
* [Trie](data_structures/trie/trie.py)
* KD Tree
* [KD Tree Node](data_structures/kd_tree/kd_node.py)
* [Build KD Tree](data_structures/kd_tree/build_kdtree.py)
* [Nearest Neighbour Search](data_structures/kd_tree/nearest_neighbour_search.py)
* [Hypercibe Points](data_structures/kd_tree/example/hypercube_points.py)
* [Example Usage](data_structures/kd_tree/example/example_usage.py)

## Digital Image Processing
* [Change Brightness](digital_image_processing/change_brightness.py)
Expand Down
Empty file.
31 changes: 31 additions & 0 deletions data_structures/kd_tree/build_kdtree.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
from typing import Optional
from .kd_node import KDNode


def build_kdtree(points: list[list[float]], depth: int = 0) -> Optional[KDNode]:

Check failure on line 5 in data_structures/kd_tree/build_kdtree.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (I001)

data_structures/kd_tree/build_kdtree.py:1:1: I001 Import block is un-sorted or un-formatted

Check failure on line 5 in data_structures/kd_tree/build_kdtree.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (UP007)

data_structures/kd_tree/build_kdtree.py:5:64: UP007 Use `X | Y` for type annotations
"""
Builds a KD-Tree from a list of points.

Args:
points (list[list[float]]): The list of points to build the KD-Tree from.
depth (int): The current depth in the tree (used to determine axis for splitting).

Check failure on line 11 in data_structures/kd_tree/build_kdtree.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

data_structures/kd_tree/build_kdtree.py:11:89: E501 Line too long (90 > 88)

Returns:
Optional[KDNode]: The root node of the KD-Tree.
"""
if not points:
return None

k = len(points[0]) # Dimensionality of the points
axis = depth % k

# Sort point list and choose median as pivot element
points.sort(key=lambda point: point[axis])
median_idx = len(points) // 2

# Create node and construct subtrees
return KDNode(
point=points[median_idx],
left=build_kdtree(points[:median_idx], depth + 1),
right=build_kdtree(points[median_idx + 1 :], depth + 1),
)
Empty file.
37 changes: 37 additions & 0 deletions data_structures/kd_tree/example/example_usage.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import numpy as np
from hypercube_points import hypercube_points
from data_structures.kd_tree.build_kdtree import build_kdtree
from data_structures.kd_tree.nearest_neighbour_search import nearest_neighbour_search


def main() -> None:

Check failure on line 7 in data_structures/kd_tree/example/example_usage.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (I001)

data_structures/kd_tree/example/example_usage.py:1:1: I001 Import block is un-sorted or un-formatted
"""
Demonstrates the use of KD-Tree by building it from random points
in a 10-dimensional hypercube and performing a nearest neighbor search.
"""
num_points: int = 5000
cube_size: float = 10.0 # Size of the hypercube (edge length)
num_dimensions: int = 10

# Generate random points within the hypercube
points: np.ndarray = hypercube_points(num_points, cube_size, num_dimensions)
hypercube_kdtree = build_kdtree(points.tolist())

# Generate a random query point within the same space
rng = np.random.default_rng()
query_point: list[float] = rng.random(num_dimensions).tolist()

# Perform nearest neighbor search
nearest_point, nearest_dist, nodes_visited = nearest_neighbour_search(
hypercube_kdtree, query_point
)

# Print the results
print(f"Query point: {query_point}")
print(f"Nearest point: {nearest_point}")
print(f"Distance: {nearest_dist:.4f}")
print(f"Nodes visited: {nodes_visited}")


if __name__ == "__main__":
main()
19 changes: 19 additions & 0 deletions data_structures/kd_tree/example/hypercube_points.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import numpy as np


def hypercube_points(
num_points: int, hypercube_size: float, num_dimensions: int
) -> np.ndarray:
"""
Generates random points uniformly distributed within an n-dimensional hypercube.

Args:
num_points (int): Number of points to generate.
hypercube_size (float): Size of the hypercube.
num_dimensions (int): Number of dimensions of the hypercube.

Returns:
np.ndarray: An array of shape (num_points, num_dimensions) with generated points.

Check failure on line 16 in data_structures/kd_tree/example/hypercube_points.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

data_structures/kd_tree/example/hypercube_points.py:16:89: E501 Line too long (89 > 88)
"""
rng = np.random.default_rng()
return hypercube_size * rng.random((num_points, num_dimensions))
30 changes: 30 additions & 0 deletions data_structures/kd_tree/kd_node.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
from typing import Optional


class KDNode:
"""
Represents a node in a KD-Tree.

Attributes:
point (list[float]): The point stored in this node.
left (Optional[KDNode]): The left child node.
right (Optional[KDNode]): The right child node.
"""

def __init__(
self,
point: list[float],
left: Optional["KDNode"] = None,
right: Optional["KDNode"] = None,
) -> None:
"""
Initializes a KDNode with the given point and child nodes.

Args:
point (list[float]): The point stored in this node.
left (Optional[KDNode]): The left child node.
right (Optional[KDNode]): The right child node.
"""
self.point = point
self.left = left
self.right = right
70 changes: 70 additions & 0 deletions data_structures/kd_tree/nearest_neighbour_search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
from typing import Optional
from data_structures.kd_tree.kd_node import KDNode


def nearest_neighbour_search(

Check failure on line 5 in data_structures/kd_tree/nearest_neighbour_search.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (I001)

data_structures/kd_tree/nearest_neighbour_search.py:1:1: I001 Import block is un-sorted or un-formatted
root: Optional[KDNode], query_point: list[float]

Check failure on line 6 in data_structures/kd_tree/nearest_neighbour_search.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (UP007)

data_structures/kd_tree/nearest_neighbour_search.py:6:11: UP007 Use `X | Y` for type annotations
) -> tuple[Optional[list[float]], float, int]:

Check failure on line 7 in data_structures/kd_tree/nearest_neighbour_search.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (UP007)

data_structures/kd_tree/nearest_neighbour_search.py:7:12: UP007 Use `X | Y` for type annotations
"""
Performs a nearest neighbor search in a KD-Tree for a given query point.

Args:
root (Optional[KDNode]): The root node of the KD-Tree.
query_point (list[float]): The point for which the nearest neighbor is being searched.

Check failure on line 13 in data_structures/kd_tree/nearest_neighbour_search.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

data_structures/kd_tree/nearest_neighbour_search.py:13:89: E501 Line too long (94 > 88)

Returns:
tuple[Optional[list[float]], float, int]:
- The nearest point found in the KD-Tree to the query point.
- The squared distance to the nearest point.
- The number of nodes visited during the search.
"""
nearest_point: Optional[list[float]] = None

Check failure on line 21 in data_structures/kd_tree/nearest_neighbour_search.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (UP007)

data_structures/kd_tree/nearest_neighbour_search.py:21:20: UP007 Use `X | Y` for type annotations
nearest_dist: float = float("inf")
nodes_visited: int = 0

def search(node: Optional[KDNode], depth: int = 0) -> None:
"""
Recursively searches the KD-Tree for the nearest neighbor.

Args:
node (Optional[KDNode]): The current node in the KD-Tree.
depth (int): The current depth in the tree.
"""
nonlocal nearest_point, nearest_dist, nodes_visited
if node is None:
return

nodes_visited += 1

# Calculate the current distance (squared distance)
current_point = node.point
current_dist = sum(
(query_coord - point_coord) ** 2
for query_coord, point_coord in zip(query_point, current_point)
)

# Update nearest point if the current node is closer
if nearest_point is None or current_dist < nearest_dist:
nearest_point = current_point
nearest_dist = current_dist

# Determine which subtree to search first (based on axis and query point)
k = len(query_point) # Dimensionality of points
axis = depth % k

if query_point[axis] <= current_point[axis]:
nearer_subtree = node.left
further_subtree = node.right
else:
nearer_subtree = node.right
further_subtree = node.left

# Search the nearer subtree first
search(nearer_subtree, depth + 1)

# If the further subtree has a closer point
if (query_point[axis] - current_point[axis]) ** 2 < nearest_dist:
search(further_subtree, depth + 1)

search(root, 0)
return nearest_point, nearest_dist, nodes_visited
Empty file.
73 changes: 73 additions & 0 deletions data_structures/kd_tree/tests/test_kdtree.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
import unittest
import numpy as np
from data_structures.kd_tree.build_kdtree import build_kdtree
from data_structures.kd_tree.nearest_neighbour_search import nearest_neighbour_search
from data_structures.kd_tree.kd_node import KDNode
from data_structures.kd_tree.example.hypercube_points import hypercube_points


class TestKDTree(unittest.TestCase):
def setUp(self):
"""
Set up test data.
"""
self.num_points = 10
self.cube_size = 10.0
self.num_dimensions = 2
self.points = hypercube_points(
self.num_points, self.cube_size, self.num_dimensions
)
self.kdtree = build_kdtree(self.points.tolist())

def test_build_kdtree(self):
"""
Test that KD-Tree is built correctly.
"""
# Check if root is not None
self.assertIsNotNone(self.kdtree)

# Check if root has correct dimensions
self.assertEqual(len(self.kdtree.point), self.num_dimensions)

# Check that the tree is balanced to some extent (simplistic check)
self.assertIsInstance(self.kdtree, KDNode)

def test_nearest_neighbour_search(self):
"""
Test the nearest neighbor search function.
"""
rng = np.random.default_rng()
query_point = rng.random(self.num_dimensions).tolist()

nearest_point, nearest_dist, nodes_visited = nearest_neighbour_search(
self.kdtree, query_point
)

# Check that nearest point is not None
self.assertIsNotNone(nearest_point)

# Check that distance is a non-negative number
self.assertGreaterEqual(nearest_dist, 0)

# Check that nodes visited is a non-negative integer
self.assertGreaterEqual(nodes_visited, 0)

def test_edge_cases(self):
"""
Test edge cases such as an empty KD-Tree.
"""
empty_kdtree = build_kdtree([])
query_point = [0.0] * self.num_dimensions

nearest_point, nearest_dist, nodes_visited = nearest_neighbour_search(
empty_kdtree, query_point
)

# With an empty KD-Tree, nearest_point should be None
self.assertIsNone(nearest_point)
self.assertEqual(nearest_dist, float("inf"))
self.assertEqual(nodes_visited, 0)


if __name__ == "__main__":
unittest.main()
Loading