Skip to content

Add Quantum k-Means Clustering Implementation #11664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

79 changes: 79 additions & 0 deletions quantum/quantum_kmeans_clustering.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
import cirq
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.preprocessing import MinMaxScaler

def generate_data(n_samples=100, n_features=2, n_clusters=2):

Check failure on line 7 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (I001)

quantum/quantum_kmeans_clustering.py:1:1: I001 Import block is un-sorted or un-formatted
data, labels = make_blobs(n_samples=n_samples, centers=n_clusters, n_features=n_features, random_state=42)

Check failure on line 8 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

quantum/quantum_kmeans_clustering.py:8:89: E501 Line too long (110 > 88)
return MinMaxScaler().fit_transform(data), labels

def quantum_distance(point1, point2):
"""
Quantum circuit explanation:
1. Use a single qubit to encode the distance between two points.
2. Apply Ry rotation based on the normalized Euclidean distance.
3. Measure the qubit to get a probabilistic distance metric.
The probability of measuring |1> correlates with the distance between points.
"""
qubit = cirq.LineQubit(0)
diff = np.clip(np.linalg.norm(point1 - point2), 0, 1)
theta = 2 * np.arcsin(diff)

Check failure on line 22 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (W293)

quantum/quantum_kmeans_clustering.py:22:1: W293 Blank line contains whitespace
circuit = cirq.Circuit(
cirq.ry(theta)(qubit),
cirq.measure(qubit, key='result')
)

Check failure on line 27 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (W293)

quantum/quantum_kmeans_clustering.py:27:1: W293 Blank line contains whitespace
result = cirq.Simulator().run(circuit, repetitions=1000)
return result.histogram(key='result').get(1, 0) / 1000

def initialize_centroids(data, k):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: initialize_centroids. If the function does not return a value, please provide the type hint as: def function() -> None:

As there is no test file in this pull request nor any test function or class in the file quantum/quantum_kmeans_clustering.py, please provide doctest for the function initialize_centroids

Please provide type hint for the parameter: data

Please provide descriptive name for the parameter: k

Please provide type hint for the parameter: k

return data[np.random.choice(len(data), k, replace=False)]

Check failure on line 32 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (NPY002)

quantum/quantum_kmeans_clustering.py:32:17: NPY002 Replace legacy `np.random.choice` call with `np.random.Generator`

def assign_clusters(data, centroids):
clusters = [[] for _ in range(len(centroids))]
for point in data:
closest = min(range(len(centroids)), key=lambda i: quantum_distance(point, centroids[i]))

Check failure on line 37 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

quantum/quantum_kmeans_clustering.py:37:89: E501 Line too long (97 > 88)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide descriptive name for the parameter: i

clusters[closest].append(point)
return clusters

def recompute_centroids(clusters):
return np.array([np.mean(cluster, axis=0) for cluster in clusters if cluster])

def quantum_kmeans(data, k, max_iters=10):
centroids = initialize_centroids(data, k)

Check failure on line 46 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (W293)

quantum/quantum_kmeans_clustering.py:46:1: W293 Blank line contains whitespace
for _ in range(max_iters):
clusters = assign_clusters(data, centroids)
new_centroids = recompute_centroids(clusters)
if np.allclose(new_centroids, centroids):
break
centroids = new_centroids

Check failure on line 53 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (W293)

quantum/quantum_kmeans_clustering.py:53:1: W293 Blank line contains whitespace
return centroids, clusters

# Main execution
n_samples, n_clusters = 10, 2
data, labels = generate_data(n_samples, n_clusters=n_clusters)

plt.figure(figsize=(12, 5))

plt.subplot(121)
plt.scatter(data[:, 0], data[:, 1], c=labels)
plt.title("Generated Data")

final_centroids, final_clusters = quantum_kmeans(data, n_clusters)

plt.subplot(122)
for i, cluster in enumerate(final_clusters):
cluster = np.array(cluster)
plt.scatter(cluster[:, 0], cluster[:, 1], label=f'Cluster {i+1}')
plt.scatter(final_centroids[:, 0], final_centroids[:, 1], color='red', marker='x', s=200, linewidths=3, label='Centroids')

Check failure on line 72 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

quantum/quantum_kmeans_clustering.py:72:89: E501 Line too long (122 > 88)
plt.title("Quantum k-Means Clustering with Cirq")
plt.legend()

plt.tight_layout()
plt.show()

print(f"Final Centroids:\n{final_centroids}")

Check failure on line 79 in quantum/quantum_kmeans_clustering.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (W292)

quantum/quantum_kmeans_clustering.py:79:46: W292 No newline at end of file
Loading