My goal with this mini-project was, to get a very basic idea about how K-Means Clustering works.
Here are the different steps that I performed:
Step 1 Importing relevant Libraries and defining a Cluster of Size 1.000
import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.cluster import KMeans SIZE = 1000 DIVISOR = 10 NUMBER_OF_CLUSTERS = 10 Cluster = {"X": np.random.randint(0, 50, size=SIZE)/DIVISOR, "Y": np.random.randint(100, 150, size=SIZE)/DIVISOR}
Step 2 Transform Cluster to Dataframe
df = pd.DataFrame(Cluster, columns=["X", "Y"]) df = df.sample(frac=1)
Step 3 Apply K-Means Algorithm and plot the Results using Matplotlib
kmeans = KMeans(n_clusters=NUMBER_OF_CLUSTERS).fit(df) centroids = kmeans.cluster_centers_ plt.scatter(df["X"], df["Y"], c= kmeans.labels_.astype(float),s=50, alpha=0.5) ### Plotting the different Clusters with different colours plt.scatter(centroids[:, 0], centroids[:, 1], c="red", s=50) ### Plotting the centroids plt.show()
There you have it, a very simple implementation of K-Means Clustering.