Image Segmentation with K-Means Clustering

4 min readOct 27, 2020

Outline:

Definition of the Image Segmentation
What is K-Means?
How to perform the image segmentation with the K-Means clustering algorithm?

What is Image Segmentation?

What is an image?

An image expresses the visual information. They are saved on our computers as tensors. They can be in a dimension of 2D, 3D and etc. For instance, let’s take the example of the Gandalf picture.

We can see exactly as it is, but how do the computers see them? As tensors?

Image Segmentation…

Image processing is just an image partitioning into different regions based on the characterization of the pixels. So, if we apply the image segmentation to the previous image, we got:

Image segmentation is a very useful technique. It’s widely used in medical imaging (cancer detection), autonomous driving and etc.

We will try to use the K-Means algorithm to perform this task.

K-Means algorithm

K-Means is an unsupervised machine learning algorithm. It is a clustering algorithm, which means it divides the unlabeled data into clusters according to its characteristics. Steps to perform the K-Means algorithm:

Choose randomly K number of the data points. This K represents the number of classes and we call the data points centroids.
For each example of the data set, we must calculate the distance to all centroids, and that example belongs to the nearest centroid (We can mark them with colors).
Now we know the first partition of the clusters. We must calculate the mean of all coordinates for each cluster, and that means points are our new centroids.
Repeat the previous steps until convergence.

Image Segmentation with K-Means algorithm

Now, we know that the K-Means clustering algorithm grouping the data regard to their characteristics, we can use the same technique to the images and see what happens…

1.Let’s have a look at the image distribution for the image of Gandalf.

As we can see, there are plenty of similar color tones. We can minimize them and reduce it to the 2 or another cluster.

First, we need to read an image as the tensor.

The answer is the tensor that we demonstrated before. To get better optimization we can scale (normalize) them by dividing into 255.0,

The shape of our tensor is (375, 500, 3). 3 is the number of channels (RGB), but (375,500) are our spatial dimensions. We must flat them to send the K-Means model.