- The internet is filled with huge amounts of data in the form of images. With such large amounts of data, image compression techniques become important to compress the images and reduce storage space.
- In this post, we will implement and test the highly effective and simple 2D image compression algorithm developed by Jordi Warmenhoven.

- It is based upon K-means clustering – one of the simplest and popular unsupervised Machine Learning (ML) algorithms, which groups the unlabeled dataset into different clusters.
- Here K defines the number of predefined clusters that need to be created in the process, as if K=2, there will be two clusters.
- In image compression, K represents the number of colors.
- The K-means algorithm allows us to cluster the 2D image into different segments and a convenient way to discover the categories of segments in the unlabeled dataset on its own without the need for any training.
- It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim of this algorithm is to minimize the sum of distances between the data point and their corresponding clusters.

## Performance Test

Let’s set the working directory YOUR PATH and import the key Python libraries

import os

os.chdir(‘YOUR PATH’)

os. getcwd()

import pandas as pd

import numpy as np

import matplotlib as mpl

import matplotlib.pyplot as plt

from scipy.io import loadmat

from sklearn.cluster import KMeans

from sklearn.preprocessing import StandardScaler

from scipy import linalg

pd.set_option(‘display.notebook_repr_html’, False)

pd.set_option(‘display.max_columns’, None)

pd.set_option(‘display.max_rows’, 150)

pd.set_option(‘display.max_seq_items’, None)

%matplotlib inline

import seaborn as sns

sns.set_context(‘notebook’)

sns.set_style(‘white’)

Let’s load the test synthetic data

data1 = loadmat(‘ex7data2.mat’)

X1 = data1[‘X’]

print(‘X1:’, X1.shape)

X1: (300, 2)

Let’s add uncorrelated noise with signal/noise=0.1

x0mean=X1[:,0].mean()

x1mean=X1[:,1].mean()

x0std=X1[:,0].std()

x1std=X1[:,1].std()

dim=300

scale=0.1

noise0 = np.random.normal(x0mean,x0std,dim)

noise1 = np.random.normal(x1mean,x1std,dim)

X1[:,0]=X1[:,0]+noise0/scale

X1[:,1]=X1[:,1]+noise1/scale

Let’s call Kmeans with K=3

km1 = KMeans(3)

km1.fit(X1)

KMeans(n_clusters=3)

Let’s plot the output

plt.scatter(X1[:,0], X1[:,1], s=40, c=km1.labels_, cmap=plt.cm.prism)

plt.title(‘K-Means Clustering Results with K=3’)

plt.scatter(km1.cluster_centers_[:,0], km1.cluster_centers_[:,1], marker=’+’, s=100, c=’k’, linewidth=2);

## Image Compression

Let’s load the image

img = plt.imread(‘youtubewatcher.png’)

img_shape = img.shape

img_shape

(1440, 2560, 4)

and perform the following transformations

A = img/255

AA = A.reshape(img_shape[0]*img_shape[1], img_shape[2])

AA.shape

(3686400, 4)

Let’s apply K-means with K=64

km2 = KMeans(64)

km2.fit(AA)

KMeans(n_clusters=64)

B = km2.cluster_centers_[km2.labels_].reshape(img_shape[0], img_shape[1], img_shape[2])

Let’s plot the outcome

fig, (ax1, ax2) = plt.subplots(1,2, figsize=(13,9))

ax1.imshow(img)

ax1.set_title(‘Original’)

ax2.imshow(B*255)

ax2.set_title(‘Compressed, with 64 colors’)

for ax in fig.axes:

ax.axis(‘off’)

Let’s load another image

img = plt.imread(‘bird_small.png’)

and repeat the same sequence as above.

The outcome is

## Summary

- We have looked at image compression using the K-means clustering algorithm which is an unsupervised ML algorithm.
- In this case study, the optimal image compression was performed with K=64.
- For the YouTube image, the compression ratio is 5,712/131 = 43,60.
- For the Parrot image, the compression ratio is 33/4 = 8,25.
- Results show that the K-means algorithm works well and can be used to compress 2D images without compromising on quality/resolution.

## Explore More

ML/AI Breast Cancer Diagnosis with 98% Confidence

K-means clustering algorithm (unsupervised learning) for image compression

Image Compression with K-means Clustering

#### Make a one-time donation

#### Make a monthly donation

#### Make a yearly donation

Choose an amount

Or enter a custom amount

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly
## One response to “Effective 2D Image Compression with K-means Clustering”

Love This !! my thoughts on this ….

The internet has a vast amount of image data that needs compression to reduce storage space. To achieve this, image compression techniques are important, and one such effective and simple algorithm is the 2D image compression algorithm developed by Jordi Warmenhoven. The algorithm is based on K-means clustering, which is a popular unsupervised Machine Learning algorithm that groups an unlabeled dataset into clusters. In the case of image compression, K means the number of colors used in the image. The K-means algorithm clusters the 2D image into different segments, reducing the number of colors in each segment, resulting in a compressed image. The algorithm is implemented by first converting the image to a 2D array, and then applying the K-means clustering algorithm to the array. The compressed image is obtained by applying the reduced color palette to the original image. The algorithm is tested on various images, and the results show that it is highly effective and simple.

Thanks – PomKing

http://www.pomeranianpuppies.uk

LikeLike