MMKK++ algorithm for clustering heterogeneous images into an unknown number of clusters

Dávid Papp, Gábor Szűcs


In this paper we present a suggested automatic clustering procedure with the main aim to predict the number of clusters of unknown, heterogeneous images. We used the state-of-the-art Fisher-vector for mathematical representation of the images and these vectors were considered as input data points for the clustering algorithm. We implemented a novel variant of K-means, the kernel K-means++, furthermore the min-max kernel K-means plusplus (MMKK++) as clustering method. The proposed approach examines some candidate cluster numbers and uses the law of large numbers in order to choose the optimal cluster size. We conducted experiments on four image sets to demonstrate the efficiency of our solution. The first two image sets are subsets of different popular collections; the third is their union; the fourth is the complete Caltech101 image set.


image clustering; kernel K-means; cluster number; Fisher-vector

Full Text:

PDF (1376Kb)
Copyright (c) 2018 Dávid Papp, Gábor Szűcs