Spherical k-Means Clustering

Buchta, Christian and Kober, Martin and Feinerer, Ingo and Hornik, Kurt ORCID: https://orcid.org/0000-0003-4198-9911 (2012) Spherical k-Means Clustering. Journal of Statistical Software, 50 (10). pp. 1-22. ISSN 1548-7660


Download (406kB)


Clustering text documents is a fundamental task in modern data analysis, requiring approaches which perform well both in terms of solution quality and computational efficiency. Spherical k-means clustering is one approach to address both issues, employing cosine dissimilarities to perform prototype-based partitioning of term weight representations of the documents. This paper presents the theory underlying the standard spherical k-means problem and suitable extensions, and introduces the R extension package skmeans which provides a computational environment for spherical k-means clustering featuring several solvers: a fixed-point and genetic algorithm, and interfaces to two external solvers (CLUTO and Gmeans). Performance of these solvers is investigated by means of a large scale benchmark experiment. (authors' abstract)

Item Type: Article
Keywords: spherical / clustering / text mining / cosine dissimilarity / R
Divisions: Departments > Finance, Accounting and Statistics > Statistics and Mathematics > Hornik
Version of the Document: Published
Depositing User: ePub Administrator
Date Deposited: 28 Oct 2013 16:21
Last Modified: 24 Oct 2019 13:41
FIDES Link: https://bach.wu.ac.at/d/research/results/60056/
URI: https://epub.wu.ac.at/id/eprint/4000


View Item View Item


Downloads per month over past year

View more statistics