Text Mining Infrastructure in R

Meyer, David and Hornik, Kurt ORCID: https://orcid.org/0000-0003-4198-9911 and Feinerer, Ingo (2008) Text Mining Infrastructure in R. Journal of Statistical Software, 25 (5). pp. 1-54. ISSN 1548-7660

Available under License Creative Commons Attribution 3.0 Austria (CC BY 3.0 AT).

Download (701kB)


During the last decade text mining has become a widely used discipline utilizing statistical and machine learning methods. We present the tm package which provides a framework for text mining applications within R. We give a survey on text mining facilities in R and explain how typical application tasks can be carried out using our framework. We present techniques for count-based analysis methods, text clustering, text classiffication and string kernels. (authors' abstract)

Item Type: Article
Additional Information: Article contains supplementary files. See http://dx.doi.org/10.18637/jss.v025.i05
Keywords: text mining / R / count-based evaluation / text clustering / text classiffication / string kernels
Divisions: Departments > Finance, Accounting and Statistics > Statistics and Mathematics > Hornik
Version of the Document: Published
Variance from Published Version: None
Depositing User: ePub Administrator
Date Deposited: 08 Oct 2013 08:53
Last Modified: 24 Oct 2019 13:41
Related URLs:
FIDES Link: https://bach.wu.ac.at/d/research/results/43299/
URI: https://epub.wu.ac.at/id/eprint/3978


View Item View Item


Downloads per month over past year

View more statistics