A service provided by the WU Library and the WU IT-Services

Modeling Mortality Rates In The WikiLeaks Afghanistan War Logs

Rusch, Thomas and Hofmarcher, Paul and Hatzinger, Reinhold and Hornik, Kurt (2011) Modeling Mortality Rates In The WikiLeaks Afghanistan War Logs. Research Report Series / Department of Statistics and Mathematics, 112. WU Vienna University of Economics and Business, Vienna.

[img]
Preview
PDF
Download (813Kb) | Preview

Abstract

The WikiLeaks Afghanistan war logs contain more than 76 000 reports about fatalities and their circumstances in the US led Afghanistan war, covering the period from January 2004 to December 2009. In this paper we use those reports to build statistical models to help us understand the mortality rates associated with specific circumstances. We choose an approach that combines Latent Dirichlet Allocation (LDA) with negative binomial based recursive partitioning. LDA is used to process the natural language information contained in each report summary. We estimate latent topics and assign each report to one of them. These topics - in addition to other variables in the data set - subsequently serve as explanatory variables for modeling the number of fatalities of the civilian population, ISAF Forces, Anti-Coalition Forces and the Afghan National Police or military as well as the combined number of fatalities. Modeling is carried out with manifest mixtures of negative binomial distributions estimated with model-based recursive partitioning. For each group of fatalities, we identify segments with different mortality rates that correspond to a small number of topics and other explanatory variables as well as their interactions. Furthermore, we carve out the similarities between segments and connect them to stories that have been covered in the media. This provides an unprecedented description of the war in Afghanistan covered by the war logs. Additionally, our approach can serve as an example as to how modern statistical methods may lead to extra insight if applied to problems of data journalism. (author's abstract)

Item Type: Paper
Keywords: WikiLeaks / Afghanistan / topic models / model-based recursive partitioning / mixture models / negative binomial / fatalities / data journalism / count data
Divisions: Departments > Finance, Accounting and Statistics > Statistics and Mathematics
Depositing User: Josef Leydold
Date Deposited: 20 Sep 2011 08:40
Last Modified: 24 Feb 2017 14:01
URI: http://epub.wu.ac.at/id/eprint/3210

Actions

View Item