It is difficult to estimate the number of civilians killed in Darfur. The region is remote and the Sudanese government has no interest in the truth being known. Two investigators map out a method for finding these elusive numbers.

When one country attacks another the open nature of the conflict makes it relatively easy to get a sense of the nature and the extent of the violence. But what about government violence within a country? If a government shuts off all access to the country (as happened in Cambodia and Sudan), how can we get a clear sense of the nature and extent of the violence?

The solution isn't easy, but Andreas Hofer Petersen and Lise-Lotte Tullin come up with a way to get a reliable sense of the violence in Darfur. Their careful analysis provides confidence in their claims.

But how do they do it?

Gather Good Data

We live in an age of information. But not all of it can be trusted. Sifting reliable information from opinion and hearsay is difficult without a set of guiding principles. This is particularly a challenge when a government actively seeks to hide information. Petersen and Tullin use the following principles to gather information on the events in Darfur.

  • Cast a wide net. Don't rely on one piece or type of information, even if you think it is trustworthy. Petersen and Tullin wanted to get their information from eye-witness testimonies. They gathered information from a wide range of resources, including: extensive searches of newspapers, the internet and electronic libraries; all available reports by NGOs, human rights organizations, the UN and the African Union.
  • Set rules ahead of time for what data you'll accept. A set of rules will help to sort through a welter of information and make your data collection more than haphazard. Petersen and Tullin decided to accept eye-witness testimonies only if: witnesses could name the village or locality of the attack, they knew the date or the month of the attack and they could identify the perpetrators. Other information—perhaps even compelling information—was available, but the author's procedure limited the data to what they considered the most reliable.
  • Get all sides of the story. Every story has a number of different facets. Even if you know that a source is unreliable (for instance the Sudanese government claiming that it is only mounting a counterinsurgency) and don't plan on using their account as evidence for what really happened, the information is important in order to fill out the larger picture.
  • Avoid duplicating data. Drawing on different sources for eye-witness testimony runs the risk of one witness's testimony showing up in more than one source. Obviously, this compromises the data and opens your data up to criticism. Take special care to avoid this mistake.

Exclude Questionable Data

Setting rules for what data to include is only one part of the process of gathering good data. You also need to set ahead of time what information you will not include.

For instance, you may decide that you want to use only eye-witness testimony. However, can you trust all sources to eye-witness testimony to be equally accurate? Is testimony reported by Sudanese news sources as trustworthy as eye-witness testimony reported in UN documents? Probably not.

Petersen and Tullin excluded sources from groups associated in any way with the fighting. Both government and rebel sources were excluded. Also, the authors excluded information from NGOs linked to Darfur ethnic groups overseas. It is possible that the testimony provided by the some of these sources is accurate. However, because these groups have a vested interest in the conflict, there is always the suspicion that the information will have been “spun” to fit the group's objectives. Better to play it safe and throw out some good information than to be too relaxed and include faulty information.

The investigators also excluded witness testimony that did not provide exact numbers of people killed or specific reference to dead people (for example, “my father,” “my daughter,” etc.) Expressions like “people were killed” or “many people died” were excluded from the data set.

Define Terms Carefully

In order to analyze data you need to be able to categorize the data. In order to categorize data you have to have clear definitions. This may be particularly difficult with testimony—people don't always tell their stories the same way.

For instance, Petersen and Tullin wanted to get an estimate of how many villages in Darfur had been attacked. The problem was that eye-witnesses didn't always give an exact number. When a witness stated that there were “attacks on villages,” did they mean two villages or fifty villages? When a witness said that “several” villages were attacked, did “several” mean three? Seven? Eleven? There is no way to tell.

Petersen and Tullin decided to define the terms in this way:

  • “Attacks on villages” counted as two villages,
  • “Several villages” counted as three villages.

This allowed the authors to translate testimony into numbers. Calculating the number of villages attacked in this way almost certainly underestimated the number of villages attacked. However, taking a conservative approach protected the authors from being accused of exaggeration.

Cross-Check with Other Reliable Sources

There is no perfect data source. Even reliable data sources admit the possibility of error. Different interpretations are possible from the same set of data. One way to address these issues is to check the facts in one data source against other reliable data sources.

For instance, an eye-witness may say that a particular village was attacked on a particular date or in a specific month. However, other sources may indicate that this isn't plausible (for instance, because of other evidence that the village had been destroyed some time before that). This doesn't mean that a particular witness is lying. It only means that human memories are not perfect.

Petersen and Tullin corroborate accounts of attacks on Darfur villages in two ways:

  • Where did the attack occur? Names of villages were generally transliterated from one of the local languages or dialects into another language. This opened up the possibility of mistakes and confusion about the number of villages attacked. If the name of a single village was transliterated in multiple ways it could lead the investigators to think that multiple villages had been attacked. Petersen and Tullin included only villages named by witnesses that matched a list of Darfur villages compiled by the UN Office for the Coordination of Humanitarian Affairs (OCHA). They noted whether the villages named by the witnesses were an exact match of very close. This not only helped to avoid the problem of overestimating the number of villages attacked, but checked the other sources with a public list. This allowed other researchers to check their data.
  • How many people were killed? A single villager is unlikely to see or know about everyone killed in an attack. However, survivors often fled together or met other survivors when they reached safety. Petersen and Tullin suggest that by sharing stories, comparing accounts survivors were able to gain a more comprehensive and accurate knowledge of the number of people killed in their villages. So, reliable estimates of people killed could be culled from the shared accounts and consensus of survivors from the same village or area.

Calculate Estimates Using Different Assumptions

Even with all these precautions, there is still room for error. And, since all the relevant information cannot be known in these situations, the investigators have no choice but to make certain assumptions. Others can always question your assumptions. So, two ways to lessen the likelihood that your calculations will be discredited are:

  1. Make your assumptions explicit. If you don't make it clear how you are calculating the numbers, then it will look like you have something to hide. Give reasons for making the assumptions you do and demonstrate how your calculations follow from these assumptions. Realize, however, that assumptions can't be proved (that's why they are assumptions) and others may make different assumptions.
  2. Recalculate the numbers based on different assumptions. A useful strategy is to not rely on only a single set of assumptions. Carry out the analysis using different assumptions. Even if one set of assumptions strikes you as most plausible, others may not be convinced. Providing estimates based on different approaches is unlikely to give you a hard and fast number (more likely, you'll get a range of numbers), but it will increase the credibility of your calculations.

Petersen and Tullin's approach to estimating the number of civilians killed in Darfur is a good example of this.

Using the above criteria, Petersen and Tullin were able to identify 101 villages where they could get a specific death toll. The total death toll for these 101 villages was 5733. This meant that there were an average of 57 people killed per attacked village.

But, how to estimate the number of people killed in all of Darfur? Surely there were people killed outside these 101 villages?

The investigators knew that there were 2800 villages on the OCHA list. They combined satellite scans and the information from the 101 villages in their sample to estimate that 58% of the villages in Darfur had been burned by September 2005. This gave them one way to calculate the number killed in Darfur:

  • If there are 2800 villages and 58% of them were burned, this equals 1624 villages burned,
  • If there were an average of 57 people killed per each of these 1642 villages, then the total number killed would be approximately 92,500.

Simple and straightforward calculation. But is it right? Can we really trust that an average of 57 people were killed per village? Petersen and Tullin take a different approach to estimating the average number of people killed per village.

  • First, there were 101 villages where a specific number of deaths was given. However, there were 43 other accounts where there were no specific number of deaths reported. If we add these together we get a total of 144 villages.
  • In eight of the additional 43 accounts, the witnesses reported deaths, but didn't say how many.
  • Let's assume that, like the 101 villages where we have numbers, each of these additional eight villages had 57 people killed. This increases our estimate of the number killed by 456 and raises our total from 5733 to 6189 killed.
  • Let's assume that there were no deaths in the additional remaining 35 villages where no deaths were mentioned at all.
  • We now have a total of 6189 people killed in 144 villages, and this gives us an average of 43 people killed per village (compared to the average of 57 in the initial estimate).
  • If we place this new average back into the first equation (of how many villages were attacked), then we have 58% of the 2800 total villages times 43. Our new estimate of the total number killed is approximately 70,000. This is a 22,500 person difference from the first calculation!

The above change in the estimate of people killed occurs only by changing one assumption (the average number of people killed per village). But, what if we changed the assumptions about how many villages there are in Darfur? The OCHA numbers are a good reference point, but are they right? What if we change the number of deaths we know about (5733) by taking out deaths reported in “prison camps?” Prison camps are not the same as villages after all (are people killed more frequently or less frequently in prison camps?). How would this change our calculation?

It should be clear that changing assumptions can make for significant changes in the estimates of the number of people killed.

Petersen and Tullin recalculate their estimates by changing a number of assumptions. After recalculating several times they finally come up with a range. They provide a minimum and maximum estimate (based on different assumptions) of the number of people killed in the violence in Darfur: 56,840 to 128,000.

Data and Methods:

Data Source:

Data consisted of information from a wide range of resources, including: extensive searches of newspapers, the internet and electronic libraries; all available reports by NGOs, human rights organizations, the UN and the African Union. The authors used a strict set of inclusion and exclusion criteria to determine whether to include a particular eye-witness account in the data set.

From the eye-witness accounts the authors were able to estimate the number of villages attacked and the number of people who died in each village.


The authors used several alternative assumptions to extrapolate from their sample to the general population in Darfur. This allowed them to calculate a range of people killed in Darfur during the target period.

Funding Source:

Not reported. [Bloodhound?]


Petersen, A.H. and Tullin, L. 2006.The Scorched Earth of Darfur: Patterns in Death and Destruction Reported by the People of Darfur. January 2001-September 2005. Copenhagen: Bloodhound.

