Data-driven journalism: Visualizing the lie vs. revealing the truth. | Datasketch
Logo Datasketch
Blog > Data Journalism > Data-driven journalism: Visualizing the lie vs. revealing the truth.

Data-driven journalism: Visualizing the lie vs. revealing the truth.

Recommended article. Milagros Salazar defends the idea that a methodology to handle data must be established in journalistic research. It should have as a fundamental aspect the organization, analysis, and verification of data to find a real story.

Available in:

By Sasha Muñoz Vergara. Published July 19, 2021

The research “Data-driven journalism: Visualizing the lie versus revealing the truth” was presented at the Academic Track of the Global Investigative Journalism Conference in 2017, organized by the International Consortium of Investigative Journalists. It defends that a methodology to handle data must be established in journalistic research. Moreover, it should have a fundamental aspect of the organization, analysis, and data verification to find a real story.

Journalism is full of data, but not everything is data journalism, comments the introduction of the research conducted by Milagros Salazar, director of Convoca, investigative journalism and data analysis media that brings together reporters and programmers in Peru.

Data verification is essential

It is crucial to verify them and find a story that is as human as possible. After all, that is what the audience is looking for. If the data is not contrasted with the situation “on the ground,” there is a danger that it will show us lies. Instead of helping us tell the truth to help people make better decisions for their lives, the Peruvian author argues in her research.

Without a sound methodology based on ethical criteria, databases in journalism can lead to lousy journalism on a grand scale.

Databases may lie

Salazar studied the trend in databases and their methodologies in journalism, the challenges, lessons learned, and practices in Latin America, based on his experience in Convoca and interviews with journalists in Latin America. Also, the work examines the trends of journalism projects in the region among the nominees and winners of the Data Journalism Awards, organized by the Global Editors Network and Google, between 2012 and 2017.

Databases can lie more than people. But a diligent reporter can detect their lies with the best weapon journalism has: fact-checking. The author alludes that the added value that journalists give to a database lies in what we have always learned: contact with reality, reporting from the scene of the facts, and cross-checking information with various sources.

In general, statistics play an increasingly important role. Think tanks, government agencies, independent researchers, academics, and other sources of information routinely produce data that can inform public debate and people’s decision-making.

Accurate interpretation of the data is vital

Most people do not read the raw data or the methodology behind it but rely primarily on the media’s interpretation of statistical information. That puts considerable pressure on journalists: in today’s 24-hour news culture, they have little time to understand how statistics are produced and communicate their broader meaning.

The article alludes to improved cooperation between journalists and programmers, as an enriching set of different points of view is formed especially in a context where technology is constantly evolving.

The implementation of data units in newsrooms is still not enough, according to Salazar. Media owners are afraid to invest in database work because they are looking for stories in the short term and are unwilling to commit the resources needed to do so.

Recommendations for a better use of data

The research presents an interview with Costa Rican journalist Giannina Segninig, who defines a series of recommendations that we list below:

1. Question everything and everyone

There is no such thing as a completely reliable source when using data to do meticulous journalism.

2. Check for completeness of data

“A good way to start is to scan the extreme values (maxima and minima of each variable in a data set) and count in Excel how many rows appear within each group to determine if we have all the information. Otherwise, we may reach the wrong conclusion,” the article states.

3. Determine if there are duplicates in the data record to eliminate them and have a correct result

Duplicate information, even if not false, will change the results.

4. Verify if the data is accurate

It is necessary to test the database from the beginning.

5. Evaluate the integrity of the data.

Since the database we usually obtain goes through several stages of “input, storage, transmission and registration” of the information, it is essential to perform integrity tests whenever necessary. For example, check if it has not been manipulated by people or information systems since the last time we did the previous checks.

6. Decipher the acronyms and codes that have been used to classify the data,

So that we can describe the importance of this information and find relevant stories.

Datasketch shares with you this and other article reviews on journalism and data visualization.