Services

Data-capture

We help you to get and organize information from public data or different websites with scrapping.

Data-cleaning

We structure databases with information from multiple databases in multiple formats. Organization of information and standardization of variables.

Visualization apps

We create public data visualization applications so that your users can know and explore databases. We use the latest technologies in data visualization to communicate information.

Algorithms

We implement artificial intelligence algorithms to facilitate your work with data, from predictive algorithms to pattern recognition.

Web specials

We develop interactive web specials based on data. The specials have different visual components to guide your readers. See examples of our specials.

About us

Datasketch is a digital platform of investigative and data journalism. Our portal allows journalists, data scientists, social scientists and citizens in general to learn and consult on data visualizations, tools, software and in-depth research on various short-term issues. We have free data tools and different projects to bridge the gap between data and citizenship that facilitates the democratization of knowledge and a critical review of social realities based on information contrasts.

Our team

Juan Pablo Marín

Electronic engineer with a master's degree in computational statistics. Expert in data science with applications in multiple areas such as economics, hydrology and journalism.

Camila Achuri

Statistics and expert in R programming language. She has developed various applications of data visualization in mobility and open data subjects.

Juliana Galvis

Politologist and candidate for a Master in Digital Humanities. She is currently leading the development of the Who Is database, as well as supporting journalistic research and the creation of databases.

David Daza

Bachelor of Electronics. Expert in development of applications and websites with emphasis on data journalism and content management of multiple databases.

Verónica Toro

Anthropologist and researcher. Responsible for the management and organization of the data-community in Colombia and Latin America and provide support in journalistic investigations and the creation of databases.

Andrea Cervera

Journalist responsible for writing articles, provide investigative support and community manager.

Mariana Villamizar

Systems engineer and designer. Expert in user experience, data visualization and graphic communication. Feminist.

Contact

Why every journalist should use The Archive's Wayback Machine

May 23, 2017

The Wayback Machine allows anyone to explore historical captures (snapshots) of internet pages. For example, this was the homepage of the Washington Post on September 12th 2001.

"I would certainly be open to closing areas where we are at war with somebody. I sure as hell don’t want to let people that want to kill us and kill our nation use our internet. Yes, sir, I am." - Donald Trump. CNN December 15, 2015.

 

Just as the White House webpage took down many of their contents and pages right after Donald Trump took office, including all pages in spanish and pages about civil rights and the LGBT community, it is useful to be able to access copies of that content later in time. Not only governmental sites, discontinued sites, fake-news and tweets can be subject to deletion at any point in time. It is very useful, in fact necessary, to contrast sources with the precise information they had at the moment you consulted them. It is also necessary to be able to share this information when it is hosted by a trusted third party and not only using screenshots that can be easily manipulated by anyone. In case you want to have a trusted source to keep the actual contents of a page at the moment you visit it then you can use the Internet Archive and its Wayback Machine.

 

The Wayback Machine is one of the services from The Internet Archive. The Archive is basically a collection of historical snapshots of the internet, its main objective is to bring universal access to all knowledge to the world. It began in 1996 as a project to download all the public pages available on the internet and keep them as a reference. It now has over 286 billiion pages saved over time, that amounts to more than 9 petabytes of data and it currently adds more than 20 terabytes every week. For reference, 2 Petabytes correspond to the information all US academic research libraries have.

 

The Wayback Machine allows anyone to explore historical captures (snapshots) of internet pages. For example, this was the homepage of the Washington Post on September 12th 2001.

 

 

The Archive is an attempt to keep alive our digital memory so we don't lose years and years of collective intelligence and knowledge in the case of a tremendous hazard or accident, just as it happened to The Library of Alexandria.

 

As of november 2016, The Archive embarked on a quest to keep full copy their data in servers in another country. They currently have partial copies of the Internet Archive in Alexandria, Egypt, and in Amsterdam, the Netherlands. During President Trump's campaign trail the nature of his statements pushed the efforts of the non-profit to make an additional full copy of The Archive in Canada in case of institutional failure in the United States.

 

Many pages do not have a historic capture every single day. So it is necessary for users to manually save the pages they are interested in case they need a specific snapshot. The process in rather simple, you can visit http://archive.org/web/ and simply save your page. You will get a link, with the information of that webpage and the time you saved it, that you can share or publish so your readers can know exactly where and when the information was captured.

 



You can also save the pages using this chrome extension, straight from your browser. With this extension besides saving the web page to The Archive, you can also get the latest snapshot of the page when it is not currently available.

 

So, go ahead and use the Wayback Machine to document your publications and sources before they are taken down.

 

 

Juan Pablo Marín Díaz

Juan Pablo is a data scientist. His work in computational statistics has been applied in fields like macroeconomic analysis, hydrology and data journalism.

Hallelujah! Finally a free PNG image bank

May 23, 2017
Hallelujah! Finally a free PNG image bank