Human Rights Documentation In The Digital Age: Why Machine Learning Isn’t A Silver Bullet

When the Syrian uprising started nearly 10 years ago, videos taken by citizens of attacks against them such as chemical and barrel bomb strikes started appearing on social media. While international human rights investigators couldn't get into the country, people on the ground documented and shared what was happening. Yet soon, videos and pictures of war atrocities were deleted from social media platforms – a pattern that has continued to date. Ashoka Fellow Hadi al-Khatib, founder of the Syrian Archive and Mnemonic, works to save these audiovisual documents so they are available as evidence for lawyers, human rights investigators, historians, prosecutors, and journalists. In the wake of the Facebook Leaks, which are drawing needed attention to the topic of content moderation and human rights, Ashoka’s Konstanze Frischen caught up with Hadi.

Hadi al-Khatib, founder of Mnemonic and the Syrian Archive warns us against an over-reliance on ... [+] machine learning for online content moderation.

HCPlambeck

Konstanze Frischen: Hadi, you verify and save images and videos that show potential human rights violations, and ensure that prosecutors and journalists can use them later to investigate crimes against humanity. How and why did you start this work?

Hadi al-Khatib: I come from Hama, a city in the north of Damascus in Syria, where the first uprising against the Syrian government happened in 1982, and thousands of people died at the hands of the Syrian military. Unfortunately, at the time, there was very little documentation about what happened. Growing up, when my family spoke about these incidents, they would speak very quietly, or avoid the topic when I asked them about it. They would say: ‘be careful, even the walls have ears.’ In 2011, during the second big uprising against the Syrian government, the situation was quite different. We immediately saw a huge scale of audio-visual documentation on social media - videos and photos captured by people witnessing the peaceful protests first, and then the violence against protesters. People wanted to make sure the crimes that they were witnessing were documented, in contrast to what happened in Hama in 1982. My work is to ensure that this documentation captured by people who risked their lives is not lost and is accessible in the future.

Frischen: With people publishing this on social media on a very large scale, many people might assume ‘It's all out there, so why do I need someone else to archive it?’

al-Khatib: Yes, good question. When we work with journalists, photographers, citizens from around the world, most of them do think of social media as a place where they can safely archive their materials. They think ‘We have the archive. It's on social media, Dropbox, or Google Drive.’ But it’s not safe there — once this media is uploaded on social media platforms, we lose control of it. From March 2011 until I founded the Syrian Archive in 2014, footage got deleted on a very large scale – and it still is until now – because of social media platform’s content moderation policies. It got worse after 2017 when social media companies like YouTube started to use machine learning to detect content that shows violence automatically.

Frischen: Why do you think the materials get removed from social media platforms?

al-Kathib: Because the machine learning algorithm they have developed doesn't really differentiate between a video that shows extremist content or graphic content, and a video that documents a human rights violation. They all get detected automatically and removed.

MORE FROMFORBES ADVISOR

Best Tax Software Of 2022

Best Tax Software For The Self-Employed Of 2022

Income Tax Calculator: Estimate Your Taxes

Frischen: Though it’s well intended, machine learning can’t handle the complexity?

al-Khatib: Exactly. The use of machine learning is very dangerous for human rights documentation, not just in Syria, but around the world. Social media platforms would need to invest more in human intelligence, not just machine intelligence, to make sound decisions.

Frischen: The Syrian Archive, one of the organizations you founded, has archived over 3.5 million records of digital content. How does that work in practice? How do you balance machine learning and manual work?

al-Khatib: The first step is to monitor specific sources, locations, and keywords around current or historical events. Once we discover content, we make sure that we preserve it automatically, as fast as possible. This is always our priority. Each of the 3.5 million records we have collected come from social media platforms, websites, or apps like Telegram. We archive them all in a way that provides availability, accessibility and authenticity for these records. We use machine learning with the project VFRAME to help us discover what we have in the archives that is most relevant for human rights investigations, journalism reporting or legal case building within this large pool of media. Then, we manually verify the location, date, time. We also verify any kind of objects we can see in the video, and make sure we are able to link it with other pieces of archived media and corroborate it with other types of evidence, to construct a “verified incident.” We also use blockchain to timestamp the materials, with a third-party company called Enigio. We want to provide long term, safe accessibility to the documents, and authenticate them in a way that proves we haven't tampered with the material during the archival process.

Frischen: Machine learning is great for analyzing large data sets, but then human judgment and a deep knowledge of history, politics, and the region must be brought to bear?

al-Khatib: Exactly. Knowledge of context, language, and history is vital for verification. This is all a manual process where researchers use certain tools and techniques to verify the location, date, time of every record, and make sure that it's clustered together into incidents. Those incidents are also clustered together into collections to form a bigger picture understanding of the pattern of violence and the impact it has on people.

Frischen: These findings can in turn be leveraged: You feed the results of your investigations to governments and prosecutors. What has the impact been?

al-Khatib: We realize that any legal accountability is going to take a long time. One of the main legal cases we are working on right now is about the use of chemical weapons in Syria. We focus on two incidents in two locations in Syria, in Eastern Ghouta (2013), and in Khan Sheikhoun (2017), — where we saw the biggest uses of chemical weapons (i.e. Sarin gas) in recent history. We submitted a legal complaint to the German, French and Swedish prosecutors in collaboration with the Syrian Center for Media and Freedom of Expression, Civil Rights Defenders, and the Open Society Justice Initiative. Part of that submission was media evidence verified and collected by the Syrian Archive. Our investigations into the Syrian chemical supply chain resulted in the conviction of three Belgian firms who violated European Union sanctions, an internal audit of the Belgian customs system, parliamentary inquiries in multiple countries, a change in Swiss export laws to reflect European Union sanctions laws on specific chemicals, and the filing of complaints urging the governments of Germany and Belgium to initiate investigations into additional shipments to Syria.

Frischen: Wow. Let me come back to the automated content removal on social media platforms – when this happens, i.e. when pieces of evidence of atrocities by the government are deleted, does this then opens up windows of opportunity for actors like the Syrian government to then flood social media with other, positive images, and thus take over newsfeeds?

al-Khatib: Yes, absolutely. Over the last 10 years, we've seen this kind information propaganda coming from all sides of the conflict in Syria. And our role within this information environment is to counter disinformation by archiving, collecting and verifying visual materials to reconstruct what really happened and to make sure that this reconstruction is based on facts. And we are doing this transparently, so anyone can see our methodology and tools we are using.

Frischen: How are the big social media companies responding? Do you see them as collaborative or as distant?

al-Khatib: Many civil society organizations from around the world, have been engaging with social media companies and asking them to invest more resources into this issue. So far, nothing has changed. The use of machine learning is still happening. A huge amount of content related to human rights documentation is still being removed. But there has absolutely been engagement and collaboration throughout the years, especially since 2017. We worked with YouTube for example to reinstate some of the channels that were removed, as well as thousands of videos that were published by credible human rights and media organizations in Syria. But unfortunately, a big part of this documentation is still being removed. The Facebook Leaks reveal the company knew about this problem, but they are continuing to use machine learning, erasing the history and memory of people around the world.

Frischen: How do you attend to the wellbeing of the humans involved in gathering and triaging violent and traumatic content?

al-Khatib: This is a very important question. We need to make sure there is a system of support for all researchers looking at this content – practical assistance from psychologists that understand all the challenges and mitigate some of them. We are setting up protocols, so the researchers have access to experts. There are also some technical efforts underway. For example, we work with machine learning to blur images at the beginning, so researchers are not seeing graphic images directly on their screen. This is something that we want to do more work on.

Frischen: What gives you hope?

al-Khatib: The will of people who are facing the violence firsthand, and the families of victims. Whether in Syria or other countries, they did not yet get the accountability they deserve, but regardless, they are asking for it, fighting for it. This is what gives me hope – working together with them, adding value by linking documentation to justice and accountability, and using this process to reconstruct the future of the country again.

Hadi al-Khatib (@Hadi_alkhatib) is the founder of Syrian Archive and its umbrella organization Mnemonic.

This conversation was condensed and edited. Watch the full conversation & browse more insights on Tech & Humanity.

More From Forbes

Human Rights Documentation In The Digital Age: Why Machine Learning Isn’t A Silver Bullet

Best Tax Software Of 2022

Best Tax Software For The Self-Employed Of 2022

Income Tax Calculator: Estimate Your Taxes