Censorship on Centralized Code Collaboration platform?

Censorship on Code Collaboration Platforms

“Censorship reflects society’s lack of confidence in itself. It is a hallmark of an authoritarian regime.”–Potter Stewart


`
When the internet was created, it was meant to be a utopic space where people could communicate and learn free of restrictions. But with countries increasingly building firewalls to censor content on the internet and banning internet services that do not comply with government regulations, freedom on the internet has long been compromised. What once was a free space has now been tamed, curtailed, and limited.

But Internet censorship is a complicated issue as it is not always bad. There are many circumstances where depending on your stance; censorship can be seen as justified. Many people agree on this, but what they don’t agree on is who gets to decide which content is censored and which is not. Central authority representatives making decisions on their behalf is not acceptable to most people.

This blog is about one such censorship that has become a problem in recent years — Censorship in Code hosting and Collaboration platforms.

When discussing censorship in Code hosting and collaboration platforms, it is enough to consider the censorship cases in the most popular of these platforms, GitHub. With over 60 million developers building their projects on its platform, GitHub has a near-monopoly in the market.

Censorship of GitHub

GitHub has been the target of various kinds of censorship since its inception. Some of the cases of censorship are -

  • DMCA takedowns
  • Trade control Compliance
  • Being targeted by governments of different countries
  • Removal of content due to abuse-related violations

GitHub being a centralized platform, has no choice other than to comply with most of the censorship demands that comes their way. Its users have no say in this. Let’s look at each of these scenarios in detail.

DMCA takedown

Takedown Notice processed vs. the number of repositories affected. Image source: 2020 Transparency Report | The GitHub Blog

The risk of DMCA takedown is a growing concern among developers in using GitHub to host their projects. This issue has recently gained popularity due to GitHub’s removal of a popular open-source repository — youtube-dl and its public forks after getting a DMCA takedown notice from RIAA.

The Electronic Frontier Foundation (EFF) explained that the youtube-dl project was never in violation of the DMCA. EFF explained that what the RIAA described as a suggestion in the documentation of youtube-dl to pirate certain songs was only a test that streams a few seconds of those videos to show that the software is working. It was well within fair use rights.

The decision of GitHub to remove youtube-dl led to criticism and protest from youtube-dl users as well as the open-source community. Users reposted the software source code across the internet in multiple formats. Twitter users posted images on Twitter containing the whole youtube-dl source code encoded in different colors on each pixel. This incident made developers aware that their projects could also one day be removed due to malicious DMCA takedowns.

youtube-dl Removal Notice Image source: GitHub blocks public access to youtube-dl after RIAA issues DMCA notice - Wikinews, the free news source

In the end, due to the public outcry and the help of EFF, GitHub had to reverse its decision and reinstate the youtube-dl repository back on the platform.

Youtube-dl is just one example of the takedown of a GitHub repository due to a DMCA notice. According to the 2020 GitHub transparency report, in 2020 alone, GitHub has received and processed 2,097 valid DMCA takedown notices. This number may include legal projects that decided to forgo counter-notices to reinstate their projects due to the tedious process of reversal.

Another major problem with this takedown is that the forks of the project serving the notice are also taken down. This means one takedown notice affects many projects. The DMCA takedown of youtube-dl affected its 12,000 forks. Taking this into consideration, in total, GitHub took down 36,173 projects in 2020 alone.

The trend of DMCA takedown has been increasing over the years. Although this number is nothing compared to the number of projects hosted on GitHub, the increasing number of takedowns over the years is a cause for concern.

Trade Control Compliance

Repository removal Notice due to Trade and compliance issues. Source:GitHub blocked my account and they think I’m developing nuclear weapons | by Hamed | Medium

GitHub has banned users of some countries from accessing some part of their services to comply with the US trade sanctions. The US government has imposed sanctions on several countries and regions (Crimea, Cuba, Iran, North Korea, and Syria), which means GitHub isn’t completely available in all of those places. GitHub CEO, on this topic, said that GitHub, like any other company doing business in the US, was subject to the US trade laws.

This is a major cause of concern since it means that the communities and the developers from your country could be banned from using GitHub anytime in the future as per the wishes of the US government. This may lead to GitHub being unsuitable for hosting collaborative open-source projects as it goes against open source values. Sudden interruptions of GitHub services due to trade sanctions issues can cause a lot of damage to the normal functioning of many services and harm the economy of the sanctioned countries.

The problem faced by ordinary developers due to these trade law sanctions can be explained more clearly with an example. Let’s say you are collaborating on a project with developers from across the world. Unexpectedly, the US government decides to impose sanctions in a country where one of your collaborators lives. Now that developer cannot access GitHub and you would have to look for alternate ways to finish the project. You can either use another platform to host your project or look for another developer who can replace the previous developer and do the job.

Both the above options are not easy to accomplish within a short span of time. The familiarization and ease of GitHub’s features by virtue of using it for a long period of time may have to be replaced by a new and unfamiliar platform. Looking for a suitable alternate developer who can collaborate with you can also be very difficult and time-consuming. Either way, your project completion can lead to delay.

Being Targeted by Governments of Various Countries

GitHub, in its run, has been the target of censorship from governments of many countries, including China, India, Russia, and Turkey, using many methods like local Internet service provider blocks and denial-of-service attacks on GitHub’s servers. GitHub was eventually unblocked, but it was mainly due to backlash from users and technology businesses or GitHub complying with the countries’ demands.

China

China heavily regulates its internet traffic, and to make GitHub comply with its demands, China has used many different methods.

  • On January 21, 2013, GitHub was blocked in China using DNS hijacking. The block was later lifted on January 23, 2013, after an online protest on Sina Weibo.
  • On January 26, 2013, GitHub users in China experienced a man-in-the-middle attack in which attackers could have intercepted traffic between the site and its users in China. This attack was performed again on March 26, 2020, on GitHub Pages, and on March 27, 2020, on GitHub.com.
  • On March 26, 2015, GitHub was the target of a distributed denial-of-service (DDoS) attack originating from China. It targeted two anti-censorship projects: GreatFire and cn-nytimes, the latter including instructions on how to access the Chinese version of The New York Times. GitHub later blocked China-based IP addresses from visiting these repositories.

India

India selectively censors websites at the federal and state levels. On December 17, 2014, the Indian Department of Telecom issued an order to ISPs to block 32 websites which included GitHub.

This was said to be done to force these websites to cooperate with the Indian government on investigations and remove “Objectionable content”. GitHub was later unblocked on January 2, 2015.

Russia

The Russian government blacklists websites that include child pornography, drug-related material, advocacy of suicide, extremist material, and other illegal content under the Russian Internet Restriction Bill to protect children. Roscomnadzor, Russia’s regulatory agency, maintains this list.

In October and December 2014, GitHub was temporarily blocked for hosting a page containing (mostly) satirical suicide instructions, frequently used to troll the Russian censorship system.

Turkey

On October 8, 2016, following the leak of emails of Turkish Minister Berat Albayrak by RedHack, the Information, and Communication Technologies Authority (BTK) ordered ISPs to block several file-sharing websites, including GitHub, Dropbox, Microsoft OneDrive, and Google Drive. RedHack purposefully spread the emails using multiple services, expecting Turkey to block them so as to utilize the Streisand effect and popularize the matter. GitHub was unblocked 18 hours later.

After these instances of government bans, GitHub has started taking the takedown requests from the government of various countries seriously and has begun taking swift action. When it receives a valid takedown request from a government agency, GitHub blocks the content primarily by geoblocking content only in a local jurisdiction from where it received the request.

In 2020, GitHub received and processed 44 government takedown requests based on local laws — all from Russia. These takedowns resulted in 44 projects being blocked in Russia. In comparison, in 2019, GitHub processed 16 takedowns that affected 54 projects. In addition to requests based on violations of local law, GitHub processed 13 requests from governments to take down content as a Terms of Service violation, affecting 12 accounts and one repository in 2020.

Removal of Content due to abuse-related violations

GitHub’s Terms of Service include content and conduct restrictions. These restrictions include discriminatory content, doxxing, harassment, sexually obscene content, inciting violence, disinformation, and impersonation. In these cases, GitHub takes action according to the severity and type of the violation. Depending on the severity of the violation, the actions range from disabling the repository under violation, hiding user accounts from public view while allowing the user access, or restricting the user from accessing his account.

According to the 2020 GitHub transparency report, in 2020,

  • GitHub hid 4,826 accounts and reinstated 415 hidden accounts.
  • GitHub restricted an account owner’s access to 47 accounts and reinstated it for 15 accounts.
  • For 1,178 accounts, GitHub both hid and restricted the account owner’s access, lifting both of those restrictions to fully reinstate 29 accounts and lifting one but not the other to partially reinstate 12 accounts.
  • As for abuse-related restrictions at the project level, GitHub disabled 2,405 projects and reinstated only four in 2020.

Not many people have a problem with censorship in this case. The only cause for concern is that GitHub sometimes falls prey to the Scunthorpe problem while censoring content this way.

The Scunthorpe problem arises because while computers can easily identify strings of text within a document, interpreting words and complex phrases requires considerable ability to interpret a wide range of contexts, perhaps across many cultures, resulting in an arduous task. As a result, broad blocking rules may result in false positives, affecting innocent phrases. GitHub also falls prey to this.

GitHub nuked repositories over the use of the word ‘Retard’ Source: [TW: Not tumblr] Github disables repo for use of the word "retard" - Imgurstrong text