Language homogenization on the web

Motherboard says The Internet Is Killing Most Languages:

The great flat, globalized world of the internet operates pretty much as a monoculture, Kornai says. Only about 250 languages can be called well-established online, and another 140 are borderline. Of the 7,000 languages still alive, perhaps 2,500 will survive, in the classical sense, for another century, and many fewer will make it on to the internet.

Globalization of the world and the web could lead to homogenization of the languages in both places.

The adage “If it’s not on the web, it does not exist,” neatly encapsulates the loss of prestige. And as a generation of digital natives comes up, their online tongue is likely not to be their mother tongue—a loss of competence.

Languages on the web matter for self identity

Is homogenization of language on the web an instantiation of totalitarianism?

Boston Review on Herta Müller’s Language of Resistance:

Since language plays such an important part in the construction of the self, when the state subjects you to constant acts of linguistic aggression, whether you realize it or not, your sense of who you are and of your place in the world are seriously affected. Your language is not just something you use, but an essential part of what you are. For this reason any political disruption of the way language is normally used can in the long run cripple you mentally, socially, and existentially. When you are unable to think clearly you cannot act coherently. Such an outcome is precisely what a totalitarian system wants: a population perpetually caught in a state of civic paralysis.

What if the web is the state, in this context? What does it mean for self-identity, power, and a neutral web?

Top-level domains and nationalism

In 2010, Irina Shklovski and David M. Struthers wrote an excellent article on Kazakh national identity and its reflection through top-level domain name choices. The article is titled: Of States and Borders on the Internet: The Role of Domain Name Extensions in Expressions of Nationalism Online in Kazakhstan and the Oxford Internet Institute makes a PDF available.

The space on the internet is easily traversable and state boundaries in the form of domain extensions can be crossed with no more effort than a click of a mouse. Yet, what might such traversals of imagined state boundaries on the internet mean to the people doing the traversing? This question is especially relevant when considering people from Kazakhstan, a country where notions of statehood and nationalism are contested and are in the process of being renegotiated. Results presented here suggest that residents of Kazakhstan are acutely aware of national boundary traversals as they navigate the internet. The naming of a state-controlled space on the internet, through the use of ccTLDs, does in fact matter to the average user. Citizens of Kazakhstan often identified their activity on the internet as happening within or outside the space of the state to which they felt allegiance and attachment. We argue that naming matters for the creation of not only imagined communities online but also for individual expressions of nationalism on the internet.

Kazakhstan was previously part of the USSR.

There are several ways online spaces such as websites or other internet resources might signal their national affiliation. One such was is through the use of “country-code top-level domain names” (ccTLDs) that are in fact managed by an organization affiliated with the country in question that is the “designated manager” of second-level domain names (DNS) with the defined ccTLD (Postel 1994). The presence of a ccTLD often does not imply that the server that houses the page is in fact physically located on the territory of the country that the ccTLD denotes. However, symbolically, the webpage or an internet resource would display its national affiliation regardless of its actual physical location. We argue that the majority of internet users do not know and likely do not care where the resources they use online are physically located, but pay attention to the symbolic information embedded in the URLs as well as in the content they consume. In fact, prior research demonstrates that barring the physical locations of online resources, a direct analysis of links between sites based exclusively on their URLs indicated that most sites tend to link within a given ccTLD rather than across ccTLDs (Halavais 2000).

ccTLDs can operate as a national identity signifier, a way to entrench political borders on the web.

Although ccTLDs are the most common marker of national affiliation, they are rarely used in the US, suggesting a largely US-centric structure of generic TLD use such as .com, .net or .org (Leiner et al 2002). The lack of a country-identification for US businesses and personal sites may have been one of the drivers for the idea that the internet can be a borderless space. The use of ccTLDs is far more common in countries other than the US. We suggest that one of the reasons for this could be an attempt to carve out a national space on the internet where borders are deliniated, to clearly mark non-US territories and to provide symbolic markers for internet users.

The United States is an exception to this sort of identification. We’re the white people of the web in this way—the rarely-acknowledged default that doesn’t experience a national identification with our ccTLD because so much of the web (and our web) is US-built and US-centric.

Although much rhetoric in western countries still speaks of the one single internet that spans the world, the experience of talking about the internet in Kazakhstan begins to question this notion of a global undifferentiated online space.

Is there really a global single internet, or is there a series of differentiable internets that happen to use similar architecture and integrated infrastructure?

In Kazakhstan, many commented on the importance of both Russian and English for simply navigating online. For many young Kazakh-speaking respondents, however, use of Kazakh was an important marker of ethnic identity and a deliniation [sic] of national space online. Initially, young ethnic Kazakh activists translated interfaces of existing Western resources such as Facebook and WordPress into Kazakh by contacting the companies and offering translation services for free.

Language matters too, as a way of communicating but also self-expression.

I recommend reading the whole thing.

Politics and server locations

Theorizing the Web 2014 included a panel on World Wide Web(s): Theorizing the Non-Western Web. The participants, from the program, follow:

  • Presider / Jillet Sarah Sam @JilletSarahSam
  • Hashmod / Alice Samson @theclubinternet
  • Panelists:
    • David Peter Simon | @davidpetersimon | The Do-Gooder Industrial Complex?
    • Jason Q. Ng | @jasonqng | Fit for Public Display: Rethinking censorship via a comparison of Chinese Wikipedia with Hudong and Baidu Baike
    • Tolu Odumosu | @todumosu | Phoning the Web: A critical examination of Web infrastructure in Sub-Saharan Africa
    • Dalia Othman |@daliaothman | Social Media, Activism and the Middle East

The live tweets from the session included some interesting tidbits.

The borderless internet is a myth

The Atlantic, The Myth of a Borderless Internet. Political borders are re-enshrined on the web in a literal and metaphorical sense.

Just like the cartographers of yore, multinational corporations—particularly Internet companies—play a role in defining and shaping political boundaries for the public’s consumption. This rise of huge, international corporations online has torn away at the Emerald Curtain that once obscured the variety of geopolitical boundaries that exist in the world, making clearer to the average person just how unsettled the planet’s borders really are.

Given the global nature of the Internet, corporate giants like Google and Microsoft are forced to define borders, often contending with demands from governments. The result? One’s view of certain countries’ borders is often dependent on the physical location from which one accesses Google or Bing maps. In other cases—such as that of the Western Sahara—jurisdiction is a determining factor. Microsoft, which has offices in Morocco, takes its cue from Rabat in determining the territory’s borders, while Google—which does not—draws a dotted line between Morocco and the Western Sahara, demarcating the disputed border.

Political borders are enshrined in mapping tools, but reflected differently based on the nation-state that you occupy. The web has clear political borders, and the map you see on the web does too.

Rather than remove content entirely as other companies do, Twitter created a system whereby content would be “withheld” from users in a given country. Users are notified that the content in question has been withheld due to a legal request from a government. In addition to Pakistan, the tool has been used in numerous countries, including France, Brazil, and Russia.

The tool’s usage means that one “view” of the platform from a given country is different from the view from another. In other words, a Pakistani Twitter user is provided a sanitized version of Twitter, while an American one has access to—as far as we know—whatever content they desire. Corporate decisions around controversial speech, such as this one, all too often result in the creation of an “iron curtain” of sorts, dividing the seemingly borderless Internet.

The web you see in one country might not be the same web you see in another country. Political borders matter.

Who owns the ccTLDs?

Lawfare blog covers an interesting case that attempts to answer the question Are Top-level Domains Property? 

On December 28 [2015], the Justice Department filed an amicus brief in Weinstein v. Islamic Republic of Iran, a case pending before the D.C. Circuit. At issue is whether country-code top-level domains are the property of those countries’ foreign governments.

Does a country’s government own the country-code top level domain that represents that country?

DOJ argues first that ccTLDs are not attachable “property” or “assets” under the FSIA or TRIA. Rather, ccTLDs “merely [] designat[e] . . . the national affiliation of a subset of the global Internet community,” including “millions of private businesses and individuals.

Although the right to designate its territory “Iran” is presumably valuable to the Iranian government, no one would suggest that the name “Iran” in an atlas or a newspaper—or even official publications—is itself the “property” of the Iranian government subject to attachment by creditors.  

The Justice Department focuses primarily on the practical mechanisms of Internet governance. To support its position, DOJ points to a 1994 Internet governance document describing the Internet naming authority as a responsibility, not a property right, as well as “the actual practice under which country-code top-level domains have been established and managed.” In practice, ICANN “delegat[es]” TLD management to regional managers on the basis of whether the manager will be a “technically competent trustee of the domain on behalf of the national and global Internet communities.” In this sense, TLDs differ from second-level domains, which private parties purchase from the TLD-managers. Importantly, DOJ does treat second-level domains as property.

Internet domain stewardship is complex. Per the court:

any court order treating TLDs as property would threaten “the multi-stakeholder model of Internet governance” because other countries would react by “turn[ing] their backs on ICANN for good.” This risk of root zone anarchy not only eliminates any potential value for plaintiffs—who had hoped to profit from licensing Iran’s ccTLD—but also would “be devastating for ICANN.”

ICANN > a national government, in this case.

Searching without words

Search could be moving to images, in which case the languages may not play such a large part if images dominate web searches.

Fast Company goes Inside Baidu’s Plan To Beat Google By Taking Search Out Of The Text Era

In many cases, text-based search is not ideal for finding information. For instance, if you’re out shopping and spot a handbag you might like, it is far better to take a picture than to try and describe it in words. The same is often true if you see a flower or animal species that you would like to identify.

The Balkanization of the Internet

Political and legal borders interact to create a potentially balkanized future internet. Time Magazine says The Future of the Internet is Balkanization and Borders.

Rousseff’s plan to create walled-off, national Intranets followed reports that the United States has been surveilling Rousseff’s email, intercepting internal government communications, and spying on the country’s national oil company, so it was somewhat understandable. But her move could lead to a powerful backlash against an open Internet – one that would transform it from a global commons to a fractured patchwork severely limited by the political boundaries on a map.

The former Brazilian president wanted to protect her privacy by reinforcing political borders on the web.

The NSA has also opened a Pandora’s box by treating “citizens” and “foreigners” differently (even defining both groups in myriad different ways). U.S. rules also impose geo-locational-based jurisdictional mandates (based upon the route of your Internet traffic or the location of the data services and databases you use). Already, a German citizen accessing a New York City data center via a Chinese fiber line may find their data covered by an array of conflicting legal requirements requiring privacy and active surveillance at the same time.

What does it mean to be a citizen vs a foreigner when browsing the web and using the internet?

Chance thanks Obama for us with a ccTLD

Chance the Rapper has a new fashion line full of clothing that celebrates Barack Obama’s presidency: https://www.thankuobama.us/

With obvious references to Obama (the king Obama t-shirt) and some more oblique ones (a jersey with 44, because he was the 44th president), it’s only appropriate that the URL contains some symbolism.

He put the site up on thankuobama.us, indicating twofold:

  • Us as is “us” as in, we the people thank him for being our president.
  • But also Us as in US, as in USA, as in the ccTLD for the USA. The USA thanks him, but also like how much more patriotic of a URL can you get in this context.

Thank a former president with a URL that refers to the USA in multiple ways. Good work.

The tech media isn’t flat

Model View Culture confronts “The App You’ve Never Heard Of”: Exploring Western Bias in Tech Media.

It is flabbergasting that LINE–an app that beats out Messenger and WhatsApp in Thailand and Indonesia–or WeChat or even Alibaba would ever be so baldly described as “little-known.” Little known to Americans or Europeans? Perhaps, since they were not part of the original target market. But “little known” to millions of people in Asia? Certainly not.

This pattern reflects the arrogance and shortsightedness of tech publications which, although often having primarily Western staff, are consumed globally: English speakers around the world — both within and outside of the tech industry — consume Western tech news; after all, Silicon Valley is home to international giants like Facebook, Apple, and Google. Such headlines erase huge populations of users, not only internationally but even in the West itself. Take, for example, the large population of immigrants to the West: just as WeChat remains significant to Chinese Australian immigrants and their families, many immigrants from non-Western cultural backgrounds remain connected to the technology of their (or their extended family’s) homeland. For example, South Korea’s most popular chat app, KakaoTalk, is installed on 93% of smartphones in the country; in America, the majority of KakaoTalk’s downloads are by Korean immigrants and Korean Americans. Not acknowledging just how significant KakaoTalk is to the Korean tech industry and to Korean Americans is exclusionary and, frankly, ignorant.

Just because some tech comes from Silicon Valley doesn’t mean that tech popular in non-Western markets is inherently “unknown.”

Yet a biased, narrow focus in tech journalism contradicts and subverts these outcomes. It’s time tech writers and bloggers educate themselves about what’s dominating the markets in parts of the non-Western globe, and move towards journalism that truly reflects a commitment to technology that is changing the world… not just Silicon Valley.