Showing posts with label endangered languages. Show all posts
Showing posts with label endangered languages. Show all posts

26 November 2010

Wikipedia as (Multilingual) Word-Hoard

Wikipedia is often regarded as little more than a poor person's encyclopedia, providing a handy reference collection of basic facts. But there's another side that I predict will be recognised increasingly: as a key corpus of texts in languages that lack traditional large-scale publishing to preserve their cultures.

Here's a good example:

"Some Indian-language Wikipedias are already the largest online repositories of information in their respective languages," Bhati said. "Regular community meetings such as the one we had today in Ahmedabad can help spread the word about our mission."

This facet is even more important for languages with a relatively small numbers of speakers, or perhaps threatened with outright extinction. Wikipedia acts as a natural focus for the creation of texts in these languages that might otherwise be missing - a repository of linguistic wisdom that can be shared and built on. In this way, it plays an important role not just in spreading knowledge about the world, but also about the languages that people use to talk about that world. (via @klang67)

Follow me @glynmoody on Twitter or identi.ca.

28 August 2008

Words Fail Us

linguistics professor and author shares a personal selection from the thousands of languages on the brink of disappearing

How about if we all volunteered to learn an endangered language? - You can put me down for Ket:

Ket is the only Siberian language with a tone system where the pitch of the voice can give what sound like identical words quite different meanings. (Much like Chinese or Yoruba). To add to the difficulty for any westerner wishing to learn it, it also has extremely complicated word structure and grammar.

23 April 2008

Well, Well, WALS

Now that's what I call open content:

WALS is a large database of structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials (such as reference grammars) by a team of more than 40 authors (many of them the leading authorities on the subject).

WALS consists of 141 maps with accompanying texts on diverse features (such as vowel inventory size, noun-genitive order, passive constructions, and "hand"/"arm" polysemy), each of which is the responsibility of a single author (or team of authors). Each map shows between 120 and 1110 languages, each language being represented by a symbol, and different symbols showing different values of the feature. Altogether 2,650 languages are shown on the maps, and more than 58,000 datapoints give information on features in particular languages.

WALS thus makes information on the structural diversity of the world's languages available to a large audience, including interested nonlinguists as well as linguists who would not normally read grammars of exotic languages or specialized works by comparative linguists. Although endangered languages are not particularly emphasized, they are automatically foregrounded because of the large sample of languages represented on each map, where each language (independently of its number of speakers) is shown by a single symbol.

(Via Languagehat.)