Indian language wikipedia community has so far been looking at more of input statistics, like # of pages, page depth to measure their growth. Using absolute numbers based on content contribution is not a right metric to compare the languages. Outcome based measures like the # of unique visitors per Million speakers is a better measure. Instead of looking at ranking based on value, comparison between groups of languages based on grades can show better insights. I present here the same for Indian languages with more than 1 Million primary and secondary speakers. I have excludeded Sanskrit, Bishnupriya Manipuri (score of >10000) . I have dropped English (>10000) as it is a worldwide language and is not useful for comparison. I have included Chinese(zh) as that language has similar problems as our with regard to use on computers.
It shows Malayalam tops the list and is on par with Chinese (approximately 150). Tamil. Telugu, Marathi are in the second group(>50), Kannada, Urdu,Hindi, Gujarati, Bengali(>25) are in third group. Assuming that these language communities have similar issues with regard to Wikipedia, this trend is expected to continue for the next decade. Only surprise element may be, if Hindi beats the prediction and emerges as the top, because of it being treated as a national language and incorporating it in the curriculum. These statistics I hope will make the Indian Wikipedian communities think about what needs to be done to move to the top grade. Another interesting analysis on how soon these will grow would be, to look at the number of active Wikipedians and what is being done to prmote wikipedia in these communities. I will plan that for a later post. Meawhile I look forward to your feedback.
For those, who are interested in the inputs for this analysis, I have taken the daily statistics on 2009-09-01 and the language population numbers at wiki for generating this chart.
Indian Wikipedias-Grading based on Traffic Analysis
Posted by
Arjun
at
Saturday, November 14, 2009
Labels: indian , languages , statistics , telugu , wikipedia
3 comments:
Hoi,
I think you have the numbers wrong.. just above in your first source you find zh-yue not zh above ml
The order in size is best gleaned from the WMF official numbers..
Thanks,
GerardM
PS I appreciate your blogging :)
@Gerard,
At the first glance, I too thought that there are lot of mistakes in the chart. But the numbers on the chart are (# of visitors per Million language speakers), which should be correct in this case.
@Arjuna,
Your post title should have been "Indian Wikipedias - visitors per million speakers", or something similar. So that there wouldn't be much confusion, for busy readers.
Thanks Gearad, Pradeep for your valuable feedback. I will be more careful with titles in futures.
Post a Comment