On Windows keyboard layouts for minority languages in Russia
By Anatoly Mironov
I can’t write in Chuvash in Windows 8 (and all the previous Windows releases). Chuvash is a minority language in Russian Federation. In this blog post I want to summarize the status of the keyboard layout support of the minority languages of Russia and find a way to improve this situation.
Languages and Microsoft
There are thousands of languages. Of course it is hard to support them all. As per 2012-02-21 Windows 8 supports 109 (!) languages. In december 2012 the support for Cheerokee language was added.
Display language, locale and keyboard layout
In Windows 8, when you go to Language preferences - Add a language, you’ll get “a language”. Behind this general word there are three parts which have to be distinguished in this post:
- Display language (labels, messages and other user interface in the particular language)
- Locale (a set of preferences for a particular language and region/country like currency, point or comma as a decimal delimiter, ltr vs rtl, encoding and much more)
- Keyboard layout (just an arrangement of keys, their placement, can be specific for a language or country, can have different systems like Dvorak)
This blog post is about the keyboard layouts, the easiest part of the “language” support in an operating system.
Russian Federation minorities
There are 160 ethnic groups in Russia speaking over 100 minority languages. The most of ethnic groups ar so called stateless nations meaning there is no main country for this nation (e.g. Sami people in Sweden, but not Germans in the USA). In Russia there are 21 republics which have their own official languages alongside Russian and their purpose is to be home for ethnic groups. I’ll focus mostly on the official languages in these republics in this blog post, but it would be interesting to investigate smaller languages as well. Allmost all of the minority languages of stateless nations use the Cyrillic alphabet (often with additional letters). So it makes it pretty simple to see how many languages are supported in Windows 8. Just Go to the Language preferences -> Add a language and group them by writing system. See the screenshot above. There are only three minority keyboard layouts which are supported:
- Bashkir (1,45 millions speakers)
- Sakha (Yakut, 360 native speakers)
- Tatar (4,3 millions speakers)
The funny thing is that all the three are Turkic languages. There are two additional language keyboard layouts which are implicitly supported:
These two languages (which are co-official languages in the republic of Mordovia) don’t use any additional letters. That’s it. So they can write using only the standard Russian keyboard layout.
Keyboard layouts in Linux
Just a little comparison. In Linux distributions there are more minority languages from Russian Federation represented. The supported ones can be found in the /usr/share/X11/xkb/symbols/ru
file:
- Tatar / tt
- Ossetian / os
- Chuvash / cv
- Udmurt / udm
- Komi / kom
- Sakha (Yakut) / sah
- Kalmyk / xal
- Bashkir / bak
- Mari / chm
All these keyboard layouts were added by the community. I personally sent the Chuvash and Kalmyk fragments of that file to Sergey Udaltsov who created patch files and pushed it to freedesktop.
Windows 8 keyboard layouts and Touch mode
When I tried these three supported minority language keyboard layouts of Russia in touch mode, only one worked! It was the Tatar keyboard layout. The tatars can type all their additional letters in touch mode as well. Bashkir and Sakha keyboard layouts use the row above qwerty: 12345… Here is the preview for the classic Sakha keyboard layout: And what about the virtual touch keyboard layout for Sakha language? As you can see there are no keys for the additional letters for Sakha language (ҕ ҥ ө һ).
Summary
Many minority languages of Russian Federation (the most of them already endangered) miss the native keyboard layout support in Microsoft Windows 8 and Windows 7. Windows is a prevalent operating system in Russia. The support for minority language keyboard layout would help people to use their languages and give more chances for languages to survive. For now there are only 3 languages (besides Russian and implicitly some others like Moksha and Erzya) which are supported in Windows 8 with a physical keyboard: Tatar, Bashkir and Sakha. And only one of them (!) works even in touch mode: Tatar. The purpose of this post is only to identify the status for Russian Federation minority language keyboard layout support in Windows 8. Microsoft Local Language Program (LLP) seems very promising. I hope we will see more languages of Russia and other countries available in “Add language” menu in Microsoft Windows 8.
Long tap and additional letters in Windows 8 (update 2013-03-16)
After I wrote this post I discovered some additional letters available when you long-tap the buttons on the virtual keyboard. Here is an excerpt from the Microsoft Blog about the “press-and-hold”-letters:
There is an interesting counter example in press-and-hold behavior. On a physical keyboard, when you press and hold a character, it repeats. On our touch keyboard when you press and hold, we show alternate characters or symbols. This is something a touch keyboard can do well and a physical keyboard can’t. If you don’t know the specific key combination to show ñ or é or š, for example, it’s painful to type on a physical keyboard. It’s easy to find on the touch keyboard. Practically no one has complained about this departure from convention. We built on it, in fact. You might discover that you can simply swipe from a key in the direction of the secondary key, and that character will be entered, without an explicit selection from the menu. So if you use accented characters a lot, you can get pretty fast with this.
I appreciate this. Here come all the letters I found in the Russian keyboard layout:
Flyout letters
Main letter
Additional letters
у
к
н
г
з
х
о
э
с
и
Here is the full list of the Cyrillic additional letters:
ү
Cyrillic Ue
Bashkir, Tatar, Kazakh, Buryat, Kalmyk, Kyrgyz, Mongolian
ұ
Straight U with stroke
Kazakh
ҡ
Bashkir Qa
Bashkir
қ
Ka with descender
Kazakh, Uyghur, Uzbek, Tajik, Abkhaz
ң
En with descender
Bashkir, Tatar, Kazakh, Dungan, Kalmyk, Khakas, Kyrgyz , Turkmen, Tuvan, Uyghur
ҥ
En-ghe (Cyrillic)
Sakha, Meadow Mari, Altai, Aleut
ғ
Ge with stroke
Bashkir, Kazakh, Uzbek, Tofa, Tajik
ҕ
Ge with middle hook
Sakha, Abkhaz
ҙ
Ze with descender
Bashkir
һ
Shha
Bashkir, Sakha, Tatar, Kazakh Buryat Kalmyk Kildin Sami
ө
Barred O (Oe)
Bashkir, Sakha, Kazakh, Buryat, Kalmyk, Kyrgyz, Mongolian
ә
Cyrillic Schwa
Bashkir, Tatar, Kazakh, Abkhaz, Dungan, Itelmen, Kalmyk, Kurdish
ҫ
Cyrillic The
Bashkir, Chuvash
і
Dotted i
Kazakh, Ukrainian, Belarusian, Khakas, Komi, Rusyn
Those are missing: ӑ ӳ
ӗ
E breve
Chuvash
ӑ
A breve
Chuvash
ӳ
U with double acute
Chuvash
ӝ
Zhe with diaeresis
Udmurt
ӟ
Ze with diaeresis
Udmurt
ӥ
I with diaeresis
Udmurt
ӧ
O with diaeresis
Udmurt, Meadow Mari, Hill Mari
ӵ
Che with diaeresis
Udmurt
ӓ
A with diaeresis
Hill Mari
ӱ
U with diaeresis
Meadow Mari, Hill Mari
ӹ
Yery with diaeresis
Hill Mari
Here we have four fully functional language keyboard layouts if you are okay with long-tapping:
Bashkir
ғ ҡ ҙ ҫ ң һ ә ө ү
Sakha
ҕ ҥ ө һ ү
Tuvan
ң ү ө
Buryat
ө ү һ
Bashkir and Sakha, I suppose, were considered whilst designing the keyboard layout, and Tuvan and Buryat language letters only happen to be within the Bashkir and Sakha letters range. Tatar letters aren’t complete in the standard Russian keyboard layout, the reason for that must be, as I mentioned above, the full functional virtual keyboard for Tatar (where is no need for long-tapping). There is another language which contains all the letters through long-tapping. Kazakh is absolutely a minority language of Russia, but it doesn’t represent a stateless nation.
Kazakh
ғ ә қ ң ө ү ұ і һ
Long-tapping technique could be a solution for many minority languages of Russia:
Language
Existing letters
To be added
Chuvash
ҫ
Udmurt
ӝ ӟ ӥ ӧ ӵ
Meadow Mari
ҥ
ö ӱ
Hill Mari
ä ö ӱ ӹ
Komi
і
ö
Altay
ҥ
ј ӧ ӱ
Comments from Wordpress.com
It is time to standardize the Chuvash Keyboard Layout | Bool Tech - Apr 1, 2014
[…] appear on Android, in web browser (they use the standardized letters) and hopefully in Windows and iOS, we have to consider put the correct letters into the keyboard layouts. For Linux […]
Очень понравилась статья! Все четко и по делу. Прежде всего я задумался, а как сделать экранную версию (Touch mode) башкирской клавиатуры? Интересно, стандартная программа для создания раскладок от MS позволяет это делать? Еще хотел разобраться, правильна ли формулировка в предложении: «Allmost all of the minority languages of stateless nations use the Cyrillic alphabet (often with additional letters)». Вот, есть Cyrillic Script, а есть Russian alphabet. На мой взгляд, все кириллические символы давно уже являются частью кириллической письменности, а «дополнительными» эти буквы являются только по отношению к русскому алфавиту (мое личное мнение, не совсем корректно отсчитывать от него, так как это только один из множества языков + самоуничижительная формулировка). Я бы хотел перевести вашу статью и выложить на Хабре) Однажды уже опубликовал там вот это http://habrahabr.ru/company/adv/blog/189556/ (кстати, есть чувашская клава на маке?), и аудитория нормально отреагировала, резонанс получается неплохой.
Большое спасибо за комментарий. Я прошу прощения за то что я так долго не отвечал. Согласен полностью насчет дополнительных букв. Они дополнительные только по отношению к русскому алфавиту. Постараюсь подправить статью Я не знаю, выкладывали ли вы на Хабре. Я, конечно же, ничего против не имею. Насчет Touch Mode, к сожалению, сделать такую клавиатуру пока еще нельзя: http://superuser.com/questions/730588/can-i-create-a-custom-touch-keyboard-in-windows-8 Я стараюсь следить за БашСофтом и ТатСофтом. Многому можно вдохновиться у вас. Жалко, что мы работаем не вместе. Было бы намного легче работать сообща над компьютерными проблемами “миноритарных” языков России.
- Bashkir
- Chuvash
- i18n
- internalization
- keyboard
- keyboard layout
- l10n
- language
- linux
- localization
- Microsoft
- minority
- Sakha
- Tatar
- ubuntu
- Udmurt
- windows
- windows8
- xkb