It's the language, stupid

Interview with Javier Solá from KhmerOS on Free Software for the Third World and why Mozilla will disappear

Phnom Penh, october 15th 2005

The remembrance of the Khmer Rouge's victims can still be felt everywhere around Cambodia's capital Phnom Penh. More than two decades after the fall of Pol Pot's terror regime the country slowly releases itself from the deadlock of civil war. But apart from collecting guns and fighting the further spread of HIV, there is also another issue Cambodia has to overcome – using computers. For people in the US and Europe, it is common to use computers in their own language. But imagine you would have to use a Chinese word processor to write an English text, and this word processor is not really able to handle English well. That is roughly the situation in Cambodia. Most people speak only very little English and are therefore handicapped in doing even simple tasks. Javier Sola, a spaniard living in Cambodia, and the NGO Open Forum of Cambodia want to change this by localizing open source software for Khmer, as the cambodian language is called officially, and provide the people a legal alternative to pirated copies of Windows and Microsoft Office.

Timo Kozlowski: You mentioned the “advantage of being poor” concerning software. How can being poor be an advantage?

Javier Sola: The advantage of being poor is that proprietary software companies are not interested in translating their software into your language, which is in our case Khmer, the official language of Cambodia. So if you localize your software, you have the advantage of language: The people in the country can understand your software, whereas they do not understand the proprietary software, because it is in Englisch. Cambodians learned the English alphabet, and they recognize the single characters but not the words. So you have to memorize all the words by heart in a strange language, and that does not make sense. When you learn something in another language, usually you forget it rather quickly. You do not really learn. In the end, language is the most powerful reason to change to open source – if the language is on your side.

In Cambodia, people do not care too much about the price of software, as you can find it here anyway. Whether you can modify the code or not, people do not really care about that.

Most people are not capable of doing so.


Yes, the user only wants software, that works well, and if possible you should be able to use it in your own language. In any European country you do not find software in a different language than the language of the country. In Spain, you cannot find an English keyboard, because it is not on sale there. All the software is just Spanish. Once you have software in your own language, it is very hard to change back to a foreign language. So we are working on that.

We did a very strong localization effort. We have very good software in Khmer, and we work on the Unicode standard to overcome all the different encodings for Khmer which had been hitherto in use, so that we can move on on one encoding standard. Then, I think, the development of software in this country will explode. It will really grow as it had happened in many other countries, for example in Japan and Korea. Information and Communication Technology (ICT) really starts, when you have it in your own language. The thing is, in those countries it was Microsoft programs. And here, we are the first country, that has a complete localization of quite a number of products that Microsoft has not done.

And we are doing the localization in a very orderly way, so that we have not only just the software, but we have also created very high quality training materials. And now we are training teachers. At Open Forum we train about 20 teachers a week on how to use and how to teach OpenOffice in Khmer. We show these people a new way to teach, as they are professional teachers who used to teach using Microsoft programs, and we give them courses on how to teach OpenOffice – in Khmer.

So the “orderly way” you mentioned is a package, that consists of software on the one side, and training on the other?

Everything has to go together. You have to distribute your software, and you have to train people in using your software. People do not like to install software on their computers. They are afraid, and especially they do not want to change to software, that they do not know. So fear is the main reason not to change software. This applies to everybody, not only here. It is because of fear of losing productivity, fear that your computer will break. So you do not touch your computer except for the work you normally do.

Never change a running system!

Right. So we have to reduce that fear by providing a pre-installation of our software, or doing the installation by ourselves. We go to as many places as we can, but for that you need the computer vendors on your side. You do not get them to your side by simply changing to Linux. You have to follow a longer path. First, you use open source software on Windows, and they will support it, and pre-install it on new computers that they sell. And then, you can slowly move into Linux, when you are creating know-how in the country, which is what we are doing right now. People spend about 95 percent of their time working in the applications. Once they get used to spend these 95 percent working with our software in Khmer, then you have to change only five percent more to have them using an operating system that is also in Khmer, and which happens to be Linux.

But we do not talk about open source software in Cambodia. We talk about Khmer software, because that is what people want. They want software in their own language. We do not try to build up an open source community, instead we are trying to build up a language community.

Microsoft released the “Windows XP Starter Edition” – a Windows XP version at a reduced price, but also with less features, and they provide localization for some Third World countries, e.g. Thailand. So would you welcome a Khmer version of “Windows XP Starter Edition”?

I am in doubt it would help them, because there is no market for it. No one is going to pay one cent, as the copyright laws are not enforced. For years, no one in Cambodia is going to pay any money for Microsoft software, and so Microsoft has no interests. Not in the next time.

And in addition, Khmer is not so easy to implement in computer software. Right now, Khmer language still provides a lot of technical problems for Microsoft Office. It breaks all the time, as a matter of fact.


Our products support Khmer language much better. The Windows version use Microsofts “uniscribe.dll” at the bottom, but on the top of it we support using Khmer script better. For example, when you use Microsoft Office to write Khmer, the program tends to crash easily. With other programs you cannot write Khmer correctly – wrong spacing, etc. But if you use OpenOffice, you can use Khmer without problems.

You said on your website, that you can download a newer version of this Dll, and copy it into the right directory.

Right. The version that comes with Microsoft Office 2003 is unstable, but the newer version is unstable. But we cannot distribute it by ourselves, because it belongs to Micrsosoft.

How do you provide Khmer support on Linux systems?

There are three important libraries on Linux: Qt, Pango and ICU. We have developed our own rendering engines for Linux and included it as Khmer modules in thise three libraries. So we render Khmer script with software that we have developed, and that works very well. ICU is used with OpenOffice, Java, and a number of things. When you are inside of these libraries, you have full support of the interface in any Linux program that uses these libraries. And most programs use them – either Qt, or Pango. Not all of the graphic programs make use of these libraries, but I think we will be getting there, as it offers better support for Complex Script.

Do you also support Graphite, the rendering system by SIL International?

Graphite was a great solution, when all those languages which use complex script were not supported by Windows. SIL did a great job by defining all these scripts very well. In Graphite all the necessary information to display complex scripts correctly was stored inside of the font. But the problem with Graphite is, it does not use standard fonts. So you can only use them with applications, that are supported by Graphite. So although they have done a great job, they will disappear in the long run, because now the standard font format is OpenType, and any rendering engine supports OpenType, but not Graphite.

But OpenType has its own pitfalls: Some information is inside of the font, some is inside of the rendering engine. And that makes it quite complicated, because now the rendering engine has to support your script as well. For example, Microsoft does not support Myanmar script. But we have been able to develop a font for Myanmar that works correctly using advanced OpenType functions. So now we can render Myanmar text correctly, and we are preparing this script also for ICU, Pango, and Qt.

The goal of our project is to support all the scripts in Southeast Asia. Apart from Khmer, we have also worked on Bhutanese, which is a Tibetan script. We implemented it in OpenOffice and Pango. The only two scripts left are Myanmar and Laos, and right now we are working on them.

So you are not only working on Cambodia. Are there any other projects you are running?

We also do the Open Source Localization Toolkit. This is aimed towards small economies to provide them with the tools and a framework to do localization by themselves. The main problem in localization is the lack of information. We saw that when we worked on the documentation on how to localize OpenOffice from point zero on. That was the our start. Then we started localizing another program called the Translation Toolkit from South Africa. We started working together with them, and established the Localization Framework project. With this, we want to change the whole way open source is localized providing better tools, changing the formats in use for localization to formats that include much more information. It includes tools for work flow, and tools to organize the needed information – glossaries, translation memories. For example, when the translator has to translate a dialog, our system will tell him: “There are four words that also appear in the glossary with these translations. So please make sure to use them”, or “This sentence was translated before like this”. These are informations for the translator that help him to make a good job. With this localization framework you can really improve the translation.

Perhaps you could summarize the computer and software situation in Cambodia.

There are about 50 to 70,000 computers installed. That is a growth of 15 to 25,000 a year. I would guess that to thirds of them would be able to run Windows 2000 or Windows XP, which means that they can use Unicode. About 40 percent of the computers are bought by international or large organizations, 30 percent are bought by NGOs, 20 percent by development aid workers, and 10 percent by SMEs, or private people. It rather seems that some of the computers bought by NGOs are actually donors for example to the government. So it is hard to know who has them.

Where do these 15 to 25,000 new computers per year get installed?

Mostly in Phnom Penh. But the growth of computers goes together with the growth of jobs because you also need people to maintain them. We are going to start a project next year to support this development. Because common people rarely have a change to work on computers, they have very little and mostly theoretical experience in administrating and programming. In Cambodia, you learn programming by simply doing it on the job. So we want to build up small projects with programming tasks, to create knowledge within the country, which is essential for the future. And you can only achieve this knowledge by experience.

Then how many software specialists are there in Cambodia?

It depends what you call a software specialist, but not many. For example, there is only one open source software specialist, and he is German. There are a few people who know Linux. We are developing demonstration materials in Khmer for Linux, and we have people to be certified as Linux demonstrators. At first we have to create a support network for Linux. Right now, it does not exist, yet. We need a large number of Linux demonstrators in the country, and that is a part of our strategy, that we will be able to support Linux later. All the vendors need to be able to support Linux.

We think, that there are not more than 600 to 700 people in Cambodia at this time, who are teaching the usage of computers. We would like to reach all of them, if we can, to re-train them to use Khmer software. It takes some work because they are used to English software. The Khmer terminology sounds funny to them, but this is always the case. When you hear a fancy word you do not understand, it sounds right, but when you hear the same word in your own language, it sounds funny. Usually it takes two or three days to get used to the language. But it will change from the bottom. It will change, because for example people in the province often do not speak English. They cannot learn to use computers when they are in English. Not only in Cambodia, also in other Third World countries. You cannot be successful in ICT, as lang as you have no computers that can use the local language.

How long did it take for you to learn Khmer?

Learning the characters and how they work together was quite fast. But that is just the more technical aspect of it. When it comes to reading it is the same as for any other language: You need to have pictures for the words. If you do not have these pictures you get lost very quickly, especially for Khmer and other scripts like Thai and Laos that are based on Indic scripts, because they have no spaces between words.

How did you learn the Khmer script?

With a book. The writing is very systematic. Learning it by yourself is not so much the problem, but usually people here do not know very much about writing. One of the things we have to do is to prepare tools that teach people about writing.

I heard that there is no official dictionary of Khmer language.

There is one dictionary which was written in 1957. But the spellings have changed since then. Even the number of vowels has changed, there are many more nowadays. In this dictionary we have 20, now we have at least 23, some sources even say 28.

There is a new edition of this dictionary, but they copied it exactly without paying attention to the changes of the language since then. Khmer has gone through a simplification in writing. One way of writing a word is simpler than the other, but it is not accepted by some people. The people who write this dictionary would have to be be more flexible. Right now, we have no other source and reference to develop our software, but the spelling in this dictionary is different from the spelling in normal school books.

And bilingual dictionaries?

There are very good ones. They only use the modern spelling, and they sort the word alphabetically. The monolingual dictionary uses phonetic ordering, so if you do not know how to pronounce a word, you are lost. This is especially hard for the many words that come from Pali and Sanskrit.

Does Khmer also have many composite words?

Yes, so some words can be found at two or three places in the dictionary.

How did you get involved in making computers being able to run Khmer software?

I came here two and a half years ago, and I was working at a children's home for a while. That was, when I saw the problem of Khmer language in computers, and thought it would be an interesting thing to do. I did not do anything on open source at that time. I got into it slowly. My main goal was to get the children use computers. But it had some other positive effects for the government and SMEs, too. For example the effort for training is greatly reduced when you can use software in your own language. And then, the language survives. That is also a very important issue. A language without technological terms is quickly taken over by other languages.

When did you start the KhmerOS project?

Generally 2004 when I met the NGO Open Forum Cambodia, and i said, “Yes, let's do it with an NGO.” At the beginning we hired two people who were working on the glossary of computer terms in Khmer language. Thead did read all the relevant books and magazines in Khmer on computers. We compiled all the Khmer words used in these publications, and then completed this list. This glossary is the basis of all our efforts to translate software to provide consistent use of terminology.

If you hire somebody, you need to pay them. From where do you receive your funds?

in the beginning the NGO had funds already because it had a German sponsor. And at that time the euro was very high – higher than when they got their funds, and so there was some money left. We did not have to spend a lot of money, only 600 dollars a month to cover all the expenses.

But then we had to get the grants from somewhere else. We started writing the documentation on how to localize OpenOffice, and we used the money to hire more people. By the middle of 2004, we had translated Thunderbird, but we needed more people to do the work on OpenOffice.

Then we got funded from the Internet Society, an international organization founded 1992, which has been following the growth of the internet from the beginning on. All the people who made important contributions to the internet are part of this society. They had been the major funders this year. They funded about 50 percent of our IT, altogether 20,000 dollars. Producing all of our localized software has cost less than that.

With these funds we are able to produce our training materials, hire trainers and train them, and send people all over the country to install our programs. We also work together with the government. They have developed the “Master Plan for Implementation of Open Source in Cambodia”. They have established a certification system based upon our training materials. So teachers and students in their courses on OpenOffice in Khmer can get certified by the government.

Did you approach the government to ask for assistance, or did they come to see you?

At the beginning we tried to contact them, but we had to find the right person in the government. That is always the problem: Finding the right person at the right time. Now we work together with the National Information Communications Technology Development Authority (NiDa), and this is a very nice relationship. They really do a lot of things, and they want to move ahead.

Did they move into the same directions as you before?

They knew something, but we had the necessary experience, and so it was natural to work together. We also participated in writing the already mentioned master plan, but it is not our work. We just advised the government.

What are the main steps in this master plan?

We have a Windows year, and a Linux year. In the Windows year, people use our localized software on Windows computers, and in the Linux year we plan to move on to Linux completely. Right now, we are working on that.

How many people are involved in KhmerOS at the moment?

At our NGO about 30 people, and within the government about 20 people.

Apart from OpenOffice you also localized Thunderbird and Firefox, but you changed the names of these programs.

We did that for two reasons. The first one is that Firefox and Thunderbird are really difficult names for Khmer speakers. But Moyura, as we call Thunderbird, and Mekhala, our version of Firefox, are better names for Cambodia. The Windows versions can be downloaded from our website.

The second reason is, that we strongly disagree with the Mozilla Foundation's policies on trademarks because of two things. First, if you change anything, you cannot use their trademarks any more. Second, they make products for the First World, not for the Third World.

For example, the Mozilla Foundation's version of Thunderbird cannot be used on low resolution displays like 800x600 pixels because the windows are too big. They do not fit into the screen. But they refuse to fix this issue themselves. They say, that they produce software for the US market. So we prepared the Windows version for displays like that, but now it cannot be called Thunderbird any more.

Or the Mozilla Foundation made an agreement with Google to include a search button into Firefox. They get paid for this, but only, if all Firefoxes have that. So they cannot allow an unofficial binary to be called Firefox. Only, if they compile it by themselves on their servers. If you go through all their technical stuff on their servers, and put your translation in, then you have the right to call your localization Firefox. And this is not the way to go for us. So we localize in our own way, and use different names. It is sad that we cannot contribute back, but it is the way they want it.

The Mozilla Foundation has the best product ever in internationalization. You just make your translation, and put it in. Then it changes the language automatically. No other product has been able to do that, but they refuse to let people use it, because they want to maintain their trademark policy.

They really make it difficult, not only for us, but also for the people who work on the Linux distribution Debian, because Debian only accepts open source software. But Firefox is full of proprietary stuff.

As far as we are concerned, we are very happy with their products. We keep up with their source code, as it is open source, and then we use it. For cybercafes we sometimes do multilingual installers – Khmer, English, French, Korean, etc. – and then we make them all work. But none of them can be called Firefox, even if we make it better. Sorry! And this is a real pity. I think, they will disappear with the time, because they have the wrong localization policy.

Are there also localization policy problems with OpenOffice?

You can compile it by yourself, but there are also people who compile it for you outside of the OpenOffice community. There is a real nice guy from the Czech Republic who builds OpenOffice in 20 different languages.You just upload your translated files onto his website, and afterwards you can download the recent version of OpenOffice in your language. It is really simple.

With which other open source projects do you work together?

We work with smaller projects. We are trying to improve the localization from here. For example the desktop environment – we decided to use KDE, so we go around all the applications in KDE and take a look at them. But as these applications are much smaller, there is no open source community for them in that sense. At least not when it comes to localization.

When you introduce Linux in Cambodia, which distribution will you choose?

It is hard to say. Right now we are working with all of them. KDE makes you tend to SUSE, but we will make sure that all of them work very will with Khmer language. And we will especially make sure that Debian will work very well. But we do not want to choose, yet.

Javier Solá is the coordinator of the Khmer OS Initiativ. Before he was Director of the Spanish Internet Users Association, and actively involved in the creation of ICANN, where he was chairing the working group to introduce new top level domains (.info, .biz), and tought at universities in France, Spain, the United States and Cambodia. He holds a BSc degree from Duke University and a Masters Degree in Computer and Information Sciences from Ohio State University (both in USA)


Free Joomla! template by L.THEME