User:Bluedwarf/Asian Phonetic Guide

From Apache OpenOffice Wiki
Jump to: navigation, search

Abstract

OpenOffice.org Writer has a function for Asian Phonetic Guide, so-called ruby, for Chinese and Japanese, which allows you to add annotative ruby characters to a corresponding Kanji string. To enter ruby on OpenOffice.org Writer, you usually open Asian Phonetic Guide dialog, however, this dialog is a little bit inconvenient for users in Japan.

In this page, first, the background and details of current inconvenience in that dialog are explained. Second, we look into the equivalent feature of other Office suites. At the end, some ideas for improvement of OpenOffice.org are discussed.

Note that this article doesn't cover phonetic function of Calc but only one of Writer. I know some strong requests exist for phonetic support in Calc and activities to implement phonetic function in Calc can be seen in OpenOffice.org project, which are lagging little bit, but that's another problem. This page covers only phonetic support in Writer.

Background

OpenOffice.org has a good ability to split Japanese text into grammatical segments even though Japanese text doesn't have spaces between words. This separation process is done by OpenOffice.org without user intervention. Yet this automatic separation sometimes fails due to complexity and ambiguity of the grammar. It sometimes separates text in wrong position or position which you don't expect. Unfortunately, this problem is essentially (probably) unresolvable.

As a result, when you edit ruby text, you may encounter some kinds of inconvenience because incorrectly split segments are shown in Asian Phonetic Guide dialog, which is described below.

Problems

Integration

If OOo generates smaller segments than you expect, you would try to merge two or more segments into one. However, there is not a good solution to integrate segments.

For example, here is a word "国際連合貿易開発会議" which means UNCTAD in English. Now let's try to add ruby to this word in OpenOffice.org 3.1 Writer. You will get the same result shown in the picture below. As you see, this word are recognized as the combination of 5 words, while you consider it as 1 word.

100px‎

Then, to combine them into one segment, you would cut the last 4 parts and paste them together into the top text box like the picture below.

100px‎

This operation results in an unexpected situation like the picture below. The 4 parts you cut and pasted would be duplicated.

Of course, there is one simple workaround that you only delete the duplicated parts, but this is quite inconvenient. It would be especially troublesome when you have a lot of wrongly separated segments in one document.

100px‎

Separation

In contrast, OOo sometimes generates longer segments than you expect. In this case, you would try to separate into a couple of segments.

For example, here is a word "東京都" which means Tokyo Metropolis. Now try to add ruby to this word in OpenOffice.org Writer. You will get the same result shown in the picture below. As you see, this word is recognized as 1 word.

100px‎

However, in some conditions, you might want to separate this word into 2 parts.

To do so, you have to close Asian Phonetic Guide dialog once, select the first part and open the dialog to add ruby, and do the same thing for the second part, but you would think that this is a quite annoying operation.

100px‎

Ruby layout in vertical text

This is one of the greatest obstacles to treat Ruby characters correctly. It has been handled by http://jsdp2007.net/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8, which created some patches but not have note been integrated into OpenOffice.org yet.

Related issues:

Competitors

Microsoft Office Word 2003

This is the ruby dialog of Microsoft Office Word 2003. When you try to add ruby text to the Japanese text "国際連合貿易開発会議", then you will see the following picture.

100px‎‎

As you see, it results in the similar separation to OpenOffice.org Writer, but you can see some ruby strings are already inserted, which is one different point. Microsoft Office Word 2003 has a function, which not only separates text into segments but also suggests ruby text automatically.

Anyway, now here you try to integrate the separated 5 words into one. In this case, you can do that only by clicking the button "文字列全体", and you will get the correct result shown below.

100px‎‎

As to the example of "東京都", you can break this word into characters by clicking "文字単位" button.

100px‎
100px‎‎

As described above, compared to OpenOffice.org, Microsoft Word provide good functionality to integrate or separate segments you don't like, but it's not great enough because you can't integrate or separate them arbitrarily. The only two things you can do with Microsoft Word are to merge all segments shown into one and to split one all segments into characters. You can't set a separator in an arbitrary position.

Ichitaro

This is a very famous word processor in Japan, which is mainly focused on Japanese text processing. As some advertisements of this product appeal the advantage of its functionality related to ruby, we should investigate it.

Unfortunately, I don't have Ichitaro, so I hope somebody reports here how it deals with ruby text.

Ideas for the future

Integration and separation

Flexible integration and separation of ruby would be realized by putting "integration" and "separation" button at the right side in each ruby field. With these buttons, you could merge two adjacent segments or break apart one segment into separated characters.

100px‎

This idea would fulfill the following requests:

Technical comments:

  • This idea causes the changes of UI.

Automatic suggestion of ruby

It would be great if OpenOffice.org gave ruby for Japanese text automatically. It could be realized with the help of MS-IME like Japanese_Reconversion or a morphological analyzer like MeCab library. OpenOffice.org could suggest ruby by filling out the ruby text entries in Asian Phonetic Guide dialog. This would be very helpful to add ruby. What you have to do would be to correct only wrong sugements. You could leave accurate segments as suggested.

This idea can be extended to another idea that suggest and insert ruby for whole text or text in wide selected ranges without the help of Asian Phonetic Guide dialog. For some reasons, Asian Phonetic Guide dialog have the limitation of the number of seguments it can handle once. This limitation makes it impossible to suggest ruby for long text once. But, instead, ruby suggestion can be done without the dialog, which means suggestion of ruby text should be inserted directly when users request ruby suggestion for long text.

This idea would fulfill the following requests:

Technical comments:

  • This idea doesn't change UI.
  • The implementation of Japanese_Reconversion might be helpful to realize this idea.

Automatic insertion of ruby

When you wrote one document, for example, an article for children, where you wanted to insert ruby for the whole text, it would be wonderful if OpenOffice.org inserted ruby as you typed, which would be more accurate than the suggestion mentioned above. This could be realized with the help of MS-IME or some other kinds of Kana-kanji conversion software like ATOK through Win32 APIs. Technically, this can be done by request GCS_RESULTREADSTR with ImmGetCompositionString() function included in imm32.dll.

In this idea, one new particular mode (it would be called "ruby insertion mode") for Writer would be required, in which you would get ruby inserted automatically as you typed.

This idea would fulfill the following requests:

Technical comments:

Hided ruby

Some users want to preserve invisible ruby text for some reasons. For example, some users might want to make all ruby texts hide which were automatically inserted stated above, and would make some of them visible when needed. Technically, "visible" attribution for ruby text would enable this feature.

From the point of users, this attribute would enable that OpenOffice.org could remember ruby strings obtained through automatic insertion process or users' operation and revive them when they need.

This idea would fulfill the following requests:

Technical comments:

  • It should introduce a new attribution for ruby text.
  • This idea would cause UI changes to handle a new attribute.
Personal tools