LoadICUBreakIterator

From Apache OpenOffice Wiki
Jump to: navigation, search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Breaking encapsulation of ICU BreakIterator

Because of Issue 84467 (duplicate of the Issue 81519 ) we are using RuleBasedBreakIterator() constructor and then we want to setBreakType() there.

There is a fix to this that removes the patch to ICU by creating a subclass of RuleBasedBreakIterator which can access the protected setBreakType() member. The bug is here: Issue 88411

ICU code:

OpenOffice.org code:

Mailing list discussions:

Example reasons to use custom rules:

Use cases of loadICUBreakIterator

Questions:

  • Why does wordRule need to be static and preserved across the calls?
  • Is rulestring word used at all? Other WordTypes?
public method loadICU call resulting rule text

nextCharacters(Text, nStartPos, rLocale, SKIPCELL, sal_Int32 nCount, nDone)

prevCharacters(Text, nStartPos, rLocale, SKIPCELL, sal_Int32 nCount, nDone)

loadICUBreakIterator(rLocale, LOAD_CHARACTER_BREAKITERATOR, 0, "char", Text) char

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, ANYWORD_IGNOREWHITESPACES)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, ANYWORD_IGNOREWHITESPACES)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, ANYWORD_IGNOREWHITESPACES, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, ANYWORD_IGNOREWHITESPACES, NULL, Text) edit_word

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, DICTIONARY_WORD)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, DICTIONARY_WORD)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, DICTIONARY_WORD, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, DICTIONARY_WORD, NULL, Text) dict_word

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, WORD_COUNT)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, WORD_COUNT)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, WORD_COUNT, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, WORD_COUNT, NULL, Text) count_word

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, another_word_type)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, another_word_type)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, another_word_type, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, another_word_type NULL, Text) word (???)

beginOfSentence( const OUString& Text, sal_Int32 nStartPos, rLocale)

endOfSentence( const OUString& Text, sal_Int32 nStartPos,rLocale)

loadICUBreakIterator(rLocale, LOAD_SENTENCE_BREAKITERATOR, 0, NULL, Text); NULL

getLineBreak( const OUString& Text, sal_Int32 nStartPos, const lang::Locale& rLocale, sal_Int32 nMinBreakPos, const LineBreakHyphenationOptions& hOptions, const LineBreakUserOptions& /*rOptions*/ )

loadICUBreakIterator(rLocale, LOAD_LINE_BREAKITERATOR, 0, "line", Text); line
  1. Figure out if locale BreakIteratorRules ({edit_word, dict_word, count_word, char, line}) gives something for the requested locale
  2. If not, try to load rule+_ + lang string anyway.
Personal tools