Difference between revisions of "LoadICUBreakIterator"

From Apache OpenOffice Wiki
Jump to: navigation, search
(add discussion links and bugs)
(Breaking encapsulation of ICU BreakIterator: Link to Bug 88411, which is a better implementation than the one that patches ICU 3.6)
Line 2: Line 2:
  
 
Because of {{Bug|84467}} (duplicate of the {{Bug|81519}}) we are using <code>RuleBasedBreakIterator() constructor</code> and then we want to <code>setBreakType()</code> there.
 
Because of {{Bug|84467}} (duplicate of the {{Bug|81519}}) we are using <code>RuleBasedBreakIterator() constructor</code> and then we want to <code>setBreakType()</code> there.
 +
 +
There is a fix to this that removes the patch to ICU by creating a subclass of <code>RuleBasedBreakIterator</code> which can access the <code>protected</code> <code>setBreakType()</code> member. The bug is here: {{Bug|88411}}
  
 
ICU code:
 
ICU code:

Revision as of 16:55, 1 May 2008

Breaking encapsulation of ICU BreakIterator

Because of Issue 84467 (duplicate of the Issue 81519 ) we are using RuleBasedBreakIterator() constructor and then we want to setBreakType() there.

There is a fix to this that removes the patch to ICU by creating a subclass of RuleBasedBreakIterator which can access the protected setBreakType() member. The bug is here: Issue 88411

ICU code:

OpenOffice.org code:

Mailing list discussions:

Example reasons to use custom rules:

Use cases of loadICUBreakIterator

Questions:

  • Why does wordRule need to be static and preserved across the calls?
  • Is rulestring word used at all? Other WordTypes?
public method loadICU call resulting rule text

nextCharacters(Text, nStartPos, rLocale, SKIPCELL, sal_Int32 nCount, nDone)

prevCharacters(Text, nStartPos, rLocale, SKIPCELL, sal_Int32 nCount, nDone)

loadICUBreakIterator(rLocale, LOAD_CHARACTER_BREAKITERATOR, 0, "char", Text) char

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, ANYWORD_IGNOREWHITESPACES)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, ANYWORD_IGNOREWHITESPACES)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, ANYWORD_IGNOREWHITESPACES, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, ANYWORD_IGNOREWHITESPACES, NULL, Text) edit_word

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, DICTIONARY_WORD)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, DICTIONARY_WORD)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, DICTIONARY_WORD, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, DICTIONARY_WORD, NULL, Text) dict_word

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, WORD_COUNT)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, WORD_COUNT)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, WORD_COUNT, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, WORD_COUNT, NULL, Text) count_word

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, another_word_type)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, another_word_type)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, another_word_type, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, another_word_type NULL, Text) word (???)

beginOfSentence( const OUString& Text, sal_Int32 nStartPos, rLocale)

endOfSentence( const OUString& Text, sal_Int32 nStartPos,rLocale)

loadICUBreakIterator(rLocale, LOAD_SENTENCE_BREAKITERATOR, 0, NULL, Text); NULL

getLineBreak( const OUString& Text, sal_Int32 nStartPos, const lang::Locale& rLocale, sal_Int32 nMinBreakPos, const LineBreakHyphenationOptions& hOptions, const LineBreakUserOptions& /*rOptions*/ )

loadICUBreakIterator(rLocale, LOAD_LINE_BREAKITERATOR, 0, "line", Text); line
  1. Figure out if locale BreakIteratorRules ({edit_word, dict_word, count_word, char, line}) gives something for the requested locale
  2. If not, try to load rule+_ + lang string anyway.
Personal tools