Difference between revisions of "Calc/Performance/string handling in formula compiler"
From Apache OpenOffice Wiki
< Calc | Performance
(toUpper() bottleneck) |
(results and change) |
||
Line 4: | Line 4: | ||
* No need to do this at all for tokens of operators, separators, parentheses, ... all tokens that do not involve letters. | * No need to do this at all for tokens of operators, separators, parentheses, ... all tokens that do not involve letters. | ||
* When loading ODF documents, only a simplified ASCII toUpper() needs to be called, since all function names are stored using English names. | * When loading ODF documents, only a simplified ASCII toUpper() needs to be called, since all function names are stored using English names. | ||
+ | |||
+ | A [http://www.openoffice.org/nonav/issues/showattachment.cgi/60645/i99828_formula_compiler.ods test case document] is attached to {{bug|99828}}, containing two columns of functions, 64k rows of formulas each, with a function name, some references, a value and a few operators and separators. Profiling gave | ||
+ | |||
+ | Level Method Instr. (incl.) % Called | ||
+ | --------------------------------------------------------------------- | ||
+ | Application::Execute 19,000,207,876 | ||
+ | ScDocShell::LoadXML 18,500,074,143 | ||
+ | 1 ScFormulaCell::CompileXML 5,119,853,402 26.9 131,072 | ||
+ | 2 ScCompiler::CompileString 3,664,394,563 19.2 131,072 | ||
+ | 3 ScCompiler::NextNewToken 3,171,730,381 16.6 983,040 | ||
+ | 4 String::~String 276,824,113 1.5 1,441,792 | ||
+ | 4 CharClass::toUpper 948,874,151 5.0 589,824 | ||
+ | |||
+ | After having eliminated unnecessary toUpper() calls and rearranged things | ||
+ | a bit for less temporary strings the results were | ||
+ | |||
+ | Application::Execute 17,961,981,399 | ||
+ | ScDocShell::LoadXML 17,464,830,027 | ||
+ | 1 ScFormulaCell::CompileXML 4,110,890,473 22.8 131,072 | ||
+ | 2 ScCompiler::CompileString 2,641,827,274 14.6 131,072 | ||
+ | 3 ScCompiler::NextNewToken 2,149,151,806 11.9 983,040 | ||
+ | 4 String::~String 171,048,974 0.9 983,040 | ||
+ | CharClass::toUpper 0 0.0 0 | ||
+ | |||
+ | which is an overall improvement of roughly ~5% under LoadXML(). | ||
[[Category:Calc|Performance/string_handling_in_formula_compiler]] | [[Category:Calc|Performance/string_handling_in_formula_compiler]] | ||
[[Category:Performance]] | [[Category:Performance]] | ||
− | [[Category: | + | [[Category:Done]] |
Revision as of 17:17, 3 March 2009
Compiling formulas almost unconditionally calls toUpper() on every token parsed and spends way too much time in the underlying i18n routines.
- No need to do this at all for tokens of operators, separators, parentheses, ... all tokens that do not involve letters.
- When loading ODF documents, only a simplified ASCII toUpper() needs to be called, since all function names are stored using English names.
A test case document is attached to Issue 99828 , containing two columns of functions, 64k rows of formulas each, with a function name, some references, a value and a few operators and separators. Profiling gave
Level Method Instr. (incl.) % Called --------------------------------------------------------------------- Application::Execute 19,000,207,876 ScDocShell::LoadXML 18,500,074,143 1 ScFormulaCell::CompileXML 5,119,853,402 26.9 131,072 2 ScCompiler::CompileString 3,664,394,563 19.2 131,072 3 ScCompiler::NextNewToken 3,171,730,381 16.6 983,040 4 String::~String 276,824,113 1.5 1,441,792 4 CharClass::toUpper 948,874,151 5.0 589,824
After having eliminated unnecessary toUpper() calls and rearranged things a bit for less temporary strings the results were
Application::Execute 17,961,981,399 ScDocShell::LoadXML 17,464,830,027 1 ScFormulaCell::CompileXML 4,110,890,473 22.8 131,072 2 ScCompiler::CompileString 2,641,827,274 14.6 131,072 3 ScCompiler::NextNewToken 2,149,151,806 11.9 983,040 4 String::~String 171,048,974 0.9 983,040 CharClass::toUpper 0 0.0 0
which is an overall improvement of roughly ~5% under LoadXML().