Difference between revisions of "Calc/Performance/string handling in formula compiler"

From Apache OpenOffice Wiki
Jump to: navigation, search
(toUpper() bottleneck)
 
(results and change)
Line 4: Line 4:
 
* No need to do this at all for tokens of operators, separators, parentheses, ... all tokens that do not involve letters.
 
* No need to do this at all for tokens of operators, separators, parentheses, ... all tokens that do not involve letters.
 
* When loading ODF documents, only a simplified ASCII toUpper() needs to be called, since all function names are stored using English names.
 
* When loading ODF documents, only a simplified ASCII toUpper() needs to be called, since all function names are stored using English names.
 +
 +
A [http://www.openoffice.org/nonav/issues/showattachment.cgi/60645/i99828_formula_compiler.ods test case document] is attached to {{bug|99828}}, containing two columns of functions, 64k rows of formulas each, with a function name, some references, a value and a few operators and separators. Profiling gave
 +
 +
Level  Method                      Instr. (incl.)    %      Called
 +
---------------------------------------------------------------------
 +
        Application::Execute        19,000,207,876
 +
        ScDocShell::LoadXML        18,500,074,143
 +
1      ScFormulaCell::CompileXML    5,119,853,402  26.9      131,072
 +
2      ScCompiler::CompileString    3,664,394,563  19.2      131,072
 +
3      ScCompiler::NextNewToken    3,171,730,381  16.6      983,040
 +
4      String::~String                276,824,113  1.5    1,441,792
 +
4      CharClass::toUpper            948,874,151  5.0      589,824
 +
 +
After having eliminated unnecessary toUpper() calls and rearranged things
 +
a bit for less temporary strings the results were
 +
 +
        Application::Execute        17,961,981,399
 +
        ScDocShell::LoadXML        17,464,830,027
 +
1      ScFormulaCell::CompileXML    4,110,890,473  22.8      131,072
 +
2      ScCompiler::CompileString    2,641,827,274  14.6      131,072
 +
3      ScCompiler::NextNewToken    2,149,151,806  11.9      983,040
 +
4      String::~String                171,048,974  0.9      983,040
 +
        CharClass::toUpper                      0  0.0            0
 +
 +
which is an overall improvement of roughly ~5% under LoadXML().
  
 
[[Category:Calc|Performance/string_handling_in_formula_compiler]]
 
[[Category:Calc|Performance/string_handling_in_formula_compiler]]
 
[[Category:Performance]]
 
[[Category:Performance]]
[[Category:InProgress]]
+
[[Category:Done]]

Revision as of 17:17, 3 March 2009


Compiling formulas almost unconditionally calls toUpper() on every token parsed and spends way too much time in the underlying i18n routines.

  • No need to do this at all for tokens of operators, separators, parentheses, ... all tokens that do not involve letters.
  • When loading ODF documents, only a simplified ASCII toUpper() needs to be called, since all function names are stored using English names.

A test case document is attached to Issue 99828 , containing two columns of functions, 64k rows of formulas each, with a function name, some references, a value and a few operators and separators. Profiling gave

Level   Method                      Instr. (incl.)     %       Called
---------------------------------------------------------------------
        Application::Execute        19,000,207,876
        ScDocShell::LoadXML         18,500,074,143
1       ScFormulaCell::CompileXML    5,119,853,402  26.9      131,072
2       ScCompiler::CompileString    3,664,394,563  19.2      131,072
3       ScCompiler::NextNewToken     3,171,730,381  16.6      983,040
4       String::~String                276,824,113   1.5    1,441,792
4       CharClass::toUpper             948,874,151   5.0      589,824

After having eliminated unnecessary toUpper() calls and rearranged things a bit for less temporary strings the results were

        Application::Execute        17,961,981,399
        ScDocShell::LoadXML         17,464,830,027
1       ScFormulaCell::CompileXML    4,110,890,473  22.8      131,072
2       ScCompiler::CompileString    2,641,827,274  14.6      131,072
3       ScCompiler::NextNewToken     2,149,151,806  11.9      983,040
4       String::~String                171,048,974   0.9      983,040
        CharClass::toUpper                       0   0.0            0

which is an overall improvement of roughly ~5% under LoadXML().

Personal tools