Difference between revisions of "Calc/Performance/Specific Bottlenecks"

From Apache OpenOffice Wiki
Jump to: navigation, search
(OOX import issue 96758 relocated to Calc/Performance/Import_of_XLSX)
(relocated the Ou case to Calc/Performance/The_Ou_case)
Line 6: Line 6:
  
 
Content relocated to [[Calc/Performance/Zaske]], section preserved for external references linking here.
 
Content relocated to [[Calc/Performance/Zaske]], section preserved for external references linking here.
 
== The Ou case ==
 
 
Loading a large plain data file takes very long.
 
 
References:
 
* [http://blogs.zdnet.com/Ou/?p=120 George Ou's blog entry]
 
* [http://www.lanarchitect.net/Examples/200264-l.sxc The test case file] (.sxc)
 
* [http://www.lanarchitect.net/Examples/200264-l.zip Same data, but zip'ed Excel-XML]
 
 
Findings:
 
 
* source/filter/xml/xmlsubti.cxx
 
** 38% of time spent in ScMyTables::NewColumn() because of replicated use of aTableVec[nTableCount - 1]  (vector::operator[]) <br> Note: percentage may be off due to compilation without optimization to obtain exact line numbers that may result in STLport's vector methods being differently compiled.
 
*** proposed fix: should obtain the pointer once instead.
 
** Similar for other places where aTableVec[xxx] is used.
 
 
* '''TODO:''' Check all ScMyTables::.*() and ScMyTableData::.*()
 
** Especially for 63342857 calls to AddColumn() and NewColumn() that result in 1168654944 calls to operator[] ...
 
** 63081776 calls to AddColumn() originate from ScXMLTableRowCellContext::EndElement()
 
** Those are highly suspicious and seem to indicate that too many temporary elements are created for empty columns/cells (needs verification).
 
 
  
 
== Sorting values within functions ==
 
== Sorting values within functions ==

Revision as of 20:15, 12 February 2009

Specific bottlenecks to be worked on, identified using tools such as valgrind --tool=callgrind.


The Zaske case

Content relocated to Calc/Performance/Zaske, section preserved for external references linking here.

Sorting values within functions

Content relocated to Calc/Performance/sorting_values_within_functions, section preserved for external references linking here.

Querying data within functions

An internal customer's document (sorry, can't publish) doing lookup queries that don't fit into the current caching strategy.

Findings:

  • 8% in 51613353 calls to com::sun::star::i18n::casefolding::getNextChar() via
    • 39696595 calls to utl::TransliterationWrapper::isEqual() via
      • ScTable::ValidQuery() via
        • 8888 calls to ScQueryCellIterator::GetThis() via
          • lcl_LookupQuery()
  • 5% in ScTableValidQuery() most in String() and ~String() of aCellStr
  • 200873636 calls to com::sun::star::i18n::casefolding::getNextChar() via
    • 33173401 calls to com::sun::star::i18n::Transliteration_caseignore::compare()
  • 5% in com::sun::star::i18n::oneToOneMappingWithFlag::find()
    • Replicated mpIndex[high] access, might be better using temporary pointer.
  • 5% in com::sun::star::i18n::casefolding::getValue()
  • 58% overall in ScTable::ValidQuery() and below
    • TODO: Cache results of ValidQuery()? Similar to ScLookupCache?
  • 11% overall in 27341713 calls to ScBroadcastAreaSlot::StartListeningArea() and below, of which 10% are in ::std::set::insert() and below.
    • TODO: refactor implementation of broadcast slots.
Personal tools