Difference between revisions of "Calc/Performance/sorting values within functions"

From Apache OpenOffice Wiki
Jump to: navigation, search
(done)
 
m (typo in URL)
 
Line 1: Line 1:
 
{{DISPLAYTITLE:Sorting Values within Spreadsheet Functions}}
 
{{DISPLAYTITLE:Sorting Values within Spreadsheet Functions}}
  
[http://www.openoffice.org/issues/show_bug.cgi?6id=89976 i89976] has a document attached:
+
[http://www.openoffice.org/issues/show_bug.cgi?id=89976 i89976] has a document attached:
  
 
test-huge_calculations-Median-detailed.ods
 
test-huge_calculations-Median-detailed.ods

Latest revision as of 20:04, 26 March 2009


i89976 has a document attached:

test-huge_calculations-Median-detailed.ods

NOTE: Some assumptions made by the submitter as documented in the test case are plain wrong.

Findings when testing with filling C4:C3003

  • 52% overall in interpr3.cxx lcl_QuickSort() and below, of which
    • 32% in vector<double>::operator[] and below,
      • 25% originating from the loops
       while (ni <= nHi && rSortArray[ni]  < rSortArray[nLo]) ni++;
       while (nj >= nLo && rSortArray[nLo] < rSortArray[nj])  nj--;

where rSortArray[nLo] should be a temporary variable instead.

Or all that be realized using simple double[].

  • 21% overall in ScValueIterator::GetThis() and below.


Solution is in CWS DEV300 calcperf03  , besides the loops above being changed to use a temporary value, which benefits other functions, the use of a fully sorted array for functions MEDIAN, PERCENTILE, QUARTILE, LARGE and SMALL was eliminated by using ::std::nth_element() instead. Filling entire column C as described in the document now needs ~12 minutes instead of an hour or so. Excel needed ~15 minutes, but that can't be exactly compared because it was on different machines.

Personal tools