Performance/load performance implement

From Apache OpenOffice Wiki
Jump to: navigation, search

Performance 170.png
Performance Project

performance.openoffice.org

Quick Navigation

Team

Communication

Activities

About this template


Please view Performance/Odf_document_load_performance_increase_feasibility_analysis about analysis. This document explain implement and some discussion.

Implement

Collect data throught sax:

  • The first step is simple, and no more problem.
  • We can collect valid data information to a data structure throught sax parser, the structure's elements can be queried quickly by location identified. I select the vector, now.
  • The element has these base fields:
    • sal_uInt16 m_prefix;
    • rtl_uString* mp_localName;
    • sal_uInt32 m_local;
    • sal_uInt32 m_distance;
    • sal_uInt32 m_count;
    • and other fields such as namespace / full name / cached token enum ...

Processing method

  • We can go through the result's element by element's "m_local" "m_distance" "m_count"; Get parent element and subelements.
  • Every element's processing that has two process steps:
    • Process itself , process the attribute and process subelements.
    • The result information commit to the parent element.

Odfcontext Process Implement.jpg

      • processElement() : Process element
        • _startElement() : Start element
        • processSubContexts() : If this element is a parent element, subelements process; every subelement processing will be divided into three steps.
          • _createChildContext(): Create son context
          • _processSubContext(): Process subelement
          • _collectSubContext(): Collect the subelement's result data
        • _characters(): Process element's content
        • _endElement(): End element
      • commit(): commit the result data to parent element
  • An element has three solutions, Serial / Parallel / delayed processing.
    • Serial processing: every element will be same processing. As current processing end, back to parent. The parent will jump to next son processing if has next, or back to parent's parent.
    • Parallel processing: One element that has many subelements, will split every subelement's "_processSubContext()" into different work thread. When all subelements end, the "_collectSubContext()" will be serial processing.
    • Delayed processing: One element that has many subelements, the first subelement will process and the others will delay to the document processing end.

Processing

  • It is serial processing to the interests of the whole.
    • We know an odf document will be four base parts : "meta" "settings" "styles" "content".
    • The dependent relation is : "content" -> "styles" -> "settings" -> "meta"( I think it is no more problem -:) )
    • So It will be same as now "meta" -> "settings" -> "styles" -> "content".
  • To every part parallel processing ,that will be possible.
    • Meta part, Settings part and Styles part those can be "Parallel processing".
    • I think it can be "Parallel processing" or "Delayed processing", Conent part; No other part depend this part.

Difficulty

Meta part

  • We know "<office:meta>" , it's subelements is like "<meta:*>". I think the subelements has no correlation between. That can be parallel processing.
  • Currently, this part process a DOM object, I do't know why. So this part is serial processing, now.

Settings part

  • The "<office:settings>", every subelement of that will get an "beans::PropertyValue". It can be parallel processing.

Styles part and Conent part

  • The object from sfx2 , sd , sc and sw; It is complex.

Plan

  • Implement almost source code about sd to plan.
  • Debug the Serial processing process is right.
  • Try to test Parallel processing.
  • Try to analysis the Delayed processing is feasibility.
Personal tools