Difference between revisions of "XML Load"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Data)
(Results)
Line 17: Line 17:
 
== Results ==  
 
== Results ==  
 
* FastXML provides good speedup, across the test suite.  
 
* FastXML provides good speedup, across the test suite.  
* Expat is the fastest parser.  
+
* libxml2 (SAX API) is the fastest from a pure parsing point of view.  
 +
** libxml2 (Reader/processNode) is slower, but comparable to expat
 +
** expat is faster than xerces (SAX & SAX2) as well as OO.o.
 +
** OO.o parser has some UNO interface overhead (to be measured)
  
 
== Ongoing Work ==  
 
== Ongoing Work ==  

Revision as of 06:58, 17 February 2006

ODF Documents are a zipped archive of XML files along with some assorted information and pictures. Reading and parsing the XML constitutes a chunk of the time spent in opening documents. Below is an analysis of XML processing.

We use a suite of 60 Performance Related Test Documents. The content.xml for each has been extracted for XML-only tests.

Niklas and Florian have prototyped a test component, which tokenizes XML tags, and passes tokens around. This saves string allocation times and provides speedup.(FastXML)

Comparisons

  • Compare OpenOffice TestXML and TestFastXML for doc sample
  • Compare different XML parsers & APIs in terms of processing content.xml
  • Compare time spent in XML parsing, building document model, and rendering

Methodology

  • Time is measured in the same way for each test - This is based on Time::GetSystemTicks
  • File handling and parsing is done as similar as possible - within allowances of API differences.
  • Only C and C++ parsers are considered - Java based parsers/wrappers are excluded.

Results

  • FastXML provides good speedup, across the test suite.
  • libxml2 (SAX API) is the fastest from a pure parsing point of view.
    • libxml2 (Reader/processNode) is slower, but comparable to expat
    • expat is faster than xerces (SAX & SAX2) as well as OO.o.
    • OO.o parser has some UNO interface overhead (to be measured)

Ongoing Work

  • Performance counter to measure proportion of time spent in:
    1. Opening & uncompressing container files
    2. XML Parser setup
    3. Actual Parsing
    4. String Allocation
    5. Building Doc Model
    6. Rendering

Data

The File:Xml-load-compare.ods.

Personal tools