XML Load

ODF Documents are a zipped archive of XML files along with some assorted information and pictures. Reading and parsing the XML constitutes a chunk of the time spent in opening documents. Below is an analysis of XML processing.

We use a suite of 60 Performance Related Test Documents. The content.xml for each has been extracted for XML-only tests.

Niklas and Florian have prototyped a test component, which tokenizes XML tags, and passes tokens around. This saves string allocation times and provides speedup.(FastXML)

Comparisons

Compare OpenOffice TestXML and TestFastXML for doc sample
Compare different XML parsers & APIs in terms of processing content.xml
Compare time spent in XML parsing, building document model, and rendering

Methodology

Time is measured in the same way for each test - This is based on Time::GetSystemTicks
File handling and parsing is done as similar as possible - within allowances of API differences.
Only C and C++ parsers are considered - Java based parsers/wrappers are excluded.

Results

FastXML provides good speedup, across the test suite.
Expat is the fastest parser.

Ongoing Work

Performance counter to measure proportion of time spent in:
1. Container file uncompress & open
2. XML Parser setup
3. Actual Parsing
4. String Allocation
5. Building Doc Model
6. Rendering

The code has been partially instrumented - runs not done yet.

XML Load

Contents

Comparisons

Methodology

Results

Ongoing Work

Data

Views

Personal tools

Navigation

Search

Tools