WriterFilter

From Apache OpenOffice Wiki
Revision as of 15:47, 24 September 2009 by Hbrinkm (Talk | contribs)

Jump to: navigation, search

Installing WriterFilter

Prerequisites

The current CWS for WriterFilter is writerfilter07

Changing Filter Detection

For DOCX the new filter is the only and therefor the default.

If you want to use the new filter for WW8, too, do the following: In <SOINST>/share/registry/modules/org/openoffice/TypeDetection/Filter/fcfg_writer_filters.xcu add node with attribute oor:name="MS Word 97":

   <node oor:name="MS Word 97" oor:op="replace">
        <prop oor:name="Flags"><value>IMPORT EXPORT ALIEN 3RDPARTYFILTER</value></prop>
        <prop oor:name="UIComponent"/>
        <prop oor:name="FilterService"><value>com.sun.star.comp.Writer.WriterFilter</value></prop>
        <prop oor:name="UserData"><value>CWW8</value></prop>
        <prop oor:name="UIName">
            <value xml:lang="x-default">Microsoft Word 97/2000/XP (new)</value>
        </prop>
        <prop oor:name="FileFormatVersion"><value>0</value></prop>
        <prop oor:name="Type"><value>writer_MS_Word_97</value></prop>
        <prop oor:name="TemplateName"/>
        <prop oor:name="DocumentService"><value>com.sun.star.text.TextDocument</value></prop>
    </node>

Remember to grant yourself write access to the xcu.

Now, if you load a file recognized as "Word 97", WriterFilter will be used.

Documentation

  • Documentation generated by Doxygen (will be available soon)

Debug tokenizers

You will need an installation of OpenOffice.org with the libraries from writerfilter delivered into the $SOINST/program directory.

Dump WW8 tokens

$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.doctok.ScannerTestService -- file://...

Dump WordprocessingML tokens

$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.ooxml.ScannerTestService -- file://...

Use TagLogger

The class writerfilter::TagLogger is defined in writerfilter/inc/resourcemodel/TagLogger.hxx.

It has a SAX like interface.

Usage:

TagLogger::Pointer_t pLogger(TagLogger::getInstance(<name>)) // get an instance i.
pLogger->startDocument(); // start the document
pLogger->startElement(<element>); // start a new element
pLogger->attribute(<attribute>,<value>); // add an attribute to the current element
// add more attributes...
pLogger->chars(<string>); // add text to the current element
// add more text...
// add more nested elements...
pLogger->endElement(); // end the current element
// add more elements...
pLogger->endDocument(); // end the document
TagLogger::dump(<name>); // dump the TagLogger's content

The output will be written to

$TAGLOGGERTMP/writerfilter.<name>.xml

If $TAGLOGGERTMP is not set, /tmp will be used instead.

TagLogger is already used in writerfilter/source/{filter,ooxml,dmapper}. To activate, use the following defines while compiling (Please refer to writerfilter/inc/resourcemodel/WW8ResourceModel.hxx for more insight about what attributes, properties, stream or resolving mean in this context.):

  • DEBUG_ELEMENT
This activates logging for elements ooxml. Has to be provided, when DEBUG_DOMAINMAPPER is set, too.
  • DEBUG_CONTEXT_STACK
Logging of context-stack in ooxml
  • DEBUG_ATTRIBUTES
Logging of attributes in ooxml
  • DEBUG_PROPERTIES
Logging of properties in ooxml
  • DEBUG_RESOLVE
Logging of resolving references in ooxml
  • DEBUG_MEMORY
Logging of allocation and deallocation of memory
  • DEBUG_STREAM
Logging of stream content
  • DEBUG_TOKEN
Logging of token related events

Notes

The properties of sprm 0x6a03 (sprmCPicLocation) have to be resolved. Otherwise the FC of the picture to come is not stored.

Personal tools