Difference between revisions of "WriterFilter"

From Apache OpenOffice Wiki
Jump to: navigation, search
m (Dump OOXML tokens)
(Debug tokenizers)
Line 98: Line 98:
 
$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.ooxml.ScannerTestService -- file://...
 
$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.ooxml.ScannerTestService -- file://...
 
</pre>
 
</pre>
 +
 +
=== Use TagLogger ===
 +
 +
The class writerfilter::TagLogger is defined in writerfilter/inc/resourcemodel/TagLogger.hxx.
 +
 +
It has a SAX like interface.
 +
 +
Usage:
 +
 +
<pre>
 +
TagLogger::Pointer_t pLogger(TagLogger::getInstance(<name>)) // get an instance i.
 +
pLogger->startDocument(); // start the document
 +
pLogger->startElement(<element>); // start a new element
 +
pLogger->attribute(<attribute>,<value>); // add an attribute to the current element
 +
// add more attributes...
 +
pLogger->chars(<string>); // add text to the current element
 +
// add more text...
 +
// add more nested elements...
 +
pLogger->endElement(); // end the current element
 +
// add more elements...
 +
pLogger->endDocument(); // end the document
 +
TagLogger::dump(<name>); // dump the TagLogger's content
 +
</pre>
 +
The output will be written to
 +
<pre>
 +
$TAGLOGGERTMP/writerfilter.<name>.xml
 +
</pre>
 +
If $TAGLOGGERTMP is not set, /tmp will be used instead.
 +
 +
TagLogger is already used in writerfilter/source/{filter,ooxml,dmapper}. To activate, use the following defines while compiling (Please refer to writerfilter/inc/resourcemodel/WW8ResourceModel.hxx for more insight about what attributes, properties, stream or resolving mean in this context.):
 +
 +
* DEBUG_ELEMENT
 +
: This activates logging for elements ooxml. Has to be provided, when DEBUG_DOMAINMAPPER is set, too.
 +
* DEBUG_CONTEXT_STACK
 +
: Logging of context-stack in ooxml
 +
* DEBUG_ATTRIBUTES
 +
: Logging of attributes in ooxml
 +
* DEBUG_PROPERTIES
 +
: Logging of properties in ooxml
 +
* DEBUG_RESOLVE
 +
: Logging of resolving references in ooxml
 +
* DEBUG_MEMORY
 +
: Logging of allocation and deallocation of memory
 +
* DEBUG_STREAM
 +
: Logging of stream content
 +
* DEBUG_TOKEN
 +
: Logging of token related events
  
 
== Notes ==
 
== Notes ==

Revision as of 15:42, 20 January 2009

Download Install Sets for Current Milestone

We provide downloads for the current milestone (Solaris SPARC, Linux and Windows) here:

http://ooo.services.openoffice.org/pub/OpenOffice.org/cws/upload/writerfilter2/

Getting Current Source Code for WriterFilter

WriterFilter is currently not included in the build system. You have to check it out manually:

cvs -d:pserver:anoncvs@anoncvs.services.openoffice.org:/cvs co sw/writerfilter 

Installing WriterFilter

Prerequisites

The current CWS for WriterFilter is xmlfilter02. It previously was in writerfilter2. Have a look in EIS for details on the CWS.

You will need an installed StarOffice/OOo and the following projects:

  • comphelper
  • filter
  • jurt
  • offapi
  • offuh
  • scp2
  • svtools
  • svx
  • sw

Compile these projects and deliver the libraries to your installation in <SOINST>.

Changing Filter Detection

In <SOINST>/share/registry/modules/org/openoffice/TypeDetection/Filter/fcfg_writer_filters.xcu add node with attribute oor:name="MS Word 97":

   <node oor:name="MS Word 97" oor:op="replace">
        <prop oor:name="Flags"><value>IMPORT EXPORT ALIEN 3RDPARTYFILTER</value></prop>
        <prop oor:name="UIComponent"/>
        <prop oor:name="FilterService"><value>com.sun.star.comp.Writer.WriterFilter</value></prop>
        <prop oor:name="UserData"><value>CWW8</value></prop>
        <prop oor:name="UIName">
            <value xml:lang="x-default">Microsoft Word 97/2000/XP (new)</value>
        </prop>
        <prop oor:name="FileFormatVersion"><value>0</value></prop>
        <prop oor:name="Type"><value>writer_MS_Word_97</value></prop>
        <prop oor:name="TemplateName"/>
        <prop oor:name="DocumentService"><value>com.sun.star.text.TextDocument</value></prop>
    </node>

Remember to grant yourself write access to the xcu.

Create Resource Database for New Intefaces

Note: This is only necessary if you do not use CWS xmlfilter02.

Generate resource database for new interfaces: In offapi/<platform>/ucr/com/sun/star/text call

regmerge /tmp/merge.rdb -UCR XTextAppend.urd

Register New Interfaces

Note: Again, only necessary if not using the CWS xmlfilter02

In <SOINST>/program call

./unopkg add /tmp/merge.rdb

Register filter component

In <SOINST>/program call

chmod u+w services.rdb
regcomp -register -r services.rdb -c <writerfilter lib>

Now, if you load a file recognized as "Word 97", WriterFilter will be used.

Documentation

  • Documentation generated by Doxygen (will be available soon)

Debug tokenizers

You will need an installation of OpenOffice.org with the libraries from writerfilter delivered into the $SOINST/program directory.

Dump WW8 tokens

$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.doctok.ScannerTestService -- file://...

Dump WordprocessingML tokens

$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.ooxml.ScannerTestService -- file://...

Use TagLogger

The class writerfilter::TagLogger is defined in writerfilter/inc/resourcemodel/TagLogger.hxx.

It has a SAX like interface.

Usage:

TagLogger::Pointer_t pLogger(TagLogger::getInstance(<name>)) // get an instance i.
pLogger->startDocument(); // start the document
pLogger->startElement(<element>); // start a new element
pLogger->attribute(<attribute>,<value>); // add an attribute to the current element
// add more attributes...
pLogger->chars(<string>); // add text to the current element
// add more text...
// add more nested elements...
pLogger->endElement(); // end the current element
// add more elements...
pLogger->endDocument(); // end the document
TagLogger::dump(<name>); // dump the TagLogger's content

The output will be written to

$TAGLOGGERTMP/writerfilter.<name>.xml

If $TAGLOGGERTMP is not set, /tmp will be used instead.

TagLogger is already used in writerfilter/source/{filter,ooxml,dmapper}. To activate, use the following defines while compiling (Please refer to writerfilter/inc/resourcemodel/WW8ResourceModel.hxx for more insight about what attributes, properties, stream or resolving mean in this context.):

  • DEBUG_ELEMENT
This activates logging for elements ooxml. Has to be provided, when DEBUG_DOMAINMAPPER is set, too.
  • DEBUG_CONTEXT_STACK
Logging of context-stack in ooxml
  • DEBUG_ATTRIBUTES
Logging of attributes in ooxml
  • DEBUG_PROPERTIES
Logging of properties in ooxml
  • DEBUG_RESOLVE
Logging of resolving references in ooxml
  • DEBUG_MEMORY
Logging of allocation and deallocation of memory
  • DEBUG_STREAM
Logging of stream content
  • DEBUG_TOKEN
Logging of token related events

Notes

The properties of sprm 0x6a03 (sprmCPicLocation) have to be resolved. Otherwise the FC of the picture to come is not stored.

Personal tools