Difference between revisions of "WriterFilter"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Documentation)
 
(12 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Download install sets for current milestone ==
 
 
We provide downloads for the current milestone (Solaris SPARC, Linux and Windows) here:
 
 
http://ooo.services.openoffice.org/pub/OpenOffice.org/cws/upload/writerfilter2/
 
 
== Getting Current Source Code for WriterFilter ==
 
 
WriterFilter is currently not included in the build system. You have to check it out manually:
 
<pre>
 
cvs -d:pserver:anoncvs@anoncvs.services.openoffice.org:/cvs co sw/writerfilter
 
</pre>
 
 
 
== Installing WriterFilter ==
 
== Installing WriterFilter ==
  
 
=== Prerequisites ===
 
=== Prerequisites ===
  
The current [[CWS]] for WriterFilter is <code>writerfilter2</code>. Have a look in [[EIS]] for details on the CWS.
+
The current [[CWS]] for WriterFilter is [http://hg.services.openoffice.org/hg/cws/writerfilter07 writerfilter07]
  
You will need an installed StarOffice/OOo and the following projects:
+
=== Changing Filter Detection ===
* comphelper
+
* filter
+
* jurt
+
* offapi
+
* offuh
+
* scp2
+
* svtools
+
* svx
+
* sw
+
  
Compile these projects and deliver the libraries to your installation in <code><SOINST></code>.
+
For DOCX the new filter is the only and therefor the default.
  
=== Changing Filter Detection ===
+
If you want to use the new filter for WW8, too, do the following: In  
 
+
In  
+
 
<code><SOINST>/share/registry/modules/org/openoffice/TypeDetection/Filter/fcfg_writer_filters.xcu</code>
 
<code><SOINST>/share/registry/modules/org/openoffice/TypeDetection/Filter/fcfg_writer_filters.xcu</code>
 
add node with attribute <code>oor:name="MS Word 97"</code>:
 
add node with attribute <code>oor:name="MS Word 97"</code>:
Line 53: Line 29:
 
Remember to grant yourself write access to the xcu.
 
Remember to grant yourself write access to the xcu.
  
=== Create Resource Database for New Intefaces ===
+
Now, if you load a file recognized as "Word 97", WriterFilter will be used.
  
Note: This is only necessary if you do not use CWS writerfilter2.
+
== Documentation ==
  
Generate resource database for new interfaces: In <code>offapi/<platform>/ucr/com/sun/star/text</code> call
+
There is a directory <code>writerfilter/documentation/doxygen</code>. It contains the Doxyfile to run <code>doxygen</code> with.
<pre>regmerge /tmp/merge.rdb -UCR XTextAppend.urd</pre>
+
  
=== Register New Interfaces ===
+
== Debug tokenizers ==
  
Note: Again, only necessary if not using the CWS writerfilter2.
+
You will need an installation of OpenOffice.org with the libraries from writerfilter delivered into the $SOINST/program directory.
 +
 
 +
=== Dump WW8 tokens ===
  
In <code><SOINST>/program</code> call
 
 
<pre>
 
<pre>
./unopkg add /tmp/merge.rdb
+
$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.doctok.ScannerTestService -- file://...
 
</pre>
 
</pre>
  
=== Register filter component ===
+
=== Dump WordprocessingML tokens ===
  
In <code><SOINST>/program</code> call
 
 
<pre>
 
<pre>
chmod u+w services.rdb
+
$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.ooxml.ScannerTestService -- file://...
regcomp -register -r services.rdb -c <writerfilter lib>
+
 
</pre>
 
</pre>
  
Now, if you load a file recognized as "Word 97", WriterFilter will be used.
+
=== Use TagLogger ===
  
== Documentation ==
+
The class writerfilter::TagLogger is defined in writerfilter/inc/resourcemodel/TagLogger.hxx.
  
* Documentation generated by Doxygen (will be available soon)
+
It has a SAX like interface.
  
== Debug tokenizers ==
+
Usage:
 
+
You will need an installation of OpenOffice.org with the libraries from writerfilter delivered into the $SOINST/program directory.
+
 
+
=== Dump WW8 tokens ===
+
  
 
<pre>
 
<pre>
$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.doctok.ScannerTestService -- file://...
+
TagLogger::Pointer_t pLogger(TagLogger::getInstance(<name>)) // get an instance i.
 +
pLogger->startDocument(); // start the document
 +
pLogger->startElement(<element>); // start a new element
 +
pLogger->attribute(<attribute>,<value>); // add an attribute to the current element
 +
// add more attributes...
 +
pLogger->chars(<string>); // add text to the current element
 +
// add more text...
 +
// add more nested elements...
 +
pLogger->endElement(); // end the current element
 +
// add more elements...
 +
pLogger->endDocument(); // end the document
 +
TagLogger::dump(<name>); // dump the TagLogger's content
 
</pre>
 
</pre>
 +
The output will be written to
 +
<pre>
 +
$TAGLOGGERTMP/writerfilter.<name>.xml
 +
</pre>
 +
If $TAGLOGGERTMP is not set, /tmp will be used instead.
  
=== Dump OOXML tokens ===
+
TagLogger is already used in writerfilter/source/{filter,ooxml,dmapper}. To activate, use the following defines while compiling (Please refer to writerfilter/inc/resourcemodel/WW8ResourceModel.hxx for more insight about what attributes, properties, stream or resolving mean in this context.):
  
<pre>
+
* DEBUG_ELEMENT
$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.ooxml.ScannerTestService -- file://...
+
: This activates logging for elements ooxml. Has to be provided, when DEBUG_DOMAINMAPPER is set, too.
</pre>
+
* DEBUG_CONTEXT_STACK
 +
: Logging of context-stack in ooxml
 +
* DEBUG_ATTRIBUTES
 +
: Logging of attributes in ooxml
 +
* DEBUG_PROPERTIES
 +
: Logging of properties in ooxml
 +
* DEBUG_RESOLVE
 +
: Logging of resolving references in ooxml
 +
* DEBUG_MEMORY
 +
: Logging of allocation and deallocation of memory
 +
* DEBUG_STREAM
 +
: Logging of stream content
 +
* DEBUG_TOKEN
 +
: Logging of token related events
  
 
== Notes ==
 
== Notes ==
Line 103: Line 102:
 
The properties of sprm 0x6a03 (sprmCPicLocation) have to be resolved. Otherwise the FC of the picture to come is not stored.
 
The properties of sprm 0x6a03 (sprmCPicLocation) have to be resolved. Otherwise the FC of the picture to come is not stored.
  
[[Category: Filter]]
+
[[Category:Filter]]
 +
[[Category:Office Open XML]]
 +
[[Category:Source directories]]

Latest revision as of 15:49, 24 September 2009

Installing WriterFilter

Prerequisites

The current CWS for WriterFilter is writerfilter07

Changing Filter Detection

For DOCX the new filter is the only and therefor the default.

If you want to use the new filter for WW8, too, do the following: In <SOINST>/share/registry/modules/org/openoffice/TypeDetection/Filter/fcfg_writer_filters.xcu add node with attribute oor:name="MS Word 97":

   <node oor:name="MS Word 97" oor:op="replace">
        <prop oor:name="Flags"><value>IMPORT EXPORT ALIEN 3RDPARTYFILTER</value></prop>
        <prop oor:name="UIComponent"/>
        <prop oor:name="FilterService"><value>com.sun.star.comp.Writer.WriterFilter</value></prop>
        <prop oor:name="UserData"><value>CWW8</value></prop>
        <prop oor:name="UIName">
            <value xml:lang="x-default">Microsoft Word 97/2000/XP (new)</value>
        </prop>
        <prop oor:name="FileFormatVersion"><value>0</value></prop>
        <prop oor:name="Type"><value>writer_MS_Word_97</value></prop>
        <prop oor:name="TemplateName"/>
        <prop oor:name="DocumentService"><value>com.sun.star.text.TextDocument</value></prop>
    </node>

Remember to grant yourself write access to the xcu.

Now, if you load a file recognized as "Word 97", WriterFilter will be used.

Documentation

There is a directory writerfilter/documentation/doxygen. It contains the Doxyfile to run doxygen with.

Debug tokenizers

You will need an installation of OpenOffice.org with the libraries from writerfilter delivered into the $SOINST/program directory.

Dump WW8 tokens

$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.doctok.ScannerTestService -- file://...

Dump WordprocessingML tokens

$SOINST/program/uno -l libwriterfilter.uno.so -c debugservices.ooxml.ScannerTestService -- file://...

Use TagLogger

The class writerfilter::TagLogger is defined in writerfilter/inc/resourcemodel/TagLogger.hxx.

It has a SAX like interface.

Usage:

TagLogger::Pointer_t pLogger(TagLogger::getInstance(<name>)) // get an instance i.
pLogger->startDocument(); // start the document
pLogger->startElement(<element>); // start a new element
pLogger->attribute(<attribute>,<value>); // add an attribute to the current element
// add more attributes...
pLogger->chars(<string>); // add text to the current element
// add more text...
// add more nested elements...
pLogger->endElement(); // end the current element
// add more elements...
pLogger->endDocument(); // end the document
TagLogger::dump(<name>); // dump the TagLogger's content

The output will be written to

$TAGLOGGERTMP/writerfilter.<name>.xml

If $TAGLOGGERTMP is not set, /tmp will be used instead.

TagLogger is already used in writerfilter/source/{filter,ooxml,dmapper}. To activate, use the following defines while compiling (Please refer to writerfilter/inc/resourcemodel/WW8ResourceModel.hxx for more insight about what attributes, properties, stream or resolving mean in this context.):

  • DEBUG_ELEMENT
This activates logging for elements ooxml. Has to be provided, when DEBUG_DOMAINMAPPER is set, too.
  • DEBUG_CONTEXT_STACK
Logging of context-stack in ooxml
  • DEBUG_ATTRIBUTES
Logging of attributes in ooxml
  • DEBUG_PROPERTIES
Logging of properties in ooxml
  • DEBUG_RESOLVE
Logging of resolving references in ooxml
  • DEBUG_MEMORY
Logging of allocation and deallocation of memory
  • DEBUG_STREAM
Logging of stream content
  • DEBUG_TOKEN
Logging of token related events

Notes

The properties of sprm 0x6a03 (sprmCPicLocation) have to be resolved. Otherwise the FC of the picture to come is not stored.

Personal tools