Difference between revisions of "Documentation/DevGuide/OfficeDev/XML Filter Detection"
OOoWikiBot (Talk | contribs) m (FINAL VERSION FOR L10N) |
m (typos) |
||
Line 25: | Line 25: | ||
=== Extending the File Type Definition === | === Extending the File Type Definition === | ||
− | Since many different XML files can conform to different DTDs, the type definition of a particular XML file needs to be extended. To do this, some or all of the <code>DocType</code> information can be contained as part of the file type definition. This information is held as part of the <code>ClipboardFormat</code> property of the type node. A unique namespace or preface identifies the | + | Since many different XML files can conform to different DTDs, the type definition of a particular XML file needs to be extended. To do this, some or all of the <code>DocType</code> information can be contained as part of the file type definition. This information is held as part of the <code>ClipboardFormat</code> property of the type node. A unique namespace or preface identifies the string at this point in the sequence as being a <code>DocType</code> declaration. |
==== Sample Type definition: ==== | ==== Sample Type definition: ==== | ||
Line 50: | Line 50: | ||
string detect( [inout]sequence<com::sun::star::beans::PropertyValue > Descriptor ); | string detect( [inout]sequence<com::sun::star::beans::PropertyValue > Descriptor ); | ||
</source> | </source> | ||
− | This method supplies you with a sequence of <code> | + | This method supplies you with a sequence of <code>PropertyValue</code>s from which you can use to extract the current <code>TypeName</code> and the <code>URL</code> of the file being loaded: |
<source lang="cpp"> | <source lang="cpp"> | ||
::rtl::OUString SAL_CALL FilterDetect::detect(com::sun::star::uno::Sequence< com::sun::star::beans::PropertyValue >& aArguments ) throw (com::sun::star::uno::RuntimeException) | ::rtl::OUString SAL_CALL FilterDetect::detect(com::sun::star::uno::Sequence< com::sun::star::beans::PropertyValue >& aArguments ) throw (com::sun::star::uno::RuntimeException) | ||
Line 76: | Line 76: | ||
::rtl::OString resultString; | ::rtl::OString resultString; | ||
com::sun::star::uno::Sequence< sal_Int8 > aData; | com::sun::star::uno::Sequence< sal_Int8 > aData; | ||
− | long bytestRead =xInStream->readBytes (aData, 1000); | + | long bytestRead = xInStream->readBytes (aData, 1000); |
− | resultString=::rtl::OString( | + | resultString = ::rtl::OString( (const sal_Char *)aData.getConstArray(), bytestRead); |
− | + | ||
</source> | </source> | ||
Once you have this information, you can start looking for a type that describes the file being loaded. In order to do this, you need to get a list of the types currently supported: | Once you have this information, you can start looking for a type that describes the file being loaded. In order to do this, you need to get a list of the types currently supported: | ||
Line 104: | Line 103: | ||
Now that you have created the <code>ExtendedTypeDetection</code> service implementation, you need to tell {{PRODUCTNAME}} when to use this service. | Now that you have created the <code>ExtendedTypeDetection</code> service implementation, you need to tell {{PRODUCTNAME}} when to use this service. | ||
− | First create a <code>DetectServices</code> node, unless one already exists, and then add the information specific to the detection | + | First create a <code>DetectServices</code> node, unless one already exists, and then add the information specific to the detection service that has been implemented, that is, the name of the service and the file types that use it. |
<source lang="xml"> | <source lang="xml"> | ||
<node oor:name="DetectServices"> | <node oor:name="DetectServices"> |
Revision as of 21:11, 21 January 2010
The number of XML files that conform to differing DTD specifications means that a single filter and file type definition is insufficient to handle all of the possible formats available. In order to allow OpenOffice.org to handle multiple filter definitions and implementations, it is necessary to implement an additional filter detection module that is capable of determining the type of XML file being read, based on its DocType
declaration.
To accomplish this, a filter detection service com.sun.star.document.ExtendedTypeDetection can be implemented, which is capable of handling and distinguishing between many different XML based file formats. This type of service supersedes the basic flat detection, which uses the file's suffix to determine the Type, and instead, carries out a deep detection which uses the file's internal structure and content to detect its true type.
Requirements for Deep Detection
There are three requirements for implementing a deep detection module that is capable of identifying one or more unique XML types. These include:
- An extended type definition for describing the format in more detail (TypeDetection.xcu).
- A
DetectService
implementation. - A
DetectService
definition (TypeDetection.xcu).
Extending the File Type Definition
Since many different XML files can conform to different DTDs, the type definition of a particular XML file needs to be extended. To do this, some or all of the DocType
information can be contained as part of the file type definition. This information is held as part of the ClipboardFormat
property of the type node. A unique namespace or preface identifies the string at this point in the sequence as being a DocType
declaration.
Sample Type definition:
<node oor:name="writer_DocBook_File" oor:op="replace"> <prop oor:name="UIName"> <value XML:lang="en-US">DocBook</value> </prop> <prop oor:name="Data"> <value> 0, , doctype:-//OASIS//DTD DocBook XML V4.1.2//EN, , XML, 20002, </value> </prop> </node>
The ExtendedTypeDetection Service Implementation
In order for the type detection code to function as an ExtendedTypeDetection
service, you must implement the detect()
method as defined by the com.sun.star.document.XExtendedFilterDetection interface definition:
string detect( [inout]sequence<com::sun::star::beans::PropertyValue > Descriptor );
This method supplies you with a sequence of PropertyValue
s from which you can use to extract the current TypeName
and the URL
of the file being loaded:
::rtl::OUString SAL_CALL FilterDetect::detect(com::sun::star::uno::Sequence< com::sun::star::beans::PropertyValue >& aArguments ) throw (com::sun::star::uno::RuntimeException) { const PropertyValue * pValue = aArguments.getConstArray(); sal_Int32 nLength; ::rtl::OString resultString; nLength = aArguments.getLength(); for (sal_Int32 i = 0; i < nLength; i++) { if (pValue[i].Name.equalsAsciiL(RTL_CONSTASCII_STRINGPARAM("TypeName"))) { } else if (pValue[i].Name.equalsAsciiL(RTL_CONSTASCII_STRINGPARAM("URL"))) { pValue[i].Value >>= sUrl; } }
Once you have the URL of the file, you can then use it to create a ::ucb::Content
from which you can open an XInputStream
to the file:
Reference< com::sun::star::ucb::XCommandEnvironment > xEnv; ::ucb::Content aContent(sUrl,xEnv); xInStream = aContent.openStream();
You can now use this XInputStream
to read the header of the file being loaded. Because the exact location of the DocType
information within the file is not known, the first 1000 bytes of information will be read:
::rtl::OString resultString; com::sun::star::uno::Sequence< sal_Int8 > aData; long bytestRead = xInStream->readBytes (aData, 1000); resultString = ::rtl::OString( (const sal_Char *)aData.getConstArray(), bytestRead);
Once you have this information, you can start looking for a type that describes the file being loaded. In order to do this, you need to get a list of the types currently supported:
Reference <XNameAccess> xTypeCont(mxMSF->createInstance(OUString::createFromAscii( "com.sun.star.document.TypeDetection" )),UNO_QUERY); Sequence <::rtl::OUString> myTypes= xTypeCont->getElementNames(); nLength = myTypes.getLength();
For each of these types, you must first determine whether the ClipboardFormat
property contains a DocType:
Loc_of_ClipboardFormat=...; Sequence<::rtl::OUString> ClipboardFormatSeq; Type_Props[Loc_of_ClipboardFormat].Value >>=ClipboardFormatSeq ; while() { if(ClipboardFormatSeq.match(OUString::createFromAscii("doctype:") { //if it contains a DocType, start to compare to header } }
All of the possible DocType
declarations of the file types can be checked to determine a match. If a match is found, the type corresponding to the match is returned. If no match is found, an empty string is returned. This will force OpenOffice.org into flat detection mode.
TypeDetection.xcu DetectServices Entry
Now that you have created the ExtendedTypeDetection
service implementation, you need to tell OpenOffice.org when to use this service.
First create a DetectServices
node, unless one already exists, and then add the information specific to the detection service that has been implemented, that is, the name of the service and the file types that use it.
<node oor:name="DetectServices"> <node oor:name="com.sun.star.comp.filters.XMLDetect" oor:op="replace"> <prop oor:name="ServiceName"> <value XML:lang="en-US">com.sun.star.comp.filters.XMLDetect</value> </prop> <prop oor:name="Types"> <value>writer_DocBook_File</value> <value>writer_Flat_XML_File</value> </prop> </node> </node>
Content on this page is licensed under the Public Documentation License (PDL). |