Difference between revisions of "Xml"
m |
|||
| Line 1: | Line 1: | ||
| − | These are the Wiki pages of the [ | + | These are the Wiki pages of the [https://www.openoffice.org/xml/ Apache OpenOffice XML Project]. |
The following information can be found on these pages: | The following information can be found on these pages: | ||
Latest revision as of 15:32, 1 February 2021
These are the Wiki pages of the Apache OpenOffice XML Project.
The following information can be found on these pages:
Current Efforts of the XML Project
- OpenDocument: Contributions to the formula specification that is developed by the OASIS OpenDocument Formula Subcommittee.
- OpenDocument: Contributions to the metadata enhancements for OpenDocument v1.2 that are developed by the OASIS OpenDocument Metadata Subcommittee.
- OpenDocument: Contributions to the OpenDocument v1.2 specification, including reviews of the specification documents.
- Performance improvements: The XML project participates in the Performance project
Contribution Areas
The OpenOffice. XML project is looking for contributions in the areas listed below. If you are interested in working in one of these areas, please contact us at dev@openoffice.apache.org.
XML Based Filters
The following suggestions are related to the framework for XML based filters.
Detection of the file type
Every time a document is being imported into OpenOffice the so-called filter detection chooses between the existing filter implementations to find the one matching the type of document that shall be loaded. For filters that load XML documents using the filter framework, the filter detection only searches for a string with the first 1000 characters of a document. This string can be specified in the XML Filter Settings dialog.
A better solution, that we would like to have instead, would be to check the name and or XML namespace of the root node of the XML document.
Access to images
Currently, XML based filters have only access to the BASE64 encoded data of embedded images and objects. Having also access to the binary data (for instance by unpacking the zip file of an OpenDocument file) or having a simple way to convert the BASE64 data into a binary file would simplify the development of filters.
Error logging
Currently, error logging is comparably weak for the XML based / XSLT filters. The only way to enable logging is to set a Java environment variable (e.g. -DXSLTransformer.statsfile=/usr/local/offices/xslt_debug.txt) in the Office options for Java.
A few new features are imaginable, such as:
- Customizing the filter logging via GUI
- Usage of defined log level (analog to Java Logging)
- GUI flag for a cumulative log file (instead replacing log for every transformation)
Validation
Currently, no XML validation takes place during the import or export of documents. Validation is only possible from a test dialog. This test dialog might be dropped in favor of an external development tools and/or (optional) validation during runtime. To achieve this, it is not sufficient to reuse the existing functionality, because it supports only DTDs, but no Relax-NG or other XML schema languages like Schematron.
OpenDocument Tools at OpenOffice
| Tools at OpenOffice | |
|---|---|
| Tool | Summary |
| SAXEcho | SAXEcho is a tool to be used with the OpenOffice application. It allows viewing the XML representation of a document at runtime. It can either display the native XML file format representation of an in-memory document, or the output of an XSLT transformation on said document. Not only that, but it is written in Java. |
| xfilter tool | This page documents the xfilter tool, which lets you execute XML-based filters (as explained in the Filters using XML document) outside of Apache OpenOffice or StarOffice. |
External OpenDocument Tools
| External Tools | |
|---|---|
| Tool | Summary |
| The Perl OpenDocument Connector | This toolkit allows direct OpenDocument file update and creation from Perl scripts. It works with the main document classes (texts, spreadsheets, presentations, and drawings), and provides an easy access to a large set of content and presentation elements. |
| OpenOffice Perl Library | The OpenOffice.org Perl Library (ooolib) can be used to create simple Apache OpenOffice Calc spreadsheet and Writer text documents. |
Articles and Books
| Articles and Books | |
|---|---|
| Document | Summary |
| OASIS OpenDocument Essentials | OASIS OpenDocument Essentials introduces you to the XML that serves as the native internal format of OpenOffice.org. You should read this book if you want to extract data from an OpenDocument document, convert your data to an OpenDocument document, or simply find out how OpenOffice.org stores its data under the hood. The book is written by J. David Eisenberg for O'Reilly & Associates; the content is currently licensed under a Creative Commons License. |
| Adventures with OpenOffice and XML | An article on XML.com, which shows how to use the XML File Format within the AxKit content management system. |
More articles and book on OpenDocument can be found at opendocument.xml.org.
Reports and Research Documents covering the OpenDocument format
| Reports and Research Documents | |
|---|---|
| Document | Summary |
| Enterprise Technical Reference Model - Version 3.6 | The Commonwealth of Massachusetts Information Technology Division's Enterprise Technical Reference Model (ETRM) provides an architectural framework used to identify the standards, specifications and technologies1 that support the Commonwealth's computing environment. |
| Open-source software in e-government | Analysis and recommendations drawn up by a working group under the Danish Board of Technology. |
More reports and research document can be found on opendocument.xml.org.
Filters and Conversions based on OpenDocument/OpenOffice.org XML
| Filters and Conversions based on OpenDocument/OpenOffice.org XML | |
|---|---|
| Document | Summary |
| OpenOffice.org's DocBook filter | Allows you to load and save DocBook files with OpenOffice.org through XSLT. The filter is meant as proof of concept and does not support all DocBook elements. Community support is appreciated. |
| OpenOffice.org XML export to strict XHTML 1.0 | Svante Schubert's XSL transformation for the creation of strict XHTML 1.0 from OpenOffice.org XML. Used for the online version of the Developers's Guide. Delivered with StarOffice/OpenOffice.org as optional sample filter |
| libwpd | William Lachance's libwpd and WordPerfect filter for OpenOffice Writer |
| OpenSHORE SMRL Metaparser Filter Package | A JAR filter package that reads "Semantic Markup Rule Language" (SMRL) specifications and creates |
| RDF Ticker Template and Package | Daten RDF Ticker template and export using XSLT, by Wieser Informationstechnik |
Source Code Modules in the Apache OpenOffice XML Project
The following modules belong to the Apache OpenOffice XML project:
| Modules in XML Project | ||
|---|---|---|
| Module | Function | Browse Source |
xmloff |
The module xmloff contains the document type definitions (DTDs) for the Apache OpenOffice XML based file format. It also contains most of the C++ code to read and write these files and OpenDocument files through the SAX interface. Some additional code exists within the application modules. |
source |
sax |
The module sax contains XML parser and XML writer components, both based on the SAX 1 interface. The parser itself is using James Clark's XT. |
source |
package |
The package module contains the Zip file access API implementation, the "generic" package API implementation and support for the XML Manifest file. |
source |
xmerge |
The xmerge module contains the sources for the "Document Editing on Small Devices" project. This project is documented here. |
source |
filtertools |
The filtertools module contains a tool for running XML-based filters outside of OpenOffice.org. The tool is documented here. |
source |