Bibliographic/Developer Page/Current Implementation of the Bibliographic Component

From Apache OpenOffice Wiki
Jump to: navigation, search

Current Implementation of the OpenOffice.org Bibliographic Component

In this article I will be explaining the citation facilities of OpenOffice.org, examining the APIs available to the programmer to manipulate the citation data, and how these API calls actually map to real C++ classes in the OpenOffice.org source code. Once you have a grasp of this it should be possible to extend this understanding to other areas similar to the citation bibliographic fields and indeed to many other areas of the OpenOffice.org C++ code.


OpenOffice.org can be used to author many types of documents, and has many invaluable features to help the document author. For those authoring documents which cite other documents, either printed or electronic, there are several features in OpenOffice.org which are indispensable.


Citations may be made in the body of the document as parenthesized citations, in footnotes at the bottom of pages, or in endnotes at the end of the document. They are generally also listed in a works cited page or section - also called the bibliography index, source list or list of references. In OpenOffice.org the work bibliography is used instead of citation. To avoid confusion I will also refer to citations as bibliography.

How are bibliographic references used?

When a user wants to cite another document, it is a good idea for them to add this citation to their bibliographic database which is built into OpenOffice.org. Once the information is in this database a reference to this data can be inserted into the document very easily and with a known format by using "Insert" -> "Indexes and Tables" -> "Bibliographic Entry". The resulting dialog allows the user to insert a bibliographic reference or to add a new reference into the database. The final interesting operation is for the user to add the bibliographic reference list. This is done with "Insert" -> "Indexes and Tables" -> "Indexes and Tables".


So what's wrong?

While the bibliographic features currently available in OpenOffice.org are quite useful, they also have some serious limitations. For example, many scientific journals expect a specific format of the bibliographic entry and the bibliographic reference list. However the formatting possibilities are quite limited and not in the format required for many publications, and many authors have to maintain their citation and reference list manually.


So how is the bibliographic components being improved?

The OASIS OpenDocument Technical committee who established the the Open Document Format standard have accepted that one of the first modifications required is a change to the citation coding. The changes were proposed by Daniel Vogelheim of Sun and Bruce D'Arcus who has a lot of experience in the bibliographic world and written many bibliographic tools. These changes will allow much more powerful formatting of the bibliographic and citation information and allow the user to decide on either one of a number of predefined formats acceptable to some of the major publications, or for the user to customize the formatting to suit their own needs.


So how is this relevant for developers?

Obviously any change to the file format must mean changes to the code - at the very least to be able to read the new format into current structures and to be able to output this new format to disk also.

However the new citation and bibliographic model also needs changes to the internal data structures of OpenOffice.org so that the user can interact with that data in a reasonable way.


For the rest of this article I will explain the current bibliographic implementation in OpenOffice.org. I have learned about this on my journey to understand what needs to be changed to support the new citation and bibliographic model.


However even though the rest of this article talks almost exclusively about the bibliographic programming model many of the concepts will also work for the other fields available to the user (and indeed to the C++ code behind the OpenOffice.org APIs).


Brief introduction to the UNO model

There are many components within OpenOffice.org and to be able to develop these components in a reasonably independent way the interfaces between components (and indeed inside components) are described in a language called IDL (Interface Definition Language). This definition is used to generate code and to provide a well defined, language independent definition of objects and interfaces available.

IDL objects and interfaces are not just for the OOo developer, but also can be used by the power user by using languages such as StarBasic. The IDL language supports object inheritance and containment as well as the definition of attributes and actions. IDL objects can also have containment like relationships between objects. An object defined by this language is usually referred to as an UNO service.

Generally UNO services are defined in terms of interfaces and properties (i.e. attributes). An interface is normally a collection of methods and properties which make sense grouped together e.g. an interface for manipulating the cursor, or an interface to manipulate a text field. Interfaces are always easy to recognise in IDL from their names as they always start with an "X". Interfaces are just one facet of a service and cannot exist independently of a service.

The objects actually instantiated are services, and in order to work on an interface of a service we must use an UNO mechanism to get that interface. You will shortly see how that looks both in terms of the object model and the C++ code.

The bibliographic UNO model

All of the objects for the bibliographic model are fully described in the IDL reference at http://api.openoffice.org -> "IDL reference" under com.sun.star.text. Note that I am using "Text." as a shorthand notation for the full IDL notation "com.sun.star.text". The main UNO services we will be talking about are Text.TextField.Bibliography and Text.FieldMaster.Bibliography.

The Text.TextField.Bibliography object contains the individual citation data and is associated with a particular point in the document.

The Text.FieldMaster.Bibliography object contains the general settings which are used to generate the bibliographic reference list.

The inheritance and containment relevant to the bibliographic services are : [[Image:xxx]bibliographic services]


You can see for example that the Text.FieldMaster.Bibliography service inherits from the Text.FieldMaster service and the Text.FieldMaster service has an interface called XPropertySet and has an attribute with a reference to the relevant Text.DependentTextField service.

Many Text.TextFields objects have a reference to a Text.FieldMaster object which defines general settings for all of the dependent text fields. This is the case for the bibliographic fields. The Text.FieldMaster.Bibliography service defines properties such as sorting options, locale, prefix and suffix.

Above you can see that the Text.TextDocument service has an interface called Text.XTextFieldsSupplier which contains two methods. The first method getTextFields() returns a list of all of the Text.TextField derived services contained in the Text.TextDocument object. The second method returns a named list of the Text.FieldMaster derived services.

Knowing the above IDL and examining the relevant properties of the service it is possible to write routines in C++, StarBasic, Java or Python to create, manipulate or delete bibliographic fields in your document. Below you will see some trivial C++ code using these UNO interfaces and services.

However for the purposes of this article, we will not be looking in depth at how to use the UNO APIs as we are much more interested in getting into the code which lies behind the UNO services described above.

So how is a bibliographic UNO service actually implement in C++?

I shall use the following scenarios to give a reasonable overview of what C++ classes and methods are involved :

  1. creating a Text.TextField.Bibliography object
  2. creating a Text.FieldMaster.Bibliography object
  3. associating the Text.Field.Bibliography service with a piece of text in the Text.TextDocument
  4. change properties in the Text.FieldMaster.Bibliography object
  5. rendering the Text.TextField.Bibliography object in the document

Most of the important code for the bibliographic implementation can be found in the sw module in CVS (you can browse this code from http://go-ooo.org/lxr/source/sw/sw ). This is the module which implements most of the Writer specific functionality ( hence "sw" ).


Within this module we will mostly be looking at three specific files :

  • source/core/unocore/unomap.cxx : this contains the mapping between many com.sun.star.text objects and the implementation of the properties of the object. e.g. search for PROPERTY_MAP_FLDTYP_BIBLIOGRAPHY and you will find the definitions of the bibliographic text field
  • source/core/unocore/unofield.cxx : this file contains the code which implements much of the public interfaces of the Text.TextField and Text.FieldMaster UNO services e.g. getting and setting properties in the FieldMaster to attaching the TextField object to a specific range of text in the document.
  • source/core/fields/authfld.cxx : this file contains classes which actually implement the behaviours and interacts with the core of the SW module and is closely related to visualizing and storing the bibliographic entries.
Scenario 1 : creating a Text.TextField.Bibliography object

To create such an object the C++ code would look similar to :

Reference<XInterface> biblioField = xFactory->createInstance( OUString::createFromAscii("com.sun.star.text.TextField.Bibliography" ) );

The call above is to SwXTextDocument::createInstance() which causes the SwXServiceProvider::MakeInstance() method to call the constructor of the bibliographic service i..e SwXTextField. The object returned is a XInterface which is a UNO type from which almost all other UNO types derive. This object represents the Text.TextField.Bibliography service.


Scenario 2 : creating a Text.FieldMaster.Bibliography object

To properly use a TextField.Bibliography service we must also create a TextFieldMaster.Bibliography service. This is almost the same as the call above except of course it is a different service to be created.

Reference<XInterface> master = xFactory->createInstance( OUString::createFromAscii("com.sun.star.text.FieldMaster.Bibliography"));

Again the call above is to SwXTextDocument::createInstance() which causes the SwXServiceProvider::MakeInstance() method to call the constructor of the Text.FieldMaster.Bibliography service i.e. SwXFieldMaster. The object returned represents the Text.FieldMaster.Bibliography service.


Scenario 3 : associating the Text.Field.Bibliography service with a piece of text in the Text.TextDocument

The next step is to establish the relationship between these two objects. The IDL defined relationship is that the inherited DependentTextField service of the TextField.Bibliography service has an interface called XDependentTextField. However before we do anything with the returned XInterface we need to make sure that it is valid. This is simply done by calling the .is() method like:

if( biblioField.is() )

which returns true to indicate that the object is valid.

To obtain the Text.XDependentTextField interface from the biblioField object we simply query the object for the interface like :

Reference<XDependentTextField> xDepTextField(biblioField,UNO_QUERY);

The querying of the interface objects from the service is pretty standard UNO mechanism and explained very well in the Developers Guide, so I am not going to add anymore about that here.

Now there is an method on this interface to attach the master field, however it only works on an XPropertySet interface so we muct obtain that interface from the master object like :

Reference<XPropertySet> masterProps( master, UNO_QUERY );

Finally we can associate the two objects :

xDepTextField->attachTextFieldMaster( masterProps );

However the call to attachTextFieldMaster() is interesting as it calls methods on both the implementation for the two objects.

The call to attachTextFieldMaster() is actually a call to SwXTextField::attachTextFieldMaster() where the type of the FieldMaster is obtained by calling the SwXFieldMaster::GetFldType() and is set in the attribute m_sTypeName of the instance of the SwXTextField.

Now that the master field is attached we only need to insert the field into the document. To do this we need to work on the Text.XTextContent interface from the bibliofield instance. The following shows an example of how to insert the field in the document at a defined range :

Reference<XTextContent> xTextContent( biblioField, UNO_QUERY); Reference<XText> xText( document, UNO_QUERY);

xText -> insertTextContent( xRange, xTextContent );

Scenario 4 : change properties in the Text.FieldMaster.Bibliography

Changing a property needs the XPropertySet interface. As you have already seen to get an interface is quite easy :

Reference<XPropertySet> xPropSet( master, UNO_QUERY );

Now to set a property we need to call the method setPropertyValue() with the name of the property to set and the new value. The new value is passed as an Any object. The Any object has to have the new value set with the "<<=" operator like  :

Any newValue; newValue <<= OUString::createFromAscii( "string before the bibliographic field");

The setPropertyValue() method is then called as :

xPropSet->setPropertyValue( OUString::createFromAscii( "BracketBefore"), newValue );

The setPropertyValue() call is actually a call to SwXFieldMaster::setPropertyValue() where the first parameter (the name of the property) is compared with the known properties for the FieldMaster object and when a match is found the properties in the object is set.

These UNO objects do not have a strong interaction with the internals of the Writer component. Rather there is a corresponding private object which implements much of the behaviour for the public UNO object. For the Text.TextField.Bibliography it is SwAuthorityField which is the private implementation class, and for Text.FieldMaster.Bibliography it is SwAuthorityFieldType. There are also equivalents for the base classes : for Text.DependentTextField it is SwField and for Text.FieldMaster it is SwFieldType.

Thus in the above scenario the properties are not actually stored in the UNO object but in the SwAuthorityFieldType object by calling the PutValue() method on that class.

Scenario 5 : rendering the Text.TextField.Bibliography

The actual call to render the contents of the field is to the SwAuthorityField object on the Expand() method. This can be triggered in many ways including an edit ocurring to a neighbouring position, loading of the document, refreshing the document, etc. The SwAuthorityField::Expand() method acts on the data of the bibliographic entry and formats the string as per the definition in the associated SwAuthorityFieldType. This string is then returned by this method and is rendered in the document view.


Conclusion

This article has concentrated on the basic functionalities of the bibliographic component of OpenOffice.org. It has touched on the high level UNO services and examined how some of the common calls translate to calls in the C++ code.

Using what you have learned here, it should also be a bit easier to see the mapping from UNO API calls on Text.Fields to the C++ code implementing the actual behaviour. Indeed if you can understand the above, then it can be extended into other areas of OpenOffice.org to see how other UNO services are actually implemented.


About the author

CP Hennessy is CTO of OpenApp.biz, a content and document management company using the features of OpenOffice.org and Zope based enterprise content management systems. Using this powerful technology, systems have been developed for one of the leading news sites in Ireland, a customer relationship and billing system for an electrical supply company and a document management system for one of the largest hospitals in Ireland, and a richly integrated GIS and statistal data analyis portal. With Blackrock Education, OpenApp also has published the first ECDL accredited StarOffice training manual, and is launching a learning management system to support these materials.

CP Hennessy is also heavily involved and moderator on the users@openoffice.org mailing list.

Further Reading
 *http://bibliographic.openoffice.org is the home page of the bibliographic project
 *http://www.oasis-open.org/committees/office is the home page of the OASIS ODF committee
 *http://api.openoffice.org has a complete reference on the UNO services and a very complete Developers Guide
 *http://wiki.services.openoffice.org has lots of information on developing for OpenOffice.org
Personal tools