HSQL Text Table Integration

From Apache OpenOffice Wiki
Revision as of 23:05, 12 September 2006 by DrewJensen (Talk | contribs)

Jump to: navigation, search

We plan to create a user interface for HSQLDB's feature of linking external Text/CSV files as if they were native HSQL tables. See the specification for details.

This page collects issues around this project.

HSQL Issues

Thinking about the whole feature, there's a number of issues in the current implementation of HSQL, mostly feature-wise.

Current Text Table Settings (P2)

We need a possibility to obtain *current* text table settings

ideally (?), this is some result set listing the various settings of a concrete (or all?) tables. But perhaps a simple

 CALL "<function>"( <tableName> )

which just returns the complete text source string, is sufficient

Date Formats (P2)

Only YYYY-MM-DD is accepted as date format ATM.

I think using a java.text.(Simple)DateFormatter, probably even a (per-table) format string, is better. In an international world, we cannot expect all our users to normalize their files for date formats.

Numeric Formats (P2)

Only "." as decimal separator, no thousands separator recognized

Similar to the previous item, this is pretty unacceptable for an non-en-US audience. I suppose java.text.DecimalFormat is the right class here to use. IMO, parametrizing a text table with a decimal and a group/thousands separator is sufficient, I don't intend to allow for the full functionality of a java.text.DecimalFormat.

Error Messages (P3)

In general, there are various causes for why creating / setting a source for a text table can fail. Most interesting to me, the file could have a wrong format (e.g. not enough data in a row, or wrongly formatted data), or the data could not confor to the PK/Index restrictions.

Current error handling here is rather generous, not telling the user what actually went wrong. We should improve here, as only meaninful error messages from HSQL enable the OOo user to find and fix the problem.

Relative Paths (P3)

Not all types of relative paths are allowed: trying to set a text table's source to something like "../filename" results in an "Access Denied" error message.

Charsets / Encodings (P3)

When specifying an encoding not supported as a Java Charset, a warning should be issued.

Currently, if an invalid encoding is specified, this problem is completely silenced. We need a mechanism here to better propagate this error.

Also (but this might be a different problem), we need a mechanism to tell unsupported encodings apart from supported ones, *before* actually issueing the SET TABLE SOURCE command. (Fortunately Java uses IANA names for charsets, as does OOo internally, so we at least speak the same language. I love standards. :)

Quote Characters (P4)

Users should be able to choose the quote character. They're used to doing so from other CSV integrations in OpenOffice.org, but HSQLDB currently does not allow this.

Row Order (P?)

Normally, for database tables you cannot rely on the order of records, since it's a row *set*, not a row *sequence*.

However, for Text/CSV files, this is different. Users here might expect that the rows are in the same order as in the file.

However, this is not the case in HSQLDB's text tables as soon as the table has a PK/Index.

Design Issues


Should settings made in the linked table editor be remembered for future invocations? If yes: Per data source? Per session? Do we need a full-blown administration of "setting sets", similar to what MSA has (Load/Save)?

DrewJensen 01:05, 13 September 2006 (CEST)

The edtitor should IMO offer the option to make the settings only available for this table, leaving the default settings for the datasource unchanged or making them defaults for the datasource.

As to the editor's default settings I would think per data source would make the most sense. HSQLDB databases already support this model with it's textdb.* global settings feature. If support is added to read these settings, per the section " Current Text Table Settings ", then the linked table editor simply needs to read them on start up.

Should supporting easy access to the textdb.* global settings prove to be not worth the effort, then perhaps using a configuration registry may be apporpriate. Either way I believe the effort would be well worth it.

Support for a "full-blown administration" feature, might be worthwhile in the long term depending on the amount of use that this feature would be put to - but I would not think it worth the effort at this moment if it would push back the implementation of the feature oveall.

Relative Path Compatibility

What happens when we open an old embedded HSQLDB .odb file, where the textdb.allow_full_path property is FALSE? Should we automatically set it to TRUE, assuming TEXT TABLES are a rarely used feature so far?

Personal tools