Filter Options

From Apache OpenOffice Wiki
< Documentation‎ | DevGuide
Revision as of 07:53, 18 September 2009 by Simon.says (Talk | contribs)

Jump to: navigation, search



Loading and saving OpenOffice.org API documents is described in Handling Documents. This section lists all the filter names for spreadsheet documents and describes the filter options for text file import.

The filter name and options are passed on loading or saving a document in a sequence of com.sun.star.beans.PropertyValues. The property FilterName contains the name and the property FilterOptions contains the filter options.

Template:Documentation/Note

The list of filter names (the last two columns show the possible directions of the filters):

Filter name Description Import Export
StarOffice XML (Calc) Standard XML filter
calc_StarOffice_XML_Calc_Template XML filter for templates
StarCalc 5.0 The binary format of StarOffice Calc 5.x
StarCalc 5.0 Vorlage/Template StarOffice Calc 5.x templates
StarCalc 4.0 The binary format of StarCalc 4.x
StarCalc 4.0 Vorlage/Template StarCalc 4.x templates
StarCalc 3.0 The binary format of StarCalc 3.x
StarCalc 3.0 Vorlage/Template StarCalc 3.x templates
HTML (StarCalc) HTML filter
calc_HTML_WebQuery HTML filter for external data queries
MS Excel 97 Microsoft Excel 97/2000/XP
MS Excel 97 Vorlage/Template Microsoft Excel 97/2000/XP templates
MS Excel 95 Microsoft Excel 5.0/95
MS Excel 5.0/95 Different name for the same filter
MS Excel 95 Vorlage/Template Microsoft Excel 5.0/95 templates
MS Excel 5.0/95 Vorlage/Template Different name for the same filter
MS Excel 4.0 Microsoft Excel 2.1/3.0/4.0
MS Excel 4.0 Vorlage/Template Microsoft Excel 2.1/3.0/4.0 templates
Lotus Lotus 1-2-3
Text - txt - csv (StarCalc) Comma separated values
Rich Text Format (StarCalc)
dBase
SYLK Symbolic Link
DIF Data Interchange Format

Filter Options for Lotus, dBase and DIF Filters

These filters accept a string containing the numerical index of the used character set for single-byte characters, that is, 0 for the system character set.

The numerical indexes assigned to the character sets:

Character Set Index
RTL_TEXTENCODING_DONTKNOW 0
RTL_TEXTENCODING_MS_1252 1
RTL_TEXTENCODING_APPLE_ROMAN 2
RTL_TEXTENCODING_IBM_437 3
RTL_TEXTENCODING_IBM_850 4
RTL_TEXTENCODING_IBM_860 5
RTL_TEXTENCODING_IBM_861 6
RTL_TEXTENCODING_IBM_863 7
RTL_TEXTENCODING_IBM_865 8
Reserved: RTL_TEXTENCODING_SYSTEM 9
RTL_TEXTENCODING_SYMBOL 10
RTL_TEXTENCODING_ASCII_US 11
RTL_TEXTENCODING_ISO_8859_1 12
RTL_TEXTENCODING_ISO_8859_2 13
RTL_TEXTENCODING_ISO_8859_3 14
RTL_TEXTENCODING_ISO_8859_4 15
RTL_TEXTENCODING_ISO_8859_5 16
RTL_TEXTENCODING_ISO_8859_6 17
RTL_TEXTENCODING_ISO_8859_7 18
RTL_TEXTENCODING_ISO_8859_8 19
RTL_TEXTENCODING_ISO_8859_9 20
RTL_TEXTENCODING_ISO_8859_14 21
RTL_TEXTENCODING_ISO_8859_15 22
RTL_TEXTENCODING_IBM_737 23
RTL_TEXTENCODING_IBM_775 24
RTL_TEXTENCODING_IBM_852 25
RTL_TEXTENCODING_IBM_855 26
RTL_TEXTENCODING_IBM_857 27
RTL_TEXTENCODING_IBM_862 28
RTL_TEXTENCODING_IBM_864 29
RTL_TEXTENCODING_IBM_866 30
RTL_TEXTENCODING_IBM_869 31
RTL_TEXTENCODING_MS_874 32
RTL_TEXTENCODING_MS_1250 33
RTL_TEXTENCODING_MS_1251 34
RTL_TEXTENCODING_MS_1253 35
RTL_TEXTENCODING_MS_1254 36
RTL_TEXTENCODING_MS_1255 37
RTL_TEXTENCODING_MS_1256 38
RTL_TEXTENCODING_MS_1257 39
RTL_TEXTENCODING_MS_1258 40
RTL_TEXTENCODING_APPLE_ARABIC 41
RTL_TEXTENCODING_APPLE_CENTEURO 42
RTL_TEXTENCODING_APPLE_CROATIAN 43
RTL_TEXTENCODING_APPLE_CYRILLIC 44
RTL_TEXTENCODING_APPLE_DEVANAGARI 45
RTL_TEXTENCODING_APPLE_FARSI 46
RTL_TEXTENCODING_APPLE_GREEK 47
RTL_TEXTENCODING_APPLE_GUJARATI 48
RTL_TEXTENCODING_APPLE_GURMUKHI 49
RTL_TEXTENCODING_APPLE_HEBREW 50
RTL_TEXTENCODING_APPLE_ICELAND 51
RTL_TEXTENCODING_APPLE_ROMANIAN 52
RTL_TEXTENCODING_APPLE_THAI 53
RTL_TEXTENCODING_APPLE_TURKISH 54
RTL_TEXTENCODING_APPLE_UKRAINIAN 55
RTL_TEXTENCODING_APPLE_CHINSIMP 56
RTL_TEXTENCODING_APPLE_CHINTRAD 57
RTL_TEXTENCODING_APPLE_JAPANESE 58
RTL_TEXTENCODING_APPLE_KOREAN 59
RTL_TEXTENCODING_MS_932 60
RTL_TEXTENCODING_MS_936 61
RTL_TEXTENCODING_MS_949 62
RTL_TEXTENCODING_MS_950 63
RTL_TEXTENCODING_SHIFT_JIS 64
RTL_TEXTENCODING_GB_2312 65
RTL_TEXTENCODING_GBT_12345 66
RTL_TEXTENCODING_GBK 67
RTL_TEXTENCODING_BIG5 68
RTL_TEXTENCODING_EUC_JP 69
RTL_TEXTENCODING_EUC_CN 70
RTL_TEXTENCODING_EUC_TW 71
RTL_TEXTENCODING_ISO_2022_JP 72
RTL_TEXTENCODING_ISO_2022_CN 73
RTL_TEXTENCODING_KOI8_R 74
RTL_TEXTENCODING_UTF7 75
RTL_TEXTENCODING_UTF8 76
RTL_TEXTENCODING_ISO_8859_10 77
RTL_TEXTENCODING_ISO_8859_13 78
RTL_TEXTENCODING_EUC_KR 79
RTL_TEXTENCODING_ISO_2022_KR 80
RTL_TEXTENCODING_JIS_X_0201 81
RTL_TEXTENCODING_JIS_X_0208 82
RTL_TEXTENCODING_JIS_X_0212 83
RTL_TEXTENCODING_MS_1361 84
RTL_TEXTENCODING_GB_18030 85
RTL_TEXTENCODING_BIG5_HKSCS 86
RTL_TEXTENCODING_TIS_620 87
RTL_TEXTENCODING_KOI8_U 88
RTL_TEXTENCODING_ISCII_DEVANAGARI 89
RTL_TEXTENCODING_JAVA_UTF8 90
RTL_TEXTENCODING_ADOBE_STANDARD 91
RTL_TEXTENCODING_ADOBE_SYMBOL 92
RTL_TEXTENCODING_PT154 93
RTL_TEXTENCODING_UCS4 65534
RTL_TEXTENCODING_UNICODE(UCS2) 65535

Filter Options for the CSV Filter

This filter accepts an option string containing five tokens, separated by commas. The following table shows an example string for a file with four columns of type date - number - number - number. In the table the tokens are numbered from (1) to (5). Each token is explained below.

Example Filter Options String Field Separator (1) Text Delimiter (2) Character Set (3) Number of First Line (4) Cell Format Codes for the four Columns (5)
Column Code
File Format:

Four columns date-num-num-num

, " System line no. 1 1

2
3
4

YY/MM/DD = 5

Standard = 1
Standard = 1
Standard = 1

Token 44 34 0 1 1/5/2/1/3/1/4/1

For the filter options above, set the PropertyValue FilterOptions in the load arguments to "44,34,0,1,1/5/2/1/3/1/4/1". There are a number of possible settings for the five tokens.

  1. Field separator(s) as ASCII values. Multiple values are separated by the slash sign ("/"), that is, if the values are separated by semicolons and horizontal tabulators, the token would be 59/9. To treat several consecutive separators as one, the four letters /MRG have to be appended to the token. If the file contains fixed width fields, the three letters FIX are used.
  2. The text delimiter as ASCII value, that is, 34 for double quotes and 39 for single quotes.
  3. The character set used in the file as described above.
  4. Number of the first line to convert. The first line in the file has the number 1.
  5. Cell format of the columns. The content of this token depends on the value of the first token.
  • If value separators are used, the form of this token is column/format[/column/format/...] where column is the number of the column, with 1 being the leftmost column. The format is explained below.
  • If the first token is FIX it has the form start/format[/start/format/...], where start is the number of the first character for this field, with 0 being the leftmost character in a line. The format is explained below.
Format specifies which cell format should be used for a field during import:
Format Code Meaning
1 Standard
2 Text
3 MM/DD/YY
4 DD/MM/YY
5 YY/MM/DD
6 -
7 -
8 -
9 ignore field (do not import)
10 US-English
The type code 10 indicates that the content of a field is US-English. This is useful if a field contains decimal numbers that are formatted according to the US system (using "." as decimal separator and "," as thousands separator). Using 10 as a format specifier for this field tells OpenOffice.org API to correctly interpret its numerical content, even if the decimal and thousands separator in the current language are different.
Content on this page is licensed under the Public Documentation License (PDL).
Personal tools
In other languages