Calc/Features/Numbers import for plain text files
Contents
Numbers import for plain text files
Specification Status | |
Author | Kohei Yoshida |
Last Change | See wiki history |
Status | In progress in CWS koheicsvimport |
Abstract
The CSV import options dialog has been enhanced to provide a language option, and two additional options for numbers import. In addition, the dialog's state is now stored persistently in user configuration. Calc's HTML import filter now provides a language option and an option of whether or not to detect special numbers in its new import options dialog.
References
Reference Document | Check | Location (URL) |
Issue ID (required) | available | Issue 3687 Issue 97416 Issue 102141 |
Test case specification (required) | n/a |
Contacts
Role | Name | E-Mail Address |
Developer | Kohei Yoshida | kyoshida@novell.com |
Quality Assurance | Oliver Craemer | oliver.craemer@sun.com |
Documentation | up for grabs | |
User Experience | up for grabs |
Detailed Specification
CSV import options dialog
The CSV import options dialog now has a Language list box to allow users to specify a non-default language to use for CSV import. How this language selection influences how numbers in the CSV document are imported is explained later in this section.
In Addition, the dialog also provides the following two additional check boxes in the Other options section:
- Quoted field as text
- Detect special numbers
Likewise, how these options influence numbers import is explained later in this section.
Persistence of dialog's state
The states of this dialog's controls (except for the character set) are stored persistently in user configuration so that the subsequent launch of this dialog restores the previous settings. The states of the dialog controls are saved even after the application process terminates.
HTML import options dialog
An HTML import options dialog allows users to specify a non-default language, which influences how numbers are parsed, as well as an option of enabling or disabling special number detection. How this language selection and the special number detection option influence numbers import is explained later in this section.
Options for parsing numbers
Language
Language (and regions in case the language is associated with multiple regions) determines how the number strings are parsed during import. If the language option is set to Default (in the CSV Options dialog) or Automatic (in the HTML Import Options dialog), Calc will use the language that OOo uses globally. If the language option is set to a specific language, that language will be used when parsing numbers.
Detect special numbers
When this option is enabled, Calc will automatically detect all number formats, including special number formats such as dates, time, and scientific notation. The selected language influences how such special numbers are detected, since different languages and regions many have different conventions for such special numbers.
When this option is disabled, Calc will detect and convert decimal numbers only while the rest will be imported as texts. A decimal number string can have digits 0-9, thousands separators (aka group separators), and a decimal separator. A decimal number is not allowed to have more than one decimal separator. Thousands separators and decimal separators may vary with the selected language and region. The term decimal number in this instance does not include scientific notation. When this option is disabled, numbers formatted in scientific notation will be imported as text.
Quoted field as text (CSV import only)
When this option is enabled, fields or cells whose values are quoted in their entirety (i.e. the first and last characters of the value equal the text delimiter character specified in the same dialog) are imported as texts irregardless of what their contents are.
Migration
N/A
Configuration
This feature introduces the following new configuration nodes under the Calc/Dialogs/CSVImport
node path, to persistently store the states of controls in the CSV import options dialog.
Name | Type | Description |
MergeDelimiters | boolean | status of Merge delimiters check box |
QuotedFieldAsText | boolean | status of Quoted field as text check box |
DetectSpecialNumbers | boolean | status of Detect special numbers check box |
Language | int | Selected language. The number corresponds with the internal ID of the selected language. |
Separators | string | the character that separates the fields. |
TextSeparators | string | the text delimiter character used to quote texts. |
FixedWidth | boolean | status of whether the Fixed width or Separated by radio box is checked. |
FromRow | int | ID of the row where the data import begins. |
CharSet | int | Numerical ID of the selected character set. |
FixedWidthList | string | Set of numerical column positions where the fixed width separators are placed. The column positions are separated by semicolons (;). |
File Format
N/A
Open Issues
Not directly caused by this feature, but opening an HTML file from File - Open opens the file in Writer, even if the file is opened from Calc's application frame. On the other hand, when opening an HTML file from the command line by scalc command correctly launches Calc to import the file.
Credits
The dialog state persistence feature was contributed by Muthu Subramanian.