CSV import enhancement

As I mentioned in my previous post, I had introduced two new HTML import options to control how numbers in cells are detected & converted during import. I had also hinted at the end of the post that I might add the same import options to the CSV import dialog. I’m writing this post to inform you guys that that’s exactly what I just did.

Here is the new CSV import options dialog:

which highlights the changes I’ve made. At the top of the dialog is the new Language list box, to select what language to use for the import. If it’s set to Default, it uses the language that OOo uses globally. Similar to the HTML import, this option affects how the numbers are parsed, based on the decimal and thousands separators for the selected language, as well as how the special numbers are detected & converted. I’ll talk more on the special number detection later.

In addition to the language option, located in the middle of the dialog are two check boxes to further control how cell values are to be interpreted.

  • Option Quoted field as text, when set, always imports quoted cell values as text, even if they are numbers. This option existed before this change, but was used only for the separator-based CSV imports. It is now used for the fixed width imports as well.
  • Option Detect special numbers, similar to the one in the HTML import option, controls whether or not to detect specially-formatted numbers, such as dates, scientific notations etc. When this option is set, Calc will try to detect special numbers and convert them into appropriate format. When this option is not set, Calc will only convert the simple decimal numbers. By default, this option is not set.

Anyway, that’s all there is to it. Hopefully this will solve, or at least make it easier to handle importing of CSV documents, which was previously hampered by Calc’s aggressive date detection and lack of support for alternative number separators. I doubt that this will make your CSV import experience a perfect one, but it will hopefully make it a much better one.

5 thoughts on “CSV import enhancement”

  1. Hello Kohei,

    Thanks a lot for your last improvements.
    We import a lot of .csv files at work ;-)

    Regards,

    Silvio

  2. Hi Gábor,

    Actually, I’m not sure if it’s indeed working from macro. If you could show me (either here or via email) a code snippet setting import options for csv document, I can take a look and see how it’s working. Thanks!

  3. OK; here is what I wanted to use this for. My input consist of lines like

    sometextnumbernumber

    where columns are separated by tabs, and cells may be padded by spaces. In the Calc sheet I want to get rid of the spaces and have the numbers as numbers.

    The macro I wanted to have is like:

    sub main
    dim doc2 as object
    doc2 = loadClusteringCSV(“file:///input.txt”)
    end sub

    sub loadClusteringCSV(byval url as string) as object
    dim doc as object
    dim props(1) as new com.sun.star.beans.PropertyValue
    ‘from http://wiki.services.openoffice.org/wiki/Documentation/DevGuide/Spreadsheets/Filter_Options
    props(0).name = “FilterName”
    props(0).value = “Text – txt – csv (StarCalc)”
    props(1).name = “FilterOptions”
    props(1).value = “9,34,0,1,”
    doc = StarDesktop.loadComponentFromURL(url, “_blank”, 0, props)
    ‘ see http://api.openoffice.org/docs/common/ref/com/sun/star/frame/XComponentLoader.html
    loadClusteringCSV = doc
    end sub

  4. Ah, yes. I remember I modified the FilterOptions handling code.

    Basically, the new FilterOptions value has two boolean values appended to the end after Cell Format Codes for the four Columns (5), and one integer value between the Character Set (3) and the Number of First Line (4). The new integer value is the language ID, while the two boolean values are for the “Quoted field as text” and “Detect special numbers” options in this order.

    The usable language ID’s are listed here:
    http://svn.services.openoffice.org/opengrok/xref/Current%20(trunk)/i18npool/inc/i18npool/lang.h

    That wiki documentation page should probably be updated once this feature lands…

Comments are closed.