CSV import enhancement

As I mentioned in my previous post, I had introduced two new HTML import options to control how numbers in cells are detected & converted during import. I had also hinted at the end of the post that I might add the same import options to the CSV import dialog. I’m writing this post to inform you guys that that’s exactly what I just did.

Here is the new CSV import options dialog:

which highlights the changes I’ve made. At the top of the dialog is the new Language list box, to select what language to use for the import. If it’s set to Default, it uses the language that OOo uses globally. Similar to the HTML import, this option affects how the numbers are parsed, based on the decimal and thousands separators for the selected language, as well as how the special numbers are detected & converted. I’ll talk more on the special number detection later.

In addition to the language option, located in the middle of the dialog are two check boxes to further control how cell values are to be interpreted.

  • Option Quoted field as text, when set, always imports quoted cell values as text, even if they are numbers. This option existed before this change, but was used only for the separator-based CSV imports. It is now used for the fixed width imports as well.
  • Option Detect special numbers, similar to the one in the HTML import option, controls whether or not to detect specially-formatted numbers, such as dates, scientific notations etc. When this option is set, Calc will try to detect special numbers and convert them into appropriate format. When this option is not set, Calc will only convert the simple decimal numbers. By default, this option is not set.

Anyway, that’s all there is to it. Hopefully this will solve, or at least make it easier to handle importing of CSV documents, which was previously hampered by Calc’s aggressive date detection and lack of support for alternative number separators. I doubt that this will make your CSV import experience a perfect one, but it will hopefully make it a much better one.

Multi-range selection copy & paste

New in Go-OO Calc is the ability to copy and paste multi-range selection, that is, you can select separate cell ranges that are not connected with each other, and copy and paste them in one single action. This was not possible previously; when you tried to copy multiple ranges, you would get an error message telling you that copying of multi-range selection is not supported (or something to that effect).

When pasting a multi-range selection, all the copied ranges get consolidated into a single range when pasted into a destination location. For example, copying this multi-range selection

and pasting it into another sheet will paste the data as follows

You can’t just copy a random set of ranges of varying sizes, however. Because the data gets pasted as a single range, all copied ranges must have either equal column size, or equal row size, or else you’ll be greeted with an error message.

Range selection and cursor

There is a slight tweak I’ve made in how Calc makes a range selection and how it places the cursor during selection. The old behavior was that the cursor always moves with the tip of the selection, as you can see in the following screenshot:

In the new selection behavior, the cursor stays at its original position while the selection is being made, as follows:

Aesthetically, the difference between the old and new behaviors is subtle, and probably is subject to personal preference. What makes a big difference is when editing multiple cells by first making a selection via shift-arrow, then editing them one by one within that selection by using the ENTER or TAB key to navigate through them. Or when copying & pasting a range of cells via keyboard, by first making a selection to copy, then moving the cursor to a destination cell to paste. Power users tend to do this, and they are being put off by how Calc places the cursor always at the “end” of the selection. In their mind, the current cursor placement is “wrong”.

Competitively speaking, the new behavior is what most other spreadsheet applications behave (Excel, Gnumeric and GoogleDoc, in particular). So, there is also a benefit in keeping it consistent with the rest of the spreadsheet applications, to make it easier to migrate power users to Calc.

HTML import option

What’s new?

I just checked in to the Go-OO master branch an enhancement to Calc’s HTML table import feature. With this enhancement, you can select a custom language to use when you import an HTML table, which affects how the numbers are interpreted during the import.

What’s this good for?

Each language adheres to a different set of conventions, such as date formats, thousands and decimal separators, and other special number formats. Previously, Calc simply used the system’s language setting when importing an HTML in order to decide whether a cell is a nubmer or a text, and if it’s a number, what numbering format to apply. Although you could change the system language in the global options dialog, that was not always convenient especially when all you need to do is quickly import an HTML table with different language conventions, and you don’t want to bother changing the global language setting (and changing it back after the import is done). That’s where this enhancement will come in.

What can you expect with this change?

Well, you’ll get the following dialog

when you are importing an HTML table, to either select the system’s language (Automatic), or a custom language from the list of available languages. Just remember that, if you don’t know or don’t care about what language to choose, select Automatic and move on. If you do care, then select Custom and pick a language from the list.

Special number detection (especially the dates)

There is an additional option below, to toggle automatic number format detection. With this option checked, Calc will try to automatically detect special number formats, such as dates, scientific notations etc. If unchecked, Calc will only detect the simplest numbers i.e. numbers that only consist of digits, decimal separators and thousands separators (and a sign if there is one), while all the other numbers are imported as text. By default, this option is NOT checked, which means Calc will only detect the simple numbers.

BTW, I’m thinking of adding a similar option to the csv import, where automatic date conversion has been quite problematic in the past and driving lots of users crazy. Adding this option also to the csv import would IMO make sense.

Well, this is it. It’s actually a pretty minor enhancement, but I hope someone will find it useful. :-)

Custom sort in DataPilot

Just checked in this piece to the master branch of the Go-OO repository, to support sorting of DataPilot’s field members using custom sort lists. I’ve extended the popup window I wrote for the hide field members functionality to provide this additional sorting functionality. The result is the following popup window:

In the upper half of the window I’ve added a menu-like control, with the custom sort lists being provided in the submenu. The UI is fully functional, but still a bit rough around the edges. The custom sort list submenu, in particular, may need some additional work to handle a large set of custom sort lists, a sort list that is very long, or stuff like that. But as long as your sort list is in modest size, it should work just fine.

This feature didn’t make it in to 3.1 since we are in a stabilization phase for 3.1. But as soon as we branch master for the stable 3.1 branch, I will enable this feature in the default build in the master branch.

Traversing to precedents and dependents

Just checked this piece into the master branch of the go-oo repository. It allows traversing to precedents and dependents of a cell by Ctrl-[ and Ctrl-] key strokes, respectively. It is similar in concept to the existing Detective functionality, but while Detective graphically shows the precedents and dependents of a current cell, this new functionality physically moves the cursor to the precedent and dependent cells. Similar functionality already existed in Excel, so this is one of those interoperability features and, for some spreadsheet users, the ability to jump to precedents/dependents is apparently very important for their productivity.

No feature is complete without screenshots. So, here they are. :-)

In the first example, the cell currently selected contains references to three cells and one cell range in its formula expression. I’m showing the precedent traces here just to show their relationship visually.

When you hit Ctrl-[, it highlights all its reference cells (a.k.a precedents) and moves the cursor to the first precedent.

Once the cells are highlighted, you can easily navigate through the highlighted cells by hitting the Enter or Tab key.

There is one caveat. When the expression includes references to cells outside of the current sheet, the ones that are not on the current sheet are ignored. The only exception to this rule is when the first reference points to a cell in another sheet, in which case it jumps to that external-sheet cell while the rest of the references are ignored even if they are on the current sheet. If that reference is in another document (i.e. external reference), it opens that document then sets the cursor to the referenced cell position provided that the document is available at specified location in the file system.

Traversing to dependents also works in a similar fashion. Consider the following example

where current cell is referenced by multiple other cells. Again, I’m showing the dependency traces to display their relationship graphically. When you hit Ctrl-], it highlights all its dependent cells and moves the cursor to the first dependent cell.

Similar to the precedent jump, there is a caveat; when the cell is referenced by cells on multiple different sheets including the current sheet, only those cells on the current sheet are highlighted and the rest are ignored. But unlike the precedent jump, there is no exception to this rule.

That’s it folks! I hope you find this new functionality useful. And as always, please report back any problems you may encounter so that I can fix them. Thank you very much, ladies and gentlemen. :-)

Some minor enhancement with DataPilot

Just checked into the master branch of go-oo repository (oh BTW we just switched our repository from svn on gnome.org to git on freedesktop.org) is a minor enhancement in DataPilot, to allow the users to filter results by field members directly from the field buttons in the table output. The following screenshot will tell you the change I just made:

The new field buttons also provide visual feedback on fields that are filtered; the same way the autofilter buttons already do. This way it’s visually obvious to the users which fields are currently filtered.

I call this a “minor” enhancement, simply because it doesn’t provide a new functionality per se; the same filtering functionality was already present but hidden deep beneath multiple layers of dialogs. You needed to go from the datapilot layout dialog (the main dialog), open up the field dialog for the field you want to filter results by, and then finally the field options dialog to get to the Hide items list in that dialog. This enhancement will push that functionality up-front and make it more accessible to the users, along with some visual feedback.

Anyway, this one made it into 3.1, so testing and feedback is greatly appreciated. :-)

Coloring tabs in Calc

I just checked in to go-oo trunk the latest patch from Daniel Watson that enables coloring of sheet tabs in Calc. With his latest patch, the tab color dialog looks more compact and polished, which I definitely prefer over the previous version.

In case it’s not obvious how to use this feature, here are some screenshots to show how.

In the context menu on the sheet tab, there is now a new entry called Tab Color.

Select that menu to bring up the color palette dialog.

Select a color and click OK. The sheet tab now has a sliver of color at the bottom. When you switch to another sheet, you’ll notice that the tab you just colored is now filled entirely with that color. And when you color all of your tabs, you’ll get something like this:

Pretty cool huh?

BTW Daniel did this almost entirely on his own, with just a teeny bit of help from Eric Bachard and myself. He is very thorough and the quality of his code is pretty high. He maintains the spec for this feature right here. He is also anxious to get this feature upstreamed, and judging by the kind of reaction he is getting from the upstream Calc team, I’m pretty sure this feature is on a fast track to upstream acceptance. ;-) Great job Daniel!

Oh, one last thing. Please try out this feature and test it out.

It’s text when it’s quoted, stupid!

Warning: WP_Syntax::substituteToken(): Argument #1 ($match) must be passed by reference, value given in /homepages/7/d105059906/htdocs/wp.kohei.us/wp-content/plugins/wp-syntax/wp-syntax.php on line 383

Ok. This is another one I just checked into ooo-build trunk. It’s actually a very minor enhancement, but some people may find this useful.

What I did is to modify Calc’s csv import filter to provide the users an option to always import quoted fields (or cells) as text. Here is a screenshot of the dialog with this new option:

Hopefully what it does is self-evident. If not, here is an example.

Let’s say you have a csv file of the following content:

"0-0" "1-0" "2-0" "3-0" "4-0" "5-0"
"0-1" "1-1" "2-1" "3-1" "4-1" "5-1"

When you import this file into Calc, here is what you get with the current version:

As you can see, most of the fields got converted to dates. You get the same results even if those individual cells are not quoted, because Calc’s current csv import filter doesn’t make any distinction between quoted and unquoted fields in terms of data types.

When you import this file using the new option, you get this instead:

Now, all quoted fields are imported as text, with no surprise conversion. Plain and simple. :-)