Inserting current date and time in one step

current-date-time
Here is another simple feature that may come in handy.

With the change I just checked into ooo-build master, you can now insert current date and time with just one key stroke. By default, Ctrl+; (semicolon) is bound to current date, while Ctrl+Shift+; is bound to current time. But these key bindings are configurable in case you don’t like these default bindings.

Two more enhancements are in

Today, I’d like to talk about two minor enhancements I just checked in to ooo-build master. They are not really earth-shuttering per se, but still worth mentioning & may be interesting to some users.

Insert new sheet tab

insert-sheet-shot

Here is the first enhancement. In Calc, you’ll see a new tab at the right end of the sheet tabs, to allow quick insertion of new sheets. Each time you click this tab, a new sheet gets inserted to the right end. The sheet names are automatically assigned.

Previously, inserting a new sheet has to be done by opening the Insert sheet dialog, selecting the position of the new sheet and how many new sheets are to be inserted etc. But if you always append a single sheet at the right end and don’t care to name the new sheet (or name it after the sheet is inserted), this enhancement will save you a few clicks. Implementing this was actually not that hard since I was able to re-use the existing code for most of its functionality. I personally wanted to give it a little more visual appeal, but that will be a future project.

Anyway, I hope some of you will find this useful.

English function names in non-English locale

The second enhancement is related to cell functions. If you use a localized version of OOo, you probably know that the function names are localized. But there has been quite a few requests to support English function names even if the UI is localized. This is where this enhancement comes in.

First, there is now an additional check box in the Formula options page:
english-func-option
By default, the check box is off, which means the localized function names are used. Checking this check box will swap localized function names with the English ones across the board. You can of course uncheck it to go back to the localized function names.

For example, in French locale, the name of the function that calculates a summation of a cell range is called SOMME, but when the English function name option is enabled, this becomes SUM as you can see in the following screenshot:
english-func-displayed

This change takes effect in all of the following areas:

  • formula input and display,
  • function wizard, and
  • formula tips.

As always, please test this thoroughly, and report any bugs. Thanks!

Distributed text justification

What’s new?

Here is something I’ve been working on for the past few weeks. Since I just checked in the first version of this feature into ooo-build master, it’s probably a good time for me to talk about this.

This feature introduces a new justification option for cell text known as the “distributed justification”, where the left and right edges of the text are aligned with the left and right edges of the bounding box by adjusting space between characters (inter-character spacing), rather than space between words (inter-word spacing), across the entire width of the bounding box. This type of distributed text justification makes little sense for Latin-based languages such as English, French and German, but makes a big difference for Asian languages such as Japanese. The reason the normal justification doesn’t work for Asian languages is because, in those languages, you don’t put spaces between individual word boundaries, and the normal justification relies on presence of spaces at word boundaries. This is where the distributed justification comes into play.

This distributed justification method is commonly known as ?????? in Japanese, and is said to be one of the blockers when attempting to migrate users away from Excel to Calc.

Horizontal justification

First and foremost, I’d like to cover the horizontal justification. The following screenshot shows the difference between the three horizontal alignment modes:

calc-text-hor-align

As you can see, in the normal left-aligned text, the right edges of the lines are not aligned. When the text is justified, the right edges of the lines are now aligned by adjusting the inter-character spacing, except for the last line, which remains left-aligned. When the text is distributed, even the right edge of the last line becomes aligned with the right edge of the bounding box by equally distributing the characters on that line.

To allow this new justification type, I added a new justification type Distributed to the existing Cell Formatting dialog.

calc-text-align-dlg

For the vertical alignment setting, I’ve added two new options Justified and Distributed, to support justification in the vertical direction.

Justifying Asian text mixed with Latin text

While working on this feature, I have decided to also tweak the normal justification algorithm to make it work slightly better for Asian text mixed with Latin text such as English. As I mentioned earlier, distributed justification is not really ideal for Latin text. But with the society becoming more and more global, we are seeing more and more Asian text intermixed with Latin text, and vise versa. And correctly justifying a text having mixed script types requires using different justification methods for their respective script types. After a bit of trial and error, I think I got it right. You can see the result in the following screenshot:

mixed-script-justification

The English portion of the text is justified by inter-word justification, whereas the Japanese portion is justified by inter-character justification. The spaces between the English and Japanese text portions are also slightly adjusted in this scheme.

Vertical justification

Now, let’s move on to the vertical justification. When you justify a text in the vertical direction, that is, in the direction perpendicular to the direction of text flow, the spacing between the lines gets adjusted so that the top and bottom lines get aligned with their respective edges of the bounding box, like so:
vertically-justified
The top cell shows text with default justification, while the bottom cell shows text with vertical justification.

The Cell Format dialog itself provides both Justified and Distributed options for the vertical justification setting, but they do exactly the same thing for horizontally-flowing text. For vertically-flowing text, on the other hand, they do different things, but more on that in the next section.

Justifying vertically flowing text

Now, you can also justify text even when the text is flowing vertically. There are three ways you can make the text flow vertically. You can either

  1. rotate 90 degrees to the right (bottom-to-top),
  2. rotate 90 degrees to the left (top-to-bottom), or
  3. switch to Asian layout mode, which flows text in the top-to-bottom, right-to-left direction.

In these modes, the Justified and Distributed vertical justification options do have different effects. The following screenshot demonstrates different vertical alignment settings in three different vertical flow modes.

vertially-flowing-paragraph

As an added bonus…

The code responsible for the text layout, the code where I made my modification to support this feature, is actually shared between Calc, Draw and Impress. Calc uses it to render complex cell text, while Draw and Impress use it for their text box objects. This means that, any improvement I make in this area will automatically be made available for all three applications. All that needs to be done is to simply adjust the UI in each app and add hooks in their respective import/export filters. Whether or not I’ll work on that during this cycle is another question. Having said that, I’d like to eventually get that done, and I’d like to do it sooner rather than later. But we’ll see how that goes.

But even without making the extra code change in the Draw/Impress code, my change so far was enough to fix this bug which I didn’t even know existed. :-)

Lastly…

As of this writing, I’m not entirely done with this feature yet. I still have to cover some corner cases, and I still need to fix some bugs which I unfortunately discovered while taking screenshots for this post. So, stay tuned for further fine-tuning!

Git on Windows

I guess I don’t really have to tell the world about this, since if you type the title of this blog post in Google it will come back as the top hit. But it’s still worth mentioning msysgit, a pretty darn good git client on Windows. It’s small, it’s efficient, and it’s git. :-) You could of course use git in cygwin, but git in cygwin feels a little “heavy” and by no means small, since you have to get the whole cygwin environment to even use git. So, if you don’t already have cygwin, and want to use git on Windows, msysgit is a pretty good choice. It comes with a minimal bash shell, and while I’m happy to see ssh included with its shell, I was a little disappointed that they left out rsync. But that’s just one minor downside.

For me, msysgit is my git client of choice on Windows, especially in a virtual machine setting where the disk space is tight. On a build machine, though, I still use git in cygwin since I already have to use cygwin to build OOo.

Allergic reaction to Bananas?

Today I went to see my dentist to do my routine teeth cleaning done. In their office, I was asked to fill out a medical history form since my current form was 4 years old. On this form, you are asked to answer questions such as “have you ever had heart attack?”, “are you taking any medications?”, that sort of stuff. Nothing unusual right? However, one question caught my eye, and I can’t believe what I was asked to answer.

Do you have any allergic reaction to Bananas?

Yes, the word Bananas was capitalized for some reason. I asked my dentist right away for clarification (while trying to hold my laugh), but she was not exactly sure what the question was supposed to mean. She even said she couldn’t believe that question was even on that form! ;-)

But the story didn’t end there. Later, she asked another dentist and asked her opinion. While they had a pretty lengthy discussion going back and forth, she too was not able to come up with a reasonable explanation for the significance of the question.

Does anyone out there with enough medical knowledge know why they need to ask that question, and how is that relevant to dentistry?

P.S. A quick google search has come up with this explanation.

Setting break point where an exception is thrown

Caolan told me today that when debugging with gdb, you can actually set a break point right before an exception is thrown.

You can do

gdb ./soffice.bin
(gdb) catch throw
(gdb) run

and gdb breaks at every location where an exception is raised. Or, you can set a normal break point, run catch throw and cont, and gdb will break at the next exception throw event. This technique helps when an exception gets caught somewhere at higher level in the call stack and you are trying to find out where exactly it is thrown. Such task, without this technique, would be very time-consuming, tedious, boring, and at times frustrating especially when you’ve spent hours and still don’t have the location of the thrown exception.

Similarly, you can also break where an exception is caught, with catch catch command, or you can catch a whole set of other events with this construct.

The only drawback with this catch event construct is that, it breaks at every single exception raised or caught, which, inside OOo’s codebase can be quite substantial in some places. Nonetheless, this is a very useful technique to add to your debugging arsenal.

DBF import performance

dbf-import-perfHere is another performance win! Importing dbf files into Calc is now quicker by 80%. You will probably notice the difference especially when importing a large dbf file. The test document I used had roughly 24000 rows, and importing that took 57 seconds on my machine. Having 24000 rows in a database file (or even in a spreadsheet file) is very common by today’s standard, so this wasn’t good at all.

I had done quite a bit of performance work over the years, but this one was somewhat difficult to tackle. The bottlenecks were fragmented all over the place which required different solutions to different areas. Roughly speaking, the following are the areas I tackled to reduce the total import time for dbf files (module name in parentheses):

  • speedup in parsing of dbf file content (connectivity)
  • disabled property change notification during dbf import (dbaccess)
  • more efficient string interning and unicode conversion (sal)
  • reduction in column array re-allocation during import (sc)
  • removal of unnecessary column and row size adjustments post-import (sc)

With all of this, the file that originally took 57 seconds to load now loads in 12 seconds on the same hardware, which roughly translates to 80% reduction of the total import time!

This itself is pretty impressive; however, I was hoping to get it at least under 10 seconds since Excel can load the same file less than 5 seconds on the same hardware, even through wine emulation (!). But that’s probably for a future project. For now, I’m content with what I’ve done.

Updates on various stuff

Ok. Here is some updates on some of the stuff I’ve been doing lately. I picked the ones that are particularly worth mentioning.

Saving documents

There are two changes related to the document-saving functionality that I’d like to mention. The first one is the new icon in the document modified status window. As I blogged before, I had made a minor polish to the existing document modified status window, to show the status graphically instead of simply showing ‘*’ when the document is modified. The only problem was that the icon I used to fill that space was pretty lame and ugly. But thanks to jimmac, we now have a much better icon to show the modified status (see below).
doc-modified

The second thing is with the save icon itself. It has been known to us that some users want the ability to always save the document even when the document is not considered “modified”, while others want the save action disabled when the document is not “modified”. I quote the term modified here because even when the content of the document has not changed, some peripheral data may have changed, such as the zoom level, cursor position, active sheet and so on and so forth. These peripheral data (that we call the “view data”) are still stored with the document, but changes in these data do not set a document modified status. So, if you wanted to save your document with the cursor at a particular location, a certain sheet activated and the zoom level set to a certain level, you had to make a fake change to the content to be able to save the document with the view data change. The solution we had employed previously was to always enable this only for Calc, where the request for this behavior was greatest. However, some users still found it confusing that only Calc enables the save all the time while the rest of the applications didn’t. Also, a lot of users used the save icon itself to check whether their document has been modified or not even in Calc.

So, I’ve decided to make it a configuration option. That way we can keep both camps happy. :-) Here is the new check box to toggle this behavior:
always-save-option

Anyway, I hope some of you guys will find this useful, or at least will not find it annoying.

Performance improvement

pagenation-perf-chart
Another stuff worth mentioning is the improvement I made on Calc’s pagination performance. Pagination refers to the action of calculating appropriate positions to set page borders over the entire sheet based on the current page size, row/column sizes, presence of manual page breaks and several other factors. I had previously worked on optimizing this when we increased Calc’s row limit to 1 million rows (as I also mentioned during my talk in Orvieto), but apparently that optimization still had massive room for improvement; the test document I had took 7 minutes to perform pagination during printing! Granted, the document had 98 pages to print, but I bet that no one wants to wait that long to print even if the document has that many pages.

Long story short, I have reduced the duration from 7 minutes to roughly 35 seconds. Though I’m very happy with the result, it required a large amount of refactoring to get to that point, and when a large amount of code changes, the chance of introducing regressions unfortunately goes up. So, please pay special attention to Calc’s pagination behavior and its handling of row heights, and if you notice any problems, I’d like to hear from you, preferably with a test document or two.

DataPilot field popup window

Last but not least, I’d like to mention this one. The DataPilot field popup window has been in the works for quite some time since 3.1. I have blogged about the initial version and the 2nd incarnation. Now the 3rd incarnation is on the horizon. As they say, a picture is worth a thousand words. So without further ado, let’s take a look at the screenshot:
dp-popup-window
This version has a “toggle all” check box to quickly turn on and off all field members, “select only current” button to only select currently selected member, and “unselect only current” button to select all but the current member. Also not visible on this screenshot is the support for Gnome accessibility framework, which is also new in this version.

So…

These are the highlights of some of the stuff I’ve been doing recently. There are more things on the horizon, so stay tuned.

Automatic decimal place adjustment by column width

Adjusting decimals by column width

Here is what I’m working on at the moment. I’m working on changing Calc’s behavior so that when a value is entered into a cell, and the cell width is not wide enough to show all its significant digits, it will truncate it to fit the available column width when the number format of that cell is General.

Let me demonstrate this using the value of PI entered into a cell. I have made the column wide enough to show all available significant digits of the PI value. This is what it looks like first:
auto-decimal-1

Then I’ve decided that the column is too wide for my liking, and dragged the column border to make it narrower:
auto-decimal-2
Notice that the displayed value now has less digits to fit the new column width. Now, I have decided to make the column even more narrow. See what happens when I do that:
auto-decimal-3
The cell now only displays “3.14″. But as I said, this automatic decimal place adjustment takes place only when the cell’s number format is General. If the number format specifies some fixed decimal places for that cell, Calc won’t adjust decimals automatically, and gladly displays “###” when the value doesn’t fit the current column width.

Default decimal places

Some of you may notice that, using the current version of Calc, a cell with the value of PI only shows 3.14, or typing any number into a cell only shows up to 2 decimal places unless you manually specify decimal places for that cell. That’s because Calc by default only shows 2 decimal places for cells with General number format. You can change that by increasing or decreasing the default number of decimal places in the Options dialog (in the Calculate page). However, that behavior is a bit confusing, especially when you type in a number such as 3.01234 and the cell only displays 3.01 even though there is enough space to show the whole value. That’s another thing I’m working on to change.

The new Calculate page now has an additional check box at the bottom. You can check or uncheck this check box to either limit the number of decimal places for cells with General number format, or leave it unlimited.
auto-decimal-option
What the default behavior should be is still under discussion, but I’m pretty sure that we will agree on leaving it unlimited by default.

Thank you Google, once again!

I just received a nice gift of a free T-shirt from Google
google_soc
for participating in 2009 Google Summer of Code event as a mentor. Thank you Google, for this good-looking T-shirt! I still have the one I got from 2 years ago, you know. :-)

It came in by FedEx, and the funny thing is, when the carrier left my T-shirt at the front door of my house, I received an email notification quicker than I noticed anyone leaving the package at the door. I understand that the world is becoming more and more “connected”, but I never expected the world being “connected” this much!

Anyway, I know that the Google Summer of Code is all about the students writing cool code and all, and the mentors are there to enable them to their full potential. But it’s nice to receive some recognition for the hard-working mentors as well.