Setting break point where an exception is thrown

Caolan told me today that when debugging with gdb, you can actually set a break point right before an exception is thrown.

You can do

gdb ./soffice.bin
(gdb) catch throw
(gdb) run

and gdb breaks at every location where an exception is raised. Or, you can set a normal break point, run catch throw and cont, and gdb will break at the next exception throw event. This technique helps when an exception gets caught somewhere at higher level in the call stack and you are trying to find out where exactly it is thrown. Such task, without this technique, would be very time-consuming, tedious, boring, and at times frustrating especially when you’ve spent hours and still don’t have the location of the thrown exception.

Similarly, you can also break where an exception is caught, with catch catch command, or you can catch a whole set of other events with this construct.

The only drawback with this catch event construct is that, it breaks at every single exception raised or caught, which, inside OOo’s codebase can be quite substantial in some places. Nonetheless, this is a very useful technique to add to your debugging arsenal.

DBF import performance

dbf-import-perfHere is another performance win! Importing dbf files into Calc is now quicker by 80%. You will probably notice the difference especially when importing a large dbf file. The test document I used had roughly 24000 rows, and importing that took 57 seconds on my machine. Having 24000 rows in a database file (or even in a spreadsheet file) is very common by today’s standard, so this wasn’t good at all.

I had done quite a bit of performance work over the years, but this one was somewhat difficult to tackle. The bottlenecks were fragmented all over the place which required different solutions to different areas. Roughly speaking, the following are the areas I tackled to reduce the total import time for dbf files (module name in parentheses):

  • speedup in parsing of dbf file content (connectivity)
  • disabled property change notification during dbf import (dbaccess)
  • more efficient string interning and unicode conversion (sal)
  • reduction in column array re-allocation during import (sc)
  • removal of unnecessary column and row size adjustments post-import (sc)

With all of this, the file that originally took 57 seconds to load now loads in 12 seconds on the same hardware, which roughly translates to 80% reduction of the total import time!

This itself is pretty impressive; however, I was hoping to get it at least under 10 seconds since Excel can load the same file less than 5 seconds on the same hardware, even through wine emulation (!). But that’s probably for a future project. For now, I’m content with what I’ve done.

Updates on various stuff

Ok. Here is some updates on some of the stuff I’ve been doing lately. I picked the ones that are particularly worth mentioning.

Saving documents

There are two changes related to the document-saving functionality that I’d like to mention. The first one is the new icon in the document modified status window. As I blogged before, I had made a minor polish to the existing document modified status window, to show the status graphically instead of simply showing ‘*’ when the document is modified. The only problem was that the icon I used to fill that space was pretty lame and ugly. But thanks to jimmac, we now have a much better icon to show the modified status (see below).
doc-modified

The second thing is with the save icon itself. It has been known to us that some users want the ability to always save the document even when the document is not considered “modified”, while others want the save action disabled when the document is not “modified”. I quote the term modified here because even when the content of the document has not changed, some peripheral data may have changed, such as the zoom level, cursor position, active sheet and so on and so forth. These peripheral data (that we call the “view data”) are still stored with the document, but changes in these data do not set a document modified status. So, if you wanted to save your document with the cursor at a particular location, a certain sheet activated and the zoom level set to a certain level, you had to make a fake change to the content to be able to save the document with the view data change. The solution we had employed previously was to always enable this only for Calc, where the request for this behavior was greatest. However, some users still found it confusing that only Calc enables the save all the time while the rest of the applications didn’t. Also, a lot of users used the save icon itself to check whether their document has been modified or not even in Calc.

So, I’ve decided to make it a configuration option. That way we can keep both camps happy. :-) Here is the new check box to toggle this behavior:
always-save-option

Anyway, I hope some of you guys will find this useful, or at least will not find it annoying.

Performance improvement

pagenation-perf-chart
Another stuff worth mentioning is the improvement I made on Calc’s pagination performance. Pagination refers to the action of calculating appropriate positions to set page borders over the entire sheet based on the current page size, row/column sizes, presence of manual page breaks and several other factors. I had previously worked on optimizing this when we increased Calc’s row limit to 1 million rows (as I also mentioned during my talk in Orvieto), but apparently that optimization still had massive room for improvement; the test document I had took 7 minutes to perform pagination during printing! Granted, the document had 98 pages to print, but I bet that no one wants to wait that long to print even if the document has that many pages.

Long story short, I have reduced the duration from 7 minutes to roughly 35 seconds. Though I’m very happy with the result, it required a large amount of refactoring to get to that point, and when a large amount of code changes, the chance of introducing regressions unfortunately goes up. So, please pay special attention to Calc’s pagination behavior and its handling of row heights, and if you notice any problems, I’d like to hear from you, preferably with a test document or two.

DataPilot field popup window

Last but not least, I’d like to mention this one. The DataPilot field popup window has been in the works for quite some time since 3.1. I have blogged about the initial version and the 2nd incarnation. Now the 3rd incarnation is on the horizon. As they say, a picture is worth a thousand words. So without further ado, let’s take a look at the screenshot:
dp-popup-window
This version has a “toggle all” check box to quickly turn on and off all field members, “select only current” button to only select currently selected member, and “unselect only current” button to select all but the current member. Also not visible on this screenshot is the support for Gnome accessibility framework, which is also new in this version.

So…

These are the highlights of some of the stuff I’ve been doing recently. There are more things on the horizon, so stay tuned.