Speedier export of rich text cells

Here is another performance improvement that just landed on master.

Background

It was brought to our attention that the performance of saving documents to ODF spreadsheet format had been degrading quite noticeably. This was especially true when the document contained lots of what we call rich text cells. Rich text cells are those cells that contain text with mixed format spans, or text that consists of multiple lines. These cells are handled differently from simple strings internally, and have slightly more overhead than the simple string counterparts. Because of this, saving a document full of such texts was always slower than saving one with just numbers and simple strings.

However, even with this unavoidable overhead, the performance of saving rich text cells was clearly going in the wrong direction. Therefore it was time to act.

Long story short, after many days of code reading and writing, I brought it to a state where I can share some numbers.

Measuring export performance

I measured the performance of exporting rich text cells in the following steps.

  1. Create a new spreadsheet document.
  2. Type in cell A1 3 lines of ‘libreoffice’. Here, you can hit Ctrl-Enter to move to the next line within the same cell.
  3. Copy A1, select A1:N1000 and paste, to replicate the content of A1 to all cells in the range.
  4. Save the document as ODF spreadsheet document, and measure its duration.

Results

I performed the above measurement with 3.5, 3.6, 4.0, 4.1, and the latest master (slated to become 4.2) builds, and these are the numbers.
edit-text-export-ods-perf

It is clear from this chart that the performance started to suffer first in version 3.6, then gradually worsened over 4.0 and 4.1. The good news is that we have managed to bring the number back down in the master build, even lower than that of 3.5 which I used as the point of reference. Not just slightly lower, but much, much lower.

I don’t know about you, but I’m quite happy with this result.