Benchmark results on mdds multi_type_vector

In this post, I’m going to share the results of some benchmark testing I have done on multi_type_vector, which is included in the mdds library. The benchmark was done to measure the impact of the change I made recently to improve the performance on block searches, which will affect a major part of its functionality.

Background

One of the data structures included in mdds, called multi_type_vector, stores values of different types in a single logical vector. LibreOffice Calc is one primary user of this. Calc uses this structure as its cell value store, and each instance of this value store represents a single column instance.

Internally, multi_type_vector creates multiple element blocks which are in turn stored in its parent array (primary array) as block structures. This primary array maps a logical position of a value to the actual block structure that stores it. Up to version 1.5.0, this mapping process involved a linear search that always starts from the first block of the primary array. This was because each block structure, though it stores the size of the element block, does not store its logical position. So the only way to find the right element block that intersects the logical position of a value is to scan from the first block and keep accumulating the sizes of the encountered blocks. The following diagram depicts the structure of multi_type_vector’s internal store as of 1.5.0:

The reason for not storing the logical positions of the blocks was to avoid having to update them after shifting the blocks after value insertion, which is quite common when editing spreadsheet documents.

Of course, sometimes one has to perform repeated searches to access a number of element values across a number of element blocks, in which case, always starting the search from the first block, or block 0, in every single search can be prohibitively expensive, especially when the vector is heavily fragmented.

To alleviate this, multi_type_vector provides the concept of position hints, which allows the caller to start the search from block N where N > 0. Most of multi_type_vector’s methods return a position hint which can be used for the next search operation. A position hint object stores the last position of the block that was either accessed or modified by the call. This allows the caller to chain all necessary search operations in such a way to scan the primary array no more than once for the entire sequence of search operations. It was largely inspired by std::map’s insert method which provides a very similar mechanism. The only prerequisite is that access to the elements occur in perfect ascending order. For the most part, this approach worked quite well.

The downside of this is that there are times you need to access multiple element positions and you cannot always arrange your access pattern to take advantage of the position hints. This is the case especially during multi-threaded formula cell execution routine, which Calc introduced some versions ago. This has motivated us to switch to an alternative lookup algorithm, and binary search was the obvious replacement.

Binary search

Binary search is an algorithm well suited to find a target value in an array where the values are stored in sorted order. Compared to linear search, binary search performs much faster except for very small arrays. People often confuse this with binary search tree, but binary search as an algorithm does not limit its applicability to just tree structure; it can be used on arrays as well, as long as the stored values are sorted.

While it’s not very hard to implement binary search manually, the C++ standard library already provides several binary search implementations such as std::lower_bound and std::upper_bound.

Switch from linear search to binary search

The challenge for switching from linear search to binary search was to refactor multi_type_vector’s implementation to store the logical positions of the element blocks and update them real-time, as the vector gets modified. The good news is that, as of this writing, all necessary changes have been done, and the current master branch fully implements binary-search-based block position lookup in all of its operations.

Benchmarks

To get a better idea on how this change will affect the performance profile of multi_type_vector, I ran some benchmarks, using both mdds version 1.5.0 – the latest stable release that still uses linear search, and mdds version 1.5.99 – the current development branch which will eventually become the stable 1.6.0 release. The benchmark tested the following three scenarios:

  1. set() that modifies the block layout of the primary array. This test sets a new value to an empty vector at positions that monotonically increase by 2, until it reaches the end of the vector.
  2. set() that updates the value of the last logical element of the vector. The update happens without modifying the block layout of the primary array. Like the first test, this one also measures the performance of the block position lookup, but since the block count does not change, it is expected that the block position lookup comprises the bulk of its operation.
  3. insert() that inserts a new element block at the logical mid-point of the vector and shifts all the elements that occur below the point of insertion. The primary array of the vector is made to be already heavily fragmented prior to the insertion. This test involves both block position lookup as well as shifting of the element blocks. Since the new multi_type_vector implementation will update the positions of element blocks whose logical positions have changed, this test is designed to measure the cost of this extra operation that was previously not performed as in 1.5.0.

In each of these scenarios, the code executed the target method N number of times where N was specified to be 10,000, 50,000, or 100,000. Each test was run twice, once with position hints and once without them. Each individual run was then repeated five times and the average duration was computed. In this post, I will only include the results for N = 100,000 in the interest of space.

All binaries used in this benchmark were built with a release configuration i.e. on Linux, gcc with -O3 -DNDEBUG flags was used to build the binaries, and on Windows, MSVC (Visual Studio 2017) with /MD /O2 /Ob2 /DNDEBUG flags was used.

All of the source code used in this benchmark is available in the mdds perf-test repository hosted on GitLab.

The benchmarks were performed on machines running either Linux (Ubuntu LTS 1804) or Windows with a variety of CPU’s with varying number of native threads. The following table summarizes all test environments used in this benchmark:

It is very important to note that, because of the disparity in OS environments, compilers and compiler flags, one should NOT compare the absolute values of the timing data to draw any conclusions about CPU’s relative performance with each other.

Results

Scenario 1: set value at monotonically increasing positions

This scenario tests a set of operations that consists of first seeking the position of a block that intersects with the logical position, then setting a new value to that block which causes that block to split and a new value block inserted at the point of split. The test repeats this process 100,000 times, and in each iteration the block search distance progressively increases as the total number of blocks increases. In Calc’s context, scenarios like this are very common especially during file load.

Without further ado, here are the results:

You can easily see that the binary search (1.5.99) achieves nearly the same performance as the linear search with position hints in 1.5.0. Although not very visible in these figures due to the scale of the y-axes, position hints are still beneficial and do provide small but consistent timing reduction in 1.5.99.

Scenario 2: set at last position

The nature of what this scenario tests is very similar to that of the previous scenario, but the cost of the block position lookup is much more emphasized while the cost of the block creation is eliminated. Although the average durations in 1.5.0 without position hints are consistently higher than their equivalent values from the previous scenario across all environments, the overall trends do remain similar.

Scenario 3: insert and shift

This last scenario was included primarily to test the cost of updating the stored block positions after the blocks get shifted, as well as to quantify how much increase this overhead would cause relative to 1.5.0. In terms of Calc use case, this operation roughly corresponds with inserting new rows and shifting of existing non-empty rows downward after the insertion.

Without further ado, here are the results:

These results do indicate that, when compared to the average performance of 1.5.0 with position hints, the same operation can be 4 to 6 times more expensive in 1.5.99. Without position hints, the new implementation is more expensive to a much lesser degree. Since the scenario tested herein is largely bottlenecked by the block position updates, use of position hints seems to only provide marginal benefit.

Adding parallelism

Faced with this dilemma of increased overhead, I did some research to see if there is a way to reduce the overhead. The suspect code in question is in fact a very simple loop, and all its does is to add a constant value to a known number of blocks:

template
void multi_type_vector<_CellBlockFunc, _EventFunc>::adjust_block_positions(size_type block_index, size_type delta)
{
    size_type n = m_blocks.size();
 
    if (block_index >= n)
        return;
 
    for (; block_index < n; ++block_index)
        m_blocks[block_index].m_position += delta;
}

Since the individual block positions can be updated entirely independent of each other, I decided it would be worthwhile to experiment with the following two types of parallelization techniques. One is loop unrolling, the other is OpenMP. I found these two techniques attractive for this particular case, for they both require very minimal code change.

Adding support for OpenMP was rather easy, since all one has to do is to add a #pragma line immediately above the loop you intend to parallelize, and add an appropriate OpenMP flag to the compiler when building the code.

Adding support for loop unrolling took a little fiddling around, but eventually I was able to make the necessary change without breaking any existing unit test cases. After some quick experimentation, I settled with updating 8 elements per iteration.

After these changes were done, the above original code turned into this:

template
void multi_type_vector<_CellBlockFunc, _EventFunc>::adjust_block_positions(int64_t start_block_index, size_type delta)
{
    int64_t n = m_blocks.size();
 
    if (start_block_index >= n)
        return;
 
#ifdef MDDS_LOOP_UNROLLING
    // Ensure that the section length is divisible by 8.
    int64_t len = n - start_block_index;
    int64_t rem = len % 8;
    len -= rem;
    len += start_block_index;
    #pragma omp parallel for
    for (int64_t i = start_block_index; i < len; i += 8)
    {
        m_blocks[i].m_position += delta;
        m_blocks[i+1].m_position += delta;
        m_blocks[i+2].m_position += delta;
        m_blocks[i+3].m_position += delta;
        m_blocks[i+4].m_position += delta;
        m_blocks[i+5].m_position += delta;
        m_blocks[i+6].m_position += delta;
        m_blocks[i+7].m_position += delta;
    }
 
    rem += len;
    for (int64_t i = len; i < rem; ++i)
        m_blocks[i].m_position += delta;
#else
    #pragma omp parallel for
    for (int64_t i = start_block_index; i < n; ++i)
        m_blocks[i].m_position += delta;
#endif
}

I have made the loop-unrolling variant of this method a compile-time option and kept the original method intact to allow on-going comparison. The OpenMP part didn’t need any special pre-processing since it can be turned on and off via compiler flag with no impact to the code itself. I needed to switch the loop counter from the original size_type (which is a typedef to size_t) to int64_t so that the code can be built with OpenMP enabled on Windows, using MSVC. Apparently the Microsoft Visual C++ compiler requires the loop counter to be a signed integer for the code to even build with OpenMP enabled.

With these changes in, I wrote a separate test code just to benchmark the insert-and-shift scenario with all permutations of loop-unrolling and OpenMP. The number of threads to use for OpenMP was not specified during the test, which would cause OpenMP to automatically use all available native threads.

With all of this out of the way, let’s look at the results:

Here, LU and OMP stand for loop unrolling and OpenMP, respectively. The results from each machine consist of four groups each having two timing values, one with 1.5.0 and one with 1.5.99. Since 1.5.0 does not use neither loop unrolling nor OpenMP, its results show no variance between the groups, which is expected. The numbers for 1.5.99 are generally much higher than those of 1.5.0, but the use of OpenMP brings the numbers down considerably. Although how much OpenMP reduced the average duration varies from machine to machine, the number of available native threads likely plays some role. The reduction by OpenMP on Core i5 6300U (which comes with 4 native threads) is approximately 30%, the number on Ryzen 7 1700X (with 16 native threads) is about 70%, and the number on Core i7 4790 (with 8 native threads) is about 50%. The relationship between the native thread count and the rate of reduction somewhat follows a linear trend, though the numbers on Xeon E5-2697 v4, which comes with 32 native threads, deviate from this trend.

The effect of loop unrolling, on the other hand, is visible only to a much lesser degree; in all but two cases it has resulted in a reduction of 1 to 7 percent. The only exceptions are the Ryzen 7 without OpenMP which denoted an increase of nearly 16%, and the Xeon E5630 with OpenMP which denoted a slight increase of 0.1%.

The 16% increase with the Ryzen 7 environment may well be an outlier, since the other test in the same environment (with OpenMP enabled) did result in a reduction of 7% – the highest of all tested groups.

Interpreting the results

Hopefully the results presented in this post are interesting and provide insight into the nature of the change in multi_type_vector in the upcoming 1.6.0 release. But what does this all mean, especially in the context of LibreOffice Calc? These are my personal thoughts.

  • From my own observation of having seen numerous bug reports and/or performance issues from various users of Calc, I can confidently say that the vast majority of cases involve reading and updating cell values without shifting of cells, either during file load, or during executions of features that involve massive amounts of cell I/O’s. Since those cases are primarily bottlenecked by block position search, the new implementation will bring a massive win especially in places where use of position hints was not practical. That being said, the performance of block search will likely see no noticeable improvements even after switching to the new implementation when the code already uses position hints with the old implementation.
  • While the increased overhead in block shifting, which is associated with insertion or deletion of rows in Calc, is a certainly a concern, it may not be a huge issue in day-to-day usage of Calc. It is worth pointing out that that what the benchmark measures is repeated insertions and shifting of highly fragmented blocks, which translates to repeated insertions or deletions of rows in Calc document where the column values consist of uniformly altering types. In normal Calc usage, it is more likely that the user would insert or delete rows as one discrete operation, rather than a series of thousands of repeated row insertions or deletions. I am highly optimistic that Calc can absorb this extra overhead without its users noticing.
  • Even if Calc encounters a very unlikely situation where this increased overhead becomes visible at the UI level, enabling OpenMP, assuming that’s practical, would help lessen the impact of this overhead. The benefit of OpenMP becomes more elevated as the number of native CPU threads becomes higher.

What’s next?

I may invest some time looking into potential use of GPU offloading to see if that would further speed up the block position update operations. The benefit of loop unrolling was not as great as I had hoped, but this may be highly CPU and compiler dependent. I will likely continue to dig deeper into this and keep on experimenting.

LibreOffice Development Talk at Triangle C++ Developer’s Group

It was a pleasure to have been given an opportunity to talk about LibreOffice development the other day at the Triangle C++ Developer’s Group. Looking back, what we went through was a mixture of hardship, accomplishments, and learning experience intertwined in such a unique fashion. It was great to be able to talk about it and hopefully it was entertaining enough to those of you who decided to show up to my talk.

Here is a link to the slides I used during my talk.

Thanks again, everyone!

Edit: Here is a PDF version of my slides for those of you who don’t have a program that can open odp files.

Orcus 0.11.0

I’m very pleased to announce that version 0.11.0 of the orcus library is officially out in the wild! You can download the latest source package from the project’s home page.

Lots of changes went into this release, but the two that I would highlight most are the inclusions of JSON and YAML parsers and their associated tools and interfaces. This release adds two new command-line tools: orcus-json and orcus-yaml. The orcus-json tool optionally handles JSON references to external files when the --resolves-refs option is given, though currently it only supports resolving external files that are on the local file system and only when the paths are relative to the referencing file.

I’ve also written an API documentation on the JSON interface in case someone wants to give it a try. Though the documentation on orcus is always work-in-progress, I’d like to spend more time to make the documentation in a more complete state.

On the import filter front, Markus Mohrhard has been making improvements to the ODS import filter especially in the area of styles import. Oh BTW, he is also proposing to mentor a GSOC project on this front under the LibreOffice project. So if you are interested, go and take a look!

That’s all I have at the moment. Thank you, ladies and gentlemen.

LibreOffice mini-Conference 2016 in Osaka

Night view in Osaka, overlooking the Metropolitan Expressway.
Night view in Osaka, overlooking the Metropolitan Expressway.

Keynote

First off, let me just say that it was such an honor and pleasure to have had the opportunity to present a keynote at the LibreOffice mini-Conference in Osaka. It was a bit surreal to be given such an opportunity almost one year after my involvement with LibreOffice as a paid full-time engineer ended, but I’m grateful that I can still give some tales that some people find interesting. I must admit that I haven’t been that active since I left Collabora in terms of the number of git commits to the LibreOffice core repository, but that doesn’t mean that my passion for that project has faded. In reality it is far from it.

There were a lot of topics I could potentially have covered for my keynote, but I chose to talk about the 5-year history of the project, simply because I felt that we all deserved to give ourselves a lot of praises for numerous great things we’ve achieved in this five years time, which not many of us do simply because we are all very humble beings and always too eager to keep moving forward. I felt that, sometimes, we do need to stop for a moment, look back and reflect on what we’ve done, and enjoy the fruits of our labors.

Osaka

Though I had visited Kyoto once before, this was actually my first time in Osaka. Access from the Kansai International Airport (KIX) into the city was pretty straightforward. The venue was located on the 23th floor of Grand Front Osaka North Building Tower B (right outside the north entrance of JR Osaka Station), on the premises of GMO DigiRock who kindly sponsored the space for the event.

Osaka Station north entrance.
Osaka Station north entrance.

Conference

The conference took place on Saturday January 9th of 2016. The conference program consisted of my keynote, followed by four regular-length talks (30 minutes each), five lightning talks (5 minutes each), and round-table discussions at the end. Topics of the talks included: potential use of LibreOffice in high school IT textbooks, real-world experiences of large-scale migration from MS Office to LibreOffice, LibreOffice API how-tos, and to LibreOffice with NVDA the open source screen reader.

After the round-table discussions, we had some social event with beer and pizza before we concluded the event. Overall, 48 participants showed up for the conference.

Conference venue.
Conference venue.

Videos of the conference talks are made available on YouTube thanks to the effort of the LibreOffice Japanese Language Team.

Slides for my keynote are available here.

Hackfest

We also organized a hackfest on the following day at JUSO Coworking. A total of 20 plus people showed up for the hackfest, to work on things like translating the UI strings to Japanese, authoring event-related articles, and of course hacking on LibreOffice. I myself worked on implementing simple event callbacks in the mdds library, which, by the way, was just completed and merged to the master branch today.

Many folks hard at work during hackfest.
Many folks hard at work during hackfest.

Conclusion

It was great to see so many faces, new and old, many of whom traveled long distance to attend the conference. I was fortunate enough to be able to travel all the way from North Carolina across the Pacific, and it was well worth the hassle of a jet lag.

Last but not least, be sure to check out the article (in Japanese) Naruhiko Ogasawara has written up on the conference. The article goes in-depth with my keynote, and is very well written.

Other Pictures

I’ve taken quite a bit of pictures of the conference as well as of the city of Osaka in general. Jump over to this Facebook album I made of this event if you are interested.

Last day

Today is my last day with Collabora, and also my last day as a full-time engineer working on the LibreOffice (and formerly OpenOffice.org) code base. It’s been 8 long years of adventure. Lots of things happened, and we’ve achieved great many things. I’m certainly very proud of having been a part of it.

From this point on, I’ll participate the project purely as a volunteer. I have not yet figured out what I want to do nor how much I can do, and figuring that out will probably be my first task as a newly volunteer contributor.

Thank you for being patient with me in the last 8 years. You guys have been great, and, even though I’ll have much less time to devote to LibreOffice going forward, I still hope to see you guys around from time to time!

Seattle LibreFest

Today I’d like to talk about the LibreOffice Hackfest (LibreFest) that we did in Seattle on October 26th. This hackfest happens to be the very first hackfest event that I have participated outside of those held at the annual LibreOffice conferences, and the first one ever in the United States. Quite frankly, I didn’t really know what to expect going into this event. But despite that, I’m pleased to say that the event went quite well, with 32 participants joining the event in total, which was much more than what we had anticipated.

Hackfest took place inside the Communications Building at University of Washington, located in downtown Seattle. We borrowed a small-size class room to host the event, and later brought in extra chairs to accommodate everyone.

hackfest-1

Four of us were there from the LibreOffice project – Robinson Tryon, Norbert Thiebaud, Bjoern Michaelsen and myself, though Bjoern had to leave early to catch his flight. Some of us came to the venue around 9 AM to set things up, and people started showing up around 9:30. Once the event officially started at 10 AM, we split into 2 tracks: the hackfest track where people work on building LibreOffice from the git repository & making changes, and the QA track where people test LibreOffice to report bugs. Robinson assisted those in the QA track, and the rest of us helped those in the hackfest track.

We spent much of the morning setting people up and getting their builds going, which was quite a challenge in and of itself. We eventually got everyone building one way or another, and the availability of a virtual machine environment was quite helpful for some of the participants. Others opted to use their own machines to build it on.

hackfest-2

Some participants came late and joined in the afternoon session, while others only joined the morning session and had to leave in the afternoon. About half of us stayed there until late evening. Overall, it was great to see so much interest in our project, and pleased to see that many decided to stay until late to get things done.

hackfest-3

Overall, we had a very successful hackfest event. I would like to thank Robinson for working hard to organize this hackfest, and Lee Fisher who was very helpful in organizing the event especially in handling matters on the Seattle side.

Two things I’ve learned from this event are: 1) access to a very fast virtual build environment can be quite helpful, and 2) Slackware is still very much alive! With regard to the first one, I feel that we should put more emphasis on having the participants use virtual machines to build LibreOffice for future hackfest events, and have mentors adequately trained to set it up for them. With regard to the popularity of Slackware, well, we need to encourage more participation from Slackware users and encourage them to share tips on building LibreOffice on Slackware in our wiki.

I hope those who came to the event learned something worthwhile (I certainly did), and I hope to see them again in the LibreOffice project!

OpenCL test documents for Calc

opencl-doc-shot

Some of you have asked me previously whether or not we can share any test documents to demonstrate Calc’s new OpenCL-based formula engine. Thanks to AMD, we can now make available 3 test documents that showcase the performance of the new engine, and how it compares to Calc’s existing engine as well as Excel’s.

Download Platform
OpenCL-test-documents-Excel-64-bit.zip Excel (64-bit)
OpenCL-test-documents.zip Calc (Windows, 32-bit)
Calc (Linux, 32-bit)
Calc (Linux, 64-bit)
Excel (32-bit)

These files are intentionally in Excel format so that they can be used both in Calc and Excel. They also contain VBA script to automate the execution of formula cell recalculation and measure the recalculation time with a single button click.

All you have to do is to open one of these files, click “Recalculate” and wait for it to finish. It should give you the number that represents the duration of the recalculation in milliseconds.

Note that the 64-bit version of Excel requires different VBA syntax for calling native function in DLL, which is why we have a separate set of documents just for that version. You should not use these documents unless you want to test them specifically in the 64-bit version of Excel. Use the other one for all the rest.

On Linux, you need to use a reasonably recent build from the master branch in order for the VBA macro to be able to call the native DLL function. If you decide to run them on Linux, make sure your build is recent enough to contain this commit.

Once again, huge thanks to AMD for allowing us to share these documents with everyone!

Slides for my talk at LibreOffice conference in Bern

I’d like to share the slides I used for my talk at LibreOffice Conference 2014 in Bern, Switzerland.

slides preview

During my talk, I hinted that the number of unit tests for Calc have dramatically increased during the 4.2 bug fix cycle alone. Since I did not have the opportunity to count the actual number of unit test cases to include in my slides, let me give you the numbers now.

ucalc filters subsequent-filters subsequent-export total
4.1 65 10 49 9 133
4.2 107 13 54 15 189
master 176 15 67 34 292

unit-test-count

The numbers represent the number of top level test functions in each test category. Since sometimes we add assertions to existing test case rather than adding a new function when testing a new bug fix, these numbers are somewhat conservative representation of how much test case we’ve accumulated for Calc. Even then, it is clear from this data set that the number has spiked since the branch-off of the 4.2 stable branch.

Now, I’ll be the first to admit that the 4.2 releases were quite rough in terms of Calc due to the huge refactoring done in the cell storage structure. That said, I’m quite confident that as long as we diligently add tests for the fixes we do, we can recover from this sooner rather than later, and eventually come out stronger than ever before.

Update on border lines

Just a quick update to my last post on getting Calc’s border line situation sorted out.

As of last post, the border lines were pretty in good shape as far as printing to paper, but it was still less than satisfactory when rendered on screen. Lines looked generally fatter and the dashes line were unevenly positioned. I had some ideas that I wanted to try out in order to make the border lines look prettier on screen. So I went ahead and spent a few extra days to give that a try, and I’m happy to report that that effort paid off.

To recap, this is what the border lines looked like as of last Friday.

screen-calc-after

and this is what they look like now:

screen-calc-followup

The lines are skinnier, which in my opinion make them look slicker, and the dashes lines are now evenly spaced and look much better.

The art of drawing border lines

I spent this past week on investigating a collection of various problems surrounding how Calc draws cell borders. The problem is very hard to define and can become very subjective depending on who you talk to. Having said that, if you ever imported an Excel document that makes elaborate use of cell borders into Calc, you may often have seen that the borders were printed somewhat differently than what you would have expected.

Here is an example. This is a very small test Excel document that I made that contains all cell border types that Excel supports. When you open this document in Excel and print it on paper, here is what you get.

excel-print

When you open this document in Calc and print it, you probably get something like this:

calc-print-before

You’ll immediately notice that some of the lines (hair, dashed and double lines to be precise) are not printed at all! Not only that, thin, medium and thick lines are a little skinner than those of Excel’s, the dotted line is barely visible, the medium dashed line looks a lot different, and the rest of the dashed lines all became solid lines.

Therefore, it was time for action.

Results

I’ll spare you the details, but the good news is that after spending a week in various parts of the code base, I’ve been able to fix most of the major issues. Here is what Calc prints now using the build from the latest master branch:

calc-print-after

There are still some minor discrepancies from Excel’s borders, such as the double line being a bit too thinner, the dotted line being not as dense as Excel’s etc. But I consider this a step in the right direction. The dashed and medium dashed lines look much better to my eye, and the thicknesses of these lines are more comparable to Excel’s.

The dash-dot and dash-dot-dot lines still become solid lines since we don’t yet support those line types, but that can be worked on at a later time.

So, this is all good, right?

Not quite. One of the reasons why the cell borders became such a big issue was that we previously focused too much on getting them to display correctly on screen. Unfortunately, the resolution of a typical PC monitor is not high enough to accurately depict the content of your document, so what you see on screen is a pixelized approximation of the actual content. When printing to a paper, on the other hand, the content gets depicted much more accurately simply because you get much higher resolution when printing.

I’ll give you a side-by-side comparison of how the content of the same document gets displayed in Excel (2010), Calc 4.2 (before my change), and Calc master (with my change) all at 100% zoom level.

First up is Excel:

screen-excel

The lines all look correct, unsurprisingly. One thing to note is that when displaying Excel approximates a hairline with a very thin, densely dotted line to differentiate it from a thin line both of which are one pixel high. But make no mistake; hairline by definition is a solid line. This is just a trick Excel employs in order to make the hairline look thinner than the thin line counterpart.

Then comes Calc as of 4.2 (before my change):

screen-calc-before

The hairline became a finely-dashed line both on display and in internal representation. Aside from that, both dashed and medium dashed lines look a bit too far apart. Also, the double line looks very much single. In terms of the line thicknesses, however, they do look very much comparable to Excel’s. Let me also remind you that Excel’s dash-dot and dash-dot-dot lines currently become solid lines in Calc because we don’t support these line types yet.

Now here is what Calc displays after my change:

screen-calc-after

The hair line is a solid line since we don’t use the same hair line trick that Excel uses. The dotted and dashes lines look much denser and in my opinion look better. The double line is now really double. The line thicknesses, however, are a bit off even though they are internally more comparable to Excel’s (as you saw in the printout above). This is due to the loss of precision during rasterization of the border lines, and for some reason they get fatter. We previosly tried to “fix” this by making the lines thinner internally, but that was a wrong approach since that also made the lines thinner even when printed, which was not a good thing. So, for now, this is a compromise we’ll have to live with.

But is there really nothing we can do about this? Well, we could try to apply some correction to make the lines look thinner on screen, and on screen only. I have some ideas how we may be able to achieve that, and I might give that a try during my next visit.

That, and we should also support those missing dash-dot, and dash-dot-dot line types at some point.