mdds 0.7.1 released

Share Button

Ok. I’m actually a bit late on this announcement since 10 days have already passed since the actual release of 0.7.1. Anyhow, I will hereby announce that version 0.7.1 of Multi-Dimensional Data Structure (mdds) is out, which contains several critical bug fixes over the previous 0.7.0 release. You can download the source package from here:

http://multidimalgorithm.googlecode.com/files/mdds_0.7.1.tar.bz2

0.7.1 fixes several bugs in the set_empty() method of multi_type_vector. In the previous versions, the set_empty() method would fail to merge two adjacent empty blocks into a single block, which violated the basic requirement of multi_type_vector that it never allows two adjacent blocks of identical type. This caused other parts of multi_type_vector to fail as a result.

There are no API-incompatible changes since version 0.7.0. I highly recommend you update to 0.7.1 if you make heavy use of multi_type_vector and still use any versions older than 0.7.1.

Open Source Conference 2013 in Tokyo

Share Button

I had the pleasure to visit the Open Source Conference (OSC) 2013 in Tokyo, which took place at Meisei University located on the outskirt of Tokyo. They do organize these OSC’s on a very frequent basis throughout the year, being hosted at various cities across Japan.

Background

Normally I don’t travel to Japan just to visit OSC mainly because of the distance; being located in the East Coast of the United States, it’s a big hassle to fly to Japan, not to mention the cost. Despite this, I wanted to visit this particular OSC primarily for two reasons.

  1. I had received an email from someone from the Japan OSS Promotion Forum that I had been nominated for the 2013 OSS Contributor’s Award, and he asked me whether I could participate the award ceremony which would take place during the conference.
  2. The LibreOffice Japanese team had organized a separate track just for LibreOffice related talks, and I wanted to come and see face-to-face the people who are involved in our project in Japan in various capacities, and learn the latest on what’s going in the Japanese community.

There was one difficulty, however. Because I only had one week to arrange the travel (I got the email only a week before the scheduled ceremony date) I could not guarantee my arrival until the very last minute. Luckily everything went smoothly and I was able to book my flight and reserve my hotel despite the short notice.

Meisei University

Tama Monorail that takes you to the venue.
Tama Monorail that takes you to the venue.

This is actually my second time coming to this event. My first visit was in 2010. I was planning my trip to Tokyo to attend a different, work-related meeting. Then I learned about OSC Tokyo 2010 which was scheduled only one day after the meeting was scheduled to end, so I decided to extend my stay in Tokyo for just one more day to visit OSC. OSC 2010 was also held at Meisei University, so at least I didn’t have to research on how to get the conference venue this time.

The easiest way to get to the venue is to take the Keio Line from Shinjuku station to Takahata Fudou station, then transfer to the Tama Monorail and get off at Chu-O Daigaku / Meisei Daigaku station. The university is only 5 to 10 minutes walk from there.

Meisei University main campus.
Meisei University main campus.

Once on campus, there were signs all around the place that would take you to the building where the conference was held. Outside the venue, the campus was pretty quiet, and I didn’t see very many students.

Booths

No conferences are complete without booths. Various projects set up booths to greet the visitors, to distribute fliers and CD/DVD’s, and to inform them of what’s new in the projects. Volunteers from the LibreOffice Japanese team manned our booth throughout the conference. We distributed version 4.0 feature fliers, installer CD’s, T-shirts, stickers and flags.

LibreOffice booth and volunteers from the Japanese team.
LibreOffice booth and volunteers from the Japanese team.

LibreOffice booth, Enoki-san answering questions.
LibreOffice booth, Shinji Enoki (Enoki-san) answering questions.

Also present was the openSUSE project booth. Fuminobu Takeyama was single-handedly manning the booth when I dropped by on Friday. He is a volunteer in the openSUSE project who also manages several packages for Japanese locales. We briefly talked about some issues with Japanese input method in LibreOffice, and how some folks work around it by forcing the GTK VCL backend even if LibreOffice is launched in the KDE environment (because the input method code in the GTK VCL backend is more reliable than that in the KDE VCL). He said he is very much hoping to someday find time to look into LibreOffice code, to solve various Japanese-related issues that are still outstanding in the latest release.

Bunch of Geeko's piled up at the openSUSE booth.
Bunch of Geeko’s piled up at the openSUSE booth. Sitting next to them are Japan’s Firefox mascot Foku-Suke.

Fuminobu Takeyama at the openSUSE booth.
Fuminobu Takeyama looking friendly at the openSUSE booth.

OSS Contributor’s Award

Me receiving a Contributor's Award
Me receiving a Contributor’s Award

The ceremony for the OSS Contributor’s Awards was held on Friday evening. The OSS Contributor’s Awards are given to

“those who have created or managed an influential development project and to developers who have played an important role in a global project or those who have contributed to the promotion of related activities.” (quoted from this slide)

The candidates are nominated publicly, and the winners are selected by the Awards Committee. They select four winners and nine incentive award winners each year, and I was fortunate enough to have been selected as one of the four award winners this year.

My short talk after the ceremony.
My short talk after the ceremony, outlining my current activities etc.

The ceremony was held in a separate, moderately-sized lecture room right next to the booth areas, and was very well attended. Out of four winners, two of us were present to receive the awards: Tetsuo Handa and myself. We each gave a brief 10-minutes talk afterward, outlining our current activities and our future plans.

Handa-san is a well known Linux kernel hacker and he is leading the development of a kernel security module known as TOMOYO Linux. We briefly chatted after the ceremony, and he hinted that he may get a chance to hack on LibreOffice in the distant future (and I encouraged him!) So, let’s keep his name in the back of our mind, and hope we can see him in our project someday. ;-)

You can find two press articles on this here and here. The official announcement from the OSS Forum is here.

LibreOffice mini-Conference

I spent the second day of the conference mostly in the LibreOffice mini-Conference track. According to Naruhiko-san, this is our first ever track dedicated to LibreOffice (and hopefully won’t be the last) held in Japan. We were able to rent a pretty large lecture room for the whole day to host this mini-Conference. Despite the large size, the room was moderately attended.

The first talk was by Miyoshi Ohmori, and his talk was about the company-wide migration from OpenOffice.org to LibreOffice at NTT Comware. In his talk, he shared the challenges he faced during the migration and ways to solve them.

Miyoshi Ohmori talks about migration from OpenOffice.org to LibreOffice.
Miyoshi Ohmori talks about migration from OpenOffice.org to LibreOffice.

Next up was a talk by Shinji Enoki covering new features in LibreOffice 4.0. He covered all aspects of new features in 4.0, from Firefox Personas support, to Calc’s import filter performance improvement, and everything in-between. His talk was followed by Naruhiko Ogasawara who shared his experience with his trip to the 2nd LibreOffice Conference in Berlin, how he decided to join the LibreOffice community, and how he decided to submit paper for the conference and eventually travel there. During his talk, Ogasawara-san played the video message from Italo that was created specifically for the Japanese audience.

Shinji Enoki talks about what's new in LibreOffice 4.0.
Shinji Enoki talks about what’s new in LibreOffice 4.0.

Naruhiko Ogasawara talks about his trip to the 2nd LibreOffice Conference in Berlin.
Naruhiko Ogasawara talks about his trip to the 2nd LibreOffice Conference in Berlin.

Italo's video message with Japanese caption.
Italo’s video message with Japanese caption.

If you thought Enoki-san and Ogasawara-san looked familiar, it was because they came to the Berlin conference to co-present a talk on the topic of the non-English locale communities. The slide for their talk during the Berlin conference is found here. Enoki-san later traveled to Prague with me and the rest of SUSE’ers, to meet with Petr Mladek to learn more about the current QA activities. (Petr couldn’t make it to Berlin due to illness). Anyway, back to the mini-Conf…

The last talk before the lunch break was by Masaki Tamakoshi. In his talk, he presented a good extension to use to add AutoCAD-like functionality to Draw to make Draw easier and more familiar to use for former (or current) AutoCAD users. He also talked about how to convert AutoCAD’s proprietary dwg files to make them loadable into Draw, and how to create playable animation files from Impress slides, using external tools.

Masaki Tamakoshi talks about adding AutoCAD-like functionality to Draw.
Masaki Tamakoshi talks about extension that adds AutoCAD-like functionality to Draw.

Jun Meguro discusses how to make professional use of Draw.
Jun Meguro discusses how to make professional use of Draw.

After the lunch break, Jun Meguro kickstarted the afternoon session with his talk on how to make effective use of Draw to create professional posters. His organization – City of Aizuwakamatsu – is in fact one of the first organizations in Japan that made a large scale adoption of OpenOffice.org when such a move was still not very common, and instantly became the poster child of OpenOffice.org adoption. They had later moved on to LibreOffice, and Meguro-san continues to contribute to the LibreOffice project as a member of the Japanese language team.

In his talk, he emphasized the usefulness of Draw – the application that may not have received the attention and praise it deserves, and how Draw can be used to create professional posters and fliers without purchasing expensive and proprietary alternatives. He also hinted during his talk that, these days, they can send ODF documents to other local government offices without first converting them to MS Office or PDF formats. This was first revealed when he accidentally sent off a native Draw document (odg) without converting it to PDF, and later received a phone call from the recipient of the document to discuss about the details of the drawing! Although this is an isolated incident, an anecdote like this may suggest that the actual rate of ODF adoption may well be higher than we may have expected.

Masahisa Kamataki talks about cloud services that support ODF.
Masahisa Kamataki talks about cloud services that support ODF.

In the next talk, Masahisa Kamataki talked about how to make use of FLOSS office suites such as LibreOffice, combined with non-FLOSS but free as in beer cloud services such as SkyDrive and Google Drive to reduce operation costs. He mentioned that all of this was made possible thanks to the international standard ODF which many major cloud services also support these days. He also demonstrated the level of ODF compatibilities between these cloud services.

Ikuya Awashiro conducts his presentation via Impress Remote.
Ikuya Awashiro conducts his presentation via Impress Remote.

Next up was Ikuya Awashiro. He talked about the specifics of LibreOffice Japanese localization effort. As someone who coordinates the Japanese translation of LibreOffice UI strings, he knows the in’s and out’s of LibreOffice translation which he covered extensively in his talk. He also talked about the detailed history of the translation in this code base, dating back to the old OpenOffice.org days, and how he learned what not to do in order to successfully coordinate the current community-based translation effort in our project.

I should also mention that, of all the presenters we had during this track (including myself), he was the only presenter who used the Impress Remote feature!

Makoto Takizawa talks about ODF interoperability between various ODF producers.
Makoto Takizawa talks about ODF interoperability between various ODF producers, with live demos using SkyDrive and Calligra.

Makoto Takizawa concluded the afternoon session with his ODF PlugFest talk which also happened to be the very last talk in the whole LibreOffice track.

He started off his talk with the basics of ODF, including its standardization history, and went on to talk about various ODF-supporting applications and how each of these apps fares on interoperability test. During his talk he noted that, although in theory the use of ODF ensures seamless interoperability between different supporting applications, in reality there are still some nasty corner cases where different ODF producers interpret ODF differently.

Toward the end of his talk, he performed a live ODF spreadsheet scenario test using Calligra, Gnumeric, SkyDrive and LibreOffice, to test in real life the level of ODF conformance in these spreadsheet applications. In this particular scenario, Calligra, Gnumeric and SkyDrive actually scored higher than LibreOffice. He concluded his talk by pointing out the importance of the ODF user community assessing the conformance level of each ODF-supporting application, and actively giving feedback to the developer community to improve ODF interoperability between the supporting applications.

Lastly, while I was not officially on the list of speakers in this track, I managed to squeeze my talk during the lunch break, to briefly talk about various random development topics. Please refer to my earlier post to get a hold of the slide for my talk. Unfortunately I had to cut it short to give people enough time to eat lunch, but it sort of worked out since I didn’t have much time to prepare my talk to begin with! ;-)

All in all, I believe this was a quite successful LibreOffice track. We were able to see each other face-to-face which is not very easy to do given how widespread we are geographically. That is true even for those inside Japan, and more so for me. It was unfortunate that Takeshi Abe couldn’t make it for this event. Perhaps we should plan another conference during OSC Okinawa so that we get to see him again.

LibreOffice Japanese team members. We all gathered from various parts of Japan (and one from even outside of it).
LibreOffice Japanese language team photo.

LibreOffice 4.0 launch party photo.
LibreOffice 4.0 launch party photo.

Conclusion

This was actually my very first time to participate in OSC Japan as a speaker, and mingle with so many people from various sectors of the Japanese market. I spoke to quite a lot of people in various capacities during the conference, and I was pleasantly surprised with the level of interest that they have toward LibreOffice. Various local governments are aggressively considering a switch to LibreOffice, with Aizuwakamatsu City and JA Fukuoka leading the way. Though the uptake of LibreOffice among Japanese corporations are still slow, Sumitomo Electric has recently announced their adoption of LibreOffice, so others who are still hesitating to switch may eventually follow suit. I also chatted with someone from a local school district working very hard to realize a district-wide adoption of LibreOffice, which suggests that people in the education sector also see value in adopting LibreOffice.

On the other side of the fence, however, we have yet to attract a healthy dose of developers toward LibreOffice from the Japanese developer community. It is my impression that Japan has a sizable Linux kernel developer community, and in fact, many of the participants at OSC Tokyo were kernel hackers. So, whatever reason they may have for not participating in the LibreOffice development, it’s not because of lack of talents and expertize; they are there, contributing to other projects. At the same time, I also saw lots of interest in hacking on LibreOffice from various people. So, the interest is there; what they just need is a means and justification to work on LibreOffice.

While chatting with Ogawa-san from Ashisuto, who provides paid support for LibreOffice, it is apparent that we are not very far from seeing companies emerging who are very eager to find developers to work on LibreOffice. It is therefore my hope that, by increasing the level of LibreOffice adoption amongst users, the level of interest in participating the development of LibreOffice among support vendors will increase proportionally as a result. And my own impression from participating in OSC Tokyo fills me with optimism in this regard.

mdds 0.7.0 released

Share Button

I’m once again very happy to announce that version 0.7.0 of Multi-Dimensional Data Structure (mdds) is released and is available at the link below:

http://multidimalgorithm.googlecode.com/files/mdds_0.7.0.tar.bz2

All changes that went into this version since 0.6.1 are related to multi_type_vector. The highlights of the changes are:

  1. setter methods (set, set_empty, insert, and insert_empty) now return an iterator that references the block where the values are set or inserted,
  2. each of the above-referenced methods now have a variant that takes a position hint iterator for faster insertion, and
  3. several critical bug fixes.

There are no API-incompatible changes since 0.6.1. If you currently use version 0.6.1 and use multi_type_vector, you should upgrade to 0.7.0 as it contains several important bug fixes.

Slides for my talks at OSC Tokyo 2013

Share Button

I’ve just uploaded the slides for the two short talks I did at Open Source Conference (OSC) Tokyo.

The first one is the brief talk I did after the OSS Contributor Award ceremony on Friday.

Lightning Talk

This is saved as a hybrid PDF; you can view it in your regular PDF viewer (such as Evince and Adobe PDF viewer), or you can open it in Impress to edit it as a normal Impress document. Use this one in case you need it as a pure Impress document.

And the second one is for the talk I did during the LibreOffice mini-Conference on Saturday.

LibreOffice Kaihatsu Q&A

Like the 1st one, this one is also a hybrid PDF. The regular odp version is available here.

I will write more about OSC Tokyo and especially about the LibreOffice mini-Conference in a separate blog. Stay tuned.

Speedy Bugzilla search for developers

Share Button

I stumbled across this today entirely by accident, and I felt I had to write about it. What I’m about to tell you may or may not help you, but if you are interested, read on.

First, the disclaimer. I mainly use freedesktop.org’s bugzilla. And I will base my story on their bugzilla. If you use another bugzilla installation this may or may not apply to you since each site can add its own customization to it.

Named tags in Bugzilla

First, I need to explain what named tags in Bugzilla are, and how they work.

You may not have noticed this, but in Bugzilla, you can add your own custom tags to each bug report. To add a new tag to a bug report, scroll down to the bottom of the bug report to find the footer area which shows up in pretty much every page in bugzilla. It should look like this:

bugzilla-footer

If you haven’t added any named tag to any bug report before, you won’t see any pre-selectable list of named tags in the list box on the left. Just type in a new tag in the tag box in the center, and click Commit to add it to the bug. If you are already in a bug report, the ID of that bug should be pre-filled in the bug number box (as in the screenshot). For now, I’ll add a new tag named “come back later” and click Commit.

bugzilla-tag-come-back-later

You’ll see a page like this when you commit. When you click on that link labeled “come back later”, you’ll get to the list of all bugs that have that tag. Pretty easy, isn’t it?

Note that named tags are yours and yours only. They are visible only to you, and usable by nobody but you. It’s entirely private. When you add a tag to a bug report, it won’t send notification to whoever is watching that particular bug (unlike CC or whiteboard). And someone else may use the same tag in his or her account, and that won’t interfere with yours. But, since named tags are associated with your account, you need to have an account, or else named tags won’t be available.

Bugzilla Add-on for Firefox

Now, most of us spend the majority of time inside a web browser. If you are a developer, you may spend more time inside your editor than in your browser, but you’ll probably still spend a fair amount of time in the web browser. And you are like me, you use Firefox as the browser of choice. If you don’t use Firefox, that’s fine. I’m all about choice. But what I’m about to tell you probably won’t apply to you.

Firefox has an add-on for Bugzilla. It allows you to query Bugzilla directly from its search box at the top-right corner. Installing the extension is also very easy. When you visit your Bugzilla of choice (which in my case the freedesktop bugzilla), click on the icon on the left side of the search box, and click “Add Bugzilla” to add that to the list of your search engines.

bugzilla-add-to-firefox

You can add multiple bugzilla’s to this list; you just have to follow the same step in each bugzilla you wish to add to your list. For instance, GNOME’s bugzilla gives you “Add GNOME Bugzilla” entry in the same menu.

Get to your bugs fast!

Once you’ve set all these up, getting to the list of bug reports associated with a named tag is very quick. All you have to do is to select Bugzilla as your search engine, type tag:"named tag" in your Firefox’ search box, hit Enter, and you’ll jump right to the list. If your tag includes a space, you’ll need to surround it with double quotes.

bugzilla-search-box-tag

Very easy, and very fast. It doesn’t get any better than that.

mdds 0.6.1 released

Share Button

I’m once again very happy to announce that version 0.6.1 of Multi-Dimensional Data Structure (mdds) is released and is available at the link below:

http://multidimalgorithm.googlecode.com/files/mdds_0.6.1.tar.bz2

This is purely a bug fix release, and contain no new functionality since 0.6.0.

This release fixes a bug in the iterator implementation of flat_segment_tree. Prior to this release, the iterator would treat the position immediately before the end position to be the end position, which would result in incorrectly skipping the last data position during iteration. This release contains a fix for that bug.

It also contains fixes for various build errors and compiler warnings.

Many thanks to David Tardon, Stephan Bergmann, Tomáš Chvátal, and Markus Mohrhard for having submitted patches since the release of 0.6.0.

Orcus integration into LibreOffice

Share Button

Last week was SUSE Hack Week, where we SUSE engineers were encouraged to be creative and work on whatever project that we had been dying to work on.

Given this opportunity, I decided to try integrating my orcus library project into LibreOffice proper to see how much improvement we could make in the performance of loading spreadsheet documents.

I’ll leave the detailed description and goal of orcus project for another blog post, but in short, orcus is an independent library designed to process spreadsheet documents, and is also designed to be useable from an application that would like to use it to load documents. It’s currently still work in progress, and is not even in alpha quality. So, I intentionally don’t release orcus library packages on an official basis.

Integration work

The main difficulty with integrating orcus into LibreOffice proper was dealing with the very intricate loading process that LibreOffice uses for all existing filters. It first goes through an elaborate type detection process, which loads the content of the file into memory in order for the type detection code to parse it. Once the correct type is determined, LibreOffice then instantiates correct frame loader and start the actual loading process. I’ve explained all of this in detail in this blog post of mine.

Orcus, on the other hand, only needs a file path, and it does the rest. And it pushes data to the call back functions provided by the client code as it parses the file. It was this difference in overall loading process that made the integration of orcus into LibreOffice all the more challenging. And even though the hack week itself lasted only one week, I had spent months prior to it just to study the type detection code and other auxiliary code that makes up the whole file loading process in order to come up with an elegant way to add hook for orcus.

Long story short, I was able to come up with a way to hook orcus such that LibreOffice relinquishes all its file loading to the orcus library, and only handles callbacks. To make this work, I first packaged orcus into an installable rpm package using the openSUSE build service, locally installed that package, then added –with-system-orcus configure option to allow LibreOffice to find the library. The entire change needed to add hook is condensed into this commit.

Using CSV filter as an experiment

As an initial experiment, I replaced the current csv import filter with one from orcus, just to see how this overall process works. The results are very encouraging.

With a very large csv file I created via this python script:

#!/usr/bin/env python
 
import sys
 
for i in xrange(0, 65536):
    for j in xrange(1, 101):
        val = i * 1.0 / j
        sys.stdout.write("%g,"%val)
    sys.stdout.write("end\n")

the current filter spends roughly 27 seconds to load this file, which is not too bad given the sheer size of the file (~50Mb). The orcus filter, on the other hand, spends only 11 seconds to load the same file.

However, the orcus filter code path still skips a number of steps that need to be performed if it were to be used in the production build, such as

  • drawing progress bar in the status bar area,
  • calculating row heights for rows that include multi-line cell contents, and
  • probably something else I forget to mention here.

Given some of these can be quite expensive, the above numbers may not be fully comparable. Despite that, these initial numbers show a great promise on the performance improvement that may result from using the orcus library.

Future work

First of all, we will not switch to the orcus csv filter anytime soon. Although I’d like to see that happen at some point in the future, there are still lots of missing pieces in the orcus csv filter that prevent us from using it in the production build. My plan with orcus is therefore limited to addition of new filters, and my immediate plan is to develop new XML import and export filters using orcus, and integrate it into LibreOffice. This should also provide a stepping stone for any additional filters that may come up later, as well as replacing some of the existing filters as the need arises.

That’s all for now. Thanks for reading!

mdds::multi_type_matrix performance consideration

Share Button

In my previous post, I explained the basic concept of multi_type_vector – one of the two new data structures added to mdds in the 0.6.0 release. In this post, I’d like to explain a bit more about multi_type_matrix – the other new structure added in the aforementioned release. It is also important to note that the addition of multi_type_matrix deprecates mixed_type_matrix, and is subject to deletion in future releases.

Basics

In short, multi_type_matrix is a matrix data structure designed to allow storage of four different element types: numeric value (double), boolean value (bool), empty value, and string value. The string value type can be either std::string, or one provided by the user. Internally, multi_type_matrix is just a wrapper to multi_type_vector, which does most of the hard work. All multi_type_matrix does is to translate logical element positions in 2-dimensional space into one-dimensional positions, and pass them onto the vector. Using multi_type_vector has many advantages over the previous matrix class mixed_type_matrix both in terms of ease of use and performance.

One benefit of using multi_type_vector as its backend storage is that, we will no longer have to differentiate densely-populated and sparsely-populated matrix density types. In mixed_type_matrix, the user would have to manually specify which backend type to use when creating an instance, and once created, it wasn’t possible to switch from one to the other unless you copy it wholesale. In multi_type_matrix, on the other hand, the user no longer has to specify the density type since the new storage is optimized for either density type.

Another benefit is the reduced storage cost and improved latency in memory access especially when accessing a sequence of element values at once. This is inherent in the use of multi_type_vector which I explained in detail in my previous post. I will expand on the storage cost of multi_type_matrix in the next section.

Storage cost

The new multi_type_matrix structure generally provides better storage efficiency in most average cases. I’ll illustrate this by using the two opposite extreme density cases.

First, let’s assume we have a 5-by-5 matrix that’s fully populated with numeric values. The following picture illustrates how the element values of such numeric matrix are stored.

In mixed_type_matrix with its filled-storage backend, the element values are either 1) stored in heap-allocated element objects and their pointers are stored in a separate array (middle right), or 2) stored directly in one-dimensional array (lower right). Those initialized with empty elements employ the first variant, whereas those initialized with zero elements employ the second variant. The rationale behind using these two different storage schemes was the assertion that, in a matrix initialized with empty elements, most elements likely remain empty throughout its life time whereas a matrix initialized with zero elements likely get numeric values assigned to most of the elements for subsequent computations.

Also, each element in mixed_type_matrix stores its type as an enum value. Let’s assume that the size of a pointer is 8 bytes (the world is moving toward 64-bit systems these days), that of a double is 8 bytes, and that of an enum is 4 bytes. The total storage cost of a 5-by-5 matrix will be 8 x 25 + (8 + 4) x 25 = 500 bytes for empty-initialized matrix, and (8 + 4) x 25 = 300 bytes for zero-initialized matrix.

In contrast, multi_type_matrix (upper right) stores the same data using a single array of double’s, whose memory address is stored in a separate block array. This block array also stores the type of each block (int) and its size (size_t). Since we only have one numeric block, it only stores one int value, one size_t value, and one pointer value for the whole block. With that, the total storage cost of a 5-by-5 matrix will be 8 x 25 + 4 + 8 + 8 = 220 bytes. Suffice it to say that it’s less than half the storage cost of empty-initialized mixed_type_matrix, and roughly 26% less than that of zero-initialized mixed_type_matrix.

Now let’s a look at the other end of the density spectrum. Say, we have a very sparsely-populated 5-by-5 matrix, and only the top-left and bottom-right elements are non-empty like the following illustration shows:

In mixed_type_matrix with its sparse-storage backend (lower right), the element values are stored in heap-allocated element objects which are in turn stored in nested balanced-binary trees. The space requirement of the sparse-storage backend varies depending on how the elements are spread out, but in this particular example, it takes one 5-node tree, one 2-node tree, four single-node tree, and five element instances. Let’s assume that each node in each of these trees stores 3 pointers (pointer to left node, pointer right node and pointer to the value), which makes up 24 bytes of storage per node. Multiplying that by 11 makes 24 x 11 = 264 bytes of storage. With each element instance requiring 12 bytes of storage, the total storage cost comes to 24 x 11 + 12 x 6 = 336 bytes.

In multi_type_matrix (upper right), the primary array stores three element blocks each of which makes up 20 bytes of storage (one pointer, one size_t and one int). Combine that with one 2-element array (16 bytes) and one 4-element array (24 bytes), and the total storage comes to 20 x 3 + 8 * (2 + 4) = 108 bytes. This clearly shows that, even in this extremely sparse density case, multi_type_matrix provides better storage efficiency than mixed_type_matrix.

I hope these two examples are evidence enough that multi_type_matrix provides reasonable efficiency in either densely populated or sparsely populated matrices. The fact that one storage can handle either extreme also gives us more flexibility in that, even when a matrix object starts out sparsely populated then later becomes completely filled, there is no need to manually switch the storage structure as was necessary with mixed_type_matrix.

Run-time performance

Better storage efficiency with multi_type_matrix over mixed_type_matrix is one thing, but what’s equally important is how well it performs run-time. Unfortunately, the actual run-time performance largely depends on how it is used, and while it should provide good overall performance if used in ways that take advantage of its structure, it may perform poorly if used incorrectly.

In this section, I will provide performance comparisons between multi_type_matrix and mixed_type_matrix in several difference scenarios, with the actual source code used to measure their performance. All performance comparisons are done in terms of total elapsed time in seconds required to perform each task. All elapsed times were measured in CPU time, and all benchmark codes were compiled on openSUSE 12.1 64-bit using gcc 4.6.2 with -Os compiler flag.

For the sake of brevity and consistency, the following typedef’s are used throughout the performance test code.

typedef mdds::mixed_type_matrix<std::string, bool>            mixed_mx_type;
typedef mdds::multi_type_matrix<mdds::mtm::std_string_trait>  multi_mx_type;

Instantiation

The first scenario is the instantiation of matrix objects. In this test, six matrix object instantiation scenarios are measured. In each scenario, a matrix object of 20000 rows by 8000 columns is instantiated, and the time it takes for the object to get fully instantiated is measured.

The first three scenarios instantiate matrix object with zero element values. The first scenario instantiates mixed_type_matrix with filled storage backend, with all elements initialized to zero.

mixed_mx_type mx(20000, 8000, mdds::matrix_density_filled_zero);

Internally, this allocates a one-dimensional array and fill it with zero element instances.

The second case is just like the first one, the only difference being that it uses sparse storage backend.

mixed_mx_type mx(20000, 8000, mdds::matrix_density_sparse_zero);

With the sparse storage backend, all this does is to allocate just one element instance to use it as zero, and set the internal size value to specified size. No allocation for the storage of any other elements occur at this point. Thus, instantiating a mixed_type_matrix with sparse storage is a fairly cheap, constant-time process.

The third scenario instantiates multi_type_matrix with all elements initialized to zero.

multi_mx_type mx(20000, 8000, 0.0);

This internally allocates one numerical block containing one dimensional array of length 20000 x 8000 = 160 million, and fill it with 0.0 values. This process is very similar to that of the first scenario except that, unlike the first one, the array stores the element values only, without the extra individual element types.

The next three scenarios instantiate matrix object with all empty elements. Other than that, they are identical to the first three.

The first scenario is mixed_type_matrix with filled storage.

mixed_mx_type mx(20000, 8000, mdds::matrix_density_filled_empty);

Unlike the zero element counterpart, this version allocates one empty element instance and one dimensional array that stores all identical pointer values pointing to the empty element instance.

The second one is mixed_type_matrix with sparse storage.

mixed_mx_type mx(20000, 8000, mdds::matrix_density_sparse_empty);

And the third one is multi_type_matrix initialized with all empty elements.

multi_mx_type mx(20000, 8000);

This is also very similar to the initialization with all zero elements, except that it creates one empty element block which doesn’t have memory allocated for data array. As such, this process is cheaper than the zero element counterpart because of the absence of the overhead associated with creating an extra data array.

Here are the results:

The most expensive one turns out to be the zero-initialized mixed_type_matrix, which allocates array with 160 million zero element objects upon construction. What follows is a tie between the empty-initialized mixed_type_matrix and the zero-initialized multi_type_matrix. Both structures allocate array with 160 million primitive values (one with pointer values and one with double values). The sparse mixed_type_matrix ones are very cheap to instantiate since all they need is to set their internal size without additional storage allocation. The empty multi_type_matrix is also cheap for the same reason. The last three types can be instantiated at constant time regardless of the logical size of the matrix.

Assigning values to elements

The next test is assigning numeric values to elements inside matrix. For the remainder of the tests, I will only measure the zero-initialized mixed_type_matrix since the empty-initialized one is not optimized to be filled with a large number of non-empty elements.

We measure six different scenarios in this test. One is for mixed_type_matrix, and the rest are all for multi_type_matrix, as multi_type_matrix supports several different ways to assign values. In contrast, mixed_type_matrix only supports one way to assign values.

The first scenario involves assigning values to elements in mixed_type_matrix. Values are assigned individually inside nested for loops.

size_t row_size = 10000, col_size = 1000;
 
mixed_mx_type mx(row_size, col_size, mdds::matrix_density_filled_zero);
 
double val = 0.0;
for (size_t row = 0; row < row_size; ++row)
{
    for (size_t col = 0; col < col_size; ++col)
    {
        mx.set(row, col, val);
        val += 0.00001; // different value for each element
    }
}

The second scenario is almost identical to the first one, except that it’s multi_type_matrix initialized with empty elements.

size_t row_size = 10000, col_size = 1000;
 
multi_mx_type mx(row_size, col_size);
 
double val = 0.0;
for (size_t row = 0; row < row_size; ++row)
{
    for (size_t col = 0; col < col_size; ++col)
    {
        mx.set(row, col, val);
        val += 0.00001; // different value for each element
    }
}

Because the matrix is initialized with just one empty block with no data array allocated, the very first value assignment allocates the data array just for one element, then all the subsequent assignments keep resizing the data array by one element at a time. Therefore, each value assignment runs the risk of the data array getting reallocated as it internally relies on std::vector’s capacity growth policy which in most STL implementations consists of doubling it on every reallocation.

The third scenario is identical to the previous one. The only difference is that the matrix is initialized with zero elements.

size_t row_size = 10000, col_size = 1000;
 
multi_mx_type mx(row_size, col_size, 0.0);
 
double val = 0.0;
for (size_t row = 0; row < row_size; ++row)
{
    for (size_t col = 0; col < col_size; ++col)
    {
        mx.set(row, col, val);
        val += 0.00001; // different value for each element
    }
}

But this seemingly subtle difference makes a huge difference. Because the matrix is already initialized with a data array to the full matrix size, none of the subsequent assignments reallocate the array. This cuts the repetitive reallocation overhead significantly.

The next case involves multi_type_matrix initialized with empty elements. The values are first stored into an extra array first, then the whole array gets assigned to the matrix in one call.

size_t row_size = 10000, col_size = 1000;
 
multi_mx_type mx(row_size, col_size);
 
// Prepare a value array first.
std::vector<double> vals;
vals.reserve(row_size*col_size);
double val = 0.0;
for (size_t row = 0; row < row_size; ++row)
{
    for (size_t col = 0; col < col_size; ++col)
    {
        vals.push_back(val);
        val += 0.00001;
    }
}
 
// Assign the whole element values in one step.
mx.set(0, 0, vals.begin(), vals.end());

Operation like this is something that mixed_type_matrix doesn’t support. What the set() method on the last line does is to assign the values to all elements in the matrix in one single call; it starts from the top-left (0,0) element position and keeps wrapping values into the subsequent columns until it reaches the last element in the last column.

Generally speaking, with multi_type_matrix, assigning a large number of values in this fashion is significantly faster than assigning them individually, and even with the overhead of the initial data array creation, it is normally faster than individual value assignments. In this test, we measure the time it takes to set values with and without the initial data array creation.

The last scenario is identical to the previous one, but the only difference is the initial element values being zero instead of being empty.

size_t row_size = 10000, col_size = 1000;
 
multi_mx_type mx(row_size, col_size, 0.0);
 
// Prepare a value array first.
std::vector<double> vals;
vals.reserve(row_size*col_size);
double val = 0.0;
for (size_t row = 0; row < row_size; ++row)
{
    for (size_t col = 0; col < col_size; ++col)
    {
        vals.push_back(val);
        val += 0.00001;
    }
}
 
// Assign the whole element values in one step.
mx.set(0, 0, vals.begin(), vals.end());

The only significant thing this code does differently from the last one is that it assigns values to an existing numeric data array whereas the code in the previous scenario allocates a new array before assigning values. In practice, this difference should not make any significant difference performance-wise.

Now, let’s a take a look at the results.

The top orange bar is the only result from mixed_type_matrix, and the rest of the blue bars are from multi_type_matrix, using different assignment techniques.

The top three bars are the results from the individual value assignments inside loop (hence the label “loop”). The first thing that jumps out of this chart is that individually assigning values to empty-initialized multi_type_matrix is prohibitively expensive, thus such feat should be done with extra caution (if you really have to do it). When the matrix is initialized with zero elements, however, it does perform reasonably though it’s still slightly slower than the mixed_type_matrix case.

The bottom four bars are the results from the array assignments to multi_type_matrix, one initialized with empty elements and one initialized with zero elements, and one is with the initial data array creation and one without. The difference between the two initialization cases is very minor and well within the margin of being barely noticeable in real life.

Performance of an array assignment is roughly on par with that of mixed_type_matrix’s if you include the cost of the extra array creation. But if you take away that overhead, that is, if the data array is already present and doesn’t need to be created prior to the assignment, the array assignment becomes nearly 3 times faster than mixed_type_matrix’s individual value assignment.

Adding all numeric elements

The next benchmark test consists of fetching all numerical values from a matrix and adding them all together. This requires accessing the stored elements inside matrix after it has been fully populated.

With mixed_type_matrix, the following two ways of accessing element values are tested: 1) access via individual get_numeric() calls, and 2) access via const_iterator. With multi_type_matrix, the tested access methods are: 1) access via individual get_numeric() calls, and 2) access via walk() method which walks all element blocks sequentially and call back a caller-provided function object on each element block pass.

In each of the above testing scenarios, two different element distribution types are tested: one that consists of all numeric elements (homogeneous matrix), and one that consists of a mixture of numeric and empty elements (heterogeneous matrix). In the tests with heterogeneous matrices, one out of every three columns is set empty while the remainder of the columns are filled with numeric elements. The size of a matrix object is fixed to 10000 rows by 1000 columns in each tested scenario.

The first case involves populating a mixed_type_matrix instance with all numeric elements (homogenous matrix), then read all values to calculate their sum.

size_t row_size = 10000, col_size = 1000;
 
mixed_mx_type mx(row_size, col_size, mdds::matrix_density_filled_zero);
 
// Populate the matrix with all numeric values.
double val = 0.0;
for (size_t row = 0; row < row_size; ++row)
{
    for (size_t col = 0; col < col_size; ++col)
    {
        mx.set(row, col, val);
        val += 0.00001;
    }
}
 
// Sum all numeric values.
double sum = 0.0;
for (size_t row = 0; row < row_size; ++row)
    for (size_t col = 0; col < col_size; ++col)
        sum += mx.get_numeric(row, col);

The test only measures the second nested for loops where the values are read and added. The first block where the matrix is populated is excluded from the measurement.

In the heterogeneous matrix variant, only the first block is different:

// Populate the matrix with numeric and empty values.
double val = 0.0;
for (size_t row = 0; row < row_size; ++row)
{
    for (size_t col = 0; col < col_size; ++col)
    {
        if ((col % 3) == 0)
        {
            mx.set_empty(row, col);
        }
        else
        {
            mx.set(row, col, val);
            val += 0.00001;
        }
    }
}

while the second block remains intact. Note that the get_numeric() method returns 0.0 when the element type is empty (this is true with both mixed_type_matrix and multi_type_matrix), so calling this method on empty elements has no effect on the total sum of all numeric values.

When measuring the performance of element access via iterator, the second block is replaced with the following code:

// Sum all numeric values via iterator.
double sum = 0.0;
mixed_mx_type::const_iterator it = mx.begin(), it_end = mx.end();
for (; it != it_end; ++it)
{
    if (it->m_type == mdds::element_numeric)
        sum += it->m_numeric;
}

Four separate tests are performed with multi_type_matrix. The first variant consists of a homogeneous matrix with all numeric values, where the element values are read and added via manual loop.

size_t row_size = 10000, col_size = 1000;
 
multi_mx_type mx(row_size, col_size, 0.0);
 
// Populate the matrix with all numeric values.
double val = 0.0;
for (size_t row = 0; row < row_size; ++row)
{
    for (size_t col = 0; col < col_size; ++col)
    {
        mx.set(row, col, val);
        val += 0.00001;
    }
}
 
// Sum all numeric values.
double sum = 0.0;
for (size_t row = 0; row < row_size; ++row)
    for (size_t col = 0; col < col_size; ++col)
        sum += mx.get_numeric(row, col);

This code is identical to the very first scenario with mixed_type_matrix, the only difference being that it uses multi_type_matrix initialized with zero elements.

In the heterogeneous matrix variant, the first block is replaced with the following:

multi_mx_type mx(row_size, col_size); // initialize with empty elements.
double val = 0.0;
vector<double> vals;
vals.reserve(row_size);
for (size_t col = 0; col < col_size; ++col)
{
    if ((col % 3) == 0)
        // Leave this column empty.
        continue;
 
    vals.clear();
    for (size_t row = 0; row < row_size; ++row)
    {
        vals.push_back(val);
        val += 0.00001;
    }
 
    mx.set(0, col, vals.begin(), vals.end());
}

which essentially fills the matrix with numeric values except for every 3rd column being left empty. It’s important to note that, because heterogeneous multi_type_matrix instance consists of multiple element blocks, making every 3rd column empty creates roughly over 300 element blocks with matrix that consists of 1000 columns. This severely affects the performance of element block lookup especially for elements that are not positioned in the first few blocks.

The walk() method was added to multi_type_matrix precisely to alleviate this sort of poor lookup performance in such heavily partitioned matrices. This allows the caller to walk through all element blocks sequentially, thereby removing the need to restart the search in every element access. The last tested scenario measures the performance of this walk() method by replacing the second block with:

sum_all_values func;
mx.walk(func);

where the sum_all_values function object is defined as:

class sum_all_values : public std::unary_function<multi_mx_type::element_block_node_type, void>
{
    double m_sum;
public:
    sum_all_values() : m_sum(0.0) {}
 
    void operator() (const multi_mx_type::element_block_node_type& blk)
    {
        if (!blk.data)
            // Skip the empty blocks.
            return;
 
        if (mdds::mtv::get_block_type(*blk.data) != mdds::mtv::element_type_numeric)
            // Block is not of numeric type.  Skip it.
            return;
 
        using mdds::mtv::numeric_element_block;
        // Access individual elements in this block, and add them up.
        numeric_element_block::const_iterator it = numeric_element_block::begin(*blk.data);
        numeric_element_block::const_iterator it_end = numeric_element_block::end(*blk.data);
        for (; it != it_end; ++it)
            m_sum += *it;
    }
 
    double get() const { return m_sum; }
};

Without further ado, here are the results:

It is somewhat surprising that mixed_type_matrix shows poorer performance with iterator access as opposed to access via get_numeric(). There is no noticeable difference between the homogeneous and heterogeneous matrix scenarios with mixed_type_matrix, which makes sense given how mixed_type_matrix stores its element values.

On the multi_type_matrix front, element access via individual get_numeric() calls turns out to be very slow, which is expected. This poor performance is highly visible especially with heterogeneous matrix consisting of over 300 element blocks. Access via walk() method, on the other hand, shows much better performance, and is in fact the fastest amongst all tested scenarios. Access via walk() is faster with the heterogeneous matrix which is likely attributed to the fact that the empty element blocks are skipped which reduces the total number of element values to read.

Counting all numeric elements

In this test, we measure the time it takes to count the total number of numeric elements stored in a matrix. As with the previous test, we use both homogeneous and heterogeneous 10000 by 1000 matrix objects initialized in the same exact manner. In this test, however, we don’t measure the individual element access performance of multi_type_matrix since we all know by now that doing so would result in a very poor performance.

With mixed_type_matrix, we measure counting both via individual element access and via iterators. I will not show the code to initialize the element values here since that remains unchanged from the previous test. The code that does the counting is as follows:

// Count all numeric elements.
long count = 0;
for (size_t row = 0; row < row_size; ++row)
{
    for (size_t col = 0; col < col_size; ++col)
    {
        if (mx.get_type(row, col) == mdds::element_numeric)
            ++count;
    }
}

It is pretty straightforward and hopefully needs no explanation. Likewise, the code that does the counting via iterator is as follows:

// Count all numeric elements via iterator.
long count = 0;
mixed_mx_type::const_iterator it = mx.begin(), it_end = mx.end();
for (; it != it_end; ++it)
{
    if (it->m_type == mdds::element_numeric)
        ++count;
}

Again a pretty straightforward code.

Now, testing this scenario with multi_type_matrix is interesting because it can take advantage of multi_type_matrix’s block-based element value storage. Because the elements are partitioned into multiple blocks, and each block stores its size separately from the data array, we can simply tally the sizes of all numeric element blocks to calculate its total number without even counting the actual individual elements stored in the blocks. And this algorithm scales with the number of element blocks, which is far fewer than the number of elements in most average use cases.

With that in mind, the code to count numeric elements becomes:

count_all_values func;
mx.walk(func);

where the count_all_values function object is defined as:

class count_all_values : public std::unary_function<multi_mx_type::element_block_node_type, void>
{
    long m_count;
public:
    count_all_values() : m_count(0) {}
    void operator() (const multi_mx_type::element_block_node_type& blk)
    {
        if (!blk.data)
            // Empty block.
            return;
 
        if (mdds::mtv::get_block_type(*blk.data) != mdds::mtv::element_type_numeric)
            // Block is not numeric.
            return;
 
        m_count += blk.size; // Just use the separate block size.
    }
 
    long get() const { return m_count; }
};

With mixed_type_matrix, you are forced to parse all elements in order to count elements of a certain type regardless of which type of elements to count. This algorithm scales with the number of elements, much worse proposition than scaling with the number of element blocks.

Now that the code has been presented, let move on to the results:

The performance of mixed_type_matrix, both manual loop and via iterator cases, is comparable to that of the previous test. What’s remarkable is the performance of multi_type_matrix via its walk() method; the numbers are so small that they don’t even register in the chart! As I mentions previously, the storage structure of multi_type_matrix replaces the problem of counting elements into a new problem of counting element blocks, thereby significantly reducing the scale factor with respect to the number of elements in most average use cases.

Initializing matrix with identical values

Here is another scenario where you can take advantage of multi_type_matrix over mixed_type_matrix. Say, you want to instantiate a new matrix and assign 12.3 to all of its elements. With mixed_type_matrix, the only way you can achieve that is to assign that value to each element in a loop after it’s been constructed. So you would write code like this:

size_t row_size = 10000, col_size = 2000;
mixed_mx_type mx(row_size, col_size, mdds::matrix_density_filled_zero);
 
for (size_t row = 0; row < row_size; ++row)
    for (size_t col = 0; col < col_size; ++col)
        mx.set(row, col, 12.3);

With multi_type_matrix, you can achieve the same result by simply passing an initial value to the constructor, and that value gets assigned to all its elements upon construction. So, instead of assigning it to every element individually, you can simply write:

multi_mx_type(row_size, col_size, 12.3);

Just for the sake of comparison, I’ll add two more cases for multi_type_matrix. The first one involves instantiation with a numeric block of zero’s, and individually assigning value to the elements afterward, like so:

multi_mx_type mx(row_size, col_size, 0.0);
 
for (size_t row = 0; row < row_size; ++row)
    for (size_t col = 0; col < col_size; ++col)
        mx.set(row, col, 12.3);

which is algorithmically similar to the mixed_type_matrix case.

Now, the second one involves instantiation with a numeric block of zero’s, create an array with the same element count initialized with a desired initial value, then assign that to the matrix in one go.

multi_mx_type mx(row_size, col_size);
 
vector<double> vals(row_size*col_size, 12.3);
mx.set(0, 0, vals.begin(), vals.end());

The results are:

The performance of assigning initial value to individual elements is comparable between mixed_type_matrix and multi_type_matrix, though it is also the slowest of all. Creating an array of initial values and assigning it to the matrix takes less than half the time of individual assignment even with the overhead of creating the extra array upfront. Passing an initial value to the constructor is the fastest of all; it only takes roughly 1/8th of the time required for the individual assignment, and 1/3rd of the array assignment.

Conclusion

I hope I have presented enough evidence to convince you that multi_type_matrix offers overall better performance than mixed_type_matrix in a wide variety of use cases. Its structure is much simpler than that of mixed_type_matrix in that, it only uses one element storage backend as opposed to three in mixed_type_matrix. This greatly improves not only the cost of maintenance but also the predictability of the container behavior from the user’s point of view. That fact that you don’t have to clone matrix just to transfer it into another storage backend should make it a lot simpler to use this new matrix container.

Having said this, you should also be aware of the fact that, in order to take full advantage of multi_type_matrix to achieve good run-time performance, you need to

  • try to limit single value assignments and prefer using value array assignment,
  • construct matrix with proper initial value which also determines the type of initial element block, which in turn affects the performance of subsequent value assignments, and
  • use the walk() method when iterating through all elements in the matrix.

That’s all, ladies and gentlemen.

mdds::multi_type_vector explained

Share Button

As my previous post just mentioned, mdds 0.6.0 is finally released which contains two new data structures: multi_type_vector and multi_type_matrix. I’d like to explain a little more about multi_type_vector in this post because, of all the data structures I’ve added to mdds over the course of its project life, I firmly believe this structure deserves some explanation.

What motivated multi_type_vector

The initial idea for this structure came from a discussion I had with Michael Meeks over two years ago in Nuremberg, Germany. Back then, he was dumping his idea on me about how to optimize cell storage in LibreOffice Calc, and his idea was that, instead of storing cell values wrapped around cell objects allocated on the heap and storing them in a column array, we store raw cell values directly in an array without the cell object wrappers. This way, if you have a column filled with numbers from top down, those values are guaranteed to be placed in a contiguous region in memory space which is more likely to be in the same memory page unless their size exceeds the memory page size. By contrast, if you store cell values wrapped inside cell objects that are allocated on the heap, those values are most likely scattered all around the memory space and probably located in many different memory pages.

Now, one of the most common operations that typical spreadsheet users do is to operate on numbers in cells. It could be summing up their totals, calculating their average, determining their minimum and maximum values and so on and so forth. To make these operations happen, the program first needs to fetch all the cell values before it can work on them.

Assume that these values are stored inside cell objects which are located in hundreds of memory pages. The mere action of fetching the cell values alone requires loading all of these memory pages, which causes the CPU to fetch them from the main memory in order to access them. Worse, if some of those pages are located in instead of the physical memory space but in the virtual memory space, it causes page fault, which further degrades performance since that particular memory page must be swapped in from disk. In contrast, if they are all located in a single memory page (or just several of them instead of hundreds), it just needs to fetch just once or several times, depending on the size of the data being fetched.

Moreover, most CPUs these days come equipped with CPU caches to cache recently-fetched memory pages in order to speed up subsequent access to them. Because of this, keeping all your data in the same page reduces the chance of the CPU fetching it from the main memory (or the worse case from the virtual memory), which is slower than fetching it from the caches.

Let’s visualize this idea for a moment. The current cell storage looks like this:

As you can see, cells are scattered in different pages. To access them all, you need to load all of these pages that contain the requested cell objects.

Compare that with the following illustration:

where all requested cell values are stored in a single array that’s located in a single page. I hope it’s obvious by now which one actually fetches data faster.

Calc currently employs the former storage model, and our hope is to make Calc’s storage model more efficient both space- and time-wise by switching to the latter model.

Applying this to the design

One difficulty with applying this concept to column storage is that, a column in a typical spreadsheet application allows you to store values of different types. Cells containing a bunch of test scores may have in the same column a title cell at the top that stores the text “Score”. Likewise, those test scores may be followed by an empty cell followed by a bunch of formula cells containing formula expressions summing, averaging, or counting the test scores. Since one array can only hold values of identical type, this requires us to use a separate array for each segment of identical cell type.

With that, the column storage structure becomes somewhat like this:

An empty cell segment doesn’t store any value array, but it does store its size which is necessary to calculate the logical position of the next non-empty element.

This is the basic design of the multi_type_vector structure. It stores values of each identical type in a single, secondary value array while the primary column array stores the memory locations of all secondary value arrays. It’s important to point out that, while I used the spreadsheet use case as an example to explain the basic idea of the structure, the structure itself can be used in other, much broader use cases, and is not specific to spreadsheet applications.

In the next section, I will talk about challenges I have faced while implementing this structure. But first one terminology note: from now on I will use the term “element block” (or simply “block”) to refer to what was referred to as “secondary value array” up to this point. I use this name in my implementation code too, so using this name makes easier for me to explain things.

The challenges

The basic design of multi_type_vector is not that complicated and was not very challenging to understand and implement. What was more challenging was to handle cases where a value, or a series of values, are inserted over a block or blocks of different types. There are a variety of ways to insert new values into this container, and sometimes the new values overlap the existing blocks of different types, or overlap a part of an existing block of the same type and a part of a block of a different type, and so on and so forth. Because the basic design of the container requires that the type of every element block differs from its neighbors’, some data insertions may cause the container to need to re-organize its element block structure. This posed quite a challenge since multi_type_vector supports the following methods of modifications:

  • set a single value to overwrite an existing one if any (set() method, 2-parameter variant),
  • set a sequence of values to overwrite existing values if any (set() method, 3-parameter variant),
  • insert a sequence of values and shift those existing values that occur below the insertion position (insert() method),
  • set a segment of existing values empty (set_empty() method), and
  • insert a sequence of empty values and shift those existing value that occur below the insertion position (insert_empty() method),

and each of these scenarios requires different strategy for element block re-organization. Non-overwriting data insertion scenarios (insert() and insert_empty()) were somewhat easier to handle than the overwriting data insertion scenarios (set() and set_empty()), as the latter required more branching and significantly more code to cover all cases.

This challenge was further exacerbated by additional requirement to support a “managed” element block that stores pointers to objects whose life cycle is managed by the block. I decided to add this one for convenience reasons, to allow transitioning the current cell storage model into the new storage model in several phases rather than doing it in one big-bang change. During the transition phase, we will likely convert the number and string cells into raw value element blocks, while keeping more complex cell structures such as formula cells still wrapped in their current form. This means that, during the transition we will have element blocks storing pointers to heap-allocated formula cell objects scattered across memory space. Eventually these formula objects need to be stored in a contiguous memory space but that will have to wait after the transition phase.

Supported data types

Template containers are supposed to work with any custom types, and multi_type_vector is no exception. But unlike most standard template containers which normally have one primary data type (and perhaps another one for associative containers), multi_type_vector allows storage of unspecified numbers of data types.

By default, multi_type_vector supports the following data types: bool, short, unsigned short, int, unsigned int, long, unsigned long, double, and std::string. If these data types are all you need when using multi_type_vector, then you won’t have to do anything extra, and just instantiate the template instance by

typedef mdds::multi_type_vector<mdds::mtv::element_block_func> mtv_type;

and start using it like so:

mtv_type data(10); // set initial size to 10.
 
// insert values.
data.set(0, 1.1);
data.set(1, true);
data.set(3, std::string("Foo"));
...

But if you need to store other types of data, you’ll need to do a little more work. Let’s say you have this class type:

class my_custom_foo
{
    ....
};

and you want to store instances of this class in multi_type_vector. I’ll skip the actual definition of this class, but let’s assume that the basic stuff such as default and copy constructors, equality operator etc are all implemented and working properly.

First, you need to define a unique numeric ID for your custom type. Each element type must be associated with a numeric ID. The IDs for standard data types are defined as follows:

namespace mdds { namespace mtv {
 
typedef int element_t;
 
const element_t element_type_empty = -1;
 
const element_t element_type_numeric = 0;
const element_t element_type_string  = 1;
const element_t element_type_short   = 2;
const element_t element_type_ushort  = 3;
const element_t element_type_int     = 4;
const element_t element_type_uint    = 5;
const element_t element_type_long    = 6;
const element_t element_type_ulong   = 7;
const element_t element_type_boolean = 8;
 
}}

These values are what gets returned by calling the get_type() method. When you are adding a new data type, you’ll need to add a new ID for it. Here is how to do it the safe way:

const mdds::mtv::element_t element_my_custom_type = mdds::mtv::element_type_user_start;

The value of element_type_user_start defines the starting number of all custom type IDs. IDs for the standard types all come before this value. If you only want to define one custom type ID, then just using that value will be sufficient. If you need another ID, just add 1 to it and use it for that type. As long as each ID is unique, it doesn’t really matter what their actual values are.

Next, you need to choose the block type. There are 3 block types to choose from:

  • mdds::mtv::default_element_block
  • mdds::mtv::managed_element_block
  • mdds::mtv::noncopyable_managed_element_block

The last 2 are relevant only when you need a managing pointer element block to store heap objects. Right now, let’s just use the default element block for your custom type.

typedef mdds::mtv::default_element_block<element_my_custom_type, my_custom_foo> my_custom_element_block;

With this, define all necessary callback functions for this custom type via following pre-processor macro:

MDDS_MTV_DEFINE_ELEMENT_CALLBACKS(my_custom_foo, element_my_custom_type, my_custom_foo(), my_custom_element_block)

Note that these callbacks functions are called from within multi_type_vector via unqualified call, so it’s essential that they are in the same namespace as the custom data type in order to satisfy C++’s argument-dependent lookup rule.

So far so good. The last step that you need to do is to define a structure of element block functions. This is also a boiler plate, and for a single custom type case, you can define something like this:

struct my_element_block_func
{
    static mdds::mtv::base_element_block* create_new_block(
        mdds::mtv::element_t type, size_t init_size)
    {
        switch (type)
        {
            case element_my_custom_type:
                return my_custom_element_block::create_block(init_size);
            default:
                return mdds::mtv::element_block_func::create_new_block(type, init_size);
        }
    }
 
    static mdds::mtv::base_element_block* clone_block(const mdds::mtv::base_element_block& block)
    {
        switch (mdds::mtv::get_block_type(block))
        {
            case element_my_custom_type:
                return my_custom_element_block::clone_block(block);
            default:
                return mdds::mtv::element_block_func::clone_block(block);
        }
    }
 
    static void delete_block(mdds::mtv::base_element_block* p)
    {
        if (!p)
            return;
 
        switch (mdds::mtv::get_block_type(*p))
        {
            case element_my_custom_type:
                my_custom_element_block::delete_block(p);
            break;
            default:
                mdds::mtv::element_block_func::delete_block(p);
        }
    }
 
    static void resize_block(mdds::mtv::base_element_block& block, size_t new_size)
    {
        switch (mdds::mtv::get_block_type(block))
        {
            case element_my_custom_type:
                my_custom_element_block::resize_block(block, new_size);
            break;
            default:
                mdds::mtv::element_block_func::resize_block(block, new_size);
        }
    }
 
    static void print_block(const mdds::mtv::base_element_block& block)
    {
        switch (mdds::mtv::get_block_type(block))
        {
            case element_my_custom_type:
                my_custom_element_block::print_block(block);
            break;
            default:
                mdds::mtv::element_block_func::print_block(block);
        }
    }
 
    static void erase(mdds::mtv::base_element_block& block, size_t pos)
    {
        switch (mdds::mtv::get_block_type(block))
        {
            case element_my_custom_type:
                my_custom_element_block::erase_block(block, pos);
            break;
            default:
                mdds::mtv::element_block_func::erase(block, pos);
        }
    }
 
    static void erase(mdds::mtv::base_element_block& block, size_t pos, size_t size)
    {
        switch (mdds::mtv::get_block_type(block))
        {
            case element_my_custom_type:
                my_custom_element_block::erase_block(block, pos, size);
            break;
            default:
                mdds::mtv::element_block_func::erase(block, pos, size);
        }
    }
 
    static void append_values_from_block(
        mdds::mtv::base_element_block& dest, const mdds::mtv::base_element_block& src)
    {
        switch (mdds::mtv::get_block_type(dest))
        {
            case element_my_custom_type:
                my_custom_element_block::append_values_from_block(dest, src);
            break;
            default:
                mdds::mtv::element_block_func::append_values_from_block(dest, src);
        }
    }
 
    static void append_values_from_block(
        mdds::mtv::base_element_block& dest, const mdds::mtv::base_element_block& src,
        size_t begin_pos, size_t len)
    {
        switch (mdds::mtv::get_block_type(dest))
        {
            case element_my_custom_type:
                my_custom_element_block::append_values_from_block(dest, src, begin_pos, len);
            break;
            default:
                mdds::mtv::element_block_func::append_values_from_block(dest, src, begin_pos, len);
        }
    }
 
    static void assign_values_from_block(
        mdds::mtv::base_element_block& dest, const mdds::mtv::base_element_block& src,
        size_t begin_pos, size_t len)
    {
        switch (mdds::mtv::get_block_type(dest))
        {
            case element_my_custom_type:
                my_custom_element_block::assign_values_from_block(dest, src, begin_pos, len);
            break;
            default:
                mdds::mtv::element_block_func::assign_values_from_block(dest, src, begin_pos, len);
        }
    }
 
    static bool equal_block(
        const mdds::mtv::base_element_block& left, const mdds::mtv::base_element_block& right)
    {
        if (mdds::mtv::get_block_type(left) == element_my_custom_type)
        {
            if (mdds::mtv::get_block_type(right) != element_my_custom_type)
                return false;
 
            return my_custom_element_block::get(left) == my_custom_element_block::get(right);
        }
        else if (mdds::mtv::get_block_type(right) == element_my_custom_type)
            return false;
 
        return mdds::mtv::element_block_func::equal_block(left, right);
    }
 
    static void overwrite_values(mdds::mtv::base_element_block& block, size_t pos, size_t len)
    {
        switch (mdds::mtv::get_block_type(block))
        {
            case element_my_custom_type:
                // Do nothing.  The client code manages the life cycle of these cells.
            break;
            default:
                mdds::mtv::element_block_func::overwrite_values(block, pos, len);
        }
    }
};

This is quite a bit of code, I know. I should definitely work on making it a bit simpler to use with a lot less typing in future versions of mdds. Anyway, with this in place, we can finally define the multi_type_vector type:

typedef mdds::multi_type_vector<my_element_block_func> my_mtv_type;

With all these bits in place, you can finally start using this container:

my_mtv_type data(10);
my_custom_foo foo;
data.set(0, foo);  // Insert a custom data element.
data.set(1, 12.3); // You can still use the standard data types.
...

That’s all I will talk about custom data types for now. I hope this gives you a glimpse of how this container works in general.

Future work

Since this is the very first incarnation of multi_type_vector, I have no doubt this still has a lot of issues to be worked out. One immediate issue that comes to mind is the performance of element position lookup. Given a logical position of the element, the container first has to locate the right element block that stores the specified element, but this lookup always happens from the first element block. So, if you are doing a continuous lookup of million’s of elements in a loop, the overall lookup speed can be quite slow since each lookup starts from the first block. Speeding up this operation is certainly a task to be worked on in the near future. Meanwhile, the user of this container can resort to using the iterators to iterate through the element blocks and their member elements.

Another issue is the verbosity of the element block function structure required for custom element blocks. This can be worked out by providing templatized structures per number of custom data types. This one is probably easier to solve, and I should look into that soon.

That’s all for now, ladies and gentlemen!

mdds 0.6.0 released

Share Button

I’m once again very happy to announce that version 0.6.0 of Multi-Dimensional Data Structure (mdds) is released and is available at the link below:

http://multidimalgorithm.googlecode.com/files/mdds_0.6.0.tar.bz2

This release comes almost 9 months after the release of the last stable version 0.5.4, and contains:

  • MSVC project files to allow managing source files and compile test programs in Visual Studio on Windows,
  • improved performance of size() method of mixed_type_matrix (patch from Markus Mohrhard), and
  • two new data structures: multi_type_vector and multi_type_matrix.

Also, starting with this release, mixed_type_matrix is deprecated, and is subject to deletion in future releases.

Other than one change made to mixed_type_matrix, there are no changes made to any of the other existing data structures. So, if you still use 0.5.4 and don’t use mixed_type_matrix, and don’t need to use the new data structures added to 0.6.0, there is no reason to upgrade to 0.6.0.