New sheet protection dialog

I’ve just finished designing a new dialog for Calc’s sheet protection functionality to allow optional sheet protection options. This was actually my first time designing a dialog from scratch instead of modifying an existing one, so I had to dig around and figure out how to add a dialog. It turns out that it is actually very simple once you know what to do. After several hours of creative designing process, I’ve come up with something I can show to people. So here it is:

sheet protection dialog screenshot

One thing to note: obviously this dialog is inspired by the similar functionality offered by Excel, and Excel provides many more options for sheet protection than just the two I’m showing here. The reason I only have two at the moment is because I’ve only implemented support for those two options in Calc core. When we support more options in the core, we can easily add them to the dialog.

This work is on-going in scsheetprotection02 CWS. Aside from the new dialog and sheet protection options, this CWS contains my other work on the binary Excel export encryption as well as sheet and document password interoperability between Excel and Calc. I’m trying to wrap this up, so hopefully I can come up with something that people can try out soon.

Excel workbook and worksheet protection

I didn’t mean to keep this a secret, but I’ve been working on two tasks since December to further enhance Calc’s interoperability with Excel. The first one is to support correct import of an Excel document that includes password-protected worksheets, and the second one is to support exporting of a Calc document into an encrypted Excel format with password protection.

Workbook and worksheet protection with password

Currently, when Calc imports an Excel document with password-protected sheet, it retains the protected state but discards the password. What I’m trying to do is to retain the password if the sheet is password-protected and also export the sheet password when that document is exported back into Excel.

The difficulty of retaining a sheet password from Excel all lies in finding out the correct password hashing function that Excel uses, then the rest will be easy. This is because when you type in your password to protect a worksheet, Excel doesn’t store the actual password text, but immediately hashes it and stores that hash value in the document instead of the original text (BTW, Calc does the same). So, if you don’t know the correct hashing algorithm that generates an identical hash value given the same password, all bets are off.

As I covered in my previous blog post, however, this has already been taken care of. On top of that, Excel uses the same hashing function for both worksheet and workbook protections, so we can take care of both types of protections with just this one hashing function mentioned in my previous blog.

There is one important thing to note here. This worksheet-level protection is primarily for preventing accidental editing of protected worksheet during run-time, and this still does not prevent the content of such “protected” worksheet from being modified by directly altering the content of the file itself. This is because the content of the protected worksheet is not encrypted inside the file, so this level of password “protection” can be easily circumvented.

Workbook-level protection with password

Now, Excel also offers a workbook-level protection with password that actually involves encryption of document’s content, and that’s the second task I’ve been working on. There are two instances where Excel encrypts the workbook content:

  1. when you protect workbook’s structure (you can either protect the workbook structure, windows or both for workbook-level protection) or
  2. when you set a password from the Save As dialog (click on Tools near the upper-right corner and select General Options).

Currently, Calc can open an encrypted Excel document, but cannot save it back with the content still encrypted. So, when you open such a document and save it, it loses its content encryption (though a warning dialog is displayed before it writes the content back to the file unencrypted). What I’m trying to do is to allow the user to save the content of an already encrypted document back encrypted, or save a new Calc document to an encrypted Excel document with password protection.

This work is still on-going, but I’m pretty confident that the hardest part is already over. What makes this task really hard is the fact that Excel doesn’t encrypt the entire file stream, but encrypts only the record contents and leaves the record headers unencrypted. Also, certain records, and even parts of certain records, are not encrypted. If the entire file stream was encrypted, it would be relatively simpler to put an encryption layer on top of the normal record stream. But because some parts of the stream are not encrypted, it is necessary to check every single record that we write to make sure that we are encrypting only the right parts of the stream, or else Excel will not load the document at all. But like I said, the worst is over. I can now correctly generate an encrypted Excel file with various content types: even those documents with form controls, charts, cell comments, autoshapes, and OLE objects.

One interesting trivia. The encryption algorithm that Excel uses to encrypt a document requires a password, but Excel allows document encryption without a password. So, what does Excel do if no password is given? It uses the default password VelvetSweatshop. :-)

Anyway, both of these tasks are on-going in the scsheetprotection01 CWS. The original IZ issues are i60305 and i84766. There is still some work ahead before I can get this CWS integrated, but I’m making progress. Thank you, ladies and gentlemen.

Calc Solver release (minor update)

Only 3 days after the last release, I’m releasing a minor update of the Calc Solver extension. This release includes the following new additions:

  • Russian translation (contributed from Konstantin Lavrov)
  • German translation (ported from the ooo-build builtin version)
  • Menu integration to make it appear in Tools – Solver… below the Goal Seek menu entry (also contributed from Konstantin Lavrov)

I am very grateful to the impressive (and fast!) work that Konstantin did on the Russian translation as well as the menu integration improvement. Thank you, Konstantin! :-)

Calc Solver release

I just released a new version of Calc Solver after more than a year since the last release. A lot of effort has gone into this release mostly to re-package it as a true UNO extension, and also to make it available for the Windows version of OO.o beginning with this release.

The UI has been localized for French and Japanese, thanks to Laurent Godard and Kazunari Hirano. Laurent also helped me on various UNO related issues, so I would really like to acknowledge his help. Thank you Laurent. :-) The system language should be automatically picked up and the appropriate translation texts should be displayed for English, French and Japanese. If this doesn’t work for you, please let me know.

The ride was quite bumpy, however, to get Solver to build on Windows. Since this was my first attempt to build anything non-trivial on Windows, I had to spend a few days (and nights) studying how the MSVC compiler works so that I could build a DLL. There was also an issue with multi-thread vs single-thread libraries, so I had to manually select the default libraries to be all single-threaded for the Solver as well as the lpsolve code. Not to mention I didn’t know how to set up a build environment since GNU make in cygwin didn’t work too reliably due to file path separator and the driver letter issues. In the end, I came up with a custom DOS batch script with everything hardcoded to semi-automate the build process, but that’s far from being elegant. I’m just wondering if there is any better way to set up a build environment on Windows… Question: what do Windows developers use these days to build C++ projects?

On the Linux side, the Solver extension installed and worked fine, but I’ve experienced a major problem with installing it in the Extension Manager UI. But as long as it’s installed from the command line using unopkg, it works fine. I still haven’t figured out why installing with the Extension Manager caused a problem but installing with unopkg didn’t.

Oh I almost forgot. If you use Go-oo version of OO.o (aka ooo-build) or any variant of it with my Solver already included, you don’t need to install this extension. It’s already there in Tools – Solver.

Anyway, enough talk. Enjoy! :-) And please report me any problems you may experience.

History of Calc Solver

I’ve been trying to avoid writing a piece like this simply because if I wrote one, there would be a lot of bitterness involved, and I don’t like to put a blame on anybody. But when I saw a lot of confusion over the state of my Calc Solver in Barcelona (not the least of which is Louis’ announcement on Calc Solver for 3.0 during his keynote speech, which was truly a bad surprise for me), as well as this statement by Stefan Taxhet (st) in the issue page outlining Sun’s intention to duplicate the entire work I did (for free), I felt like it was time for me to explain what actually happened surrounding my effort to write an Optimization Solver for Calc.

Background (Why I started this)

Back when the OO.o project just started, the Calc team was requesting for two features from the community. One was advanced statistical analysis add-on, and the other was linear and non-linear optimization solver. Back then, I had already submitted more than a few bug-fix patches most of which were accepted and integrated (aside from this 2 and half year old patch which would probably never be integrated and I had already lost my hopes for) and started feeling comfortable with Calc’s code base. As the next step, I wanted to do something bigger, like a full-fledged feature, to try to prove to those Sun guys that they can count on hobbyist hackers writing a significant piece of code.

There was already some effort on writing a Solver for Calc well before I started working on my own (back in 2004). One was written in OO.o Basic (initiated by Jim Thompson), and the other was written in Java as a UNO component (Arun Gaikwad). Unfortunately Jim could not continue his work due to unavailability of spare time because of his other responsibilities (here is his last email post to sc-dev mailing list). Arun’s work was promising, but he insisted on integrating his work under GPL, which was not possible at the time because OO.o was licensed under SISSL/LGPL. The consensus back then was to convince Arun to submit his work under LGPL as an external component, which would legally permit OO.o to include it without compromising the license under which OO.o was distributed. But the effort to convince Arun failed, and we were back to square one.

During that time, I had already started my own effort on writing a Solver. My intention was to write a Solver component that would allow a clean integration into OO.o by writing one from scratch. But since I didn’t have any idea on how to write a solver, my effort started from buying books (I bought three) and studying on the subject of operations research. I didn’t want to announce it at that time simply because I didn’t even know whether I could go anywhere with this, so I worked on this in my spare time, studying the books, writing and testing the algorithms in Python (with the intention of re-writing them in C++ later), and investigating UNO’s AWT dialog framework because I didn’t want the kludge of adding a Basic layer to my component just for the dialogs.

After months of hacking, finally, the breakthrough happened. Since I felt comfortable that I could go somewhere with this effort of mine, I made an announcement on the sc-dev mailing list. Here is my first announcement on the sc-dev mailing list, and the later update. After that, I was making constant status updates in the Solver issue page as well as the sc-dev mailing list so that the world (and more importantly the Hamburg team) knows that I was working on this, to avoid duplication of effort. Everything was going reasonably well.

Google Summer of Code (GSOC) 2005

That year was the first year of Google Summer of Code. I didn’t think much about that event because I was not a student, and I didn’t think I would be involved in any way. But to my surprise, Sun decided to post the Solver task as one of the student projects, without even contacting me, despite the fact that they must have known my effort since I diligently made my update known to the public, both on the mailing list as well as the issue page. And the description of the task sounded as if the task was never touched by anyone. I was confused. I thought to myself, they could have at least contacted me how much work was already done before submitting that task to Google as if it was never touched. Because of that, I was forced to upload my interim code to the issue page to make it clear that part of the work that was planned for GSOC was already finished.

The very next day I uploaded my code, I received privately an angry email from someone who wanted to work on this task, but felt unfair because I uploaded my code to the issue page the very next day of the GSOC announcement. He was telling me how he didn’t appreciate what I did because he had just wasted 2 hours of his time writing a proposal. I felt humiliated. It was not I who decided to register that task as GSOC task, and I was not even contacted! And I am the one getting this angry email even after I put so much effort writing that code I wrote all in my spare time? Who is being unfair now!?

So, I forwarded that email to Erwin Tenhumburg, Louis Suarez Potts and Niklas Nebel, expressing my dissatisfaction on how they decided to register this task with GSOC even though the work was already on-going and they should have at least contacted me on the status of the effort. Only Erwin replied to my email. After exchanging several emails between Erwin and myself, they decided to put me as co-mentor of this GSOC task, and put Niklas as the Sun contact in case I need help from Sun. This is how I ended up mentoring the project for GSOC 2005. My memory is vague, but I believe I was the only mentor who was not a Sun employee but a free-lance hobbyist contributor that year.

But mentoring a student, totally on my own time was not very easy. I did my best, but unfortunately the project was not successful. In the end, there was no single line of code contributed from the student. Again, I don’t mean to put a blame on anyone, and if I had to pick one, I would certainly pick myself to take the blame. But it simply didn’t work out.

Post GSOC and Joining ooo-build

After the GSOC, I went back to what I was doing before the GSOC, which was to continue to improve the simplex algorithm for linear programming. At that time, there was one avid user of my Solver from Belgium (Ludvic Ansiaux), who was testing it for every version I released. He helped me sort out lots of bugs in my optimizer algorithm and some UI issues, so I have to give him a huge thank you for what he did. To this day, I have never received any help on testing as big as his.

But my progress was slowing down, and I was starting to realize the difficulty of writing a good Solver component, especially the backend optimizer code. I was simply losing my motivation because of lack of progress. But I forced myself to go on, simply because I didn’t want all of what had done to be wasted.

Then around January of 2006, Michael Meeks contacted me, telling me that he was interested in integrating my Solver code into ooo-build – the build used by Linux distributors (such as Debian, Ubuntu, SuSE) to ease the pain of integrating new code and distro-specific code into OO.o because of the upstream’s lack of interest in integrating patches. I was simply delighted, and didn’t think twice about joining them and making my Solver officially a part of ooo-build. It was also when the Solver code was morphed into a full fledged module named ‘scsolver’, after Michael reorganized my source tree to fit the OO.o build system. My excitement was back once again, and I was very delighted to see my Solver distributed in the default OO.o build that comes with openSuSE 10.1.

That’s also when I got my first software engineer position at SlickEdit after years of non-IT work in environmental science field. Unfortunately, however, this also meant that more time was spent on my new job, not because the new job required more hours but because my previous job didn’t constrain me as much in terms of work hours (in other words, I had more free time then). But I managed to go on improving the linear optimizer. In the end, however, with Michael’s suggestion, I decided to ditch my own linear optimizer in favor of using lp_solve maintained by Kjell Eikland and Peter Notebaert, and focus my effort on writing non-linear optimizer instead. That was actually the right decision because lp_solve is simply a much better optimizer than the one I wrote myself.

At that time, other Linux distributors such as Debian and Ubuntu started including my Solver, suffice it to say that brought more excitement to me.

GSOC 2006

Jody Goldberg decided to register this Solver project as a GSOC task for the year 2006, and he and I signed up as the co-mentors of this task. But unfortunately it didn’t attract any students that year, and as a consequence, there was no single line of code contributed. There was someone else from the Novell side assigned to work on this, and he did work on the code awhile as far as I can tell. But in the end there was no code either.

Then CWS was created

In September of 2006, Michael told me that it was now time to upstream the Solver code. He created a CWS for this (called scsolver02), and I began working toward putting my code into CWS. This was not an easy task, however. For one thing, I had to make the Solver related code a configure-time option (with --enable-scsolver option), which meant that I needed to find a way to make building of lpsolve and scsolver modules optional. Now, for someone who was not very familiar with OO.o’s quite convoluted, unorthodox build system, this seemed like a huge mountain of task (and it was). But I managed to find a way by looking at other optional modules and trying to make sense of it and do something similar for the lpsolve and scsolver modules.

Besides the above configure issue, I also needed to find a way to make the Solver menu item in the Tools menu optional, which as we discovered was not as easy as it seemed. But I had to solve these two issues because, obviously, without them being solved, there would be no chance of integration. So I investigated further.

I believe there was a little confusion even among Sun engineers with regard to how to implement a conditional menu item. I initially asked Eike on the sc-dev mailing list, and he referred me to the framework-dev mailing list instead. So I asked the question there, which resulted in this thread. In short, the discussion started from how to conditionally add a menu item, to how to make my Solver component a pure UNO component to make it truly an optional feature. But there was one problem: I was using the internal string resource manager API to localize the strings, but there was no UNO equivalent of string resource manager (an effort only recently included). I wasn’t really sure in which direction the discussion was going, and I don’t really remember if there was any consensus as to what the next step would be (as far as scsolver02 cws integration is concerned). But the discussion ended, and I was somewhat left confused, unsure of what to do next.

The specification woes

In addition to the above task, Michael Meeks told me (perhaps against his belief) to write a specification which he said was sadly something I needed to write in order to get my code accepted upstream. I knew how much he was against this whole specification process. Although I was not in favor of the specification process myself, I was not that much against it at the time. That said, I had no doubt in my mind that writing a specification for this Solver feature would require a fair amount of time. In the end, I accepted the task, somewhat unfavorably, in order to get things moving.

Please note that I was doing all of this completely on my own time, and no one was paying me to do any of this. I had to make a lot of sacrifice with my precious family time to allocate some time to work on this, but I was still left with only 30 minutes up to an hour of time available each day. Suffice it to say that my progress was very very slow, and since there was no coding involved, my motivation was all time low. But I kept going, believing that someday all of this would end.

Then the specification template really brought me tons of agony. Apparently there was some sort of Basic macro embedded in the template itself, to control some list boxes present in the template. It was designed to run a custom Basic macro every time the value of a list box changes. However, there was some hard-coded file path in the Basic code that prevented any of this from running properly (someone told me later that the code would only run properly inside Hamburg), so every time I changed the value of a list box, I’d get a run-time error message. Not to mention I couldn’t author the dialog description part of my specification because of that. I found myself debugging the basic code, wondering why on earth I had to debug a code when I was supposed to write up a documentation for the feature! My level of frustration was rising rapidly.

After three days of struggling with the misbehaving basic code, I asked for help, on the spec-dev mailing list. Then I soon found out that this was not a new problem and others were experiencing it as well. Two days had passed since I posted my request for help and I was still not getting any reply, so I asked again if anybody was there to comment. Still no reply from Sun. My frustration was at its peak because I was really trying to get the spec completed so that I could integrate the CWS and go back to the fun of coding again. This spec writing (debugging?) non-sense dragged me on for weeks.

Then my patience had run out, and I finally exploded. At that time, Michael was also running a piece on how the spec process was broken, in a very humorous way. So I emailed him and dumped my frustration with this whole specification process. I feel sorry that he had to listen to all my rant, but he was kind enough to listen to all of what I had to say, then after some private email talk between him and someone over at Sun, he eventually initiated this infamous specification thread. I don’t want to go into the details of the discussion we had on that thread (it’s all public for those who care), but it’s probably sufficient to say that nothing really has changed, except for the fact that now we are allowed to write a specification on the wiki, instead of using the Writer template. What I was quietly hoping to see out of this discussion was that we could finally abolish this painful specification process, but that unfortunately didn’t happen. (In retrospect, I was probably a bit too naive to have expected that from Sun. I should have known better.)

So, I was back to having to write a specification. This time on the wiki, but the same process nonetheless. After that thread had settled, I already told Michael that, after finishing the integration of the CWS, I did not want to go through the pain of upstreaming again, and that I would like to continue my work in ooo-build only. But deep down, I even had little motivation left to continue at that point, and I was more or less thinking about what I was going to say as my last word to the project, though part of myself was still wanting to stay with the project (after all, I loved hacking the OO.o code). I was torn. For the record, I tried to write a specification even under that kind of mental state, but my fingers would refuse to move, and I found myself sitting in front of my computer not doing a thing.

After joining Novell and licensing change

Long story short, I joined Novell in March 2007. I had already decided to leave the project before that, but Novell decided to pick me up. When Novell asked me whether I would be willing to change the license of the Solver code to LGPL only, I simply agreed. The change in licensing made perfect sense since the entire code was owned by myself (~99%), with a small fraction contributed from Novell and Debian, under LGPL.

But that did not mean that I lost interest in contributing the code upstream. Quite the opposite. Since it was made clear back when we were having a licensing discussion with Arun that LGPL code would be perfectly acceptable for upstream integration, I didn’t even expect that changing the license to pure LGPL would make much difference. As I said in my last comment in the issue page, my hope was to bring this functionality to all users of OO.o, but under a slightly different, more liberal license to make it fair to all the work and support those Novell guys have given me to make all of this possible. I truly believed what we were asking was a small, small favor, and I believed that we deserve at least that much after all we had done to the project.

Then Louis Suarez Potts announced Sun’s intension to develop a Solver for 3.0. I was there, witnessing his announcement on Calc Solver for 3.0 first-hand. Again, since it was never communicated with me, I was not quite sure what that announcement meant, until I saw this comment from Stefan Taxhet in the issue page.

So what happens next?

I simply don’t know, but one thing is clear. If Sun insists on rewriting all the work I’ve already done just to ensure that they own all the code in OO.o, even though it is legally permissible to integrate my code under a pure LGPL license as an external component, then perhaps I need to re-think my relationship with the project. Because that would be a clear sign that the goal of this project is no longer to work with community developers (i.e. those who contribute code, not talk) and create a vibrant open-source project where contributors feel they are making a difference, but to take advantage of free labor to further the corporate goal of Sun Microsystems, by protecting vigorously Sun’s total ownership of the code base as well as the development process in their entirety.

But I’d love to be proven wrong. I’d love to be proven that Sun still are willing to work with us, to make OO.o truly a wonderful product as well as a project attractive to prospective code contributors. But there is nothing I, as a single insignificant mere mortal can do to influence the behemoth that is Sun. It’s impossible to make an even slightest change in how the project is run, even after countless hours of coding and more than 10,000 lines of code generation (which I received no compensation for and involved quite a lot of personal sacrifice). In the end, I made no difference at all. Sad, truly sad.

Current on-going work

Here is a rough summary of the current status of my pending OO.o work (besides the data pilot work I’ve been doing).

Patches

I usually prefer sending patches when the changes are narrowly focused, or otherwise too small to warrant creating a full CWS. The following are the patches I have sent upstream and still pending for integration.

  • i79808 – This patch suppresses displaying of data validation input message when the message string is empty. So far I have not received any response from upstream. The main motivation of this patch is to improve Excel compatibility, by emulating Excel’s behavior on data validation. Besides it makes no sense to display an empty input message box when there is no message to show (IMHO)!
  • i80448 – This patch allows Excel style hyperlink format to be used in addition to Calc’s own hyperlink format. It’s targeted for 2.4 inclusion (thanks Eike!).
  • i80981 – This patch considerably speeds up loading of a spreadsheet document (Calc or Excel) containing a large number of built-in SEARCH function instances. With my test document, the load time without this patch was roughly 40 seconds, but this patch reduces that down to a few seconds. It’s targeted for 2.4 inclusion (again, thanks to Eike).
  • i81154 – This patch renames the sheet name in hyperlink when importing an Excel document, the same way Calc renames sheet names when they contain a special character that Calc cannot handle. Without this patch, the sheet name within a hyperlink would be left unmodified, thus following that hyperlink would fail when the name of the destination sheet contains a special character. It’s targeted for 2.4 inclusion (hooray Daniel!).

I think these are all I can think of at the moment. There may have been more, but they are probably either already integrated, need more work, block on ODF file format change, or something along those lines.

CWS’es

When the required code change grows to a certain size, I typically create a CWS to make the upstreaming process less painful for the Hamburg folks since reviewing a patch is not an easy task when the size of the patch is big. These are the CWS’es that I have created that are still pending for integration.

  • celltrans02 (completed, pending QA) – Localize the keywords used in the built-in CELL and INFO functions for Hungarian locale. This is for Excel compatibility since Excel localizes said keywords for who knows why. The keywords are already localized for French locale, and that work is already integrated via celltrans01 CWS.
  • autofilter01 (work in progress) – This work is an attempt to add a new multi-selection filter type similar to Microsoft Excel 2007. The internal data structure change is complete, the ODF file format change has been proposed and approved, and the ODF filter code change has been completed so that the new filter is now loaded and saved properly. Work in adding a new autofilter drop-down box for this feature is still in progress.
  • scsheetprotection01 (work in progress) – This work will add sheet protection options so that the user can place some constraints when protecting a sheet. So far, loading the sheet protection options from and saving them to a binary Excel file is complete, and the code change to prohibit selection of protected and/or unprotected cells is complete. What remains to be done is to add a new dialog to allow specifying desired sheet protection options. This effort is primarily for Excel compatibility, but I’m sure some users will appreciate this. :-)

All of these listed efforts are for the most part inspired by our customer requests as well as requests from the community. One exception is the autofilter extension to add a multi-selection filter, since that one is my own pet peeve. And of course, all of this work will all be contributed to the upstream branch under JCA.

Drilling down data in Data Pilot

Ok. Here is the feature I’ve been working on while I was in Barcelona, and I think it’s got to a point where I can show some progress to the world.

This feature is about supporting a drill-down of a data field cell in Calc’s Data Pilot feature. Excel already has an equivalent feature in its Pivot Table functionality, and what I’m doing initially came out of a request from current and former Excel users who rely on that feature.

Here is what this feature does. Let’s say you have a data pilot box somewhere in your Calc document like the one below.

data pilot view

You then decide to double-click on one cell within the data field to see the original data rows that comprise that cell. Calc then inserts a new sheet, containing the appropriate subset of the original data table (see below).

drill down data on new sheet

At this point, it works pretty reliably for me, but that’s not to say that it’s ready to be integrated into the main branch. There are still a few things to take care of.

  • The drill down sheet is not decorated with auto formatting. Excel formats the drill down sheet nicely, but currently it’s not implemented. But I have to wonder whether that is really essential (I think not).
  • Currently this feature works only on internal data source i.e. data source within the same document. But Calc’s Data Pilot supports two other data source backends (registered database and data pilot UNO component), and I need to add support for them too.

Once that is done, I’ll be happy to see this piece integrated upstream (under JCA, of course), since this feature is also requested in Issue 57030. While I’m in the data pilot code, it may be a good idea to review other data pilot related issues (if there is any) and tackle them at the same time. If you know of any hot data pilot related issues, please let me know.

Importing Excel 2007 files

The Excel 2007 filter for Calc is still on its way, but perhaps now is a good time to show the progress of this new Excel 2007 import filter work by Daniel Rentz and myself.

Here is a screenshot of a file created by Excel 2007 (left), and one for the same file opened in Calc (right).

Excel 2007 screenshot Excel 2007 file opened in Calc

There are still a lot to be done, however. Formula import is still to be completed, which blocks other features that rely on the formula parser. Charts, text boxes, and other graphic objects are still not imported yet. There is also a performance issue of a large xlsx file import, which needs to be addressed at some point.

But all in all, things are coming along very nicely.

Extending Calc’s autofilter

As I work on implementing the OOXML import filter for Calc, I notice quite a few features that are in Excel 2007 but are not in Calc. While not all of them deserve special attention, one particular feature has caught my eye, which is the ability to filter rows based on a set of multiple string values instead of just one.

To see the benefit of this feature, let’s take a look at the current autofilter implementation in Calc as of OO.o 2.1.

Current autofilter implementation (2.1)

As you can see, if you want to filter by the cell content, you can only specify one value. If you want to specify “show either Bruce or David”, you will have to use the Standard filter and use this regular expression ^(Bruce|David)$ to accomplish the effect. Alternatively, if the filtering criteria involves only one column field, you could use two filter conditions, each one specifying textual equality to one text value, and connect them with OR, but this still will not work if more than one fields are involved because Calc doesn’t allow nested AND/OR’s between filter conditions.

Excel 2007 does this quite nicely. It allows a user to specify multiple text values in a single filtering criteria by presenting a list of check boxes like this:

Multi-string autofilter in Excel 2007

This allows a user to quickly filter his/her data, without resorting to something more complex, like regular expressions.

I believe our OO.o users will benefit enormously if we implement something similar in Calc, and I’ve already started some work toward implementing this. But to implement this feature in Calc requires a change in the ODF file format specification. The ODF spec, as of version 1.1, does not allow a clean storage of multiple text values in a single filter condition. An effort is on-going, however, to change the ODF spec in order to accommodate this feature, so there is hope.