Hacking Calc – The First Step

I’m happy to announce that the article I submitted to the OpenOffice.org Developer Article Contest has been selected as June’s winner! You can read my article here. Though this article attempts to cover the hard topic of hacking the core part of OO.o’s code base, I tried to keep it somewhat of an easy read.

I hope this article will help lower the initial barrier to entry, because hacking OO.o, though it admittedly comes with a steep learning curve, is quite fun once you set your foot in the door. So, even if you are a little intimidated by the sheer size of the code base, come and try with your hand a little. You might actually like it. :-)

Interested in writing your own article for the contest? Cool. Go read the guideline and submit your own!

Recent activities

So, I have several things going on at the moment:

  • I have just registered for my final exam for the class I’ve been taking since December. This class has certainly been an eye-opener, especially for someone like myself who’d been coding in high-level languages with only a vague understanding of low level architectural stuff.
  • I’ve started looking at lp_solve for possible integration into Calc Solver, and solicited for some interests. This Solver task, admittedly, has become much, much, more than what one person can handle. So, strategic change is necessary to keep things moving. I’ve also updated the wiki page to add more content.
  • Created a new wiki page for Statistical Data Analysis Tool project, to hopefully mobilize the near-death-experienced-yet-still-important project for Calc. There is no content on this page yet as it is still pretty much a skeleton. I wonder if this project will ever gain enough momentum to get it going…

Current status

First off, my coding activity on solver is progressing very slow these days. Mostly because, besides my day job as a software engineer, I’ve been taking a class at a nearby college since January, and its class work (homework and programming assignments etc.) is taking up the bulk of my free time. This class finishes in May, so hopefully I will have more time to spend for the solver project once this is all done.

Also, for those who haven’t noticed, there is now a Windows binary available thanks to Kami’s contribution, so to those of you who have been waiting for a Windows version, now is your chance. :-) It’s available at the usual place.

Ok, so…, here is the update on my solver. I’m pretty much done with my revised simplex code, and now I’m shifting my focus to improving the (upper-and-lower) bounded revised simplex code. After a couple of test runs I’ve realized that the bounded simplex’s initial solution search is still pretty weak, so, that code needs to be improved first before anything else.

As an aside, here is an interesting page on Calc’s performance optimization bits by Michael and his gang.

New LU factorization in matrix inverse algorithm

One of the things I wanted to do as a remedy to my simplex algorithm (refer to my last blog entry) was to switch to another algorithm for LU factorization which is used in my matrix inverse algorithm. The original LU factorization algorithm obviously was not robust enough to inverse an initial matrix generated in solving Markus’ model. This time, instead of blindly looking for a new algorithm to use all over the net, I decided to dive into Calc’s code to see what algorithm Calc uses internally for matrix inversion since Calc was able to inverse the very matrix my own algorithm was not.

Of course, I can’t just copy and paste Calc’s matrix inverse code, but luckily there was a reference within the code from which it was derived, and I just happened to have a copy of that very same reference which one of my friends was kind enough to lend to me a few weeks ago. Lucky me!

Anyway, now my solver can at least solves Markus’ model, though my solution differs from the Excel Solver’s solution that he included because of simplex’s positive variable limitation (his Excel solution included negative numbers). Adding support for negative numbers shouldn’t be too complex, but requires a bit more work.

The new snapshot (revision 101) is up which includes this new LU factorization algorithm.

Studying…

I have been busy studying data structure for the past week to partly refresh and partly deepen my understanding of the subject. The obvious side-effect of this is that I haven’t been able to spend much time coding. But hopefully this study effort will pay off.

Also, I’m firmly convinced that I need to spend significant amount of time studying on matrix computation essentials in order to be able to develop reasonably-fast matrix calculation algorithms (I got the book by the way). The algorithms I use for linear programming relies heavily on matrix operations, so this effort is absolutely important. And yes, I am aware that there are libraries out there that already do fast matrix computations, but using them without having some idea of what they may be doing internally bothers me quite a bit.

Meanwhile, my patch for fixing Calc’s erf/erfc functions (i55735) is proposed for 2.0.2. Jody helped me on minor performance improvement.