About Kohei Yoshida

LibreOffice hacker, spreadsheet nerd.

New document status image in the status bar

I’ve just checked in the new icon set for the document status indicator from Paulo José. Here is a side-by-side screenshot of what the new icons look like.

statusbar-new-status-icon

The above is what it looks like when the document is unmodified. It’s a bit faded with translucency effect which is intentional. The one below is when the document is modified. The new images look very refined and are more in line with the application icon that we use for LibreOffice. Good work Paulo! :-)

Now, he has created another icon to show immediately after the document is saved, before it becomes the faded icon again after a few seconds. But that effect has yet to be implemented. If you are interested in taking on this task, drop us a note. It’s listed on the Easy Hacks page.

And let’s not forget to say that 3.4 will have these two brand-new icons.

mdds 0.5.2 released

I’m happy to announce that version 0.5.2 of Multi-Dimensional Data Structure (mdds) is available for download from the link below.

http://multidimalgorithm.googlecode.com/files/mdds_0.5.2.tar.bz2

This is a bug fix release. I would like to thank David Tardon for fixing several important bugs as well as implementing some new API’s for flat_segment_tree. In fact, the majority of changes between 0.5.1 and 0.5.2 are from David.

Here is the run-down of the major changes since 0.5.1:

  • flat_segment_tree
    • fixed a crash on assignment by properly implementing assignment
      operator().
    • fixed several bugs in shift_right():
    • shifting of all existing nodes was not handled properly.
    • leaf nodes were not properly linked under certain conditions.
    • shifting with skip node option was not properly skipping the
      node at insertion position when the insertion position was at
      the leftmost node.
    • implemented min_key(), max_key(), default_value(), clear() and
      swap().
    • fixed a bug in operator==() where two different containers were
      incorrectly evaluated to be equal.
    • added quickcheck test code.

There is no API-incompatible changes since 0.5.1, so if you are currently using mdds 0.5.1, your code should compile with 0.5.2 without any modifications.

Posted in | Tagged

Named range as data source in DataPilot table

I have hinted in my previous post that you can now use a named range as the data source of a DataPilot table, but you couldn’t create a new DataPilot table with a named range as the source.

Well, now you can.

I tried to come up with a clever way to add this functionality, but ended up with just another radio button in the existing source selection dialog (the dialog that pops up when you select Data – DataPilot – Start without an existing DataPilot table).

Here is a screenshot of the new dialog as evidence:
calc-dp-named-range-source

This functionality is currently available on the master branch of LibreOffice. For those of you who can build LibreOffice directly from the repository, go check it out!

For those of you who would rather wait for a released version, this will be available in 3.4 – the next minor release. Refer to this page for more detailed release plan of the upcoming versions of LibreOffice.

Extracting a sub project into a new repository (and how mso-dumper got its new home).

Background

Just a short while ago I worked on extracting our mso-dumper project from LibreOffice’s build repository, into a brand new repository created just for this. The new repository was to be located in libreoffice/contrib/mso-dumper.

Originally, this project started out just as a simple sub directory of a much larger parent repository. But because it grew so much, and because its scope is not entirely in line with that of the parent repository, I decided it was best to move this project into a repository of its own. Now, it’s easy to transfer a subset of files from one repository to another if you don’t mind losing its history, but I wanted to preserve the history of those files even after the transition.

It turns out that there is a way to do this with git. Kendy suggested that I look into git filter-branch, so I did. After a few hours of researching and trials & errors (and some bash script writing which was later thrown away), I’ve come to realize that all of this can be achieved in the following simple steps.

Steps

First, clone the whole build repository which contains the sub project to be extracted

git clone path/to/libo/build mso-dumper-temp

Once done, cd into that cloned repository, and run

git filter-branch --subdirectory-filter scratch/mso-dumper/ -- --all

which will remove all files from the git history except for those under the scratch/mso-dumper directory, and re-locate those files under that directory into the top-level directory. You may also want to run

git remote rm origin

to prevent accidental pushing of this to the remote origin during these steps. Anyway, once the filtering is done, remove all tags by

git tag | xargs git tag -d

And that’s all. Now, you have only the files you want to keep, they are sitting happily at the top level like they should, all of their commit records are preserved, and you don’t have any old tags you don’t need for the new repository.

This is not over yet. At this point, this git repo still stores the objects of the removed files. In fact, the size of the .git directory of this new repo was more than twice the size of the .git directory of the original build repo! To completely prune this unnecessary info in order to shrink the size of the repository, run

git clone file:///path/to/mso-dumper-temp mso-dumper

to further clone this into another repo locally to strip all the unnecessary blob. Note that I used the file:///… style file path, as opposed to the usual /path/to/foo style file path. When using the file:///… style path to clone a local repo, git will not clone the objects of the removed files, thereby reducing the size of the objects significantly (and clone is faster too). Using the regular /path/to/foo style path, git will hard-link all the object files, so the size will stay the same.

After the second cloning, the size of my .git directory shrank from 280MB to 384k! So it does make a big difference. Now all that’s left to do is to push this repository to the new remote location. Easy huh? :-)

But there was a gotcha….

There was one caveat, however. This method apparently does not preserve the whole history of the relocated files if the parent sub-directory had been renamed. The mso-dumper directory was renamed from its original name sc-xlsutil in order to accommodate the ppt dumper that Thorsten wrote. Unfortunately git filter-branch --subdirectory-filter did not preserve the history before the directory rename occurred, but that was just a minor issue, and something I was not too concerned about for this particular transition.

FOSDEM 2011 slide & latest updates

I’ve just uploaded the slide for my talk during FOSDEM 2011 here. It was very nice to be able to talk about our somewhat ambitious plan to bring LibreOffice Calc to the next level. Also, I regret that I haven’t been able to blog about what’s been going on lately; lots of time spent on writing, reviewing code, fixing bugs and integrating patches, and sadly little time is left on writing blogs.

Having said all that, let me talk about a few things that are new on the master branch (since I’m already in the writing mode).

The first one is the new move/copy sheet dialog

new-copy-move-sheet-dialog

which is based on the design suggestion from Christoph Noak and coded by Joost Eekhoorn. The idea is to provide a quick way to rename a copied sheet, and also to make the layout more ergonomic and more appropriate to modern HIG. There are still some minor issues that we have yet to work out, but this is a step in the right direction.

The second one is related to DataPilot. In fact there are two new enhancements landed on master with regard to DataPilot.

The first enhancement is the support for unlimited number of fields. Previously, DataPilot could only support up to 8 fields in each dimension (page, column, row and data). But now you can define as many fields in each dimension as you desire, provided that you have enough memory and CPU cycles to handle extra load.

calc-dp-unlimited-fields

The second DataPilot enhancement is the support for named range as the data source. Now, you can use a named range as the data source of a DataPilot table, instead of raw range reference. This has the advantage that, when your source range grows, you can simply update the named range and refresh the DataPilot table.

calc-dp-named-range-source

However, I have not yet added a way to create a new DataPilot table with a named range as data source. I will work on that sometime soon, hopefully in time for our 3.4 release.

Other than that, I’ve fixed quite a number of bugs and added performance enhancements particularly with regard to external reference handling. Still, there are lots of other tasks I need to do on master before we hit the 3.4 release. Stay tuned for more updates.

New LibreOffice build eye-candy

This is cool.

When you build LibreOffice straight from the master repository, and you build it in the GNOME environment, you’ll get a nice little systray thingie with up-to-date build status information.

libo-build-zenity

And this is what you get when your build happens to fail.

libo-build-zenity-failed

When you are lucky enough to have a successful build, here is what you see.

libo-build-zenity-success

I don’t know who added this , but it sure is a nice one. :-)

Update: this is the result of the fine work done by Luke Dixon.

Working with a branch using git-new-workdir

Introduction

Git package contains a script named git-new-workdir, which allows you to work in a branch in a separate directory on the file system. This differs from cloning a repository in that git-new-workdir doesn’t duplicate the git history from the original repository and shares it instead, and that when you commit something to the branch that commit goes directly into the history of the original repository without explicitly pushing to the original repository. On top of that, creating a new branch work directory happens very much instantly. It’s fast, and it’s efficient. It’s an absolute time saver for those of us who work on many branches at any given moment without bloating the disk space.

As wonderful as this script can be, not all distros package this script with their git package. If your distro doesn’t package it, you can always download the source packages of git and find the script there, under the contrib directory. Also, if you have the build repository of libreoffice cloned, you can find it in bin/git-new-workdir too.

Now, I’m going to talk about how I make use of this script to work on the 3.3 branch of LibreOffice.

Creating a branch work directory

If you’ve followed this page to build the master branch of libreoffice, then you should have in your clone of the build repository a directory named clone. Under this directory are your local clones of the 19 repositories comprising the whole libreoffice source tree. If you are like me, you have followed the above page and built your libreoffice build in the rawbuild directory.

The next step is to create a separate directory just for the 3.3 branch which named libreoffice-3-3 and set things up so that you can build it normally as you did in the rawbuild. I’ve written the following bash script (named create-branch-build.sh) to do this in one single step.

#!/usr/bin/env bash
 
GIT_NEW_WORKDIR=~/bin/git-new-workdir
REPOS=clone
 
print_help() {
    echo Usage: $1 [bootstrap dir] [dest dir] [branch name]
}
 
die() {
    echo $1
    exit 1
}
 
BOOTSTRAP_DIR="$1"
DEST_DIR="$2"
BRANCH="$3"
 
if [ "$BOOTSTRAP_DIR" = "" ]; then
    echo bootstrap repo is missing.
    print_help $0
    exit 1
fi
 
if [ "$DEST_DIR" = "" ]; then
    echo destination directory is missing.
    print_help $0
    exit 1
fi
 
if [ "$BRANCH" = "" ]; then
    echo branch name is missing.
    print_help $0
    exit 1
fi
 
if [ -e "$DEST_DIR/$BRANCH" ]; then
    die "$DEST_DIR/$BRANCH already exists."
fi
 
# Clone bootstrap first.
$GIT_NEW_WORKDIR "$BOOTSTRAP_DIR" "$DEST_DIR/$BRANCH" "$BRANCH" || die "failed to clone bootstrap repo."
 
# First, check out the branches.
echo "creating directory $DEST_DIR/$BRANCH/$REPOS"
mkdir -p "$DEST_DIR/$BRANCH/$REPOS" || die "failed to create $DEST_DIR/$BRANCH/$REPOS"
for repo in `ls "$BOOTSTRAP_DIR/clone"`; do
    repo_path="$BOOTSTRAP_DIR/clone/$repo"
    if [ ! -d $repo_path ]; then
        # we only care about directories.
        continue
    fi
    echo ===== $repo =====
    $GIT_NEW_WORKDIR $repo_path "$DEST_DIR/$BRANCH/$REPOS/$repo" $BRANCH
done
 
# Set symbolic links to the root directory.
cd "$DEST_DIR/$BRANCH"
for repo in `ls $REPOS`; do
    repo_path=$REPOS/$repo
    if [ ! -d $repo_path ]; then
        # skip if not directory.
        continue
    fi
    ln -s -t . $repo_path/*
done

The only thing you need to do before running this script is to set the GIT_NEW_WORKDIR variable to point to the location of the git-new-workdir script on your file system.

With this script in place, you can simply

cd ..  # move out of the build directory
create-branch-build.sh ./build/clone . libreoffice-3-3

and you now have a new directory named libreoffice-3-3 (same as the branch name), where all modules and top-level files are properly symlinked to their original locations, while the actual repo branches are under the _repos directory. All you have left to do is to start building. :-)

Note that there is no need to manually create a local branch named libreoffice-3-3 that tracks the remote libreoffice-3-3 branch in the original repository before running this script; git-new-workdir takes care of that for you provided that the remote branch of the same name exists.

Updating the branch work directory

In general, when you are in a branch work directory (I call it this because it sounds about right), updating the branch from the branch in the remote repo consists of two steps. First, fetch the latest history in the original repository by git fetch, move back to the branch work directory and run git pull -r.

But doing this manually in all the 19 repositories can be very tedious. So I wrote another script (named g.sh) to ease this pain a little.

#!/usr/bin/env bash
 
REPOS=clone
 
die() {
    echo $1
    exit 1
}
 
if [ ! -d $REPOS ]; then
    die "$REPOS directory not found in cwd."
fi
 
echo ===== main repository =====
git $@
 
for repo in `ls $REPOS`; do
    echo ===== $repo =====
    repo_path=$REPOS/$repo
    if [ ! -d $repo_path ]; then
        # Not a directory.  Skip it.
        continue
    fi
    pushd . > /dev/null
    cd $repo_path
    git $@
    popd > /dev/null
done

With this, updating the branch build directory is done:

g.sh pull -r

That’s all there is to it.

A few more words…

As with any methods in life, this method has limitations. If you build libreoffice with the old-fashioned way of applying patches on top of the raw source tree, this method doesn’t help you; you would still need to clone the repo, and manually switch to the branch in the cloned repo.

But if you build, hack and debug in rawbuild almost exclusively (like me), then this method will help you save time and disk space. You can also adopt this method for any feature branches, as long as all the 19 repos (20 if you count l10n repo) have the same branch name. So, it’s worth a look! :-)

Thank you, ladies and gentlemen.

P.S. I’ve updated the scripts to adopt to the new bootstrap based build scheme.

Japanese language mailing lists now available

Florian (whose blog I can’t find at the moment so I’ll link his twitter account) was kind enough to set up three mailing lists dedicated for the Japanese language speakers in the LibreOffice project. So, those of you who have been patiently waiting for this moment, feel free to subscribe them. I’ll see you guys there. :-)

Meanwhile, Yosuke Kato has made similar announcement about the new mailing lists here (in Japanese).

Key binding compatibility options (take 2)

This post is a sequel to this previous post, so refer to that post for the detail of what I’ve been working on.

Anyway, I have settled with the following Compatibility option page:

calc-compat-option

which should be just adequate for what it needs to do without being too annoying.

Also, just for the matter of documenting its behavior, the following chart shows what actions are associated with what key bindings for the two key binding types (Default and OpenOffice.org legacy):

Key Binding Default OpenOffice.org legacy
Backspace delete contents delete
Delete delete delete contents
Ctrl-D fill down data select
Shift-Ctrl-D data select -

where the actions are

  • delete contents – launch the delete contents dialog.
  • delete – immediately delete the cell content, without the dialog.
  • fill down – fill cell content downward within selection.
  • data select – launch the selection list dialog.

Note that all the other key bindings are left untouched. Also, the list of key bindings that can get reset by this functionality may grow in future releases.

mdds 0.3.1

I’m happy to announce the release of version 0.3.1 of the Multi-Dimensional Data Structure (mdds). This is a bug fix release, and contains no major changes from the previous version (0.3.0). The highlights of this release are:

  • use of autoconf tool, to allow you to run ./configure && sudo make install to install the library, and
  • drop the requirement for C++0x support, by using equivalent features from the boost library which mdds already depends on.

When using this library without the C++0x support, however, you need to define the MDDS_HASH_CONTAINER_BOOST compiler macro in order for the mdds library to use boost’s hash_containers, instead of the ones from C++0x. Similarly, you can define the MDDS_HASH_CONTAINER_STLPORT to force mdds to use stlport’s hash containers instead.

I will briefly explain the incompatible support of hash containers various libraries. Originally, the stlport library supported two hash containers, hash_map and hash_set, in the std namespace which can be used as hashed replacements of std::map and std::set, respectively. In C++0x, however, these two containers have been renamed to unordered_map and unordered_set which are still in the std namespace. Boost also provides unordered_map and unordered_set, but they are in the boost namespace. The change that this release contains should hopefully be useful when dealing with this incompatible hash container situation in various libraries.

This release contains patches from David Tardon and Phillip Thomas, who fixed various bits of the Makefile script. Phillip also helped me fix the rpm spec file. Thanks a lot!

Posted in | Tagged