What goes on when loading a file.

I just had an opportunity to spend some time reading and analyzing what actually takes place when you do a mundane thing like opening a file. If you are a user, you wouldn’t think much when opening a new document. You select the file, click Open, and you expect that file to be open. If you are a coder, however, and especially if you are a coder who has spent some time either looking through or trying to debug this code, I bet that this is one of the most horrifying places to work in even in this code base. It certainly is for me.

Anyway, since I’m a diagram-oriented person, I’ve decided to sketch a very rough diagram of what happens when you open a file, from the moment we receive a dispatch request with the URL of the document, to the point where we pass that call to the appropriate filter code. Here is the result.

file-load-process-diagram

Now, this is a cleaned-up version. The actual code contains lots more branch points and quite a few “temporary” hacks (here the term “temporary” is used very loosely), which undoubtedly will confuse you even more. But I believe this diagram illustrates a very rough overview of how we determine the format type of the document, how the “right” (“right” in 95% of the time) filter gets picked, and where to look in case something doesn’t work as expected…. Hopefully.

4 thoughts on “What goes on when loading a file.”

  1. Surely the loading times are not so good in LibreOffice. For example, opening a simple Calc document takes too time on a intel dual core 2,6 Ghz compared to Excel and Gnumeric.

    1. Sure. But that problem is not related to this post. This post is about how we detect file format types.

  2. So, do you see room for improvement?
    Anything that could be made into an Easy Hack?

    1. Room for improvement? Absolutely. Easy Hack? Probably not. This code frightens even the seasoned veterans. I would advise the new comers to stay away from this code.

Comments are closed.