Wikimedia projects, including the flagship English Wikipedia, have been restricted in access to people with internet access. kiwix is opening that up, via its offline reader.

I have blogged before about kiwix - this article is an effort to tell other people how to do the same.

Kiwix is a cross-platform reader of zim files. Zim is an open, standardised file format to store Wiki content efficiently for offline usage. It is compressed (LZMA), with fast resolution of inter-article links. It is simple (one file), and optimised to run on really small devices like phones.

Readers available

There are two supported platforms for the reader, Windows and Linux. These are graphical, client programs that appear similar to a web browser

  • with a search box at the top of a display area. These readers need a zim file to read - there is a standard File->Open dialog box available top-left to allow you to choose a zim file.

On the first time you access a zim file, the reader will offer to index it, allowing you to use the search bar. This can take quite a long time, but the results are saved on disk and you do not need to do it again. On linux, these are stored at ~/.www.kiwix.org/kiwix in case you are interested. Both readers below can be downloaded from http://sourceforge.net/projects/kiwix/files/_ - which I refer to as Sourceforge below.

Windows

To install the Windows reader, you will need to download it from sourceforge above - where you will find kiwix-0.9-alpha1-win.zip. You need to unzip that file under your “Program Files” directory, and create a launcher for your desktop pointing to “Program Files/kiwix/kiwix.exe

Linux

On sourceforge you will also find a deb for the very latest Ubuntu linux 10.04 (Lucid Lynx), called kiwix-0.9-alpha2.deb. If you are running 10.04, download this file and install it with

sudo dpkg -i kiwix-0.9-alpha2.deb

It will undoubtedly complain that some or all of libicu42,libxapian15, xapian-tools, xulrunner-1.9, zlib1g, libbz2-1.0, liblzma1,libmicrohttpd5 are missing. That is fixable by running this :-

sudo aptitude install libicu42 libxapian15 xapian-tools xulrunner-1.9 zlib1g libbz2-1.0 liblzma1 libmicrohttpd5

Which will take care of those dependencies, and finish installing kiwix for you. All these packages are in the lucid repositories.

If you are running something other than Ubuntu 10.04, you have slightly more work ahead of you. I suggest that you download the source package, kiwix-0.9-alpha2-src.tar.bz2, and get the required development packages for your distribution. You will need libxapian-dev (for search), libbz2-dev, libicu-dev, liblzma-dev, xulrunner-dev, zlib1g-dev, libmicrohttpd-dev in addition to the standard development tools. If you build an RPM or other finished package, let me know and I will get the files up on sourceforge if possible.

The reader under linux is called kiwix, and you can use the File -> Open menu to find a zim file to read. It appears in Applications under Education.

Zim files

Finally, you will need a zim file as a target for your reader. Currently, kiwix has these available at http://tmp.kiwix.org/zim/0.9/_ - no doubt they will move to a more permanent place later.

The files are big. Available now (April 2010) is an excellent collection of english articles, selected for school use and painstakingly checked for vandalism (thanks Martin Walker!!) aswikipedia_en_wp1_0.7_30000+_05_2009_beta3.zim. This is 30,000 articles from May 2009 and is a wonderful resource.

There are other languages available, but they have not had the same scrutiny as the English archive above. Since the associated wikipedias are smaller, these are a snapshot of the entire wikipedia - Arabic, Hebrew, Italian and Persian.

Wikimedia has other project besides wikipedia - like wikibooks, wikiversity and wiktionary that all make excellent candidates for zim files.

Server

The linux installations have another mode - a webserver, serving the zim of your choice up via http. That is how I set it up at Kwena Malapo school above. It means that on a network at a school, only one installation is necessary, and all other computers on the network can reach it as if from a website.

First, you will need to build the index. I keep the zim files in /opt/kiwixas they are so huge. I put the index files in the same directory, so to build the index for /opt/kiwix/wikipedia_en_wp1_0.7_30000+_05_2009_beta3.zim I run the following :-

export ZIM=/opt/kiwix/wikipedia_en_wp1_0.7_30000+_05_2009_beta3.zimkiwix-index $ZIM ${ZIM%zim}idx

Then you can run the server as follows (please read ―― as double dash ):-

export ZIM=/opt/kiwix/wikipedia_en_wp1_0.7_30000+_05_2009_beta3.zimkiwix-serve  ――index=${ZIM%zim}idx  ――port=8080  ――daemon $ZIM

This instructs kiwix to serve the same English wikipedia with its index from port 8080 on this computer. You can then access it as http://192.168.0.254:8080/ - replace the IP address by the IP address of this computer.

Kiwix is open-source, and uses freely-available toolkits for compression, search and rendering. The zim format is optimised for use on small devices with low computing power, so you can expect to have a port onto your favourite smartphone OS in the future. And, yes, it handles right-to-left Hebrew, and arabic writing thanks to its xulrunner mozilla heritage.