Sunday, December 20, 2009

Manage Microsoft compiled html help files in Linux

If you’ve worked in Windows long enough, you have come across the .chm file format. This format is nothing more than a compressed html directory with an included index for easy viewing. Problem is, you can’t view these files in Linux without the help of another tool. And sometimes you want to be able view these files in another format. Well, fortunately the Linux development community has solved that problem by creating various tools to manage these .chm files.
In this tutorial you are going to learn how to view .chm files and convert them to both html and pdf documents. This tutorial will be using the Ubuntu distribution, but this task can be tackled with just about any distribution available.
View chm files
Before we get into converting these files, let’s see how to simply view them in Linux. Luckily there is a single, user-friendly tool for this task. The tool – Xchm. Xchm does one thing – view compiled html documents. This tool can display the contents tree (if one exists), allows you to navigate the document, change the fonts, and search the text.
To install this tool issue the following command:
sudo apt-get install xchm
Once installed, you can start the tool from the Office sub-menu of the Applications menu. When you start the tool up you should instantly notice how simple the interface is.
In Figure 1 Xchm is opened with a Microsoft TechNet document. As you can see, you can navigate around using any link included in the document as well as with the contents tree in the left pane.
But what if you want to edit that document or convert it to a pdf document? Simple – install two more applications.
Editing chm documents
In order to edit these documents you are going to need to convert them to html documents. This conversion is done with the CHMLIB tool. To install this issue the command:
sudo apt-get install libchm-bin
After this tool is installed you can make the conversions with a command like this:
extract_chmLib file.chm output_dir
Where file.chm is the chm file you want to convert and output_dir is the name of the directory you want to output the file to. Understand that the output_dir does not (nor should) already exist, because the command will create it for you.
When the command is issued all of the contents of the chm file will be extracted into the output_dir. From there you can go into the newly created directory and edit to your hearts content.
Converting to pdf
The next task is to convert the file to a pdf document. This is done with the help of the htmldoc application. First, install this with the command:
sudo apt-get install htmldoc
The html tool is a graphical tool that allows you to add as many html documents that you want and convert them all into a single pdf file. The user interface (see Figure 2) is very simple to use.
   1. Click the Add Files button.
   2. Navigate to the directory containing your html files.
   3. Select the file(s) you want to add.
   4. Check Web Page as the document type.
   5. Click on the Output tab.
   6. Check File and then give the file a name in the output path.
   7. Check PDF as the output format.
   8. Click the Generate button.
You can also deal with the various options in the other tabs, but generating a basic PDF from your html documents is fairly straight forward. Source: www.ghacks.net