dpkg:xlhtml 

Info 

A program for converting Microsoft Excel Files .xls

Description 

The xlhtml program will take an Excel 95, or 97 file as input and convert it to highly optimized html. The output is via standard out so it can be re-directed to files or piped to filters or used as a gateway on the internet.

Version 0.5.1-2 

Installation 

% debfoster xlhtml
Need to get 51.8kB of archives.
After unpacking 156kB of additional disk space will be used.

Quick Help 

$ xlhtml --help

xlhtml  converts excel files (.xls) to Html.
Copyright (c) 1999-2001, Charles Wyble. Released under GPL.
Usage: xlhtml [-xp:# -xc:#-# -xr:#-# -bc###### -bi???????? -tc######] <FILE>
        -a:  aggressive html optimization
        -asc ascii output for -dp & -x? options
        -csv comma separated value output for -dp & -x? options
        -xml XML output
        -bc: Set default background color - default white
        -bi: Set background image path
        -c:  Center justify tables
        -dp: Dumps page count and max rows & colums per page
        -v:  Prints program version number
        -fw: Suppress formula warnings
        -m:  No encoding for multibyte
        -nc: No Colors - black & white
        -nh: No Html Headers
        -tc: Set default text color - default black
        -te: Trims empty rows & columns at the edges of a worksheet
        -xc: Columns (separated by a dash) for extraction (zero based)
        -xp: Page extracted (zero based)
        -xr: Rows (separated by a dash) to be extracted (zero based)

csv extraction 

$ xlhtml -dp -csv Survey_RawData.xls
There are 2 pages total.
Page:0 Name:Survey Question MaxRow:42 MaxCol:9
Page:1 Name:Question #1 MaxRow:60 MaxCol:12

$ xlhtml -csv -xp:1 Survey_RawData.xls | tee Survey_RawData.xls
"Survey ID","Don't do anything","Ask parents","Ask friends","Ask family doctor","Go to emergency, clinic","Search internet","Call Telehealth","Data Quality Checksum","Gender","Age Range","International Student","Program/Course of Study"
1,3,1,2,4,6,5,7,** 28,"M","r2","N","Business Management"
[...]
60,1,1,2,5,1,2,1,** 13,"M","r3","Not any more","Social Work"

documented on: 2006.03.14

Trying History 

Dumps page info 
$ xlhtml -dp test.xls
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd[]">
<HTML>
<HEAD>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="xlhtml">
<TITLE>test.xls</TITLE>
</HEAD>

<BODY TEXT="#000000" BGCOLOR="#FFFFFF"><br>
<p>There are 3 pages total.</p>
<p>Page:0 Name:Sheet1 MaxRow:7 MaxCol:2</p>
<p>Page:1 Name:Sheet2 MaxRow:-1 MaxCol:-1</p>
<p>Page:2 Name:Sheet3 MaxRow:-1 MaxCol:-1</p>
</BODY></HTML>

$ xlhtml -dp -asc test.xls
There are 3 pages total.
Page:0 Name:Sheet1 MaxRow:7 MaxCol:2
Page:1 Name:Sheet2 MaxRow:-1 MaxCol:-1
Page:2 Name:Sheet3 MaxRow:-1 MaxCol:-1

$ xlhtml -dp -csv test.xls
There are 3 pages total.
Page:0 Name:Sheet1 MaxRow:7 MaxCol:2
Page:1 Name:Sheet2 MaxRow:-1 MaxCol:-1
Page:2 Name:Sheet3 MaxRow:-1 MaxCol:-1
Page contents 
$ xlhtml -a -nc -nh test.xls
<CENTER><H1>Sheet1</H1></CENTER><br><FONT FACE="Arial" SIZE="4"><TABLE BORDER="1" CELLSPACING="2"> [...] </TABLE></FONT><HR><FONT SIZE="-1"><I>Spreadsheet's Author:&nbsp;tong</I></FONT><br><FONT SIZE="-1"><I>Last Updated with Excel 97</I></FONT><br>&nbsp;<br><hr><FONT SIZE="-1">Created with <a href="http://chicago.sf.net/xlhtml">xlhtml 0.5.1</a></FONT><br>
-a:  aggressive html optimization
-nc: No Colors - black & white
-nh: No Html Headers

NB,

  • The '-nh' switch will also remove background color.
  • The difference between having '-a' or not is:
  • no '-a': '<tr valign="bottom">'
  • have '-a': <tr>

documented on: 2006.01.25

history, asc extraction 

NB, I now think the following asc extraction failed not because the given explaination but because of the wrong page number (1). It should be "0".

$ xlhtml -dp -asc test.xls
There are 3 pages total.
Page:0 Name:Sheet1 MaxRow:7 MaxCol:2
Page:1 Name:Sheet2 MaxRow:-1 MaxCol:-1
Page:2 Name:Sheet3 MaxRow:-1 MaxCol:-1
$ xlhtml -xp:1 -asc test.xls

Oh, no, the ascii output is not much useful, unless you want to specify the full range as "-xp: -xc:- -xr:-#". However, there is no way knowing it from xlhtml unless you open and view it with other tools.