searched in ctan on convert
searched in davecentral on 'latex', no good hit. http://www.davecentral.com/
documented on: 2000.10.22 Sun 21:50:51
Overview of rtf2latex source package http://packages.qa.debian.org/r/rtf2latex.html http://packages.qa.debian.org/rtf2latex
rtf2LaTeX is a filter built on Paul DuBois' RTF reader that converts RTF (Microsoft's Rich Text Format) into LaTeX.
rtf2LaTeX expends a good deal of effort in an attempt to make the resulting LaTeX maintainable and modifiable.
Usage:
rtf2LaTeX [options] [RTF-file]
Your options are: -c No character formatting stuff -C file Use another translation-file for characters above 128 -d Use WORD formates within special WORD styles like heading, footnote text, ... -H Use LaTeX header and footer, not as default WORD header -h Help message -L file Use another translation-file for specilal WORD styles like heading, footer, footnote text, ... -n Use \hfil\break instead of \\ for making a new line -p No paragraph formatting stuff -r No left or right skip -s No tab stops -t No formatting in tables -T f Decrease-factor for the cell-width (default:0.7) -u Change underline to italic -v[#] Turn on verbose messages; the higher #, the more messages -V Print the version number
last update: July 28, 1998
(written in C). The Word document must first be saved to disk in RTF format. I'd like to hear of experiences with these. (They only handle RTF versions up to WinWord 2, not yet WinWord 6; there are great differences between these rtf versions and I definitely have problems with WinWord6-rtf with any rtf-converter.)
First, Paul Dubois <dubois@primate.wisc.edu> wrote an RTF reader and converter to plain text or troff. The version is dated April 1991.
a) Based on this reader, Robert Lupton <rhl@astro.princeton.edu> wrote the rtf2TeX converter. Last revision date: May 1992
He comments on this as follows (README-file):
citation begin>> This is a first attempt at an RTF to TeX converter. The parts that handle fonts and such like seem to work pretty well, although they could be improved, but the table handling is a problem. I had a good deal of trouble trying to figure out what particular rtf control codes were supposed to do; this makes it hard to convert them into TeX. I have tried to produce good TeX, but this is not easy due to the sloppy way that many RTF writers generate redundant font and other changes. Many things are not handled at all, more due to my lacking motivation than to their intrinsic difficulty. For example, I don't support double columns, but it would be easy enough to do (I'd generate a control sequence to do it, and add the TeX code required to the TeX_defs file. I even have the TeX somewhere...). <<citation end
So far the beginning of the README file. Later he commented:
citation begin>> Most (all?) RTF is hopelessly unstructured (the equivalent of \bf \it Hello \rm World \bf He \it \rm said. )
and the code that I wrote tries valiantly to convert this to something sensible, in this case {\it Hello\/} World {\bf He} said.
It is this attempt to make the output TeX usable that makes the code complicated... I did not try to convert equations or tables as I could find no adaquate description of either; I don't think that it would be very hard. <<citation end
My comment is: Tables yes, but formulas will be difficult. Can be found on all CTAN sites, dir …/support/rtf2tex
b) Based on these two, Erwin Wechtl wrote the rtf2LaTeX converter. Last revision date: Aug. 1993
He comments on this as follows (README-file):
citation begin>> rtf2LaTeX is a filter built on Paul DuBois' RTF reader that converts RTF (Microsoft's Rich Text Format) into LaTeX. rtf2LaTeX expends a good deal of effort in an attempt to make the resulting LaTeX maintainable and modifiable. <<citation end
Section: non-free/tex
Description: TeX/LaTeX to HTML converter LaTeX is popular for specifying complex printed documents. TtH translates Plain TeX or LaTeX documents into HTML. It quickly produces web documents that are compact, editable and fast viewing. TtH translates most equations instead of converting them into images. This HTML preserves much format when imported by MS Word.
TtH needs teTeX to generate auxiliary files for cross references and content tables. Complex equations and graphics require gs and netpbm to convert PostScript output from teTeX to images.
Tag: use::editing, works-with::text:tex
for hevea 1.08-4: Extra Features -> Date and time file:///usr/share/doc/hevea/html/manual032.html#toc98
Date and time support is not enabled by default, for portability and simplicity reasons.
However, HEVEA source distribution includes a simple (sh) shell script xxdate.exe that activates date and time support. The hevea command, should be invoked as :
# hevea -exec xxdate.exe ...
This will execute the script xxdate.exe, whose output is then read by HEVEA. As a consequence, standard LATEX counters year, month, day and \ime are defined and LATEX command \today works properly. Additionnally the following counters and commands are defined :
Counter weekday, Hour, hour, minute…
To change the date format from European to US, make the following modification:
$ diff -wu1 /usr/share/hevea/latexcommon.hva~ /usr/share/hevea/latexcommon.hva --- /usr/share/hevea/latexcommon.hva~ 2006-05-16 19:53:17.000000000 +0200 +++ /usr/share/hevea/latexcommon.hva 2006-07-19 16:33:47.000000000 +0200 @@ -538,3 +538,4 @@ %%%%%%%% Format for \today, e.g 31st July, 1980 -\newcommand\today{\english@day\english@month,~\theyear} +%\newcommand\today{\english@day\english@month,~\theyear} +\newcommand\today{\english@month{} \theday,~\theyear} %%%%%%%%%%%%%% Defined counter printing functions
texf=cdict
latex2html $texf
latex2html -split 0 -nonavigation -dir ~/tmp $texf
cp -v ~/tmp/$texf.* . dirdir $texf*
latex2html -h
-split num Stop making separate files at this depth (say "-split 0" for one huge HTML file). -(no)navigation Put a navigation panel at the top of each page (default). -(no)show_section_numbers When this is set true, the section numbers are shown. The section numbers should then match those that would have been produced by LaTeX. The correct section numbers are obtained from the $FILE.aux file generated by LaTeX. Hiding the section numbers encourages use of particular sections as standalone documents. In this case the cross reference to a section is shown using the default symbol rather than the section number.
latex2html-2002-2-1.tar.gz from CVS, 24-Feb-2004 http://saftsack.fs.uni-bayreuth.de/~latex2ht/user/latex2html-2002-2-1.tar.gz
$ configure [...] Note: Will install... ... executables to : /usr/local/bin ... shared library items to : /usr/local/share/lib/latex2html ... unshared library items to : /usr/local/lib/latex2html
$ make install Info: Running /usr/bin/mktexlsr to rebuild ls-R database... mktexlsr: Updating /usr/share/texmf/ls-R... mktexlsr: Updating /var/lib/texmf/ls-R... mktexlsr: Done.
Comes with tetex-latex-1.0.7-57
Supported styles are in /usr/share/latex2html/styles/*.perl.
cd /usr/share/latex2html/styles ls -C | sed 's/\.perl/ /g' CJK czech irish polski TEMPLATE danish italian portuges afrikaan dutch j-article psfrag alltt english j-book psfrag.jp1.4 american enumerate j-report report amsart epsbox japanese rgb.txt amsbook epsfig jarticle romanian amsfonts esperant jbook scottish amsmath estonian jreport seminar amssymb finnish jsarticle slides amstex floatfig jsbook slovak article floatflt jslides slovene ascmac frames justify spanish austrian francais latexsym supertabular babel french letter texdefs babelbst galician longtable texnames bahasa german lsorbian textcomp book germanb lyx turkish brazil graphics magyar usorbian breton graphicx makeidx verbatim catalan harvard more_amsmath verbatimfiles changebar havard multicol webtex chemsym heqn natbib welsh color hthtml nharvard wrapfig colordvi html norsk xspace crayola.txt htmllist nynorsk xy croatian inputenc polish
If commented out tex lines have commands in them, those commands will be picked wrongly by latex2html.
Yes, able to use comment package.
latex textest.tex This is TeX, Version 3.14159 (Web2C 7.3.1) (textest.tex LaTeX2e <2000/06/01> Babel <v3.7h> and hyphenation patterns for american, french, german, ngerman, i talian, nohyphenation, loaded. (/usr/share/texmf/tex/latex/base/article.cls Document Class: article 2000/05/19 v1.4b Standard LaTeX document class (/usr/share/texmf/tex/latex/base/size11.clo)) (/lfs/cache/fromCD/tex/cmd-gnrl.tex) (textest.aux) [...] latex2html -verbosity 0 -split 0 -nonavigation -noinfo -noaddress -dir /tmp textest.tex texexpand V2002 (Revision 1.11) texexpand: include cmd-gnrl.tex failed. Reinserting command Loading /usr/share/latex2html/styles/texdefs.perl...
!! |
ln -s /lfs/cache/fromCD/tex/cmd-gnrl.tex latex2html -verbosity 0 -split 0 -nonavigation -noinfo -noaddress -dir /tmp textest.tex texexpand V2002 (Revision 1.11) Loading /usr/share/latex2html/styles/texdefs.perl... -- ok now, have to ln to here
The following is OK with latex or latex2rtf, but Nok for latex2html
\newcommand\rfSkip\smallskip
* Could not find argument for command \newcommand **
\newcommand{\lih}{\item } % item list highlight \renewcommand{\lih}{\item } % item list highlight The above won't work (no item entries generated). It will only work when commment out the renewcommand During the process: Processing macros ...,,,,,,,,,,,,,++.............. @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Translating ...0/0. *** redefining \lih *** ..;.,,........................,................................;... Doing section links .... Unknown commands: item Done.
Note for —prefix: The final directory structure depends on the name of the prefix:
if prefix contains the string "latex2html" or "l2h": Then binaries go into $prefix/bin, while the rest goes into $prefix
After configure has completed, you may check the cfgcache.pm file if everything is ok. It contains all the information gathered from your system and there should be no need to change anything.
Building
Run "make". The distribution files (extension .pin) are turned into the locally adapted scripts, using the configuration from cfgcache.pm. If you need to change things, then re-run the configuration step with the appropriate options (preferred) or edit the cfgcache.pm file and run "make" again.
Testing
Do a plausibility check: "make check". The perl scripts are checked for syntax correctness. For compiling a small test document, type "make test".
Newsgroups: comp.text.tex Date: 2000/03/19
> The date is formatted in latex2html.bat (for windows version 99.2beta6)
Problem is I run it under Linux!!
> To change the date format look for the lines: > # Author address > @address data = &address data('ISO'); > > Get rid of 'ISO' on the second line to read > # Author address > @address data = &address data(); > > This change gives the American date format mm/dd/yyyy. > To change this find the subroutine: > sub get date { > > and at the end change the line > > } else { sprintf("%d/%d/%d", $m+1, $d, 1900+$y); } to > } else { sprintf("%d/%d/%d", $d, $m+1, 1900+$y); } > > You can also change the / in this line if you wish.
That sounds good but the problem is I don't have these subroutines in my latex2html.config file, and i don't know in which file they are on this system.
> There's an interesting discussion of this issue at > http://www.xray.mpe.mpg.de/mailing-lists/latex2html/1997-08/msg00100.ht[] ml > and you'll probably be able to use it to come up with a better solution .
Same thing there, it doens't specify where to find these subroutines.
I soved the problem (maybe not in the best way) by using a regular expression on the variable that gives the date in yyyy-mm-dd in my latex2html.config :
$address data[1]=~s/(.*)-(.*)-(.*)/$3-$2-$1/; $ADDRESS = "<I>Derni?e modification : $address data[1]</I>";
and it does just what i want! Thanks anyway,
Jean-Marc Lecarpentier
>and it does just what i want!
Wonderful! I copied your code into my Windows .latex2html-init file and it worked well. It's a lot better than altering config or bat files and is easily changeable for any situation
Steve Mayer
It won't work for latex2html 2002. The following did the trick.
$ diff -wu /usr/share/latex2html/styles/english.perl~ /usr/share/latex2html/styles/english.perl --- /usr/share/latex2html/styles/english.perl~ 2003-02-18 22:13:01.000000000 -0500 +++ /usr/share/latex2html/styles/english.perl 2004-01-28 21:07:52.000000000 -0500 @@ -62,7 +62,7 @@
sub english_today { local($today) = &get_date(); - $today =~ s|(\d+)/0?(\d+)/|$2 $Month[$1] |; + $today =~ s|(\d+)/0?(\d+)/|$Month[$1] $2, |; join('',$today,$_[0]); }
The basic idea of Hyperlatex is to make it possible to write a document that will look like a flawless LaTeX document when printed and like a handwritten HTML document when viewed with an HTML browser. In this it completely follows the philosophy of latexinfo (and texinfo). Like latexinfo, it defines its own input format—the Hyperlatex markup language—and provides two converters to turn a document written in Hyperlatex markup into a DVI file or a set of HTML documents.
Obviously, this approach has the disadvantage that you have to learn a "new" language to generate HTML files. However, the mental effort for this is quite limited. The Hyperlatex markup language is simply a well-defined subset of LaTeX that has been extended with commands to create hyperlinks, to control the conversion to HTML, and to add concepts of HTML such as horizontal rules and embedded images. Furthermore, you can use Hyperlatex perfectly well without knowing anything about HTML markup.
The fact that Hyperlatex defines only a restricted subset of LaTeX does not mean that you have to restrict yourself in what you can do in the printed copy. Hyperlatex provides many commands that allow you to include arbitrary LaTeX commands (including commands from any package that you'd like to use) which will be processed to create your printed output, but which will be ignored in the HTML document.
If you would rather have a tool that takes any Latex-file and translates it into HTML, you are probably happier with the Latex2html converter.
Otfried Cheong, February 25, 2002
latex2rtf foo convert foo.tex to foo.rtf latex2rtf <foo >foo.RTF convert foo to foo.RTF latex2rtf -M12 foo replace equations with bitmaps latex2rtf -i russian foo assume russian tex conventions latex2rtf -C raw foo retain font encoding in rtf file latex2rtf -d4 foo lots of debugging information
Latex to Rtf Converter
LaTeX to RTF convertor written in GNU C, text mode (no GUI), aimed at platform independence
Usage: latex2rtf [options] input[.tex] Options: -a auxfile use LaTeX auxfile rather than input.aux -b bblfile use BibTex bblfile rather than input.bbl) -C codepage latex encoding charset (latin1, cp850, raw, etc.) -d level debugging output (level is 0-6) -F use LaTeX to convert all figures to bitmaps -D dpi number of dots per inch for bitmaps -h display help -i language idiom or language (e.g., german, french) -l use latin1 encoding (default) -M # math equation handling -M1 displayed equations to RTF -M2 inline equations to RTF -M3 inline and displayed equations to RTF (default) -M4 displayed equations to bitmap -M6 inline equations to RTF and displayed equations to bitmaps -M8 inline equations to bitmap -M12 inline and displayed equations to bitmaps -M16 insert Word comment field that the original equation text -o outputfile file for RTF output -p option to avoid bug in Word for some equations -P /path/to/cfg directory containing .cfg files -S use ';' to separate args in RTF fields -se# scale factor for bitmap equations -sf# scale factor for bitmap figures -T /path/to/tmp temporary directory -v version information -V version information -W include warnings in RTF -Z # add # of '}'s at end of rtf file (# is 0-9)
In /usr/local/share/latex2rtf/latex2rtf.html
v1.9.15, 2004-02-08
550 kb
LaTeX2RTF understands most of the commands introduced with LaTeX2e . It supports both the old 2.09 version of `\documentstyle[options]{format#}' and the newer `\documentclass[options]{format}'.
It is not necesary to specify the `-C' option if you use `\usepackage{isolatin1}' or `\documentstyle[isolatin1]{...}'. LaTeX2RTF automagically detects these packages/style options and switches to processing of ISO-Latin1 codes.
Many languages from the Babel package are supported. However, the support is limited to translate various words usually emitted by LaTeX during processing. For example, this ensures that the LaTeX2RTF will provide the correct translation of the word "Chapter" in the converted document.
Cross references include everything that you might expect and then some: bibliographic citations, equation references, table references, figure references, and section references. Section, equation, table and figure references are implemented by placing RTF bookmarks around the equation number (or table number or figure number).
There are four separate levels of equation translation based on the -M switch. Each equation is now converted either to an EQ field or to a bitmap.
The table code is currently barely working. It needs to be rewritten.
There is now rudimentary support for `\includegraphics'. Three file types will be inserted into the RTF file without needing conversion: .pict, .jpeg, and .png files. EPS files are converted to PNG using `convert' from the ImageMagick package.
If there is no `\pagestyle' command, the RTF output is generated as with plain pagestyle, i.e. each page get's its page number centered at the bottom.
You must turn this off with the \pagestyle{empty} command in the LaTeX file if you don't want pagenumbers. The headings and myheadings styles are silently ignored by now. The twosided option to the \documentstyle or \documentclass produces the corresponding RTF tokens. Note that these features require RTF Version 1.4.
Hyperlatex support is largely broken at the moment, but continues to improve.
Warning line=5 Unknown style option comment ignored{ Error! line=6 \begin{comment} found before \begin{document}. Giving up. Sorry Warning line=6 Mismatched '{' in RTF file, Conversion may cause problems. Warning line=6 Try translating with 'latex2rtf -Z1 (null)'} $ latex2rtf -Z1 < textest.tex > /tmp/textest.rtf Warning line=5 Unknown style option comment ignored{ Error! line=6 \begin{comment} found before \begin{document}. Giving up. Sorry Warning line=6 Mismatched '{' in RTF file, Conversion may cause problems.} }
make
make check -- expect warnings but no errors cd test -- view generated rtf files
% install-info --entry=latex2rtf --info-dir=/usr/local/info /usr/local/info/latex2rtf.info Note, the info entry in Emacs is still wrong. I get in info node: Miscellaneous latex2rtf But it is not click-able. However, info latex2rtf shows the right result. % install-info --info-dir=/usr/local/info /usr/local/info/latex2rtf.info install-info: warning: no info dir entry in `/usr/local/info/latex2rtf.info'
install TeXInfo file latex2rtf.info
make install-info install-info: warning: no info dir entry in `doc/latex2rtf.info' install-info --entry=latex2rtf --info-dir=/usr/local/info doc/latex2rtf.info % make install-info mkdir -p /usr/local/info cp doc/latex2rtf.info /usr/local/bin install-info --info-dir=/usr/local/info doc/latex2rtf.info install-info: warning: no info dir entry in `doc/latex2rtf.info' % cp doc/latex2rtf.info /usr/local/info/ % install-info --info-dir=/usr/local/info/ latex2rtf.info install-info: No such file or directory for latex2rtf.info % install-info /usr/local/info/latex2rtf.info install-info: No dir file specified; try --help for more information. % install-info /usr/local/info/latex2rtf.info /usr/local/info install-info: warning: no info dir entry in `/usr/local/info/latex2rtf.info' % install-info /usr/local/info/latex2rtf.info /usr/local/info --entry=latex2rtf /usr/local/info: Is a directory % install-info --entry=latex2rtf /usr/local/info/latex2rtf.info /usr/local/info /usr/local/info: Is a directory % install-info --entry=latex2rtf --info-dir=/usr/local/info /usr/local/info/latex2rtf.info -- no complain any more.
Pagestyles
If there is no \pagestyle command, the RTF output is generated as with plain pagestyle, i.e. each page get's its page number centered at the bottom.
You must turn this off with the \pagestyle command in the LaTeX file if you don't want pagenumbers. The headings and myheadings styles are silently ignored by now.
The twosided option to the \documentstyle or \documentclass produces the corresponding RTF tokens.
ltx2rtf-5-0.zip 1094 Kb Version 5.0, revision: 2.36, date: 27 November 2000- 22:08:40
mainly for windows user, w/ all sorts of .exe and .bat files.
ltx2rtf.zip 22-Dec-1998 07:00 1.0M As of 2002.07.04 Thu.
$ md ltx2rtf # ! no root dir in the zip file!
For linux, an executable is readily provided (compiled with RedHat 6.1), and a Makefile is provided.
Read carefully the text file: ltx2rtf.doc
Make sure that the configuration files - direct.cfg, - fonts.cfg, - ignore.cfg, - config.cfg are in the correct directory: . either the same directory as the executable, . or the LIBDIR directory stated in the Makefile . or the directory stated by environemt variable RTFPATH.
./ltx2rtf srsm-nt.tex
cp ltx2rtf & its .conf to /opt/bin
$ ltx2rtf tongsun.tex Segmentation fault
$ export RTFPATH=/opt/bin $ ltx2rtf tongsun.tex Segmentation fault
ltx2rtf DOES NOT expand macros. Nor does it consider \usepackage specifications. However, the exp-macr routine (in F77) performs expansion of simple macros defined (with \def or \newcommand or \renewcommand) inside de given LaTeX source.
exp-macr original.tex expanded.tex
expands all macros defined in "original.tex" to "expanded.tex", which is likely to be fit for surbmitting to ltx2rtf. Besides, exp-macr also expands the \input commands found in the given original source. This means that specific macro definitions can be included in a separate file, included in the whole document.
exp-macr is not included in the gcc-g77 RH7.2 rpm package. Search on the Internet for "exp-macr" failed, as of 2002.07.04 Thu. All links are broken.
"http://www.tug.org/texlive/" does not release its CD content any more(?).
Generally acceptable
Some bullets spread to 2 lines, but others are ok make the bullet and the content be on the same line solved the problem.
the width of Job experience table is not good, but the width of the top table is ok.
delete the table width parameter. Error: Missing p-arg in array arg.
& %\multicolumn{2}{p{4.5in}}{ Led a web-based ... % \newline }
Removing '&' and uncommenting '%' will cause 'segmentation fault'.
\begin{center} % put inside center environment \begin{tabular*}{0.75\textwidth}% ... \end{tabular*} \end{center}
Output taken from LaTeX name: ttt.rtf
Begin of environment: TABULAR* /opt/bin/ltx2rtf: ERROR: Illegal dimension unit '' at line 0 Program aborted
\begin{center} % put inside center environment \begin{tabular*}{5in}% {@{\extracolsep{\fill}}cccr} label 1 & label 2 & label 3 & label 4 \\ \hline % put a line under headers item 1 & item 2 & item 3 & item 4 \\ ... \end{tabular*} \end{center}
Output taken from LaTeX name: ttt.rtf
Begin of environment: TABULAR* WARNING: command: extracolsep at line 12 not found - ignored WARNING: command: fill at line 12 not found - ignored WARNING: & not in tabular environment, ignored WARNING: & not in tabular environment, ignored WARNING: & not in tabular environment, ignored WARNING: command: hline at line 14 not found - ignored WARNING: & not in tabular environment, ignored WARNING: & not in tabular environment, ignored WARNING: & not in tabular environment, ignored Segmentation fault
Newsgroups: comp.text.tex
> editors want it in Word.
Dear Kris: If you have access to a copy of Word (version 6 or above) I would recommend the `tex2doc' Word macros, which do a very good job of producing a Word document, and even handle bibliographical references (but not maths). They are available at:
HTH, Robert.
*Tags*: ps to text, ps2text
pstotext ~/dl/mustH_b/wp/latex/learn/epslatex.ps > ~/tmp/ps/txt/epslatex.txt
pstotext is a program that works with Ghostscript to extract plain text from PostScript and PDF files.
pstotext works by sending a library, followed by the PostScript file, to the Ghostscript interpreter. The library intercepts the text rendering operators and sends information about the text back to pstotext. This information includes character metrics and encoding vectors, so in most situations we're able to reconstruct the plain text (converted to ISO Latin 1 encoding), with correct word breaks and good guesses about line breaks. It even works for rotated text!
http://www.research.compaq.com/SRC/virtualpaper/pstotext.html http://www.research.compaq.com/SRC/virtualpaper/cgi-bin/nph-download.tcl/pstotext.tar.Z?object=pstotext
The '-output' doesn't work. Have to redirect output instead.
Test on KDD2000PostWkshp.ps and the result is great
Test on www10_sarwar.pdf and the result is garbage
Usage: pstotext [option|file]... Options: -cork assume Cork encoding for dvips output -landscape rotate 270 degrees -landscapeOther rotate 90 degrees -portrait don't rotate (default) -bboxes output one word per line with bounding box -debug show Ghostscript output and error messages -gs "command" Ghostscript command - read from stdin (default if no files specified) -output file output results to "file" (default is stdout)
make cp pstotext /opt/bin/ cp pstotext.1 /opt/man/man1 chmod 711 /opt/bin/pstotext chmod 444 /opt/man/man1/pstotext.1
documented on: 2001.06.03 Sun 18:12:23
> I want to extract just the plain text from a Latex document. I was > wondering whether anyone knew a tool or > program that could do this for me?
detex, despite its name, also works for LaTeX (mostly).
Sven
documented on: 2000.10.22 Sun 23:40:48
>Are there any tools to take a LaTeX file, and convert it in a human >readable Text document? I have used deTeX but it seems to ignore/skip >some fields such as "verbatim", which it simply does not print in the >text file.
One possibility is to latex it to a dvi file, and then use dvi2tty to convert this into a text file. This isn't a great method, but I don't think that any method for doing this is perfect.
Faheem Mitha.
I use tth (html) and then htmstrip (DOS-Program) to convert tex files to formatted text files. They need not much work and htmstrip does handle tables pretty well :-)
htmstrip: http://www.erols.com/waynesof
I still need to test hevea which should also produce good results from tex.
Peter
Another possibility is to use Hevea's "text" mode. As always, it won't do a great job on math, but that to be expected. Hevea is available from
pdftohtml converts Portable Document Format files to HTML. This release converts text and links. Bold and italic face are preserved, but high level HTML structures (like lists or tables) are not yet generated. Images are ignored in the current version (but you can extract them from the pdf file using pdfimages, distributed with xpdf).