2008-01-12

Converting LaTeX to HTML, etc. with TeX4ht

This is going to be a bit of an inane post and I apologize in advance. The essence is as follows: I want to convert LaTeX into HTML, etc using TeX4ht on Ubuntu. The problem is that after installing TeX4ht, I get a weird error. The following commands should get everything working:


$ sudo apt-get install texlive tetex-extras tex4ht dvipng
$ sudo texhash


I'm repeating the gory details below for informational purposes.

The other night, I posted about my experiences with converting LaTeX into HTML, etc. I was concerned mostly with TeX4ht: what seems like a general conversion utility from LaTeX to a number of markup formats (including ODF, SGML, and MS Word). At first, I had trouble getting the thing to work, but I think I have the problem worked out now. This post includes all the details about how I got TeX4ht to work.

I did this test on a machine running Ubuntu Server 7.10 i386, but it should work on the desktop versions as well. First, I installed texlive and tetex-extras. The package texlive installs the base LaTeX system, and tetex-extras (optional actually) installs some useful goodies, namely the RevTeX package used by the APS journals.


$ sudo apt-get install texlive tetex-extras


Now that the base system is in, I generated the following LaTeX document:


\documentclass[10pt]{article}


%opening
\title{The best document, EVAR!}
\author{Joshua Ryan Smith, Ph.D.}

\begin{document}

\maketitle

\begin{abstract}
In this paper, we reveal the best document ever. It is the coolest.
\end{abstract}

\section{Introduction}
For years, researchers have searched for the absolute best document ever. Here, we demonstrate that this document is the best ever.

\section{Equation}
This is the section for equations. Bask in the sheer awesomeness of this one:

\begin{equation} \label{eq:00}
y = mx+b
\end{equation}

\section{Conclusions}
Recall Eq. \ref{eq:00}. That is the best, isn't it?


\end{document}


This document contains an equation and a reference which are some of the most useful features of LaTeX. I executed pdflatex on this document which resulted in a reasonable file. So far, so good.

Next, I installed TeX4ht:


$ sudo apt-get install tex4ht


Here is where the weirdness starts. When I tried to convert the LaTeX source into an HTML document, I got an error:


$ htlatex test.tex
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
LaTeX2e <2005/12/01>
Babel and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, croatian, bulgarian, russian, ukrainian, czech, slovak, danish, dut
ch, finnish, finnish, french, basque, french, german, ngerman, german, ngerman,
greek, monogreek, ancientgreek, ibycus, hungarian, hungarian, italian, italian
, latin, latin, mongolian, mongolian, norsk, norsk, coptic, esperanto, estonian
, icelandic, indonesian, interlingua, romanian, serbian, slovenian, turkish, up
persorbian, welsh, polish, polish, portuguese, portuguese, spanish, catalan, ga
lician, spanish, catalan, galician, swedish, swedish, loaded.
(./test.tex (/usr/share/texmf-texlive/tex/latex/base/article.cls
Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
(/usr/share/texmf-texlive/tex/latex/base/size10.clo))
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty)
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.4ht
::::::::::::::::::::::::::::::::::::::::::
TeX4ht info is available in the log file
::::::::::::::::::::::::::::::::::::::::::
) (/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty
--- needs --- tex4ht test ---
(./test.tmp)
l.1418 --- TeX4ht warning --- No file test.xref ---
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/latex.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/fontmath.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/article.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
No file test.aux.
[1] [2] [3]

LaTeX Warning: Reference `eq:00' on page 4 undefined on input line 27.

[4] (./test.aux)

LaTeX Warning: There were undefined references.


LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right.

)
Output written on test.dvi (4 pages, 11748 bytes).
Transcript written on test.log.
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
LaTeX2e <2005/12/01>
Babel and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, croatian, bulgarian, russian, ukrainian, czech, slovak, danish, dut
ch, finnish, finnish, french, basque, french, german, ngerman, german, ngerman,
greek, monogreek, ancientgreek, ibycus, hungarian, hungarian, italian, italian
, latin, latin, mongolian, mongolian, norsk, norsk, coptic, esperanto, estonian
, icelandic, indonesian, interlingua, romanian, serbian, slovenian, turkish, up
persorbian, welsh, polish, polish, portuguese, portuguese, spanish, catalan, ga
lician, spanish, catalan, galician, swedish, swedish, loaded.
(./test.tex (/usr/share/texmf-texlive/tex/latex/base/article.cls
Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
(/usr/share/texmf-texlive/tex/latex/base/size10.clo))
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty)
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.4ht
::::::::::::::::::::::::::::::::::::::::::
TeX4ht info is available in the log file
::::::::::::::::::::::::::::::::::::::::::
) (/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty
--- needs --- tex4ht test ---
(./test.tmp) (./test.xref) (/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/latex.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/fontmath.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/article.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)) (./test.aux) [1] [2]
[3] [4] (./test.aux) )
Output written on test.dvi (4 pages, 11804 bytes).
Transcript written on test.log.
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
LaTeX2e <2005/12/01>
Babel and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, croatian, bulgarian, russian, ukrainian, czech, slovak, danish, dut
ch, finnish, finnish, french, basque, french, german, ngerman, german, ngerman,
greek, monogreek, ancientgreek, ibycus, hungarian, hungarian, italian, italian
, latin, latin, mongolian, mongolian, norsk, norsk, coptic, esperanto, estonian
, icelandic, indonesian, interlingua, romanian, serbian, slovenian, turkish, up
persorbian, welsh, polish, polish, portuguese, portuguese, spanish, catalan, ga
lician, spanish, catalan, galician, swedish, swedish, loaded.
(./test.tex (/usr/share/texmf-texlive/tex/latex/base/article.cls
Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
(/usr/share/texmf-texlive/tex/latex/base/size10.clo))
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty)
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.4ht
::::::::::::::::::::::::::::::::::::::::::
TeX4ht info is available in the log file
::::::::::::::::::::::::::::::::::::::::::
) (/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty
--- needs --- tex4ht test ---
(./test.tmp) (./test.xref) (/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/latex.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/fontmath.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/article.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)) (./test.aux) [1] [2]
[3] [4] (./test.aux) )
Output written on test.dvi (4 pages, 11804 bytes).
Transcript written on test.log.
----------------------------
tex4ht.c (2007-04-21-21:07 kpathsea)
tex4ht -f/test.tex
-i/usr/share/texmf/tex4ht/ht-fonts/
--- warning --- Can't find/open file `tex4ht.env | .tex4ht'
--- error --- Illegal storage address
----------------------------
t4ht.c (2007-01-05-03:17 kpathsea)
t4ht -f/test.tex
--- warning --- Can't find/open file `tex4ht.env | .tex4ht'
--- warning --- Can't find/open file `test.lg'


This tex4ht.env error was exactly what was happening last night. Yesterday at this point during the install, I scoured the internets looking for a solution and came upon several suggestions from manually patching my TeX4ht install to generating some symlinks. All of those suggestions are bunk, one just has to rebuild the TeX database:


$ sudo texhash


Now I try to htlatex the document:


$ htlatex test.tex
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
LaTeX2e <2005/12/01>
Babel and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, croatian, bulgarian, russian, ukrainian, czech, slovak, danish, dut
ch, finnish, finnish, french, basque, french, german, ngerman, german, ngerman,
greek, monogreek, ancientgreek, ibycus, hungarian, hungarian, italian, italian
, latin, latin, mongolian, mongolian, norsk, norsk, coptic, esperanto, estonian
, icelandic, indonesian, interlingua, romanian, serbian, slovenian, turkish, up
persorbian, welsh, polish, polish, portuguese, portuguese, spanish, catalan, ga
lician, spanish, catalan, galician, swedish, swedish, loaded.
(./test.tex (/usr/share/texmf-texlive/tex/latex/base/article.cls
Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
(/usr/share/texmf-texlive/tex/latex/base/size10.clo))
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty)
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.4ht
::::::::::::::::::::::::::::::::::::::::::
TeX4ht info is available in the log file
::::::::::::::::::::::::::::::::::::::::::
) (/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty
--- needs --- tex4ht test ---
(./test.tmp)
l.1418 --- TeX4ht warning --- No file test.xref ---
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/latex.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/fontmath.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/article.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
No file test.aux.
[1] [2] [3]

LaTeX Warning: Reference `eq:00' on page 4 undefined on input line 27.

[4] (./test.aux)

LaTeX Warning: There were undefined references.


LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right.

)
Output written on test.dvi (4 pages, 11748 bytes).
Transcript written on test.log.
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
LaTeX2e <2005/12/01>
Babel and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, croatian, bulgarian, russian, ukrainian, czech, slovak, danish, dut
ch, finnish, finnish, french, basque, french, german, ngerman, german, ngerman,
greek, monogreek, ancientgreek, ibycus, hungarian, hungarian, italian, italian
, latin, latin, mongolian, mongolian, norsk, norsk, coptic, esperanto, estonian
, icelandic, indonesian, interlingua, romanian, serbian, slovenian, turkish, up
persorbian, welsh, polish, polish, portuguese, portuguese, spanish, catalan, ga
lician, spanish, catalan, galician, swedish, swedish, loaded.
(./test.tex (/usr/share/texmf-texlive/tex/latex/base/article.cls
Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
(/usr/share/texmf-texlive/tex/latex/base/size10.clo))
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty)
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.4ht
::::::::::::::::::::::::::::::::::::::::::
TeX4ht info is available in the log file
::::::::::::::::::::::::::::::::::::::::::
) (/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty
--- needs --- tex4ht test ---
(./test.tmp) (./test.xref) (/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/latex.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/fontmath.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/article.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)) (./test.aux) [1] [2]
[3] [4] (./test.aux) )
Output written on test.dvi (4 pages, 11804 bytes).
Transcript written on test.log.
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
LaTeX2e <2005/12/01>
Babel and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, croatian, bulgarian, russian, ukrainian, czech, slovak, danish, dut
ch, finnish, finnish, french, basque, french, german, ngerman, german, ngerman,
greek, monogreek, ancientgreek, ibycus, hungarian, hungarian, italian, italian
, latin, latin, mongolian, mongolian, norsk, norsk, coptic, esperanto, estonian
, icelandic, indonesian, interlingua, romanian, serbian, slovenian, turkish, up
persorbian, welsh, polish, polish, portuguese, portuguese, spanish, catalan, ga
lician, spanish, catalan, galician, swedish, swedish, loaded.
(./test.tex (/usr/share/texmf-texlive/tex/latex/base/article.cls
Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
(/usr/share/texmf-texlive/tex/latex/base/size10.clo))
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty)
(/usr/share/texmf/tex/generic/tex4ht/tex4ht.4ht
::::::::::::::::::::::::::::::::::::::::::
TeX4ht info is available in the log file
::::::::::::::::::::::::::::::::::::::::::
) (/usr/share/texmf/tex/generic/tex4ht/tex4ht.sty
--- needs --- tex4ht test ---
(./test.tmp) (./test.xref) (/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)
(/usr/share/texmf/tex/generic/tex4ht/latex.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/fontmath.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/article.4ht
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht))
(/usr/share/texmf/tex/generic/tex4ht/html4.4ht)
(/usr/share/texmf/tex/generic/tex4ht/html4-math.4ht)) (./test.aux) [1] [2]
[3] [4] (./test.aux) )
Output written on test.dvi (4 pages, 11804 bytes).
Transcript written on test.log.
----------------------------
tex4ht.c (2007-04-21-21:07 kpathsea)
tex4ht -f/test.tex
-i/usr/share/texmf/tex4ht/ht-fonts/
(/usr/share/texmf/tex4ht/tex4ht.env)
(/usr/share/texmf/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf)
(/usr/share/texmf-texlive/fonts/tfm/public/cm/cmbx9.tfm)
(/usr/share/texmf/tex4ht/ht-fonts/alias/cm/cmbx.htf)
Searching `lm-rep-cmrm.htf' for `cmbx9.htf'
(/usr/share/texmf/tex4ht/ht-fonts/unicode/lm/lm-rep-cmrm.htf)
(/usr/share/texmf-texlive/fonts/tfm/public/cm/cmr9.tfm)
(/usr/share/texmf/tex4ht/ht-fonts/alias/lm/lm-rep-cmrm/cmr.htf)
Searching `lm-rep-cmrm.htf' for `cmr9.htf'
(/usr/share/texmf/tex4ht/ht-fonts/unicode/lm/lm-rep-cmrm.htf)
(/usr/share/texmf-texlive/fonts/tfm/public/cm/cmr12.tfm)
(/usr/share/texmf/tex4ht/ht-fonts/alias/lm/lm-rep-cmrm/cmr.htf)
Searching `lm-rep-cmrm.htf' for `cmr12.htf'
(/usr/share/texmf/tex4ht/ht-fonts/unicode/lm/lm-rep-cmrm.htf)
(/usr/share/texmf-texlive/fonts/tfm/public/cm/cmr17.tfm)
(/usr/share/texmf/tex4ht/ht-fonts/alias/lm/lm-rep-cmrm/cmr.htf)
Searching `lm-rep-cmrm.htf' for `cmr17.htf'
(/usr/share/texmf/tex4ht/ht-fonts/unicode/lm/lm-rep-cmrm.htf)
(/usr/share/texmf-texlive/fonts/tfm/public/cm/cmmi10.tfm)
(/usr/share/texmf/tex4ht/ht-fonts/unicode/cm/cmmi.htf)
(/usr/share/texmf-texlive/fonts/tfm/public/cm/cmr10.tfm)
(/usr/share/texmf/tex4ht/ht-fonts/alias/lm/lm-rep-cmrm/cmr.htf)
Searching `lm-rep-cmrm.htf' for `cmr10.htf'
(/usr/share/texmf/tex4ht/ht-fonts/unicode/lm/lm-rep-cmrm.htf)
[1 file test.html
file test.css
file test.tmp
] [2] [3] [4]
Execute script `test.lg'
----------------------------
t4ht.c (2007-01-05-03:17 kpathsea)
t4ht -f/test.tex
(/usr/share/texmf/tex4ht/tex4ht.env)
Entering test.lg
System call: dvipng -T tight -x 1400 -D 72 -bg Transparent -pp 1:1 test.idv -o test0x.png
sh: dvipng: not found
--- Warning --- System return: 32512
Entering test.css
Entering test.tmp


and I get just one warning about how dvipng isn't on my system. Despite the warning, htlatex builds a document. Installing dvipng does the trick:


sudo apt-get install dvipng


Now building the document does the right thing. Beautiful, isn't it? For some reason, konqueror doesn't seem to render the page properly, but firefox does.

Clearly, the example document in this post is a simple one. I'm not sure how TeX4ht reacts to RevTeX or hyperref, but I think that tweaking should do the trick. I am very excited that I can continue to generate documents in LaTeX and convert them to other formats.

0 comments: