Documentation: Auxiliaries / Txt2HTML

Introduction Frontends Backends Includes Auxilliaries

  txt2html -- Text to HTML converter
  http://www.aigeek.com/txt2html/

  SAMPLE INPUT
  ============

  +----------------------------------------------------------------
  |   txt2html Sample Conversion
  |
  |   I used the following command to convert this document:
  |
  |    txt2html -tf --mail -H '^ *--[\w\s]+-- *$' -a sample.foot sample.txt > sample.html
  |
  |   ======================================================================
  |
  |   From bozo@clown.wustl.edu
  |   Return-Path: <bozo@clown.wustl.edu>
  |   Message-Id: <9405102200.AA04736@clown.wustl.edu>
  |   Content-Length: 1070
  |   From: bozo@clown.wustl.edu (Bozo the Clown)
  |   To: seth@aigeek.com (Seth Golub)
  |   Subject: Re: txt2html
  |   Date: Fri, 6 May 94 10:01:10 -0500
  |
  |   Bozo wrote:
  |   BtC> Can you post an example text file with its html'ed output?
  |   BtC> That would provide a much better first glance at what it does
  |   BtC> without having to look through and see what the perl code does.
  |
  |   Good idea.  I'll write something up.
  |
  |          -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  |
  |   The header lines were kept separate because they looked like mail
  |   headers and I have mailmode on.  The same thing applies to Bozo's
  |   quoted text.  Mailmode doesn't screw things up very often, but since
  |   most people are usually converting non-mail, it's off by default.
  |
  |   Paragraphs are handled ok.  In fact, this one is here just to
  |   demonstrate that.
  |
  |   THIS LINE IS VERY IMPORTANT!
  |   (Ok, it wasn't *that* important)
  |
  |
  |   EXAMPLE HEADER
  |   ==============
  |
  |   Since this is the first header noticed (all caps, underlined with an
  |   "="), it will be a level 1 header.  It gets an anchor named
  |   "section-1".
  |
  |   Another example
  |   ===============
  |   This is the second type of header (not all caps, underlined with "=").
  |   It gets an anchor named "section-1.1".
  |
  |   Yet another example
  |   ===================
  |
  |   This header was in the same style, so it was assigned the same header
  |   tag.  Note the anchor names in the HTML. (You probably can't see them
  |   in your current document view.)  Its anchor is named "section-1.2".
  |   Get the picture?
  |
  |
  |
  |                       -- This is a custom header --
  |
  |   You can define your own custom header patterns if you know what your
  |   documents look like.
  |
  |
  |
  |   Features of txt2html
  |   ====================
  |
  |    * Handles different kinds of lists
  |      1. Bulleted
  |      2. Numbered
  |         - You can nest them as far as you want.
  |         - It's pretty decent about figuring out which level of list it
  |           is supposed to be on.
  |           - You don't need to change bullet markers to start a new list.
  |      3. Lettered
  |         A. Finally handles lettered lists
  |         B. Upper and lower case both work
  |            a) Here's an example
  |            b) I've been meaning to add this for some time.
  |         C. Of course, HTML can't specify how ordered lists should be
  |            indicated, so it may be a numbered list in some
  |   	 browsers. (Ok, most browsers)
  |    * Doesn't screw up mail-ish things
  |    * Spots preformated text sometimes
  |
  |                    It just needs to have enough whitespace in the line.
  |           Surrounding blank lines aren't necessary.  If it sees enough
  |           whitespace in a line, it preformats it.  How much is enough?
  |           Set it yourself at command line if you want.
  |
  |    * You can append a file automatically to all converted files.  This
  |      is handy for adding signatures to your documents.
  |
  |    * Deals with paragraphs decently.
  |
  |      o looks for short lines in the middle of paragraphs and keeps them
  |        short with the use of breaks (<BR>).  How short the lines need to
  |        be is configurable.
  |      o Unhyphenates split words that are in the middle of para-
  |        graphs.  Let me know if trailing punctuation isn't handled "prop-
  |        erly".  It should be.
  |
  |    * Puts anchors at all headers and, if you're using the mail header
  |      features, at the beginning of each mail message.  The anchor names
  |      for headings are based on guessed section numbers.
  |
  |    * Groks Mosaic-style "formatted text" headers (like the one below)
  |
  |    * Can hyperlink things according to a dictionary file.
  |      The sample dictionary handles URLs like
  |      http://www.aigeek.com/ and also shows how to do simpler
  |      things such as linking the word txt2html the first time it appeared.
  |
  |   Example of short lines
  |   ----------------------
  |
  |   We're the knights of the round table
  |   We dance whene'er we're able
  |   We do routines and chorus scenes
  |   With footwork impeccable.
  |   We dine well here in Camelot
  |   We eat ham and jam and spam a lot.
  |
  |   ----------------------------------------
  |
  |   The signature is everything from the end of this sentence to the
  |   </BODY> tag.
  |
  +----------------------------------------------------------------

  OPTIONS
  =======

  Usage: txt2html.pl [options]

  where options are:
     [-v         ] | [--version                       ]
     [-h         ] | [--help                          ]
     [-t <title> ] | [--title <title>                 ]
     [-tf/+tf    ] | [--titlefirst / --notitlefirst   ]
     [-dt <doct> ] | [--doctype <doctype>             ]
     [+dt        ] | [--nodoctype                     ]
     [-l <file>  ] | [--link <dictfile>               ]
     [+l         ] | [--nolink                        ]
     [-H <regexp>] | [--heading <regexp>              ]
     [-EH/+EH    ] | [--explicit-headings / --noexplicit-headings ]
     [-ab <file> ] | [--append_body <file>            ]
     [+ab        ] | [--noappend_body                 ]
     [-ah <file> ] | [--append_head <file>            ]
     [+ah        ] | [--noappend_head                 ]
     [-pp <file> ] | [--prepend_body <file>           ]
     [+pp        ] | [--noprepend_body <file>         ]
     [-ec/+ec    ] | [--escapechars / --noescapechars ]
     [-e/+e      ] | [--extract / --noextract         ]
     [-c <n>     ] | [--caps <n>                      ]
     [-ct <tag>  ] | [--capstag <tag>                 ]
     [-m/+m      ] | [--mail     / --nomail           ]
     [-u/+u      ] | [--unhyphen / --nounhyphen       ]
     [-ul <n>    ] | [--ulength <n>                   ]
     [-uo <n>    ] | [--uoffset <n>                   ]
     [-tw <n>    ] | [--tabwidth <n>                  ]
     [-iw <n>    ] | [--indent <n>                    ]
     [-s <n>     ] | [--shortline <n>                 ]
     [-p <n>     ] | [--prewhite <n>                  ]
     [-pb <n>    ] | [--prebegin <n>                  ]
     [-pe <n>    ] | [--preend <n>                    ]
     [-r <n>     ] | [--hrule <n>                     ]
     [-LO/+LO    ] | [--linkonly / --nolinkonly       ]
     [-db <n>    ] | [--debug <n>                     ]

  More complete explanations of these options can be found in 
  comments near the beginning of the script.