Supreme Law SCAN Programs:
                 Managing a Large E-Mail Archive


Introduction

     The E-mail archive now available at the Supreme Law website
was written by software developed by Paul Andrew Mitchell, using
the Microsoft QBASIC interpreter.

     SCAN was developed to break a large archive file of e-mail,
written by Eudora Pro, into one separate DOS file per message.


Special Problems

     The e-mail  archive written by Eudora Pro contained isolated
line feed  characters, which were NOT preceded by carriage return
characters.   These isolated  line  feed  characters  had  to  be
replaced  with  carriage  return  characters,  but  only  if  the
resulting replacement  did not  leave  two  carriage  returns  in
sequence.

     The RANDOM  program was developed to read Eudora Pro mailbox
files, one  DOS sector  at a  time (512  bytes), using the random
access logic available in the Microsoft QBASIC compiler.

     Very simply,  this random  access reads  through the archive
file, one  record at a time, until no more sectors are available.
This method  bypasses the software eof character which signals an
end-of-file condition during sequential INPUT statements.

     The RANDOM program also replaces other unwanted non-printing
characters.  See the QBASIC code in RANDOM.BAS for details.


SCAN Program Logic

     After the  RANDOM program  has replaced unwanted characters,
the SCAN program reads the RANDOM output file, and writes one DOS
text file  per e-mail message.  Each e-mail message is identified
with "From " in its first line of text.

     The SCAN  program  creates  sequential  sub-directories,  in
order to prevent the poor performance which results from having a
single DOS  directory with  an unusually  large number  of  files
(e.g. 5,000 or more).

     SCAN allocates at most 100 messages files per DOS directory,
then creates  another directory  to store  the next 100 messages,
and continues  in this  fashion  until  all  messages  have  been
processed.   For redundancy,  the name  of each  new directory is
embedded in the name of each file stored in it, as follows:

    Message "msg00154.htm" is stored in directory "box001"

    Message "msg02154.htm" is stored in directory "box021"

    and so on ....

     SCAN writes  Hyper-Text Markup  Language ("HTML")  which  is
compatible with Netscape and Internet Explorer.  A prolog section
of code  begins each  message file, and an epilog section of code
ends each  message file.  These sections of code are found in the
DOS files "prolog.htm" and "epilog.htm", respectively.

     SCAN also eliminates most of the routing data which is found
in many e-mail messages.

     The MAKEIN ("make index") program  simply converts the INDEX
file, written by the SCAN program, into HTML language. Of course,
MAKEIN  expects the INDEX file  to have the record layout written
by the SCAN program.

     Any competent  BASIC programmer  will be familiar with these
programming details.


Downloading Files

     Internet browsers  can  access  the necessary  files  at the
following URL's:

     RANDOM.BAS   http://supremelaw.org/sls/email/random.bas

     SCAN.BAS     http://supremelaw.org/sls/email/scan.bas

     MAKEIN.BAS   http://supremelaw.org/sls/email/makein.bas

     PROLOG.HTM   http://supremelaw.org/sls/email/prolog.htm

     EPILOG.HTM   http://supremelaw.org/sls/email/epilog.htm


     Use  the VIEW/SOURCE option  in your browser  to preview the
text after it is downloaded by the browser you are using and then
use SAVE AS to store it in a local directory on your computer.


                             #  #  #
      


Return to Table of Contents for

Supreme Law School:   main index