 
    
    
   
   
    gawk
This appendix provides instructions for installing gawk on the
various platforms that are supported by the developers.  The primary
developers support Unix (and one day, GNU), while the other ports were
contributed.  The file `ACKNOWLEDGMENT' in the gawk
distribution lists the electronic mail addresses of the people who did
the respective ports, and they are also provided in
section Reporting Problems and Bugs.
gawk Distribution
This section first describes how to get the gawk
distribution, how to extract it, and then what is in the various files and
subdirectories.
gawk DistributionThere are three ways you can get GNU software.
gawk directly from the Free Software Foundation.
Software distributions are available for Unix, MS-DOS, and VMS, on
tape, CD-ROM, or floppies (MS-DOS only).  The address is:
Ordering from the FSF directly contributes to the support of the foundation and to the production of more free software.Free Software Foundation
59 Temple Place--Suite 330
Boston, MA 02111-1307 USA
Phone: +1-617-542-5942
Fax (including Japan): +1-617-542-2652
E-mail:gnu@prep.ai.mit.edu
gawk by using anonymous ftp to the Internet host
ftp.gnu.ai.mit.edu, in the directory `/pub/gnu'.
Here is a list of alternate ftp sites from which you can obtain GNU
software.  When a site is listed as "site:directory" the
directory indicates the directory where GNU software is kept.
You should use a site that is geographically close to you.
cair-archive.kaist.ac.kr:/pub/gnu
ftp.cs.titech.ac.jp
ftp.nectec.or.th:/pub/mirrors/gnu
utsun.s.u-tokyo.ac.jp:/ftpsync/prep
archie.au:/gnu
archie.oz or archie.oz.au for ACSnet)
ftp.sun.ac.za:/pub/gnu
ftp.technion.ac.il:/pub/unsupported/gnu
archive.eu.net
ftp.denet.dk
ftp.eunet.ch
ftp.funet.fi:/pub/gnu
ftp.ieunet.ie:pub/gnu
ftp.informatik.rwth-aachen.de:/pub/gnu
ftp.informatik.tu-muenchen.de
ftp.luth.se:/pub/unix/gnu
ftp.mcc.ac.uk
ftp.stacken.kth.se
ftp.sunet.se:/pub/gnu
ftp.univ-lyon1.fr:pub/gnu
ftp.win.tue.nl:/pub/gnu
irisa.irisa.fr:/pub/gnu
isy.liu.se
nic.switch.ch:/mirror/gnu
src.doc.ic.ac.uk:/gnu
unix.hensa.ac.uk:/pub/uunet/systems/gnu
ftp.inf.utfsm.cl:/pub/gnu
ftp.unicamp.br:/pub/gnu
ftp.cs.ubc.ca:/mirror2/gnu
col.hp.com:/mirrors/gnu
f.ms.uky.edu:/pub3/gnu
ftp.cc.gatech.edu:/pub/gnu
ftp.cs.columbia.edu:/archives/gnu/prep
ftp.digex.net:/pub/gnu
ftp.hawaii.edu:/mirrors/gnu
ftp.kpc.com:/pub/mirror/gnu
ftp.uu.net:/systems/gnu
gatekeeper.dec.com:/pub/GNU
jaguar.utah.edu:/gnustuff
labrea.stanford.edu
mrcnext.cso.uiuc.edu:/pub/gnu
vixen.cso.uiuc.edu:/gnu
wuarchive.wustl.edu:/systems/gnu
gawk is distributed as a tar file compressed with the
GNU Zip program, gzip.
Once you have the distribution (for example,
`gawk-3.0.0.tar.gz'), first use gzip to expand the
file, and then use tar to extract it.  You can use the following
pipeline to produce the gawk distribution:
# Under System V, add 'o' to the tar flags gzip -d -c gawk-3.0.0.tar.gz | tar -xvpf -
This will create a directory named `gawk-3.0.0' in the current directory.
The distribution file name is of the form
`gawk-V.R.n.tar.gz'.
The V represents the major version of gawk,
the R represents the current release of version V, and
the n represents a patch level, meaning that minor bugs have
been fixed in the release.  The current patch level is 0, but when
retrieving distributions, you should get the version with the highest
version, release, and patch level.  (Note that release levels greater than
or equal to 90 denote "beta," or non-production software; you may not wish
to retrieve such a version unless you don't mind experimenting.)
If you are not on a Unix system, you will need to make other arrangements
for getting and extracting the gawk distribution.  You should consult
a local expert.
gawk Distribution
The gawk distribution has a number of C source files,
documentation files,
subdirectories and files related to the configuration process
(see section Compiling and Installing gawk on Unix),
and several subdirectories related to different, non-Unix,
operating systems.
gawk source code.
gawk under Unix, and the
rest for the various hardware and software combinations.
gawk has been ported, and which
have successfully run the test suite.
gawk since the last release or patch.
gawk's performance.
Most of these depend on the hardware or operating system software, and
are not limits in gawk itself.
awk is
incorrect, and how gawk handles the problem.
troff source for a manual page describing gawk.
This is distributed for the convenience of Unix users.
makeinfo to produce an Info file.
troff source for a manual page describing the igawk
program presented in
section An Easy Way to Use Library Functions.
gawk
for various Unix systems.  They are explained in detail in
section Compiling and Installing gawk on Unix.
configure uses to generate a `Makefile'.
As part of the process of building gawk, the library functions from
section A Library of awk Functions,
and the igawk program from
section An Easy Way to Use Library Functions,
are extracted into ready to use files.
They are installed as part of the installation process.
gawk on an Amiga.
See section Installing gawk on an Amiga, for details.
gawk on an Atari ST.
See section Installing gawk on the Atari ST, for details.
gawk under MS-DOS and OS/2.
See section MS-DOS and OS/2 Installation and Compilation, for details.
gawk under VMS.
See section How to Compile and Install gawk on VMS, for details.
gawk.  You can use `make check' from the top level gawk
directory to run your version of gawk against the test suite.
If gawk successfully passes `make check' then you can
be confident of a successful port.
gawk on Unix
Usually, you can compile and install gawk by typing only two
commands.  However, if you do use an unusual system, you may need
to configure gawk for your system yourself.
gawk for Unix
After you have extracted the gawk distribution, cd
to `gawk-3.0.0'.  Like most GNU software,
gawk is configured
automatically for your Unix system by running the configure program.
This program is a Bourne shell script that was generated automatically using
GNU autoconf.
(The autoconf software is
described fully in
Autoconf--Generating Automatic Configuration Scripts,
which is available from the Free Software Foundation.)
To configure gawk, simply run configure:
sh ./configure
This produces a `Makefile' and `config.h' tailored to your system.
The `config.h' file describes various facts about your system.
You may wish to edit the `Makefile' to
change the CFLAGS variable, which controls
the command line options that are passed to the C compiler (such as
optimization levels, or compiling for debugging).
Alternatively, you can add your own values for most make
variables, such as CC and CFLAGS, on the command line when
running configure:
CC=cc CFLAGS=-g sh ./configure
See the file `INSTALL' in the gawk distribution for
all the details.
After you have run configure, and possibly edited the `Makefile',
type:
make
and shortly thereafter, you should have an executable version of gawk.
That's all there is to it!
(If these steps do not work, please send in a bug report;
see section Reporting Problems and Bugs.)
(This section is of interest only if you know something about using the C language and the Unix operating system.)
The source code for gawk generally attempts to adhere to formal
standards wherever possible.  This means that gawk uses library
routines that are specified by the ANSI C standard and by the POSIX
operating system interface standard.  When using an ANSI C compiler,
function prototypes are used to help improve the compile-time checking.
Many Unix systems do not support all of either the ANSI or the
POSIX standards.  The `missing' subdirectory in the gawk
distribution contains replacement versions of those subroutines that are
most likely to be missing.
The `config.h' file that is created by the configure program
contains definitions that describe features of the particular operating
system where you are attempting to compile gawk.  The three things
described by this file are what header files are available, so that
they can be correctly included,
what (supposedly) standard functions are actually available in your C
libraries, and
other miscellaneous facts about your
variant of Unix.  For example, there may not be an st_blksize
element in the stat structure.  In this case `HAVE_ST_BLKSIZE'
would be undefined.
It is possible for your C compiler to lie to configure. It may
do so by not exiting with an error when a library function is not
available.  To get around this, you can edit the file `custom.h'.
Use an `#ifdef' that is appropriate for your system, and either
#define any constants that configure should have defined but
didn't, or #undef any constants that configure defined and
should not have.  `custom.h' is automatically included by
`config.h'.
It is also possible that the configure program generated by
autoconf
will not work on your system in some other fashion.  If you do have a problem,
the file
`configure.in' is the input for autoconf.  You may be able to
change this file, and generate a new version of configure that will
work on your system.  See section Reporting Problems and Bugs, for
information on how to report problems in configuring gawk.  The same
mechanism may be used to send in updates to `configure.in' and/or
`custom.h'.
gawk on VMS
This section describes how to compile and install gawk under VMS.
gawk on VMS
To compile gawk under VMS, there is a DCL command procedure that
will issue all the necessary CC and LINK commands, and there is
also a `Makefile' for use with the MMS utility.  From the source
directory, use either
$ @[.VMS]VMSBUILD.COM
or
$ MMS/DESCRIPTION=[.VMS]DESCRIP.MMS GAWK
Depending upon which C compiler you are using, follow one of the sets of instructions in this table:
CC/OPTIMIZE=NOLINE, which is essential for Version 3.0.
gawk has been tested under VAX/VMS 5.5-1 using VAX C V3.2,
GNU C 1.40 and 2.3.  It should work without modifications for VMS V4.6 and up.
gawk on VMS
To install gawk, all you need is a "foreign" command, which is
a DCL symbol whose value begins with a dollar sign. For example:
$ GAWK :== $disk1:[gnubin]GAWK
(Substitute the actual location of gawk.exe for
`$disk1:[gnubin]'.) The symbol should be placed in the
`login.com' of any user who wishes to run gawk,
so that it will be defined every time the user logs on.
Alternatively, the symbol may be placed in the system-wide
`sylogin.com' procedure, which will allow all users
to run gawk.
Optionally, the help entry can be loaded into a VMS help library:
$ LIBRARY/HELP SYS$HELP:HELPLIB [.VMS]GAWK.HLP
(You may want to substitute a site-specific help library rather than the standard VMS library `HELPLIB'.) After loading the help text,
$ HELP GAWK
will provide information about both the gawk implementation and the
awk programming language.
The logical name `AWK_LIBRARY' can designate a default location
for awk program files.  For the `-f' option, if the specified
filename has no device or directory path information in it, gawk
will look in the current directory first, then in the directory specified
by the translation of `AWK_LIBRARY' if the file was not found.
If after searching in both directories, the file still is not found,
then gawk appends the suffix `.awk' to the filename and the
file search will be re-tried.  If `AWK_LIBRARY' is not defined, that
portion of the file search will fail benignly.
gawk on VMS
Command line parsing and quoting conventions are significantly different
on VMS, so examples in this book or from other sources often need minor
changes.  They are minor though, and all awk programs
should run correctly.
Here are a couple of trivial tests:
$ gawk -- "BEGIN {print ""Hello, World!""}"
$ gawk -"W" version
! could also be -"W version" or "-W version"
Note that upper-case and mixed-case text must be quoted.
The VMS port of gawk includes a DCL-style interface in addition
to the original shell-style interface (see the help entry for details).
One side-effect of dual command line parsing is that if there is only a
single parameter (as in the quoted string program above), the command
becomes ambiguous.  To work around this, the normally optional `--'
flag is required to force Unix style rather than DCL parsing.  If any
other dash-type options (or multiple parameters such as data files to be
processed) are present, there is no ambiguity and `--' can be omitted.
The default search path when looking for awk program files specified
by the `-f' option is "SYS$DISK:[],AWK_LIBRARY:".  The logical
name `AWKPATH' can be used to override this default.  The format
of `AWKPATH' is a comma-separated list of directory specifications.
When defining it, the value should be quoted so that it retains a single
translation, and not a multi-translation RMS searchlist.
gawk on VMS POSIX
Ignore the instructions above, although `vms/gawk.hlp' should still
be made available in a help library.  Make sure that the configure
script is executable; use `chmod +x'
on it if necessary.  Then execute the following commands:
$ POSIX psx> CC=vms/posix-cc.sh configure psx> CC=c89 make gawk
The first command will construct files `config.h' and `Makefile'
out of templates.  The second command will compile and link gawk.
Ignore the warning
"Could not find lib m in lib list"; it is harmless, caused by the
explicit use of `-lm' as a linker option which is not needed
under VMS POSIX.  Under V1.1 (but not V1.0) a problem with the yacc
skeleton `/etc/yyparse.c' will cause a compiler warning for
`awktab.c', followed by a linker warning about compilation warnings
in the resulting object module.  These warnings can be ignored.
Once built, gawk will work like any other shell utility.  Unlike
the normal VMS port of gawk, no special command line manipulation is
needed in the VMS POSIX environment.
If you have received a binary distribution prepared by the DOS
maintainers, then gawk and the necessary support files will appear
under the `gnu' directory, with executables in `gnu/bin',
libraries in `gnu/lib/awk', and manual pages under `gnu/man'.
This is designed for easy installation to a `/gnu' directory on your
drive, but the files can be installed anywhere provided AWKPATH is
set properly.  Regardless of the installation directory, the first line of
`igawk.cmd' and `igawk.bat' (in `gnu/bin') may need to be
edited.
The binary distribution will contain a separate file describing the
contents. In particular, it may include more than one version of the
gawk executable. OS/2 binary distributions may have a 
different arrangement, but installation is similar.
The OS/2 and MS-DOS versions of gawk search for program files as
described in section The AWKPATH Environment Variable.
However, semicolons (rather than colons) separate elements
in the AWKPATH variable. If AWKPATH is not set or is empty,
then the default search path is ".;c:/lib/awk;c:/gnu/lib/awk".
An sh-like shell (as opposed to command.com under MS-DOS 
or cmd.exe under OS/2) may be useful for awk programming.
Ian Stewartson has written an excellent shell for MS-DOS and OS/2, and a
ksh clone and GNU Bash are available for OS/2. The file
`README_d/README.pc' in the gawk distribution contains
information on these shells. Users of Stewartson's shell on DOS should
examine its documentation on handling of command-lines. In particular,
the setting for gawk in the shell configuration may need to be
changed, and the ignoretype option may also be of interest.
gawk can be compiled for MS-DOS and OS/2 using the GNU development tools
from DJ Delorie (DJGPP, MS-DOS-only) or Eberhard Mattes (EMX, MS-DOS and OS/2).
Microsoft C can be used to build 16-bit versions for MS-DOS and OS/2.  The file
`README_d/README.pc' in the gawk distribution contains additional
notes, and `pc/Makefile' contains important notes on compilation options.
To build gawk, copy the files in the `pc' directory to the
directory with the rest of the gawk sources. The `Makefile'
contains a configuration section with comments, and may need to be
edited in order to work with your make utility.
The `Makefile' contains a number of targets for building various MS-DOS
and OS/2 versions. A list of targets will be printed if the make
command is given without a target. As an example, to build gawk
using the DJGPP tools, enter `make djgpp'.
Using make to run the standard tests and to install gawk
requires additional Unix-like tools, including sh, sed, and
cp. In order to run the tests, the `test/*.ok' files may need to
be converted so that they have the usual DOS-style end-of-line markers. Most
of the tests will work properly with Stewartson's shell along with the
companion utilities or appropriate GNU utilities.  However, some editing of
`test/Makefile' is required. It is recommended that the file
`pc/Makefile.tst' be copied to `test/Makefile' as a
replacement. Details can be found in `README_d/README.pc'.
gawk on the Atari ST
There are no substantial differences when installing gawk on
various Atari models.  Compiled gawk executables do not require
a large amount of memory with most awk programs and should run on all
Motorola processor based models (called further ST, even if that is not
exactly right).
In order to use gawk, you need to have a shell, either text or
graphics, that does not map all the characters of a command line to
upper-case.  Maintaining case distinction in option flags is very
important (see section Command Line Options).
These days this is the default, and it may only be a problem for some
very old machines.  If your system does not preserve the case of option
flags, you will need to upgrade your tools.  Support for I/O
redirection is necessary to make it easy to import awk programs
from other environments.  Pipes are nice to have, but not vital.
gawk on the Atari ST
A proper compilation of gawk sources when sizeof(int)
differs from sizeof(void *) requires an ANSI C compiler. An initial
port was done with gcc.  You may actually prefer executables
where ints are four bytes wide, but the other variant works as well.
You may need quite a bit of memory when trying to recompile the gawk
sources, as some source files (`regex.c' in particular) are quite
big.  If you run out of memory compiling such a file, try reducing the
optimization level for this particular file; this may help.
With a reasonable shell (Bash will do), and in particular if you run
Linux, MiNT or a similar operating system, you have a pretty good
chance that the configure utility will succeed.  Otherwise
sample versions of `config.h' and `Makefile.st' are given in the
`atari' subdirectory and can be edited and copied to the
corresponding files in the main source directory.  Even if
configure produced something, it might be advisable to compare
its results with the sample versions and possibly make adjustments.
Some gawk source code fragments depend on a preprocessor define
`atarist'.  This basically assumes the TOS environment with gcc.
Modify these sections as appropriate if they are not right for your
environment.  Also see the remarks about AWKPATH and envsep in
section Running gawk on the Atari ST.
As shipped, the sample `config.h' claims that the system
function is missing from the libraries, which is not true, and an
alternative implementation of this function is provided in
`atari/system.c'.  Depending upon your particular combination of
shell and operating system, you may wish to change the file to indicate
that system is available.
gawk on the Atari ST
An executable version of gawk should be placed, as usual,
anywhere in your PATH where your shell can find it.
While executing, gawk creates a number of temporary files.  When
using gcc libraries for TOS, gawk looks for either of
the environment variables TEMP or TMPDIR, in that order.
If either one is found, its value is assumed to be a directory for
temporary files.  This directory must exist, and if you can spare the
memory, it is a good idea to put it on a RAM drive.  If neither
TEMP nor TMPDIR are found, then gawk uses the
current directory for its temporary files.
The ST version of gawk searches for its program files as described in
section The AWKPATH Environment Variable.
The default value for the AWKPATH variable is taken from
DEFPATH defined in `Makefile'. The sample gcc/TOS
`Makefile' for the ST in the distribution sets DEFPATH to
".,c:\lib\awk,c:\gnu\lib\awk".  The search path can be
modified by explicitly setting AWKPATH to whatever you wish.
Note that colons cannot be used on the ST to separate elements in the
AWKPATH variable, since they have another, reserved, meaning.
Instead, you must use a comma to separate elements in the path.  When
recompiling, the separating character can be modified by initializing
the envsep variable in `atari/gawkmisc.atr' to another
value.
Although awk allows great flexibility in doing I/O redirections
from within a program, this facility should be used with care on the ST
running under TOS.  In some circumstances the OS routines for file
handle pool processing lose track of certain events, causing the
computer to crash, and requiring a reboot.  Often a warm reboot is
sufficient.  Fortunately, this happens infrequently, and in rather
esoteric situations.  In particular, avoid having one part of an
awk program using print statements explicitly redirected
to "/dev/stdout", while other print statements use the
default standard output, and a calling shell has redirected standard
output to a file.
When gawk is compiled with the ST version of gcc and its
usual libraries, it will accept both `/' and `\' as path separators.
While this is convenient, it should be remembered that this removes one,
technically valid, character (`/') from your file names, and that
it may create problems for external programs, called via the system
function, which may not support this convention.  Whenever it is possible
that a file created by gawk will be used by some other program,
use only backslashes.  Also remember that in awk, backslashes in
strings have to be doubled in order to get literal backslashes
(see section Escape Sequences).
gawk on an Amiga
You can install gawk on an Amiga system using a Unix emulation
environment available via anonymous ftp from
wuarchive.wustl.edu in the directory `pub/aminet/dev/gcc'.
This includes a shell based on pdksh.  The primary component of
this environment is a Unix emulation library, `ixemul.lib'.
A more complete distribution for the Amiga is available on the FreshFish CD-ROM from:
Amiga Library Services
610 North Alma School Road, Suite 18
Chandler, AZ 85224 USA
Phone: +1-602-491-0048
FAX: +1-602-491-0048
E-mail:orders@amigalib.com
Once you have the distribution, you can configure gawk simply by
running configure:
configure -v m68k-cbm-amigados
Then run make, and you should be all set!
(If these steps do not work, please send in a bug report;
see section Reporting Problems and Bugs.)
If you have problems with gawk or think that you have found a bug,
please report it to the developers; we cannot promise to do anything
but we might well want to fix it.
Before reporting a bug, make sure you have actually found a real bug. Carefully reread the documentation and see if it really says you can do what you're trying to do. If it's not clear whether you should be able to do something or not, report that too; it's a bug in the documentation!
Before reporting a bug or trying to fix it yourself, try to isolate it
to the smallest possible awk program and input data file that
reproduces the problem.  Then send us the program and data file,
some idea of what kind of Unix system you're using, and the exact results
gawk gave you.  Also say what you expected to occur; this will help
us decide whether the problem was really in the documentation.
Once you have a precise problem, there are two e-mail addresses you can send mail to.
Please include the
version number of gawk you are using.  You can get this information
with the command `gawk --version'.
You should send a carbon copy of your mail to Arnold Robbins, who can
be reached at `arnold@gnu.ai.mit.edu'.
Important! Do not try to report bugs in gawk by
posting to the Usenet/Internet newsgroup comp.lang.awk.
While the gawk developers do occasionally read this newsgroup,
there is no guarantee that we will see your posting.  The steps described
above are the official, recognized ways for reporting bugs.
Non-bug suggestions are always welcome as well. If you have questions about things that are unclear in the documentation or are just obscure features, ask Arnold Robbins; he will try to help you out, although he may not have the time to fix the problem. You can send him electronic mail at the Internet address above.
If you find bugs in one of the non-Unix ports of gawk, please send
an electronic mail message to the person who maintains that port.  They
are listed below, and also in the `README' file in the gawk
distribution.  Information in the README file should be considered
authoritative if it conflicts with this book.
The people maintaining the non-Unix ports of gawk are:
If your bug is also reproducible under Unix, please send copies of your report to the general GNU bug list, as well as to Arnold Robbins, at the addresses listed above.
awk Implementations
There are two other freely available awk implementations.
This section briefly describes where to get them.
awk
awk freely available.  You can get it via anonymous ftp
to the host netlib.att.com.  Change directory to
`/netlib/research'. Use "binary" or "image" mode, and
retrieve `awk.bundle.Z'.
This is a shell archive that has been compressed with the compress
utility. It can be uncompressed with either uncompress or the
GNU gunzip utility.
This version requires an ANSI C compiler; GCC (the GNU C compiler)
works quite nicely.
mawk
awk,
called mawk.  It is available under the GPL
(see section GNU GENERAL PUBLIC LICENSE),
just as gawk is.
You can get it via anonymous ftp to the host
oxy.edu.  Change directory to `/public'. Use "binary"
or "image" mode, and retrieve `mawk1.2.1.tar.gz' (or the latest
version that is there).
gunzip may be used to decompress this file. Installation
is similar to gawk's
(see section Compiling and Installing gawk on Unix).