awk Language
This book describes the GNU implementation of awk, which follows
the POSIX specification. Many awk users are only familiar
with the original awk implementation in Version 7 Unix.
(This implementation was the basis for awk in Berkeley Unix,
through 4.3--Reno. The 4.4 release of Berkeley Unix uses gawk 2.15.2
for its version of awk.) This chapter briefly describes the
evolution of the awk language, with cross references to other parts
of the book where you can find more information.
The awk language evolved considerably between the release of
Version 7 Unix (1978) and the new version first made generally available in
System V Release 3.1 (1987). This section summarizes the changes, with
cross-references to further details.
awk Statements Versus Lines).
return statement
(see section User-defined Functions).
delete statement (see section The delete Statement).
do-while statement
(see section The do-while Statement).
atan2, cos, sin, rand and
srand (see section Numeric Built-in Functions).
gsub, sub, and match
(see section Built-in Functions for String Manipulation).
close, and system
(see section Built-in Functions for Input/Output).
ARGC, ARGV, FNR, RLENGTH, RSTART,
and SUBSEP built-in variables (see section Built-in Variables).
awk
programs (see section Operator Precedence (How Operators Nest)).
FS
(see section Specifying How Fields are Separated), and as the
third argument to the split function
(see section Built-in Functions for String Manipulation).
awk to
recognize `\r', `\b', and `\f', but this is not
something you can rely on.)
getline function
(see section Explicit Input with getline).
BEGIN and END rules
(see section The BEGIN and END Special Patterns).
The System V Release 4 version of Unix awk added these features
(some of which originated in gawk):
ENVIRON variable (see section Built-in Variables).
srand built-in function
(see section Numeric Built-in Functions).
toupper and tolower built-in string functions
for case translation
(see section Built-in Functions for String Manipulation).
printf function
(see section Format-Control Letters).
"%*.*d")
in the argument list of the printf function
(see section Format-Control Letters).
/foo/ as expressions, where
they are equivalent to using the matching operator, as in `$0 ~ /foo/'
(see section Using Regular Expression Constants).
awk
The POSIX Command Language and Utilities standard for awk
introduced the following changes into the language:
CONVFMT for controlling the conversion of numbers
to strings (see section Conversion of Strings and Numbers).
The following common extensions are not permitted by the POSIX standard:
\x escape sequences are not recognized
(see section Escape Sequences).
func for the keyword function is not
recognized (see section Function Definition Syntax).
FS to be a single tab character
(see section Specifying How Fields are Separated).
fflush built-in function is not supported
(see section Built-in Functions for Input/Output).
awk
Brian Kernighan, one of the original designers of Unix awk,
has made his version available via anonymous ftp
(see section Other Freely Available awk Implementations).
This section describes extensions in his version of awk that are
not in POSIX awk.
fflush built-in function for flushing buffered output
(see section Built-in Functions for Input/Output).
gawk Not in POSIX awk
The GNU implementation, gawk, adds a number of features.
This sections lists them in the order they were added to gawk.
They can all be disabled with either the `--traditional' or
`--posix' options
(see section Command Line Options).
Version 2.10 of gawk introduced these features:
AWKPATH environment variable for specifying a path search for
the `-f' command line option
(see section Command Line Options).
IGNORECASE variable and its effects
(see section Case-sensitivity in Matching).
gawk).
Version 2.13 of gawk introduced these features:
FIELDWIDTHS variable and its effects
(see section Reading Fixed-width Data).
systime and strftime built-in functions for obtaining
and printing time stamps
(see section Functions for Dealing with Time Stamps).
Version 2.14 of gawk introduced these features:
next file statement for skipping to the next data file
(see section The nextfile Statement).
Version 2.15 of gawk introduced these features:
ARGIND variable, that tracks the movement of FILENAME
through ARGV (see section Built-in Variables).
ERRNO variable, that contains the system error message when
getline returns -1, or when close fails
(see section Built-in Variables).
gawk).
Version 3.0 of gawk introduced these features:
next file statement became nextfile
(see section The nextfile Statement).
awk
(see section Major Changes between V7 and SVR3.1).
FS to be a null string, and for the third
argument to split to be the null string
(see section Making Each Character a Separate Field).
RS to be a regexp
(see section How Input is Split into Records).
RT variable
(see section How Input is Split into Records).
gensub function for more powerful text manipulation
(see section Built-in Functions for String Manipulation).
strftime function acquired a default time format,
allowing it to be called with no arguments
(see section Functions for Dealing with Time Stamps).
IGNORECASE changed, now applying to string comparison as well
as regexp operations
(see section Case-sensitivity in Matching).
fflush function from the
Bell Labs research version of awk
(see section Command Line Options; also
see section Built-in Functions for Input/Output).
gawk for Unix).
gawk on an Amiga).