The Evolution of the awk Language
The Evolution of the awk Language
This book describes the GNU implementation of awk, which follows
the POSIX specification. Many awk users are only familiar
with the original awk implementation in Version 7 Unix.
(This implementation was the basis for awk in Berkeley Unix,
through 4.3--Reno. The 4.4 release of Berkeley Unix uses gawk 2.15.2
for its version of awk.) This chapter briefly describes the
evolution of the awk language, with cross references to other parts
of the book where you can find more information.
Major Changes between V7 and SVR3.1
The awk language evolved considerably between the release of
Version 7 Unix (1978) and the new version first made generally available in
System V Release 3.1 (1987). This section summarizes the changes, with
cross-references to further details.
-
The requirement for `;' to separate rules on a line
(see section
awkStatements Versus Lines). -
User-defined functions, and the
returnstatement (see section User-defined Functions). -
The
deletestatement (see section ThedeleteStatement). -
The
do-whilestatement (see section Thedo-whileStatement). -
The built-in functions
atan2,cos,sin,randandsrand(see section Numeric Built-in Functions). -
The built-in functions
gsub,sub, andmatch(see section Built-in Functions for String Manipulation). -
The built-in functions
close, andsystem(see section Built-in Functions for Input/Output). -
The
ARGC,ARGV,FNR,RLENGTH,RSTART, andSUBSEPbuilt-in variables (see section Built-in Variables). - The conditional expression using the ternary operator `?:' (see section Conditional Expressions).
- The exponentiation operator `^' (see section Arithmetic Operators) and its assignment operator form `^=' (see section Assignment Expressions).
-
C-compatible operator precedence, which breaks some old
awkprograms (see section Operator Precedence (How Operators Nest)). -
Regexps as the value of
FS(see section Specifying How Fields are Separated), and as the third argument to thesplitfunction (see section Built-in Functions for String Manipulation). - Dynamic regexps as operands of the `~' and `!~' operators (see section How to Use Regular Expressions).
-
The escape sequences `\b', `\f', and `\r'
(see section Escape Sequences).
(Some vendors have updated their old versions of
awkto recognize `\r', `\b', and `\f', but this is not something you can rely on.) -
Redirection of input for the
getlinefunction (see section Explicit Input withgetline). -
Multiple
BEGINandENDrules (see section TheBEGINandENDSpecial Patterns). - Multi-dimensional arrays (see section Multi-dimensional Arrays).
Changes between SVR3.1 and SVR4
The System V Release 4 version of Unix awk added these features
(some of which originated in gawk):
-
The
ENVIRONvariable (see section Built-in Variables). - Multiple `-f' options on the command line (see section Command Line Options).
- The `-v' option for assigning variables before program execution begins (see section Command Line Options).
- The `--' option for terminating command line options.
- The `\a', `\v', and `\x' escape sequences (see section Escape Sequences).
-
A defined return value for the
srandbuilt-in function (see section Numeric Built-in Functions). -
The
toupperandtolowerbuilt-in string functions for case translation (see section Built-in Functions for String Manipulation). -
A cleaner specification for the `%c' format-control letter in the
printffunction (see section Format-Control Letters). -
The ability to dynamically pass the field width and precision (
"%*.*d") in the argument list of theprintffunction (see section Format-Control Letters). -
The use of regexp constants such as
/foo/as expressions, where they are equivalent to using the matching operator, as in `$0 ~ /foo/' (see section Using Regular Expression Constants).
Changes between SVR4 and POSIX awk
The POSIX Command Language and Utilities standard for awk
introduced the following changes into the language:
- The use of `-W' for implementation-specific options.
-
The use of
CONVFMTfor controlling the conversion of numbers to strings (see section Conversion of Strings and Numbers). - The concept of a numeric string, and tighter comparison rules to go with it (see section Variable Typing and Comparison Expressions).
- More complete documentation of many of the previously undocumented features of the language.
The following common extensions are not permitted by the POSIX standard:
-
\xescape sequences are not recognized (see section Escape Sequences). -
The synonym
funcfor the keywordfunctionis not recognized (see section Function Definition Syntax). - The operators `**' and `**=' cannot be used in place of `^' and `^=' (see section Arithmetic Operators, and also see section Assignment Expressions).
-
Specifying `-Ft' on the command line does not set the value
of
FSto be a single tab character (see section Specifying How Fields are Separated). -
The
fflushbuilt-in function is not supported (see section Built-in Functions for Input/Output).
Extensions in the AT&T Bell Laboratories awk
Brian Kernighan, one of the original designers of Unix awk,
has made his version available via anonymous ftp
(see section Other Freely Available awk Implementations).
This section describes extensions in his version of awk that are
not in POSIX awk.
- The `-mf=NNN' and `-mr=NNN' command line options to set the maximum number of fields, and the maximum record size, respectively (see section Command Line Options).
-
The
fflushbuilt-in function for flushing buffered output (see section Built-in Functions for Input/Output).
Extensions in gawk Not in POSIX awk
The GNU implementation, gawk, adds a number of features.
This sections lists them in the order they were added to gawk.
They can all be disabled with either the `--traditional' or
`--posix' options
(see section Command Line Options).
Version 2.10 of gawk introduced these features:
-
The
AWKPATHenvironment variable for specifying a path search for the `-f' command line option (see section Command Line Options). -
The
IGNORECASEvariable and its effects (see section Case-sensitivity in Matching). -
The `/dev/stdin', `/dev/stdout', `/dev/stderr', and
`/dev/fd/n' file name interpretation
(see section Special File Names in
gawk).
Version 2.13 of gawk introduced these features:
-
The
FIELDWIDTHSvariable and its effects (see section Reading Fixed-width Data). -
The
systimeandstrftimebuilt-in functions for obtaining and printing time stamps (see section Functions for Dealing with Time Stamps). - The `-W lint' option to provide source code and run time error and portability checking (see section Command Line Options).
- The `-W compat' option to turn off these extensions (see section Command Line Options).
- The `-W posix' option for full POSIX compliance (see section Command Line Options).
Version 2.14 of gawk introduced these features:
-
The
next filestatement for skipping to the next data file (see section ThenextfileStatement).
Version 2.15 of gawk introduced these features:
-
The
ARGINDvariable, that tracks the movement ofFILENAMEthroughARGV(see section Built-in Variables). -
The
ERRNOvariable, that contains the system error message whengetlinereturns -1, or whenclosefails (see section Built-in Variables). - The ability to use GNU-style long named options that start with `--' (see section Command Line Options).
- The `--source' option for mixing command line and library file source code (see section Command Line Options).
-
The `/dev/pid', `/dev/ppid', `/dev/pgrpid', and
`/dev/user' file name interpretation
(see section Special File Names in
gawk).
Version 3.0 of gawk introduced these features:
-
The
next filestatement becamenextfile(see section ThenextfileStatement). -
The `--lint-old' option to
warn about constructs that are not available in
the original Version 7 Unix version of
awk(see section Major Changes between V7 and SVR3.1). - The `--traditional' option was added as a better name for `--compat' (see section Command Line Options).
-
The ability for
FSto be a null string, and for the third argument tosplitto be the null string (see section Making Each Character a Separate Field). -
The ability for
RSto be a regexp (see section How Input is Split into Records). -
The
RTvariable (see section How Input is Split into Records). -
The
gensubfunction for more powerful text manipulation (see section Built-in Functions for String Manipulation). -
The
strftimefunction acquired a default time format, allowing it to be called with no arguments (see section Functions for Dealing with Time Stamps). - Full support for both POSIX and GNU regexps (see section Regular Expressions).
- The `--re-interval' option to provide interval expressions in regexps (see section Regular Expression Operators).
-
IGNORECASEchanged, now applying to string comparison as well as regexp operations (see section Case-sensitivity in Matching). -
The `-m' option and the
fflushfunction from the Bell Labs research version ofawk(see section Command Line Options; also see section Built-in Functions for Input/Output). -
The use of GNU Autoconf to control the configuration process
(see section Compiling
gawkfor Unix). -
Amiga support
(see section Installing
gawkon an Amiga).
The Evolution of the awk Language : micro annuaire
| cygwin | : | le compilateur gcc sous windows ainsi que tous les outils unix (awk, grep, sed, bash, ksh ...). |
| Youhp3 | : | Youpee est un preprocesseur HTML pour vous simplifier toutes les tâches répétitives dans la création d'un site web. Salemioche.net utilise trés largement ses possibilités. |
