Running Awk
Running awk
There are two ways to run awk: with an explicit program, or with
one or more program files. Here are templates for both of them; items
enclosed in `[...]' in these templates are optional.
Besides traditional one-letter POSIX-style options, gawk also
supports GNU long options.
awk [options] -f progfile [--] file ... awk [options] [--] 'program' file ...
It is possible to invoke awk with an empty program:
$ awk " datafile1 datafile2
Doing so makes little sense though; awk will simply exit
silently when given an empty program (d.c.). If `--lint' has
been specified on the command line, gawk will issue a
warning that the program is empty.
Command Line Options
Options begin with a dash, and consist of a single character. GNU style long options consist of two dashes and a keyword. The keyword can be abbreviated, as long the abbreviation allows the option to be uniquely identified. If the option takes an argument, then the keyword is either immediately followed by an equals sign (`=') and the argument's value, or the keyword and the argument's value are separated by whitespace. For brevity, the discussion below only refers to the traditional short options; however the long and short options are interchangeable in all contexts.
Each long option for gawk has a corresponding
POSIX-style option. The options and their meanings are as follows:
-F fs--field-separator fs-
Sets the
FSvariable to fs (see section Specifying How Fields are Separated). -f source-file--file source-file-
Indicates that the
awkprogram is to be found in source-file instead of in the first non-option argument. -v var=val--assign var=val-
Sets the variable var to the value val before
execution of the program begins. Such variable values are available
inside the
BEGINrule (see section Other Command Line Arguments). The `-v' option can only set one variable, but you can use it more than once, setting another variable each time, like this: `awk -v foo=1 -v bar=2 ...'. -mf=NNN-mr=NNN-
Set various memory limits to the value NNN. The `f' flag sets
the maximum number of fields, and the `r' flag sets the maximum
record size. These two flags and the `-m' option are from the
Bell Labs research version of Unix
awk. They are provided for compatibility, but otherwise ignored bygawk, sincegawkhas no predefined limits. -W gawk-opt-
Following the POSIX standard, options that are implementation
specific are supplied as arguments to the `-W' option. With
gawk, these arguments may be separated by commas, or quoted and separated by whitespace. Case is ignored when processing these options. These options also have corresponding GNU style long options. See below. --- Signals the end of the command line options. The following arguments are not treated as options even if they begin with `-'. This interpretation of `--' follows the POSIX argument parsing conventions. This is useful if you have file names that start with `-', or in shell scripts, if you have file names that will be specified by the user which could start with `-'.
The following gawk-specific options are available:
-W traditional-W compat--traditional--compat-
Specifies compatibility mode, in which the GNU extensions to
the
awklanguage are disabled, so thatgawkbehaves just like the Bell Labs research version of Unixawk. `--traditional' is the preferred form of this option. See section Extensions ingawkNot in POSIXawk, which summarizes the extensions. Also see section Downward Compatibility and Debugging. -W copyleft-W copyright--copyleft--copyright-
Print the short version of the General Public License.
This option may disappear in a future version of
gawk. -W help-W usage--help--usage-
Print a "usage" message summarizing the short and long style options
that
gawkaccepts, and then exit. -W lint--lint-
Warn about constructs that are dubious or non-portable to
other
awkimplementations. Some warnings are issued whengawkfirst reads your program. Others are issued at run-time, as your program executes. -W lint-old--lint-old-
Warn about constructs that are not available in
the original Version 7 Unix version of
awk(see section Major Changes between V7 and SVR3.1). -W posix--posix-
Operate in strict POSIX mode. This disables all
gawkextensions (just like `--traditional'), and adds the following additional restrictions:-
\xescape sequences are not recognized (see section Escape Sequences). -
The synonym
funcfor the keywordfunctionis not recognized (see section Function Definition Syntax). - The operators `**' and `**=' cannot be used in place of `^' and `^=' (see section Arithmetic Operators, and also see section Assignment Expressions).
-
Specifying `-Ft' on the command line does not set the value
of
FSto be a single tab character (see section Specifying How Fields are Separated). -
The
fflushbuilt-in function is not supported (see section Built-in Functions for Input/Output).
gawkwill also issue a warning if both options are supplied. -
-W re-interval--re-interval-
Allow interval expressions
(see section Regular Expression Operators),
in regexps.
Because interval expressions were traditionally not available in
awk,gawkdoes not provide them by default. This prevents oldawkprograms from breaking. -W source program-text--source program-text-
Program source code is taken from the program-text. This option
allows you to mix source code in files with source
code that you enter on the command line. This is particularly useful
when you have library functions that you wish to use from your command line
programs (see section The
AWKPATHEnvironment Variable). -W version--version-
Prints version information for this particular copy of
gawk. This allows you to determine if your copy ofgawkis up to date with respect to whatever the Free Software Foundation is currently distributing. It is also useful for bug reports (see section Reporting Problems and Bugs).
Any other options are flagged as invalid with a warning message, but are otherwise ignored.
In compatibility mode, as a special case, if the value of fs supplied
to the `-F' option is `t', then FS is set to the tab
character ("\t"). This is only true for `--traditional', and not
for `--posix'
(see section Specifying How Fields are Separated).
The `-f' option may be used more than once on the command line.
If it is, awk reads its program source from all of the named files, as
if they had been concatenated together into one big file. This is
useful for creating libraries of awk functions. Useful functions
can be written once, and then retrieved from a standard place, instead
of having to be included into each individual program.
You can type in a program at the terminal and still use library functions,
by specifying `-f /dev/tty'. awk will read a file from the terminal
to use as part of the awk program. After typing your program,
type Control-d (the end-of-file character) to terminate it.
(You may also use `-f -' to read program source from the standard
input, but then you will not be able to also use the standard input as a
source of data.)
Because it is clumsy using the standard awk mechanisms to mix source
file and command line awk programs, gawk provides the
`--source' option. This does not require you to pre-empt the standard
input for your source code, and allows you to easily mix command line
and library source code
(see section The AWKPATH Environment Variable).
If no `-f' or `--source' option is specified, then gawk
will use the first non-option command line argument as the text of the
program source code.
If the environment variable POSIXLY_CORRECT exists,
then gawk will behave in strict POSIX mode, exactly as if
you had supplied the `--posix' command line option.
Many GNU programs look for this environment variable to turn on
strict POSIX mode. If you supply `--lint' on the command line,
and gawk turns on POSIX mode because of POSIXLY_CORRECT,
then it will print a warning message indicating that POSIX
mode is in effect.
You would typically set this variable in your shell's startup file. For a Bourne compatible shell (such as Bash), you would add these lines to the `.profile' file in your home directory.
POSIXLY_CORRECT=true export POSIXLY_CORRECT
For a csh compatible shell,(14)
you would add this line to the `.login' file in your home directory.
setenv POSIXLY_CORRECT true
Other Command Line Arguments
Any additional arguments on the command line are normally treated as
input files to be processed in the order specified. However, an
argument that has the form var=value, assigns
the value value to the variable var---it does not specify a
file at all.
All these arguments are made available to your awk program in the
ARGV array (see section Built-in Variables). Command line options
and the program text (if present) are omitted from ARGV.
All other arguments, including variable assignments, are
included. As each element of ARGV is processed, gawk
sets the variable ARGIND to the index in ARGV of the
current element.
The distinction between file name arguments and variable-assignment
arguments is made when awk is about to open the next input file.
At that point in execution, it checks the "file name" to see whether
it is really a variable assignment; if so, awk sets the variable
instead of reading a file.
Therefore, the variables actually receive the given values after all
previously specified files have been read. In particular, the values of
variables assigned in this fashion are not available inside a
BEGIN rule
(see section The BEGIN and END Special Patterns),
since such rules are run before awk begins scanning the argument list.
The variable values given on the command line are processed for escape sequences (d.c.) (see section Escape Sequences).
In some earlier implementations of awk, when a variable assignment
occurred before any file names, the assignment would happen before
the BEGIN rule was executed. awk's behavior was thus
inconsistent; some command line assignments were available inside the
BEGIN rule, while others were not. However,
some applications came to depend
upon this "feature." When awk was changed to be more consistent,
the `-v' option was added to accommodate applications that depended
upon the old behavior.
The variable assignment feature is most useful for assigning to variables
such as RS, OFS, and ORS, which control input and
output formats, before scanning the data files. It is also useful for
controlling state if multiple passes are needed over a data file. For
example:
awk 'pass == 1 { pass 1 stuff }
pass == 2 { pass 2 stuff }' pass=1 mydata pass=2 mydata
Given the variable assignment feature, the `-F' option for setting
the value of FS is not
strictly necessary. It remains for historical compatibility.
The AWKPATH Environment Variable
The previous section described how awk program files can be named
on the command line with the `-f' option. In most awk
implementations, you must supply a precise path name for each program
file, unless the file is in the current directory.
But in gawk, if the file name supplied to the `-f' option
does not contain a `/', then gawk searches a list of
directories (called the search path), one by one, looking for a
file with the specified name.
The search path is a string consisting of directory names
separated by colons. gawk gets its search path from the
AWKPATH environment variable. If that variable does not exist,
gawk uses a default path, which is
`.:/usr/local/share/awk'.(15) (Programs written for use by
system administrators should use an AWKPATH variable that
does not include the current directory, `.'.)
The search path feature is particularly useful for building up libraries
of useful awk functions. The library files can be placed in a
standard directory that is in the default path, and then specified on
the command line with a short file name. Otherwise, the full file name
would have to be typed for each file.
By using both the `--source' and `-f' options, your command line
awk programs can use facilities in awk library files.
See section A Library of awk Functions.
Path searching is not done if gawk is in compatibility mode.
This is true for both `--traditional' and `--posix'.
See section Command Line Options.
Note: if you want files in the current directory to be found, you must include the current directory in the path, either by including `.' explicitly in the path, or by writing a null entry in the path. (A null entry is indicated by starting or ending the path with a colon, or by placing two colons next to each other (`::').) If the current directory is not included in the path, then files cannot be found in the current directory. This path search mechanism is identical to the shell's.
Starting with version 3.0, if AWKPATH is not defined in the
environment, gawk will place its default search path into
ENVIRON["AWKPATH"]. This makes it easy to determine
the actual search path gawk will use.
Obsolete Options and/or Features
This section describes features and/or command line options from
previous releases of gawk that are either not available in the
current version, or that are still supported but deprecated (meaning that
they will not be in the next release).
For version 3.0 of gawk, there are no command line options
or other deprecated features from the previous version of gawk.
This section
is thus essentially a place holder,
in case some option becomes obsolete in a future version of gawk.
Undocumented Options and Features
This section intentionally left blank.
Known Bugs in gawk
-
The `-F' option for changing the value of
FS(see section Command Line Options) is not necessary given the command line variable assignment feature; it remains only for backwards compatibility. -
If your system actually has support for `/dev/fd' and the
associated `/dev/stdin', `/dev/stdout', and
`/dev/stderr' files, you may get different output from
gawkthan you would get on a system without those files. Whengawkinterprets these files internally, it synchronizes output to the standard output with output to `/dev/stdout', while on a system with those files, the output is actually to different open files (see section Special File Names ingawk). - Syntactically invalid single character programs tend to overflow the parse stack, generating a rather unhelpful message. Such programs are surprisingly difficult to diagnose in the completely general case, and the effort to do so really is not worth it.
- The word "GNU" is incorrectly capitalized in at least one file in the source code.
Running Awk : micro annuaire
| cygwin | : | le compilateur gcc sous windows ainsi que tous les outils unix (awk, grep, sed, bash, ksh ...). |
| Youhp3 | : | Youpee est un preprocesseur HTML pour vous simplifier toutes les tâches répétitives dans la création d'un site web. Salemioche.net utilise trés largement ses possibilités. |
