[Top] [Contents] [Index] [ ? ]

The GNU Awk User's Guide

This file documents awk, a program that you can use to select particular records in a file and perform operations upon them.

This is Edition 1.0.6 of Effective AWK Programming,
for the 3.0.6 version of the GNU implementation
of AWK.

Preface  What this Info file is about; brief history and acknowledgements.
1. Introduction  What is the awk language; using this Info file.
2. Getting Started with awk  A basic introduction to using awk. How to run an awk program. Command line syntax.
3. Useful One Line Programs  Short, sample awk programs.
4. Regular Expressions  All about matching things using regular expressions.
5. Reading Input Files  How to read files and manipulate fields.
6. Printing Output  How to print using awk. Describes the
                                print and printf statements.  
                                Also describes redirection of output.
7. Expressions  Expressions are the basic building blocks of statements.
8. Patterns and Actions  Overviews of patterns and actions.
9. Control Statements in Actions  The various control statements are described in detail.
10. Built-in Variables  
11. Arrays in awk  The description and use of arrays. Also includes array-oriented control statements.
12. Built-in Functions  The built-in functions are summarized here.
13. User-defined Functions  User-defined functions are described in detail.
14. Running awk  How to run gawk.
15. A Library of awk Functions  
16. Practical awk Programs  Many awk programs with complete explanations.
17. The Evolution of the awk Language  The evolution of the awk language.
A. gawk Summary  gawk Options and Language Summary.
B. Installing gawk  Installing gawk under various operating systems.
C. Implementation Notes  Something about the implementation of
                                gawk.
D. Glossary  An explanation of some unfamiliar terms.
GNU GENERAL PUBLIC LICENSE  Your right to copy and distribute gawk.
Index  Concept and Variable Index.

History of awk and gawk  The history of gawk and awk.
The GNU Project and This Book  Brief history of the GNU project and this Info file.
Acknowledgements  
1.1 Using This Book  Using this Info file. Includes sample input files that you can use.
1.2 Typographical Conventions  
1.3 Data Files for the Examples  Sample data files for use in the awk programs illustrated in this Info file.
2.1 A Rose By Any Other Name  What name to use to find awk.
2.2 How to Run awk Programs  How to run gawk programs; includes command line syntax.
2.2.1 One-shot Throw-away awk Programs  Running a short throw-away awk program.
2.2.2 Running awk without Input Files  Using no input files (input from terminal instead).
2.2.3 Running Long Programs  Putting permanent awk programs in files.
2.2.4 Executable awk Programs  Making self-contained awk programs.
2.2.5 Comments in awk Programs  Adding documentation to gawk programs.
2.3 A Very Simple Example  A very simple example.
2.4 An Example with Two Rules  A less simple one-line example with two rules.
2.5 A More Complex Example  A more complex example.
2.6 awk Statements Versus Lines  Subdividing or combining statements into lines.
2.7 Other Features of awk  
2.8 When to Use awk  When to use gawk and when to use other things.
4.1 How to Use Regular Expressions  
4.2 Escape Sequences  How to write non-printing characters.
4.3 Regular Expression Operators  
4.4 Additional Regexp Operators Only in gawk  Operators specific to GNU software.
4.5 Case-sensitivity in Matching  How to do case-insensitive matching.
4.6 How Much Text Matches?  How much text matches.
4.7 Using Dynamic Regexps  
5.1 How Input is Split into Records  Controlling how data is split into records.
5.2 Examining Fields  An introduction to fields.
5.3 Non-constant Field Numbers  
5.4 Changing the Contents of a Field  
5.5 Specifying How Fields are Separated  The field separator and how to change it.
5.5.1 The Basics of Field Separating  How fields are split with single characters or simple strings.
5.5.2 Using Regular Expressions to Separate Fields  Using regexps as the field separator.
5.5.3 Making Each Character a Separate Field  Making each character a separate field.
5.5.4 Setting FS from the Command Line  Setting FS from the command line.
5.5.5 Field Splitting Summary  Some final points and a summary table.
5.6 Reading Fixed-width Data  Reading constant width data.
5.7 Multiple-Line Records  Reading multi-line records.
5.8 Explicit Input with getline  Reading files under explicit program control using the getline function.
5.8.1 Introduction to getline  Introduction to the getline function.
5.8.2 Using getline with No Arguments  Using getline with no arguments.
5.8.3 Using getline Into a Variable  Using getline into a variable.
5.8.4 Using getline from a File  Using getline from a file.
5.8.5 Using getline Into a Variable from a File  Using getline into a variable from a file.
5.8.6 Using getline from a Pipe  Using getline from a pipe.
5.8.7 Using getline Into a Variable from a Pipe  Using getline into a variable from a pipe.
5.8.8 Summary of getline Variants  Summary Of getline Variants.
6.1 The print Statement  The print statement.
6.2 Examples of print Statements  Simple examples of print statements.
6.3 Output Separators  The output separators and how to change them.
6.4 Controlling Numeric Output with print  Controlling Numeric Output With print.
6.5 Using printf Statements for Fancier Printing  The printf statement.
6.5.1 Introduction to the printf Statement  Syntax of the printf statement.
6.5.2 Format-Control Letters  Format-control letters.
6.5.3 Modifiers for printf Formats  Format-specification modifiers.
6.5.4 Examples Using printf  Several examples.
6.6 Redirecting Output of print and printf  How to redirect output to multiple files and pipes.
6.7 Special File Names in gawk  File name interpretation in gawk.
                                gawk allows access to inherited file
                                descriptors.
6.8 Closing Input and Output Files and Pipes  
7.1 Constant Expressions  String, numeric, and regexp constants.
7.1.1 Numeric and String Constants  Numeric and string constants.
7.1.2 Regular Expression Constants  Regular Expression constants.
7.2 Using Regular Expression Constants  When and how to use a regexp constant.
7.3 Variables  Variables give names to values for later use.
7.3.1 Using Variables in a Program  Using variables in your programs.
7.3.2 Assigning Variables on the Command Line  Setting variables on the command line and a summary of command line syntax. This is an advanced method of input.
7.4 Conversion of Strings and Numbers  The conversion of strings to numbers and vice versa.
7.5 Arithmetic Operators  Arithmetic operations (`+', `-', etc.)
7.6 String Concatenation  Concatenating strings.
7.7 Assignment Expressions  Changing the value of a variable or a field.
7.8 Increment and Decrement Operators  Incrementing the numeric value of a variable.
7.9 True and False in awk  What is "true" and what is "false".
7.10 Variable Typing and Comparison Expressions  How variables acquire types, and how this affects comparison of numbers and strings with
                                `<', etc.
7.11 Boolean Expressions  Combining comparison expressions using boolean operators `||' ("or"), `&&'
                                ("and") and `!' ("not").
7.12 Conditional Expressions  Conditional expressions select between two subexpressions under control of a third subexpression.
7.13 Function Calls  A function call is an expression.
7.14 Operator Precedence (How Operators Nest)  How various operators nest.
8.1 Pattern Elements  What goes into a pattern.
8.1.1 Kinds of Patterns  A list of all kinds of patterns.
8.1.2 Regular Expressions as Patterns  Using regexps as patterns.
8.1.3 Expressions as Patterns  Any expression can be used as a pattern.
8.1.4 Specifying Record Ranges with Patterns  Pairs of patterns specify record ranges.
8.1.5 The BEGIN and END Special Patterns  Specifying initialization and cleanup rules.
8.1.5.1 Startup and Cleanup Actions  How and why to use BEGIN/END rules.
8.1.5.2 Input/Output from BEGIN and END Rules  I/O issues in BEGIN/END rules.
8.1.6 The Empty Pattern  The empty pattern, which matches every record.
8.2 Overview of Actions  What goes into an action.
9.1 The if-else Statement  Conditionally execute some awk statements.
9.2 The while Statement  Loop until some condition is satisfied.
9.3 The do-while Statement  Do specified action while looping until some condition is satisfied.
9.4 The for Statement  Another looping statement, that provides initialization and increment clauses.
9.5 The break Statement  Immediately exit the innermost enclosing loop.
9.6 The continue Statement  Skip to the end of the innermost enclosing loop.
9.7 The next Statement  Stop processing the current input record.
9.8 The nextfile Statement  Stop processing the current file.
9.9 The exit Statement  Stop execution of awk.
10.1 Built-in Variables that Control awk  Built-in variables that you change to control
                                awk.
10.2 Built-in Variables that Convey Information  Built-in variables where awk gives you information.
10.3 Using ARGC and ARGV  Ways to use ARGC and ARGV.
11.1 Introduction to Arrays  
11.2 Referring to an Array Element  How to examine one element of an array.
11.3 Assigning Array Elements  How to change an element of an array.
11.4 Basic Array Example  Basic Example of an Array
11.5 Scanning All Elements of an Array  A variation of the for statement. It loops through the indices of an array's existing elements.
11.6 The delete Statement  The delete statement removes an element from an array.
11.7 Using Numbers to Subscript Arrays  How to use numbers as subscripts in
                                awk.
11.8 Using Uninitialized Variables as Subscripts  Using Uninitialized variables as subscripts.
11.9 Multi-dimensional Arrays  Emulating multi-dimensional arrays in
                                awk.
11.10 Scanning Multi-dimensional Arrays  Scanning multi-dimensional arrays.
12.1 Calling Built-in Functions  How to call built-in functions.
12.2 Numeric Built-in Functions  Functions that work with numbers, including
                                intsin and rand.
12.3 Built-in Functions for String Manipulation  Functions for string manipulation, such as
                                splitmatch, and
                                sprintf.
12.4 Built-in Functions for Input/Output  Functions for files and shell commands.
12.5 Functions for Dealing with Time Stamps  Functions for dealing with time stamps.
13.1 Function Definition Syntax  How to write definitions and what they mean.
13.2 Function Definition Examples  An example function definition and what it does.
13.3 Calling User-defined Functions  Things to watch out for.
13.4 The return Statement  Specifying the value a function returns.
14.1 Command Line Options  Command line options and their meanings.
14.2 Other Command Line Arguments  Input file names and variable assignments.
14.3 The AWKPATH Environment Variable  Searching directories for awk programs.
14.4 Obsolete Options and/or Features  Obsolete Options and/or features.
14.5 Undocumented Options and Features  
14.6 Known Bugs in gawk  
15.1 Simulating gawk-specific Features  What to do if you don't have gawk.
15.2 Implementing nextfile as a Function  Two implementations of a nextfile function.
15.3 Assertions  A function for assertions in awk programs.
15.4 Rounding Numbers  A function for rounding if sprintf does not do it correctly.
15.5 Translating Between Characters and Numbers  Functions for using characters as numbers and vice versa.
15.6 Merging an Array Into a String  A function to join an array into a string.
15.7 Turning Dates Into Timestamps  A function to turn a date into a timestamp.
15.8 Managing the Time of Day  A function to get formatted times.
15.9 Noting Data File Boundaries  A function for handling data file transitions.
15.10 Processing Command Line Options  A function for processing command line arguments.
15.11 Reading the User Database  Functions for getting user information.
15.12 Reading the Group Database  Functions for getting group information.
15.13 Naming Library Function Global Variables  How to best name private global variables in library functions.
16.1 Re-inventing Wheels for Fun and Profit  Clones of common utilities.
16.1.1 Cutting Out Fields and Columns  The cut utility.
16.1.2 Searching for Regular Expressions in Files  The egrep utility.
16.1.3 Printing Out User Information  The id utility.
16.1.4 Splitting a Large File Into Pieces  The split utility.
16.1.5 Duplicating Output Into Multiple Files  The tee utility.
16.1.6 Printing Non-duplicated Lines of Text  The uniq utility.
16.1.7 Counting Things  The wc utility.
16.2 A Grab Bag of awk Programs  Some interesting awk programs.
16.2.1 Finding Duplicated Words in a Document  Finding duplicated words in a document.
16.2.2 An Alarm Clock Program  An alarm clock.
16.2.3 Transliterating Characters  A program similar to the tr utility.
16.2.4 Printing Mailing Labels  Printing mailing labels.
16.2.5 Generating Word Usage Counts  A program to produce a word usage count.
16.2.6 Removing Duplicates from Unsorted Text  Eliminating duplicate entries from a history file.
16.2.7 Extracting Programs from Texinfo Source Files  Pulling out programs from Texinfo source files.
16.2.8 A Simple Stream Editor  
16.2.9 An Easy Way to Use Library Functions  A wrapper for awk that includes files.
17.1 Major Changes between V7 and SVR3.1  The major changes between V7 and System V Release 3.1.
17.2 Changes between SVR3.1 and SVR4  Minor changes between System V Releases 3.1 and 4.
17.3 Changes between SVR4 and POSIX awk  New features from the POSIX standard.
17.4 Extensions in the Bell Laboratories awk  New features from the Bell Laboratories version of awk.
17.5 Extensions in gawk Not in POSIX awk  The extensions in gawk not in POSIX
                                awk.
A.1 Command Line Options Summary  Recapitulation of the command line.
A.2 Language Summary  A terse review of the language.
A.3 Variables and Fields  Variables, fields, and arrays.
A.3.1 Fields  Input field splitting.
A.3.2 Built-in Variables  awk's built-in variables.
A.3.3 Arrays  Using arrays.
A.3.4 Data Types  Values in awk are numbers or strings.
A.4 Patterns  Patterns and Actions, and their component parts.
A.4.1 Pattern Summary  Quick overview of patterns.
A.4.2 Regular Expressions  Quick overview of regular expressions.
A.5 Actions  Quick overview of actions.
A.5.1 Operators  awk operators.
A.5.2 Control Statements  The control statements.
A.5.3 I/O Statements  The I/O statements.
A.5.4 printf Summary  A summary of printf.
A.5.5 Special File Names  Special file names interpreted internally.
A.5.6 Built-in Functions  Built-in numeric and string functions.
A.5.7 Time Functions  Built-in time functions.
A.5.8 String Constants  Escape sequences in strings.
A.6 User-defined Functions  Defining and calling functions.
A.7 Historical Features  Some undocumented but supported "features".
B.1 The gawk Distribution  What is in the gawk distribution.
B.1.1 Getting the gawk Distribution  How to get the distribution.
B.1.2 Extracting the Distribution  How to extract the distribution.
B.1.3 Contents of the gawk Distribution  What is in the distribution.
B.2 Compiling and Installing gawk on Unix  Installing gawk under various versions of Unix.
B.2.1 Compiling gawk for Unix  Compiling gawk under Unix.
B.2.2 The Configuration Process  How it's all supposed to work.
B.3 How to Compile and Install gawk on VMS  Installing gawk on VMS.
B.3.1 Compiling gawk on VMS  How to compile gawk under VMS.
B.3.2 Installing gawk on VMS  How to install gawk under VMS.
B.3.3 Running gawk on VMS  How to run gawk under VMS.
B.3.4 Building and Using gawk on VMS POSIX  Alternate instructions for VMS POSIX.
B.4 MS-DOS and OS/2 Installation and Compilation  Installing and Compiling gawk on MS-DOS and OS/2
B.5 Installing gawk on the Atari ST  
B.5.1 Compiling gawk on the Atari ST  Compiling gawk on Atari
B.5.2 Running gawk on the Atari ST  Running gawk on Atari
B.6 Installing gawk on an Amiga  
B.7 Reporting Problems and Bugs  
B.8 Other Freely Available awk Implementations  Other freely available awk implementations.
C.1 Downward Compatibility and Debugging  How to disable certain gawk extensions.
C.2 Making Additions to gawk  Making Additions To gawk.
C.2.1 Adding New Features  Adding code to the main body of gawk.
C.2.2 Porting gawk to a New Operating System  Porting gawk to a new operating system.
C.3 Probable Future Extensions  New features that may be implemented one day.
C.4 Suggestions for Improvements  Suggestions for improvements by volunteers.

To Miriam, for making me complete.
To Chana, for the joy you bring us.
To Rivka, for the exponential increase.
To Nachum, for the added dimension.
To Malka, for the new beginning.



This document was generated by root l2-hrz on May, 9 2001 using texi2html