2.2 Basic Rules

[Next] [Previous] [Up] [Top] [Full Contents] [Search]

2. Notational Conventions and Generic Grammar

2.2 Basic Rules

The following rules are used throughout this specification to describe basic parsing constructs. The US-ASCII character set is defined by [18].

OCTET	=	<any 8-bit character>
CHAR	=	<any US-ASCII character (octets 0 - 127)>
UPALPHA	=	<any US-ASCII uppercase letter "A".."Z">
LOALPHA	=	<any US-ASCII lowercase letter "a".."z">
ALPHA	=	UPALPHA | LOALPHA
DIGIT	=	<any US-ASCII digit "0".."9">
CTL	=	<any US-ASCII control character
		(octets 0 - 31) and DEL (127)>
CR	=	<US-ASCII CR, carriage return (13)>
LF	=	<US-ASCII LF, linefeed (10)>
SP	=	<US-ASCII SP, space (32)>
HTAB	=	<US-ASCII HT, horizontal-tab (9)>
<">	=	<US-ASCII double-quote mark>

HTTP/1.0 defines the octet sequence CR LF as the end-of-line marker for all protocol elements except the Entity-Body (see Appendix C for tolerant applications). The end-of-line marker for an Entity-Body is defined by its associated media type, as described in Section 8.1.

CRLF	=	CR LF

HTTP/1.0 headers can be folded onto multiple lines if the continuation lines begin with linear whitespace characters. All linear whitespace (including folding) has the same semantics as SP.

LWS	=	[CRLF] 1*( SP | HTAB )

Many HTTP/1.0 header field values consist of words separated by LWS or special characters. These special characters must be in a quoted string to be used within a parameter value.

word	= token | quoted-string
token	= 1*<any CHAR except CTLs or tspecials>
tspecials	= "(" | ")" | "<" | ">" | "@"
	|	"," | ";" | ":" | "\" | <">
	|	"/" | "[" | "]" | "?" | "="
	|	SP | HTAB

A string of text is parsed as a single word if it is quoted using double-quote marks or angle brackets.

quoted-string	=	( <"> *(qdtext) <"> )
	|	( "<" *(qatext) ">" )
qdtext	= <any CHAR except <"> and CTLs,
		but including LWS>
qatext	= <any CHAR except "<", ">", and CTLs,
		but including LWS>

The text rule is only used for descriptive field contents. Words of *text may contain characters from character sets other than US-ASCII only when encoded according to the rules of RFC 1522 [13].

text	= <any OCTET except CTLs,
		but including LWS>

T. Berners-Lee, R. T. Fielding, H. Frystyk Nielsen - 12 MAR 95

[Next] [Previous] [Up] [Top] [Full Contents] [Search]

Generated with CERN WebMaker