HTTP Clients
This document describes how to implement an HTTP client, from deciding which method and resource to request, to processing the response and overcoming errors.
Select the HTTP version
Absent prior knowledge, use HTTP/1.1
. If this is too high for the HTTP server, it will produce an appropriate error, and the request can be retried.
Select a method
An HTTP request begins with some request to ask for information or manipulate the state of the server. The exact kind of request is specified by selecting an HTTP method:
- GET
- Request a representation and metadata of a resource
- HEAD
- Request only metadata about a resource
- POST
- Have a resource on the server act on or process a client-provided payload
- PUT
- Store a resource on the server at a specific URI
- DELETE
- Delete a specific resource on the server
- CONNECT
- Open a bi-directional communication channel
- OPTIONS
- Request a document about communication options, for cases where HEAD is insufficient
- TRACE
- Sends back the request message, used for debugging proxies/gateways
Select the request URI
All requests in HTTP contain some sort of request URI which identifies a resource on the server to be acted on. Resource URIs are gathered from several places, typically:
Another hypermedia document,
A server-provided pattern for generating URIs (such as an HTML form or a URI Template),
An external source pointing to an entry point, typically the homepage, or
A bookmark saved during a previous use.
Splitting the request-URI
For historical reasons, HTTP requires the request URI be split up into several component parts to generate the request.
For background, consult the Host header.
Two lines are used to form the effective Request URI: the request-line (the first line in the request), and the Host header.
HTTP allows requests with an absolute-form URI in the request-line, however this is not supported by very many servers, and is typically only used by proxies. That is, a request in absolute form typically indicates a request to make a proxied request, instead of an origin request the server can answer authoritatively.
Most requests will be made with the origin-form:
origin-form = absolute-path [ "?" query ]
For example:
GET /request-URI HTTP/1.1
Host: www.example.org
It is not possible to explicitly send the scheme being used HTTP/1.0 or HTTP/1.1 using the origin-form, however the server is usually able to detect the scheme being used. If your networked application needs to support responding to arbitrary URI schemes, consider supporting the absolute-form of requests in your server.
Host
The very first header should be the Host
header, since it is related to the request-URI and the request-line.
Optional Date header
The Date header is optional for request headers. It can be sent if it would convey useful information to the server.
Connection options
Connect header
Main article: the Connection header.
This is a hop-by-hop header: a header describing the connection details, instead of the message itself. This header, and any headers named by it, must be consumed.
List acceptable responses
Clients may specify the kinds of responses they'd prefer to receive, along a variety of dimensions:
Accept-Charset
Main article: Accept-Charset.
The client can list the kinds of character sets it supports, so that the server may send a variation the client is known to understand.
This is typically omitted by Web browsers; most servers will use UTF-8 which is suitable for virtually all uses.
Accept-Encoding
Main article: Accept-Encoding.
The client can list the kinds of content-codings it supports, for example, compressed streams.
Most clients will want to support Accept-Encoding: gzip, deflate
Accept-Language
Main article: Accept-Language.
The client can list the kinds of natural languages the user would prefer to receive.
TE
The client can list the kinds of Transfer-Encodings it allows (besides "chunked").
Authenticating
If the server requires authentication, the server may add headers to authenticate itself to the server. Additionally, the server may also perform authentication at the transport layer, for example with TLS certificates, or Unix domain sockets.
Proxy Authorization
Intermediate proxies may require their own authentication to use. Many of the HTTP headers available to authenticate to destination servers also have variants for authenticating to proxies, which will accept and remove the header before forwarding it to the destination server.
Page navigation data
Referer
The Referer
header (sic, a misspelling of "Referrer" that happens to save a byte over the wire) specifies the resource where the user-agent found and followed the request-URI for the current request.
Cookie
If the client chooses to, it may send state information to the server in a Cookie
header, if any has been sent from a Set-Cookie
header in a previous response.
The formatting of the Cookie header is different than other HTTP headers, so it must be treated as a special exemption in code.
Be very careful when choosing to relay cookies, this makes applications susceptible to ambient authority (confused deputy) attacks if not carefully designed.
User agent information
User-Agent
The User-Agent header describes for the server the software the client is running, for logging, analytics, or other purposes.
Sometimes servers use this header to determine what behavior the client supports. in these cases, you may need to modify the header to cajole the server to do the right thing, but avoid doing this unless absolutely necessary.
From
The From header field contains an Internet email address for a human user who controls the requesting user agent.
The header is not indended for typical usage. Use it when running an automated script or bot, so server owners can contact the owner in the event of misbehavior.
Related headers
Authorization/authentication headers and cookies can also be used to identify the person making the request, see those respective sections for more information.
Conditional Request Headers
In some cases, the client may want the server to only conditionally evaluate the request, for example, modify a resource only if it is is unchanged on the server, or download the resource only if it has changed since the last download. In these cases, use conditional headers.
If-Match
If-None-Match
If-Modified-Since
If-Unmodified-Since
If-Range
Encode the payload if any
Requests can have an attached document called the request message body. The meaning of this document varies with the definition of the method and the resource saved on the server. If a document is submitted, several headers control how to read the body and how to interpret it.
The specific meaning varies with the request method:
- GET
- Undefined
- HEAD
- Undefined
- POST
- Read by the server, frequently used to create a new resource
- PUT
- The document to be saved to the server
- DELETE
- Not allowed
- CONNECT
- Undefined
- OPTIONS
- Undefined
- TRACE
- Not allowed
The request body is typically only used for the PUT
and POST
methods.
Content-Type
Select a media type that the server will likely understand. The Content-Type header specifies the media type of the document, and therefore what sort of data can be extracted out of it.
Content-Encoding
The client can additionally encode the document before attaching it into the message body using a content coding; for example, compressing or encrypting the document. The Content-Encoding header specifies how the server will need to decode the body in order to arrive at the final document.
Content-Language
If the document is known to have a specific language, it can be conveyed in the Content-Language header.
Content-Location
If the document was requested earlier from a particular effective request URI, it can be provided here.
Set Expect header
If the client has reason to believe the upload may be too large for the server to accept, it can indicate this with an Expect header:
PUT /somewhere/fun HTTP/1.1
Host: origin.example.com
Content-Type: video/h264
Content-Length: 1234567890987
Expect: 100-continue
␍␊
Clients that send this must attach a request-body.
Clients should send the request body after a brief period of time without a response from the server (about a second), in the event of misimplementation or network problems.
If the client receives 417 Expectation Failed instead of 100 Continue, the server (or request path) does not support 100 Continue, and the request should be retried without using the 100-continue feature (remake the request attaching the entire request-body to the request).
Lookup request from cache
If a similar request has been made, the request might be able to be served from cache.
Several headers specify if the response is cachable:
- Age
- Cache-Control
- Expires
- Pragma
- Warning
Parsing the Response
At this point we wait for the response from the server.
Parse zero or more 1xx responses
HTTP/1.1 and above defines the 1xx status code class, which is an informational header sent before the final status code, that can encode information for the client before the final response has been generated or decided on.
Handle status code
How to handle the response is determined by the status code. In most cases, the server will respond 200 (OK), indicating the response may be used as expected. Other responses may necessitate reporting an error, retrying after some time, or changing and re-issuing the request to the server.
Compute the Caching Key for response
A caching key can be computed to determine if another future request can be served from a cache, instead of the origin server.
This is derived from an algorithm that takes into account the effective request-URI (see above), and several headers specified in the Vary header of the response.
Parse Set-Cookie header
The Set-Cookie
header follows a nonstandard syntax that cannot be folded the same way as other headers can. Even if you don't intend on storing cookies, note that the Set-Cookie header requires special processing.
Store Session Data
The contents of the Set-Cookie
header can be stored for subsequent requests.
Determine if there is a response-body
If there's a response-body, decode Content-Encoding
If one or more encodings have been applied to a representation, the sender that applied the encodings MUST generate a Content-Encoding header field that lists the content codings in the order in which they were applied.
Read language of request
HTTP allows the sender to identify the primary language of the entity body (request body or response body) using the Content-Language
header.
Determine Content-Type
The Content-Type
header determines which media type will be used to decode the entity-body.
Generally for every media type, there is a single specification that specifies how to parse and use the document.
Read response body
The presence of a message body in a request is signaled by a Content-Length or Transfer-Encoding header field.
The number of bytes to read depends on the request method, the response code, and headers present:
- If the request is a HEAD request, there is no response body. Response length information, if present, describes how long an otherwise identical GET request would have been.
- A CONNECT request becomes a tunnel immediately after the response headers are written, as there is no concept of a response body.
- If the outermost applied Transfer-Encoding (if any) is "chunked", the response body ends when the chunked parsing does.
- If there is any other Transfer-Encoding, the message can only be ended when the server closes the connection.
- If there is a single Content-Length header (or multiple identical values), then read that many bytes.
- If this is an HTTP/1.0 request, or if the response specifies
Connection: close
, read until the server closes the connection. At this step, there is no way to distinguish a connection error from the end of the document. - Otherwise, raise an error that there is no reliable way to determine when the response ends.