Python request module is a simple and elegant Python HTTP library. It provides methods for accessing Web resources via HTTP. In the following article, we will use the HTTP GET method in the Request module. This method requests data from the server and the Exception handling comes in handy when the response is not successful. Here, we will go through such situations. We will use Python’s try and except functionality to explore the exceptions that arise from the Requests module.
- url: Returns the URL of the response
- raise_for_status(): If an error occur, this method returns a HTTPError object
- request: Returns the request object that requested this response
- status_code: Returns a number that indicates the status (200 is OK, 404 is Not Found)
Successful Connection Request
The first thing to know is that the response code is 200 if the request is successful.
Python3
Output:
200
Exception Handling for HTTP Errors
Here, we tried the following URL sequence and then passed this variable to the Python requests module using raised_for_status(). If the try part is successful, we will get the response code 200, if the page that we requested doesn’t exist. This is an HTTP error, which was handled by the Request module’s exception HTTPError and you probably got the error 404.
Python3
import
requests
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.HTTPError as errh:
print
(
"HTTP Error"
)
print
(errh.args[
0
])
print
(r)
Output:
HTTP Error 404 Client Error: Not Found for url: https://www.amazon.com/nothing_here <Response [404]>
General Exception Handling
You could also use a general exception from the Request module. That is requests.exceptions.RequestException.
Python3
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.RequestException as errex:
print
(
"Exception request"
)
Output:
Exception request
Now, you may have noticed that there is an argument ‘timeout’ passed into the Request module. We could prescribe a time limit for the requested connection to respond. If this has not happened, we could catch that using the exception requests.exceptions.ReadTimeout. To demonstrate this let us find a website that responds successfully.
Python3
import
requests
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
print
(r)
Output:
<Response [200]>
If we change timeout = 0.01, the same code would return, because the request could not possibly be that fast.
Time out <Response [200]>
Exception Handling for Missing Schema
Another common error is that we might not specify HTTPS or HTTP in the URL. For example, We cause use requests.exceptions.MissingSchema to catch this exception.
Python3
url
=
"www.google.com"
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.MissingSchema as errmiss:
print
(
"Missing schema: include http or https"
)
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
Output:
Missing scheme: include http or https
Exception Handling for Connection Error
Let us say that there is a site that doesn’t exist. Here, the error will occur even when you can’t make a connection because of the lack of an internet connection
Python3
try
:
r
=
requests.get(url, timeout
=
1
, verify
=
True
)
r.raise_for_status()
except
requests.exceptions.HTTPError as errh:
print
(
"HTTP Error"
)
print
(errh.args[
0
])
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
except
requests.exceptions.ConnectionError as conerr:
print
(
"Connection error"
)
Output:
Connection error
Putting Everything Together
Here, We put together everything we tried so far the idea is that the exceptions are handled according to the specificity.
For example, url = “https://www.gle.com”, When this code is run for this URL will produce an Exception request. Whereas, In the absence of connection requests.exceptions.ConnectionError will print the Connection Error, and when the connection is not made the general exception is handled by requests.exceptions.RequestException.
Python3
try
:
r
=
requests.get(url, timeout
=
1
, verify
=
True
)
r.raise_for_status()
except
requests.exceptions.HTTPError as errh:
print
(
"HTTP Error"
)
print
(errh.args[
0
])
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
except
requests.exceptions.ConnectionError as conerr:
print
(
"Connection error"
)
except
requests.exceptions.RequestException as errex:
print
(
"Exception request"
)
Output:
Note: The output may change according to requests.
Time out
Содержание
- Developer Interface¶
- Main Interface¶
- Exceptions¶
- Request Sessions¶
- Lower-Level Classes¶
- Lower-Lower-Level Classes¶
- Authentication¶
- Encodings¶
- Cookies¶
- Status Code Lookup¶
- Migrating to 1.x¶
- API Changes¶
- Licensing¶
- Migrating to 2.x¶
- API Changes¶
- Behavioural Changes¶
- Stay Informed
Developer Interface¶
This part of the documentation covers all the interfaces of Requests. For parts where Requests depends on external libraries, we document the most important right here and provide links to the canonical documentation.
Main Interface¶
All of Requests’ functionality can be accessed by these 7 methods. They all return an instance of the Response object.
Constructs and sends a Request .
Sends a HEAD request.
Parameters: |
|
---|---|
Returns: |
Parameters: |
|
---|---|
Returns: |
requests. get ( url, params=None, **kwargs ) [source] ¶
Sends a GET request.
Parameters: |
|
---|---|
Returns: |
requests. post ( url, data=None, json=None, **kwargs ) [source] ¶
Sends a POST request.
Parameters: |
|
---|---|
Returns: |
requests. put ( url, data=None, **kwargs ) [source] ¶
Sends a PUT request.
Parameters: |
|
---|---|
Returns: |
requests. patch ( url, data=None, **kwargs ) [source] ¶
Sends a PATCH request.
Parameters: |
|
---|---|
Returns: |
requests. delete ( url, **kwargs ) [source] ¶
Sends a DELETE request.
Exceptions¶
There was an ambiguous exception that occurred while handling your request.
exception requests. ConnectionError ( *args, **kwargs ) [source] ¶
A Connection error occurred.
exception requests. HTTPError ( *args, **kwargs ) [source] ¶
An HTTP error occurred.
exception requests. URLRequired ( *args, **kwargs ) [source] ¶
A valid URL is required to make a request.
exception requests. TooManyRedirects ( *args, **kwargs ) [source] ¶
Too many redirects.
exception requests. ConnectTimeout ( *args, **kwargs ) [source] ¶
The request timed out while trying to connect to the remote server.
Requests that produced this error are safe to retry.
exception requests. ReadTimeout ( *args, **kwargs ) [source] ¶
The server did not send any data in the allotted amount of time.
The request timed out.
Catching this error will catch both ConnectTimeout and ReadTimeout errors.
Request Sessions¶
A Requests session.
Provides cookie persistence, connection-pooling, and configuration.
Or as a context manager:
Default Authentication tuple or object to attach to Request .
SSL client certificate default, if String, path to ssl client cert file (.pem). If Tuple, (‘cert’, ‘key’) pair.
Closes all adapters and as such the session
A CookieJar containing all currently outstanding cookies set on this session. By default it is a RequestsCookieJar , but may be any other cookielib.CookieJar compatible object.
Sends a DELETE request. Returns Response object.
Parameters: |
|
---|---|
Returns: |
Parameters: |
|
---|---|
Return type: |
get ( url, **kwargs ) [source] ¶
Sends a GET request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
get_adapter ( url ) [source] ¶
Returns the appropriate connection adapter for the given URL.
Return type: | requests.adapters.BaseAdapter |
---|
get_redirect_target ( resp ) ¶
Receives a Response. Returns a redirect URI or None
Sends a HEAD request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
headers = None¶
A case-insensitive dictionary of headers to be sent on each Request sent from this Session .
Maximum number of redirects allowed. If the request exceeds this limit, a TooManyRedirects exception is raised. This defaults to requests.models.DEFAULT_REDIRECT_LIMIT, which is 30.
Check the environment and merge it with some settings.
Return type: | dict |
---|
mount ( prefix, adapter ) [source] ¶
Registers a connection adapter to a prefix.
Adapters are sorted in descending order by prefix length.
Sends a OPTIONS request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
params = None¶
Dictionary of querystring data to attach to each Request . The dictionary values may be lists for representing multivalued query parameters.
Sends a PATCH request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
post ( url, data=None, json=None, **kwargs ) [source] ¶
Sends a POST request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
prepare_request ( request ) [source] ¶
Constructs a PreparedRequest for transmission and returns it. The PreparedRequest has settings merged from the Request instance and those of the Session .
Parameters: | request – Request instance to prepare with this session’s settings. |
---|---|
Return type: | requests.PreparedRequest |
proxies = None¶
Dictionary mapping protocol or protocol and host to the URL of the proxy (e.g. <‘http’: ‘foo.bar:3128’, ‘http://host.name’: ‘foo.bar:4012’>) to be used on each Request .
Sends a PUT request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
rebuild_auth ( prepared_request, response ) ¶
When being redirected we may want to strip authentication from the request to avoid leaking credentials. This method intelligently removes and reapplies authentication where possible to avoid credential loss.
rebuild_method ( prepared_request, response ) ¶
When being redirected we may want to change the method of the request based on certain specs or browser behavior.
rebuild_proxies ( prepared_request, proxies ) ¶
This method re-evaluates the proxy configuration by considering the environment variables. If we are redirected to a URL covered by NO_PROXY, we strip the proxy configuration. Otherwise, we set missing proxy keys for this URL (in case they were stripped by a previous redirect).
This method also replaces the Proxy-Authorization header where necessary.
Return type: | dict |
---|
request ( method, url, params=None, data=None, headers=None, cookies=None, files=None, auth=None, timeout=None, allow_redirects=True, proxies=None, hooks=None, stream=None, verify=None, cert=None, json=None ) [source] ¶
Constructs a Request , prepares it and sends it. Returns Response object.
Parameters: |
|
---|---|
Return type: |
resolve_redirects ( resp, req, stream=False, timeout=None, verify=True, cert=None, proxies=None, yield_requests=False, **adapter_kwargs ) ¶
Receives a Response. Returns a generator of Responses or Requests.
Send a given PreparedRequest.
Return type: | requests.Response |
---|
should_strip_auth ( old_url, new_url ) ¶
Decide whether Authorization header should be removed when redirecting
Stream response content default.
Trust environment settings for proxy configuration, default authentication and similar.
SSL Verification default.
Lower-Level Classes¶
A user-created Request object.
Used to prepare a PreparedRequest , which is sent to the server.
Parameters: |
|
---|
Deregister a previously registered hook. Returns True if the hook existed, False if not.
Constructs a PreparedRequest for transmission and returns it.
Properly register a hook.
class requests. Response [source] ¶
The Response object, which contains a server’s response to an HTTP request.
The apparent encoding, provided by the chardet library.
Releases the connection back to the pool. Once this method has been called the underlying raw object must not be accessed again.
Note: Should not normally need to be called explicitly.
Content of the response, in bytes.
A CookieJar of Cookies the server sent back.
The amount of time elapsed between sending the request and the arrival of the response (as a timedelta). This property specifically measures the time taken between sending the first byte of the request and finishing parsing the headers. It is therefore unaffected by consuming the response content or the value of the stream keyword argument.
Encoding to decode with when accessing r.text.
Case-insensitive Dictionary of Response Headers. For example, headers[‘content-encoding’] will return the value of a ‘Content-Encoding’ response header.
A list of Response objects from the history of the Request. Any redirect responses will end up here. The list is sorted from the oldest to the most recent request.
True if this Response one of the permanent versions of redirect.
True if this Response is a well-formed HTTP redirect that could have been processed automatically (by Session.resolve_redirects ).
iter_content ( chunk_size=1, decode_unicode=False ) [source] ¶
Iterates over the response data. When stream=True is set on the request, this avoids reading the content at once into memory for large responses. The chunk size is the number of bytes it should read into memory. This is not necessarily the length of each item returned as decoding can take place.
chunk_size must be of type int or None. A value of None will function differently depending on the value of stream . stream=True will read data as it arrives in whatever size the chunks are received. If stream=False, data is returned as a single chunk.
If decode_unicode is True, content will be decoded using the best available encoding based on the response.
iter_lines ( chunk_size=512, decode_unicode=False, delimiter=None ) [source] ¶
Iterates over the response data, one line at a time. When stream=True is set on the request, this avoids reading the content at once into memory for large responses.
This method is not reentrant safe.
Returns the json-encoded content of a response, if any.
Parameters: | **kwargs – Optional arguments that json.loads takes. |
---|---|
Raises: | ValueError – If the response body does not contain valid json. |
links ¶
Returns the parsed header links of the response, if any.
Returns a PreparedRequest for the next request in a redirect chain, if there is one.
Returns True if status_code is less than 400, False if not.
This attribute checks if the status code of the response is between 400 and 600 to see if there was a client error or a server error. If the status code is between 200 and 400, this will return True. This is not a check to see if the response code is 200 OK .
Raises stored HTTPError , if one occurred.
Textual reason of responded HTTP Status, e.g. “Not Found” or “OK”.
The PreparedRequest object to which this is a response.
Integer Code of responded HTTP Status, e.g. 404 or 200.
Content of the response, in unicode.
If Response.encoding is None, encoding will be guessed using chardet .
The encoding of the response content is determined based solely on HTTP headers, following RFC 2616 to the letter. If you can take advantage of non-HTTP knowledge to make a better guess at the encoding, you should set r.encoding appropriately before accessing this property.
Final URL location of Response.
Lower-Lower-Level Classes¶
The fully mutable PreparedRequest object, containing the exact bytes that will be sent to the server.
Generated from either a Request object or manually.
request body to send to the server.
deregister_hook ( event, hook ) ¶
Deregister a previously registered hook. Returns True if the hook existed, False if not.
dictionary of HTTP headers.
dictionary of callback hooks, for internal usage.
HTTP verb to send to the server.
Build the path URL to use.
Prepares the entire request with the given parameters.
Prepares the given HTTP auth data.
Prepares the given HTTP body data.
Prepare Content-Length header based on request method and body
Prepares the given HTTP cookie data.
This function eventually generates a Cookie header from the given cookies using cookielib. Due to cookielib’s design, the header will not be regenerated if it already exists, meaning this function can only be called once for the life of the PreparedRequest object. Any subsequent calls to prepare_cookies will have no actual effect, unless the “Cookie” header is removed beforehand.
Prepares the given HTTP headers.
Prepares the given hooks.
Prepares the given HTTP method.
Prepares the given HTTP URL.
Properly register a hook.
HTTP URL to send the request to.
class requests.adapters. BaseAdapter [source] ¶
The Base Transport Adapter
Cleans up adapter specific items.
Sends PreparedRequest object. Returns Response object.
Parameters: |
|
---|
class requests.adapters. HTTPAdapter ( pool_connections=10, pool_maxsize=10, max_retries=0, pool_block=False ) [source]¶
The built-in HTTP Adapter for urllib3.
Provides a general-case interface for Requests sessions to contact HTTP and HTTPS urls by implementing the Transport Adapter interface. This class will usually be created by the Session class under the covers.
Parameters: |
|
---|
Add any headers needed by the connection. As of v2.0 this does nothing by default, but is left for overriding by users that subclass the HTTPAdapter .
This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|
build_response ( req, resp ) [source]¶
Builds a Response object from a urllib3 response. This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter
Parameters: |
|
---|---|
Return type: |
cert_verify ( conn, url, verify, cert ) [source] ¶
Verify a SSL certificate. This method should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|
close ( ) [source]¶
Disposes of any internal state.
Currently, this closes the PoolManager and any active ProxyManager, which closes any pooled connections.
Returns a urllib3 connection for the given URL. This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|---|
Return type: |
init_poolmanager ( connections, maxsize, block=False, **pool_kwargs ) [source] ¶
Initializes a urllib3 PoolManager.
This method should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|
proxy_headers ( proxy ) [source]¶
Returns a dictionary of the headers to add to any request sent through a proxy. This works with urllib3 magic to ensure that they are correctly sent to the proxy, rather than in a tunnelled request if CONNECT is being used.
This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: | proxy – The url of the proxy being used for this request. |
---|---|
Return type: | dict |
proxy_manager_for ( proxy, **proxy_kwargs ) [source] ¶
Return urllib3 ProxyManager for the given proxy.
This method should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|---|
Returns: |
request_url ( request, proxies ) [source] ¶
Obtain the url to use when making the final request.
If the message is being sent through a HTTP proxy, the full URL has to be used. Otherwise, we should only use the path portion of the URL.
This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|---|
Return type: |
send ( request, stream=False, timeout=None, verify=True, cert=None, proxies=None ) [source] ¶
Sends PreparedRequest object. Returns Response object.
Authentication¶
Base class that all auth implementations derive from
class requests.auth. HTTPBasicAuth ( username, password ) [source] ¶
Attaches HTTP Basic Authentication to the given Request object.
class requests.auth. HTTPDigestAuth ( username, password ) [source] ¶
Attaches HTTP Digest Authentication to the given Request object.
Encodings¶
Returns encodings from given content string.
Parameters: |
|
---|---|
Return type: |
Parameters: | content – bytestring to extract encodings from. |
---|
requests.utils. get_encoding_from_headers ( headers ) [source] ¶
Returns encodings from given HTTP Header Dict.
Parameters: | headers – dictionary to extract encoding from. |
---|---|
Return type: | str |
requests.utils. get_unicode_from_response ( r ) [source] ¶
Returns the requested content back in unicode.
Parameters: | r – Response object to get unicode content from. |
---|
- charset from content-type
- fall back and replace all unicode characters
Return type: | str |
---|
Cookies¶
Returns a key/value dictionary from a CookieJar.
Parameters: | cj – CookieJar object to extract cookies from. |
---|---|
Return type: | dict |
requests.utils. add_dict_to_cookiejar ( cj, cookie_dict ) [source] ¶
Returns a CookieJar from a key/value dictionary.
Parameters: |
|
---|---|
Return type: |
requests.cookies. cookiejar_from_dict ( cookie_dict, cookiejar=None, overwrite=True ) [source] ¶
Returns a CookieJar from a key/value dictionary.
Parameters: |
|
---|---|
Return type: |
class requests.cookies. RequestsCookieJar ( policy=None ) [source] ¶
Compatibility class; is a cookielib.CookieJar, but exposes a dict interface.
This is the CookieJar we create by default for requests and sessions that don’t specify one, since some clients may expect response.cookies and session.cookies to support dict operations.
Requests does not use the dict interface internally; it’s just for compatibility with external client code. All requests code should work out of the box with externally provided instances of CookieJar , e.g. LWPCookieJar and FileCookieJar .
Unlike a regular CookieJar, this class is pickleable.
dictionary operations that are normally O(1) may be O(n).
Add correct Cookie: header to request (urllib.request.Request object).
The Cookie2 header is also added unless policy.hide_cookie2 is true.
Clear some cookies.
Invoking this method without arguments will clear all cookies. If given a single argument, only cookies belonging to that domain will be removed. If given two arguments, cookies belonging to the specified path within that domain are removed. If given three arguments, then the cookie with the specified name, path and domain is removed.
Raises KeyError if no matching cookie exists.
Discard all expired cookies.
You probably don’t need to call this method: expired cookies are never sent back to the server (provided you’re using DefaultCookiePolicy), this method is called by CookieJar itself every so often, and the .save() method won’t save expired cookies anyway (unless you ask otherwise by passing a true ignore_expires argument).
Discard all session cookies.
Note that the .save() method won’t save session cookies anyway, unless you ask otherwise by passing a true ignore_discard argument.
Return a copy of this RequestsCookieJar.
extract_cookies ( response, request ) ¶
Extract cookies from response, where allowable given the request.
Dict-like get() that also supports optional domain and path args in order to resolve naming collisions from using one cookie jar over multiple domains.
operation is O(n), not O(1).
Takes as an argument an optional domain and path and returns a plain old Python dict of name-value pairs of cookies that meet the requirements.
Return type: | dict |
---|
get_policy ( ) [source] ¶
Return the CookiePolicy instance used.
Dict-like items() that returns a list of name-value tuples from the jar. Allows client-code to call dict(RequestsCookieJar) and get a vanilla python dict of key value pairs.
keys() and values().
Dict-like iteritems() that returns an iterator of name-value tuples from the jar.
iterkeys() and itervalues().
Dict-like iterkeys() that returns an iterator of names of cookies from the jar.
itervalues() and iteritems().
Dict-like itervalues() that returns an iterator of values of cookies from the jar.
iterkeys() and iteritems().
Dict-like keys() that returns a list of names of cookies from the jar.
values() and items().
Utility method to list all the domains in the jar.
Utility method to list all the paths in the jar.
make_cookies ( response, request ) ¶
Return sequence of Cookie objects extracted from response object.
Returns True if there are multiple domains in the jar. Returns False otherwise.
Return type: | bool |
---|
pop ( k [ , d ] ) → v, remove specified key and return the corresponding value.¶
If key is not found, d is returned if given, otherwise KeyError is raised.
popitem ( ) → (k, v), remove and return some (key, value) pair¶
as a 2-tuple; but raise KeyError if D is empty.
Dict-like set() that also supports optional domain and path args in order to resolve naming collisions from using one cookie jar over multiple domains.
Set a cookie, without checking whether or not it should be set.
set_cookie_if_ok ( cookie, request ) ¶
Set a cookie if policy says it’s OK to do so.
setdefault ( k [ , d ] ) → D.get(k,d), also set D[k]=d if k not in D¶ update ( other ) [source] ¶
Updates this jar with cookies from another CookieJar or dict-like
Dict-like values() that returns a list of values of cookies from the jar.
There are two cookies that meet the criteria specified in the cookie jar. Use .get and .set and include domain and path args in order to be more specific.
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
Status Code Lookup¶
The codes object defines a mapping from common names for HTTP statuses to their numerical codes, accessible either as attributes or as dictionary items.
Some codes have multiple names, and both upper- and lower-case versions of the names are allowed. For example, codes.ok , codes.OK , and codes.okay all correspond to the HTTP status code 200.
- 100: continue
- 101: switching_protocols
- 102: processing
- 103: checkpoint
- 122: uri_too_long , request_uri_too_long
- 200: ok , okay , all_ok , all_okay , all_good , o/ , ✓
- 201: created
- 202: accepted
- 203: non_authoritative_info , non_authoritative_information
- 204: no_content
- 205: reset_content , reset
- 206: partial_content , partial
- 207: multi_status , multiple_status , multi_stati , multiple_stati
- 208: already_reported
- 226: im_used
- 300: multiple_choices
- 301: moved_permanently , moved , o-
- 302: found
- 303: see_other , other
- 304: not_modified
- 305: use_proxy
- 306: switch_proxy
- 307: temporary_redirect , temporary_moved , temporary
- 308: permanent_redirect , resume_incomplete , resume
- 400: bad_request , bad
- 401: unauthorized
- 402: payment_required , payment
- 403: forbidden
- 404: not_found , -o-
- 405: method_not_allowed , not_allowed
- 406: not_acceptable
- 407: proxy_authentication_required , proxy_auth , proxy_authentication
- 408: request_timeout , timeout
- 409: conflict
- 410: gone
- 411: length_required
- 412: precondition_failed , precondition
- 413: request_entity_too_large
- 414: request_uri_too_large
- 415: unsupported_media_type , unsupported_media , media_type
- 416: requested_range_not_satisfiable , requested_range , range_not_satisfiable
- 417: expectation_failed
- 418: im_a_teapot , teapot , i_am_a_teapot
- 421: misdirected_request
- 422: unprocessable_entity , unprocessable
- 423: locked
- 424: failed_dependency , dependency
- 425: unordered_collection , unordered
- 426: upgrade_required , upgrade
- 428: precondition_required , precondition
- 429: too_many_requests , too_many
- 431: header_fields_too_large , fields_too_large
- 444: no_response , none
- 449: retry_with , retry
- 450: blocked_by_windows_parental_controls , parental_controls
- 451: unavailable_for_legal_reasons , legal_reasons
- 499: client_closed_request
- 500: internal_server_error , server_error , /o , ✗
- 501: not_implemented
- 502: bad_gateway
- 503: service_unavailable , unavailable
- 504: gateway_timeout
- 505: http_version_not_supported , http_version
- 506: variant_also_negotiates
- 507: insufficient_storage
- 509: bandwidth_limit_exceeded , bandwidth
- 510: not_extended
- 511: network_authentication_required , network_auth , network_authentication
Migrating to 1.x¶
This section details the main differences between 0.x and 1.x and is meant to ease the pain of upgrading.
API Changes¶
Response.json is now a callable and not a property of a response.
The Session API has changed. Sessions objects no longer take parameters. Session is also now capitalized, but it can still be instantiated with a lowercase session for backwards compatibility.
All request hooks have been removed except ‘response’.
Authentication helpers have been broken out into separate modules. See requests-oauthlib and requests-kerberos.
The parameter for streaming requests was changed from prefetch to stream and the logic was inverted. In addition, stream is now required for raw response reading.
The config parameter to the requests method has been removed. Some of these options are now configured on a Session such as keep-alive and maximum number of redirects. The verbosity option should be handled by configuring logging.
Licensing¶
One key difference that has nothing to do with the API is a change in the license from the ISC license to the Apache 2.0 license. The Apache 2.0 license ensures that contributions to Requests are also covered by the Apache 2.0 license.
Migrating to 2.x¶
Compared with the 1.0 release, there were relatively few backwards incompatible changes, but there are still a few issues to be aware of with this major release.
For more details on the changes in this release including new APIs, links to the relevant GitHub issues and some of the bug fixes, read Cory’s blog on the subject.
API Changes¶
There were a couple changes to how Requests handles exceptions. RequestException is now a subclass of IOError rather than RuntimeError as that more accurately categorizes the type of error. In addition, an invalid URL escape sequence now raises a subclass of RequestException rather than a ValueError .
Lastly, httplib.IncompleteRead exceptions caused by incorrect chunked encoding will now raise a Requests ChunkedEncodingError instead.
The proxy API has changed slightly. The scheme for a proxy URL is now required.
Behavioural Changes¶
- Keys in the headers dictionary are now native strings on all Python versions, i.e. bytestrings on Python 2 and unicode on Python 3. If the keys are not native strings (unicode on Python 2 or bytestrings on Python 3) they will be converted to the native string type assuming UTF-8 encoding.
- Values in the headers dictionary should always be strings. This has been the project’s position since before 1.0 but a recent change (since version 2.11.0) enforces this more strictly. It’s advised to avoid passing header values as unicode when possible.
Requests is an elegant and simple HTTP library for Python, built for human beings. You are currently looking at the documentation of the development release.
Stay Informed
Receive updates on new releases and upcoming projects.
Источник
Source code for requests.exceptions
# -*- coding: utf-8 -*- """ requests.exceptions ~~~~~~~~~~~~~~~~~~~ This module contains the set of Requests' exceptions. """ from urllib3.exceptions import HTTPError as BaseHTTPError[docs]class RequestException(IOError): """There was an ambiguous exception that occurred while handling your request. """ def __init__(self, *args, **kwargs): """Initialize RequestException with `request` and `response` objects.""" response = kwargs.pop('response', None) self.response = response self.request = kwargs.pop('request', None) if (response is not None and not self.request and hasattr(response, 'request')): self.request = self.response.request super(RequestException, self).__init__(*args, **kwargs)
[docs]class HTTPError(RequestException): """An HTTP error occurred."""
[docs]class ConnectionError(RequestException): """A Connection error occurred."""
class ProxyError(ConnectionError): """A proxy error occurred.""" class SSLError(ConnectionError): """An SSL error occurred."""[docs]class Timeout(RequestException): """The request timed out. Catching this error will catch both :exc:`~requests.exceptions.ConnectTimeout` and :exc:`~requests.exceptions.ReadTimeout` errors. """
[docs]class ConnectTimeout(ConnectionError, Timeout): """The request timed out while trying to connect to the remote server. Requests that produced this error are safe to retry. """
[docs]class ReadTimeout(Timeout): """The server did not send any data in the allotted amount of time."""
[docs]class URLRequired(RequestException): """A valid URL is required to make a request."""
[docs]class TooManyRedirects(RequestException): """Too many redirects."""
class MissingSchema(RequestException, ValueError): """The URL schema (e.g. http or https) is missing.""" class InvalidSchema(RequestException, ValueError): """See defaults.py for valid schemas.""" class InvalidURL(RequestException, ValueError): """The URL provided was somehow invalid.""" class InvalidHeader(RequestException, ValueError): """The header value provided was somehow invalid.""" class InvalidProxyURL(InvalidURL): """The proxy URL provided is invalid.""" class ChunkedEncodingError(RequestException): """The server declared chunked encoding but sent an invalid chunk.""" class ContentDecodingError(RequestException, BaseHTTPError): """Failed to decode response content""" class StreamConsumedError(RequestException, TypeError): """The content for this response was already consumed""" class RetryError(RequestException): """Custom retries logic failed""" class UnrewindableBodyError(RequestException): """Requests encountered an error when trying to rewind a body""" # Warnings class RequestsWarning(Warning): """Base warning for Requests.""" pass class FileModeWarning(RequestsWarning, DeprecationWarning): """A file was opened in text mode, but Requests determined its binary length.""" pass class RequestsDependencyWarning(RequestsWarning): """An imported dependency doesn't match the expected version range.""" pass
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Making HTTP Requests With Python
The requests
library is the de facto standard for making HTTP requests in Python. It abstracts the complexities of making requests behind a beautiful, simple API so that you can focus on interacting with services and consuming data in your application.
Throughout this article, you’ll see some of the most useful features that requests
has to offer as well as how to customize and optimize those features for different situations you may come across. You’ll also learn how to use requests
in an efficient way as well as how to prevent requests to external services from slowing down your application.
In this tutorial, you’ll learn how to:
- Make requests using the most common HTTP methods
- Customize your requests’ headers and data, using the query string and message body
- Inspect data from your requests and responses
- Make authenticated requests
- Configure your requests to help prevent your application from backing up or slowing down
Though I’ve tried to include as much information as you need to understand the features and examples included in this article, I do assume a very basic general knowledge of HTTP. That said, you still may be able to follow along fine anyway.
Now that that is out of the way, let’s dive in and see how you can use requests
in your application!
Getting Started With requests
Let’s begin by installing the requests
library. To do so, run the following command:
If you prefer to use Pipenv for managing Python packages, you can run the following:
$ pipenv install requests
Once requests
is installed, you can use it in your application. Importing requests
looks like this:
Now that you’re all set up, it’s time to begin your journey through requests
. Your first goal will be learning how to make a GET
request.
The GET Request
HTTP methods such as GET
and POST
, determine which action you’re trying to perform when making an HTTP request. Besides GET
and POST
, there are several other common methods that you’ll use later in this tutorial.
One of the most common HTTP methods is GET
. The GET
method indicates that you’re trying to get or retrieve data from a specified resource. To make a GET
request, invoke requests.get()
.
To test this out, you can make a GET
request to GitHub’s Root REST API by calling get()
with the following URL:
>>>
>>> requests.get('https://api.github.com')
<Response [200]>
Congratulations! You’ve made your first request. Let’s dive a little deeper into the response of that request.
The Response
A Response
is a powerful object for inspecting the results of the request. Let’s make that same request again, but this time store the return value in a variable so that you can get a closer look at its attributes and behaviors:
>>>
>>> response = requests.get('https://api.github.com')
In this example, you’ve captured the return value of get()
, which is an instance of Response
, and stored it in a variable called response
. You can now use response
to see a lot of information about the results of your GET
request.
Status Codes
The first bit of information that you can gather from Response
is the status code. A status code informs you of the status of the request.
For example, a 200 OK
status means that your request was successful, whereas a 404 NOT FOUND
status means that the resource you were looking for was not found. There are many other possible status codes as well to give you specific insights into what happened with your request.
By accessing .status_code
, you can see the status code that the server returned:
>>>
>>> response.status_code
200
.status_code
returned a 200
, which means your request was successful and the server responded with the data you were requesting.
Sometimes, you might want to use this information to make decisions in your code:
if response.status_code == 200:
print('Success!')
elif response.status_code == 404:
print('Not Found.')
With this logic, if the server returns a 200
status code, your program will print Success!
. If the result is a 404
, your program will print Not Found
.
requests
goes one step further in simplifying this process for you. If you use a Response
instance in a conditional expression, it will evaluate to True
if the status code was between 200
and 400
, and False
otherwise.
Therefore, you can simplify the last example by rewriting the if
statement:
if response:
print('Success!')
else:
print('An error has occurred.')
Keep in mind that this method is not verifying that the status code is equal to 200
. The reason for this is that other status codes within the 200
to 400
range, such as 204 NO CONTENT
and 304 NOT MODIFIED
, are also considered successful in the sense that they provide some workable response.
For example, the 204
tells you that the response was successful, but there’s no content to return in the message body.
So, make sure you use this convenient shorthand only if you want to know if the request was generally successful and then, if necessary, handle the response appropriately based on the status code.
Let’s say you don’t want to check the response’s status code in an if
statement. Instead, you want to raise an exception if the request was unsuccessful. You can do this using .raise_for_status()
:
import requests
from requests.exceptions import HTTPError
for url in ['https://api.github.com', 'https://api.github.com/invalid']:
try:
response = requests.get(url)
# If the response was successful, no Exception will be raised
response.raise_for_status()
except HTTPError as http_err:
print(f'HTTP error occurred: {http_err}') # Python 3.6
except Exception as err:
print(f'Other error occurred: {err}') # Python 3.6
else:
print('Success!')
If you invoke .raise_for_status()
, an HTTPError
will be raised for certain status codes. If the status code indicates a successful request, the program will proceed without that exception being raised.
Now, you know a lot about how to deal with the status code of the response you got back from the server. However, when you make a GET
request, you rarely only care about the status code of the response. Usually, you want to see more. Next, you’ll see how to view the actual data that the server sent back in the body of the response.
Content
The response of a GET
request often has some valuable information, known as a payload, in the message body. Using the attributes and methods of Response
, you can view the payload in a variety of different formats.
To see the response’s content in bytes
, you use .content
:
>>>
>>> response = requests.get('https://api.github.com')
>>> response.content
b'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}'
While .content
gives you access to the raw bytes of the response payload, you will often want to convert them into a string using a character encoding such as UTF-8. response
will do that for you when you access .text
:
>>>
>>> response.text
'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}'
Because the decoding of bytes
to a str
requires an encoding scheme, requests
will try to guess the encoding based on the response’s headers if you do not specify one. You can provide an explicit encoding by setting .encoding
before accessing .text
:
>>>
>>> response.encoding = 'utf-8' # Optional: requests infers this internally
>>> response.text
'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}'
If you take a look at the response, you’ll see that it is actually serialized JSON content. To get a dictionary, you could take the str
you retrieved from .text
and deserialize it using json.loads()
. However, a simpler way to accomplish this task is to use .json()
:
>>>
>>> response.json()
{'current_user_url': 'https://api.github.com/user', 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}', 'authorizations_url': 'https://api.github.com/authorizations', 'code_search_url': 'https://api.github.com/search/code?q={query}{&page,per_page,sort,order}', 'commit_search_url': 'https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}', 'emails_url': 'https://api.github.com/user/emails', 'emojis_url': 'https://api.github.com/emojis', 'events_url': 'https://api.github.com/events', 'feeds_url': 'https://api.github.com/feeds', 'followers_url': 'https://api.github.com/user/followers', 'following_url': 'https://api.github.com/user/following{/target}', 'gists_url': 'https://api.github.com/gists{/gist_id}', 'hub_url': 'https://api.github.com/hub', 'issue_search_url': 'https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}', 'issues_url': 'https://api.github.com/issues', 'keys_url': 'https://api.github.com/user/keys', 'notifications_url': 'https://api.github.com/notifications', 'organization_repositories_url': 'https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}', 'organization_url': 'https://api.github.com/orgs/{org}', 'public_gists_url': 'https://api.github.com/gists/public', 'rate_limit_url': 'https://api.github.com/rate_limit', 'repository_url': 'https://api.github.com/repos/{owner}/{repo}', 'repository_search_url': 'https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}', 'current_user_repositories_url': 'https://api.github.com/user/repos{?type,page,per_page,sort}', 'starred_url': 'https://api.github.com/user/starred{/owner}{/repo}', 'starred_gists_url': 'https://api.github.com/gists/starred', 'team_url': 'https://api.github.com/teams', 'user_url': 'https://api.github.com/users/{user}', 'user_organizations_url': 'https://api.github.com/user/orgs', 'user_repositories_url': 'https://api.github.com/users/{user}/repos{?type,page,per_page,sort}', 'user_search_url': 'https://api.github.com/search/users?q={query}{&page,per_page,sort,order}'}
The type
of the return value of .json()
is a dictionary, so you can access values in the object by key.
You can do a lot with status codes and message bodies. But, if you need more information, like metadata about the response itself, you’ll need to look at the response’s headers.
Query String Parameters
One common way to customize a GET
request is to pass values through query string parameters in the URL. To do this using get()
, you pass data to params
. For example, you can use GitHub’s Search API to look for the requests
library:
import requests
# Search GitHub's repositories for requests
response = requests.get(
'https://api.github.com/search/repositories',
params={'q': 'requests+language:python'},
)
# Inspect some attributes of the `requests` repository
json_response = response.json()
repository = json_response['items'][0]
print(f'Repository name: {repository["name"]}') # Python 3.6+
print(f'Repository description: {repository["description"]}') # Python 3.6+
By passing the dictionary {'q': 'requests+language:python'}
to the params
parameter of .get()
, you are able to modify the results that come back from the Search API.
You can pass params
to get()
in the form of a dictionary, as you have just done, or as a list of tuples:
>>>
>>> requests.get(
... 'https://api.github.com/search/repositories',
... params=[('q', 'requests+language:python')],
... )
<Response [200]>
You can even pass the values as bytes
:
>>>
>>> requests.get(
... 'https://api.github.com/search/repositories',
... params=b'q=requests+language:python',
... )
<Response [200]>
Query strings are useful for parameterizing GET
requests. You can also customize your requests by adding or modifying the headers you send.
Other HTTP Methods
Aside from GET
, other popular HTTP methods include POST
, PUT
, DELETE
, HEAD
, PATCH
, and OPTIONS
. requests
provides a method, with a similar signature to get()
, for each of these HTTP methods:
>>>
>>> requests.post('https://httpbin.org/post', data={'key':'value'})
>>> requests.put('https://httpbin.org/put', data={'key':'value'})
>>> requests.delete('https://httpbin.org/delete')
>>> requests.head('https://httpbin.org/get')
>>> requests.patch('https://httpbin.org/patch', data={'key':'value'})
>>> requests.options('https://httpbin.org/get')
Each function call makes a request to the httpbin
service using the corresponding HTTP method. For each method, you can inspect their responses in the same way you did before:
>>>
>>> response = requests.head('https://httpbin.org/get')
>>> response.headers['Content-Type']
'application/json'
>>> response = requests.delete('https://httpbin.org/delete')
>>> json_response = response.json()
>>> json_response['args']
{}
Headers, response bodies, status codes, and more are returned in the Response
for each method. Next you’ll take a closer look at the POST
, PUT
, and PATCH
methods and learn how they differ from the other request types.
The Message Body
According to the HTTP specification, POST
, PUT
, and the less common PATCH
requests pass their data through the message body rather than through parameters in the query string. Using requests
, you’ll pass the payload to the corresponding function’s data
parameter.
data
takes a dictionary, a list of tuples, bytes, or a file-like object. You’ll want to adapt the data you send in the body of your request to the specific needs of the service you’re interacting with.
For example, if your request’s content type is application/x-www-form-urlencoded
, you can send the form data as a dictionary:
>>>
>>> requests.post('https://httpbin.org/post', data={'key':'value'})
<Response [200]>
You can also send that same data as a list of tuples:
>>>
>>> requests.post('https://httpbin.org/post', data=[('key', 'value')])
<Response [200]>
If, however, you need to send JSON data, you can use the json
parameter. When you pass JSON data via json
, requests
will serialize your data and add the correct Content-Type
header for you.
httpbin.org is a great resource created by the author of requests
, Kenneth Reitz. It’s a service that accepts test requests and responds with data about the requests. For instance, you can use it to inspect a basic POST
request:
>>>
>>> response = requests.post('https://httpbin.org/post', json={'key':'value'})
>>> json_response = response.json()
>>> json_response['data']
'{"key": "value"}'
>>> json_response['headers']['Content-Type']
'application/json'
You can see from the response that the server received your request data and headers as you sent them. requests
also provides this information to you in the form of a PreparedRequest
.
Inspecting Your Request
When you make a request, the requests
library prepares the request before actually sending it to the destination server. Request preparation includes things like validating headers and serializing JSON content.
You can view the PreparedRequest
by accessing .request
:
>>>
>>> response = requests.post('https://httpbin.org/post', json={'key':'value'})
>>> response.request.headers['Content-Type']
'application/json'
>>> response.request.url
'https://httpbin.org/post'
>>> response.request.body
b'{"key": "value"}'
Inspecting the PreparedRequest
gives you access to all kinds of information about the request being made such as payload, URL, headers, authentication, and more.
So far, you’ve made a lot of different kinds of requests, but they’ve all had one thing in common: they’re unauthenticated requests to public APIs. Many services you may come across will want you to authenticate in some way.
Authentication
Authentication helps a service understand who you are. Typically, you provide your credentials to a server by passing data through the Authorization
header or a custom header defined by the service. All the request functions you’ve seen to this point provide a parameter called auth
, which allows you to pass your credentials.
One example of an API that requires authentication is GitHub’s Authenticated User API. This endpoint provides information about the authenticated user’s profile. To make a request to the Authenticated User API, you can pass your GitHub username and password in a tuple to get()
:
>>>
>>> from getpass import getpass
>>> requests.get('https://api.github.com/user', auth=('username', getpass()))
<Response [200]>
The request succeeded if the credentials you passed in the tuple to auth
are valid. If you try to make this request with no credentials, you’ll see that the status code is 401 Unauthorized
:
>>>
>>> requests.get('https://api.github.com/user')
<Response [401]>
When you pass your username and password in a tuple to the auth
parameter, requests
is applying the credentials using HTTP’s Basic access authentication scheme under the hood.
Therefore, you could make the same request by passing explicit Basic authentication credentials using HTTPBasicAuth
:
>>>
>>> from requests.auth import HTTPBasicAuth
>>> from getpass import getpass
>>> requests.get(
... 'https://api.github.com/user',
... auth=HTTPBasicAuth('username', getpass())
... )
<Response [200]>
Though you don’t need to be explicit for Basic authentication, you may want to authenticate using another method. requests
provides other methods of authentication out of the box such as HTTPDigestAuth
and HTTPProxyAuth
.
You can even supply your own authentication mechanism. To do so, you must first create a subclass of AuthBase
. Then, you implement __call__()
:
import requests
from requests.auth import AuthBase
class TokenAuth(AuthBase):
"""Implements a custom authentication scheme."""
def __init__(self, token):
self.token = token
def __call__(self, r):
"""Attach an API token to a custom auth header."""
r.headers['X-TokenAuth'] = f'{self.token}' # Python 3.6+
return r
requests.get('https://httpbin.org/get', auth=TokenAuth('12345abcde-token'))
Here, your custom TokenAuth
mechanism receives a token, then includes that token in the X-TokenAuth
header of your request.
Bad authentication mechanisms can lead to security vulnerabilities, so unless a service requires a custom authentication mechanism for some reason, you’ll always want to use a tried-and-true auth scheme like Basic or OAuth.
While you’re thinking about security, let’s consider dealing with SSL Certificates using requests
.
SSL Certificate Verification
Any time the data you are trying to send or receive is sensitive, security is important. The way that you communicate with secure sites over HTTP is by establishing an encrypted connection using SSL, which means that verifying the target server’s SSL Certificate is critical.
The good news is that requests
does this for you by default. However, there are some cases where you might want to change this behavior.
If you want to disable SSL Certificate verification, you pass False
to the verify
parameter of the request function:
>>>
>>> requests.get('https://api.github.com', verify=False)
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning)
<Response [200]>
requests
even warns you when you’re making an insecure request to help you keep your data safe!
Performance
When using requests
, especially in a production application environment, it’s important to consider performance implications. Features like timeout control, sessions, and retry limits can help you keep your application running smoothly.
Timeouts
When you make an inline request to an external service, your system will need to wait upon the response before moving on. If your application waits too long for that response, requests to your service could back up, your user experience could suffer, or your background jobs could hang.
By default, requests
will wait indefinitely on the response, so you should almost always specify a timeout duration to prevent these things from happening. To set the request’s timeout, use the timeout
parameter. timeout
can be an integer or float representing the number of seconds to wait on a response before timing out:
>>>
>>> requests.get('https://api.github.com', timeout=1)
<Response [200]>
>>> requests.get('https://api.github.com', timeout=3.05)
<Response [200]>
In the first request, the request will timeout after 1 second. In the second request, the request will timeout after 3.05 seconds.
You can also pass a tuple to timeout
with the first element being a connect timeout (the time it allows for the client to establish a connection to the server), and the second being a read timeout (the time it will wait on a response once your client has established a connection):
>>>
>>> requests.get('https://api.github.com', timeout=(2, 5))
<Response [200]>
If the request establishes a connection within 2 seconds and receives data within 5 seconds of the connection being established, then the response will be returned as it was before. If the request times out, then the function will raise a Timeout
exception:
import requests
from requests.exceptions import Timeout
try:
response = requests.get('https://api.github.com', timeout=1)
except Timeout:
print('The request timed out')
else:
print('The request did not time out')
Your program can catch the Timeout
exception and respond accordingly.
The Session Object
Until now, you’ve been dealing with high level requests
APIs such as get()
and post()
. These functions are abstractions of what’s going on when you make your requests. They hide implementation details such as how connections are managed so that you don’t have to worry about them.
Underneath those abstractions is a class called Session
. If you need to fine-tune your control over how requests are being made or improve the performance of your requests, you may need to use a Session
instance directly.
Sessions are used to persist parameters across requests. For example, if you want to use the same authentication across multiple requests, you could use a session:
import requests
from getpass import getpass
# By using a context manager, you can ensure the resources used by
# the session will be released after use
with requests.Session() as session:
session.auth = ('username', getpass())
# Instead of requests.get(), you'll use session.get()
response = session.get('https://api.github.com/user')
# You can inspect the response just like you did before
print(response.headers)
print(response.json())
Each time you make a request with session
, once it has been initialized with authentication credentials, the credentials will be persisted.
The primary performance optimization of sessions comes in the form of persistent connections. When your app makes a connection to a server using a Session
, it keeps that connection around in a connection pool. When your app wants to connect to the same server again, it will reuse a connection from the pool rather than establishing a new one.
Max Retries
When a request fails, you may want your application to retry the same request. However, requests
will not do this for you by default. To apply this functionality, you need to implement a custom Transport Adapter.
Transport Adapters let you define a set of configurations per service you’re interacting with. For example, let’s say you want all requests to https://api.github.com
to retry three times before finally raising a ConnectionError
. You would build a Transport Adapter, set its max_retries
parameter, and mount it to an existing Session
:
import requests
from requests.adapters import HTTPAdapter
from requests.exceptions import ConnectionError
github_adapter = HTTPAdapter(max_retries=3)
session = requests.Session()
# Use `github_adapter` for all requests to endpoints that start with this URL
session.mount('https://api.github.com', github_adapter)
try:
session.get('https://api.github.com')
except ConnectionError as ce:
print(ce)
When you mount the HTTPAdapter
, github_adapter
, to session
, session
will adhere to its configuration for each request to https://api.github.com.
Timeouts, Transport Adapters, and sessions are for keeping your code efficient and your application resilient.
Conclusion
You’ve come a long way in learning about Python’s powerful requests
library.
You’re now able to:
- Make requests using a variety of different HTTP methods such as
GET
,POST
, andPUT
- Customize your requests by modifying headers, authentication, query strings, and message bodies
- Inspect the data you send to the server and the data the server sends back to you
- Work with SSL Certificate verification
- Use
requests
effectively usingmax_retries
,timeout
, Sessions, and Transport Adapters
Because you learned how to use requests
, you’re equipped to explore the wide world of web services and build awesome applications using the fascinating data they provide.
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Making HTTP Requests With Python
24 Дек. 2015, Python, 342319 просмотров,
Стандартная библиотека Python имеет ряд готовых модулей по работе с HTTP.
- urllib
- httplib
Если уж совсем хочется хардкора, то можно и сразу с socket поработать. Но у всех этих модулей есть один большой недостаток — неудобство работы.
Во-первых, большое обилие классов и функций. Во-вторых, код получается вовсе не pythonic. Многие программисты любят Python за его элегантность и простоту, поэтому и был создан модуль, призванный решать проблему существующих и имя ему requests или HTTP For Humans. На момент написания данной заметки, последняя версия библиотеки — 2.9.1. С момента выхода Python версии 3.5 я дал себе негласное обещание писать новый код только на Py >= 3.5. Пора бы уже полностью перебираться на 3-ю ветку змеюки, поэтому в моих примерах print отныне является функцией, а не оператором
Что же умеет requests?
Для начала хочется показать как выглядит код работы с http, используя модули из стандартной библиотеки Python и код при работе с requests. В качестве мишени для стрельбы http запросами будет использоваться очень удобный сервис httpbin.org
>>> import urllib.request
>>> response = urllib.request.urlopen('https://httpbin.org/get')
>>> print(response.read())
b'{n "args": {}, n "headers": {n "Accept-Encoding": "identity", n "Host": "httpbin.org", n "User-Agent": "Python-urllib/3.5"n }, n "origin": "95.56.82.136", n "url": "https://httpbin.org/get"n}n'
>>> print(response.getheader('Server'))
nginx
>>> print(response.getcode())
200
>>>
Кстати, urllib.request это надстройка над «низкоуровневой» библиотекой httplib о которой я писал выше.
>>> import requests
>>> response = requests.get('https://httpbin.org/get')
>>> print(response.content)
b'{n "args": {}, n "headers": {n "Accept": "*/*", n "Accept-Encoding": "gzip, deflate", n "Host": "httpbin.org", n "User-Agent": "python-requests/2.9.1"n }, n "origin": "95.56.82.136", n "url": "https://httpbin.org/get"n}n'
>>> response.json()
{'headers': {'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'python-requests/2.9.1', 'Host': 'httpbin.org', 'Accept': '*/*'}, 'args': {}, 'origin': '95.56.82.136', 'url': 'https://httpbin.org/get'}
>>> response.headers
{'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Server': 'nginx', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Origin': '*', 'Content-Length': '237', 'Date': 'Wed, 23 Dec 2015 17:56:46 GMT'}
>>> response.headers.get('Server')
'nginx'
В простых методах запросов значительных отличий у них не имеется. Но давайте взглянем на работы с Basic Auth:
>>> import urllib.request
>>> password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
>>> top_level_url = 'https://httpbin.org/basic-auth/user/passwd'
>>> password_mgr.add_password(None, top_level_url, 'user', 'passwd')
>>> handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
>>> opener = urllib.request.build_opener(handler)
>>> response = opener.open(top_level_url)
>>> response.getcode()
200
>>> response.read()
b'{n "authenticated": true, n "user": "user"n}n'
>>> import requests
>>> response = requests.get('https://httpbin.org/basic-auth/user/passwd', auth=('user', 'passwd'))
>>> print(response.content)
b'{n "authenticated": true, n "user": "user"n}n'
>>> print(response.json())
{'user': 'user', 'authenticated': True}
А теперь чувствуется разница между pythonic и non-pythonic? Я думаю разница на лицо. И несмотря на тот факт, что requests ничто иное как обёртка над urllib3, а последняя является надстройкой над стандартными средствами Python, удобство написания кода в большинстве случаев является приоритетом номер один.
В requests имеется:
- Множество методов http аутентификации
- Сессии с куками
- Полноценная поддержка SSL
- Различные методы-плюшки вроде .json(), которые вернут данные в нужном формате
- Проксирование
- Грамотная и логичная работа с исключениями
О последнем пункте мне бы хотелось поговорить чуточку подробнее.
Обработка исключений в requests
При работе с внешними сервисами никогда не стоит полагаться на их отказоустойчивость. Всё упадёт рано или поздно, поэтому нам, программистам, необходимо быть всегда к этому готовыми, желательно заранее и в спокойной обстановке.
Итак, как у requests дела обстоят с различными факапами в момент сетевых соединений? Для начала определим ряд проблем, которые могут возникнуть:
- Хост недоступен. Обычно такого рода ошибка происходит из-за проблем конфигурирования DNS. (DNS lookup failure)
- «Вылет» соединения по таймауту
- Ошибки HTTP. Подробнее о HTTP кодах можно посмотреть здесь.
- Ошибки SSL соединений (обычно при наличии проблем с SSL сертификатом: просрочен, не является доверенным и т.д.)
Базовым классом-исключением в requests является RequestException. От него наследуются все остальные
- HTTPError
- ConnectionError
- Timeout
- SSLError
- ProxyError
И так далее. Полный список всех исключений можно посмотреть в requests.exceptions.
Timeout
В requests имеется 2 вида таймаут-исключений:
- ConnectTimeout — таймаут на соединения
- ReadTimeout — таймаут на чтение
>>> import requests
>>> try:
... response = requests.get('https://httpbin.org/user-agent', timeout=(0.00001, 10))
... except requests.exceptions.ConnectTimeout:
... print('Oops. Connection timeout occured!')
...
Oops. Connection timeout occured!
>>> try:
... response = requests.get('https://httpbin.org/user-agent', timeout=(10, 0.0001))
... except requests.exceptions.ReadTimeout:
... print('Oops. Read timeout occured')
... except requests.exceptions.ConnectTimeout:
... print('Oops. Connection timeout occured!')
...
Oops. Read timeout occured
ConnectionError
>>> import requests
>>> try:
... response = requests.get('http://urldoesnotexistforsure.bom')
... except requests.exceptions.ConnectionError:
... print('Seems like dns lookup failed..')
...
Seems like dns lookup failed..
HTTPError
>>> import requests
>>> try:
... response = requests.get('https://httpbin.org/status/500')
... response.raise_for_status()
... except requests.exceptions.HTTPError as err:
... print('Oops. HTTP Error occured')
... print('Response is: {content}'.format(content=err.response.content))
...
Oops. HTTP Error occured
Response is: b''
Я перечислил основные виды исключений, которые покрывают, пожалуй, 90% всех проблем, возникающих при работе с http. Главное помнить, что если мы действительно намерены отловить что-то и обработать, то это необходимо явно запрограммировать, если же нам неважен тип конкретного исключения, то можно отлавливать общий базовый класс RequestException и действовать уже от конкретного случая, например, залоггировать исключение и выкинуть его дальше наверх. Кстати, о логгировании я напишу отдельный подробный пост.
У блога появился свой Telegram канал, где я стараюсь делиться интересными находками из сети на тему разработки программного обеспечения. Велком, как говорится
Полезные «плюшки»
- httpbin.org очень полезный сервис для тестирования http клиентов, в частности удобен для тестирования нестандартного поведения сервиса
- httpie консольный http клиент (замена curl) написанный на Python
- responses mock библиотека для работы с requests
- HTTPretty mock библиотека для работы с http модулями
💌 Присоединяйтесь к рассылке
Понравился контент? Пожалуйста, подпишись на рассылку.