Python request module is a simple and elegant Python HTTP library. It provides methods for accessing Web resources via HTTP. In the following article, we will use the HTTP GET method in the Request module. This method requests data from the server and the Exception handling comes in handy when the response is not successful. Here, we will go through such situations. We will use Python’s try and except functionality to explore the exceptions that arise from the Requests module.
- url: Returns the URL of the response
- raise_for_status(): If an error occur, this method returns a HTTPError object
- request: Returns the request object that requested this response
- status_code: Returns a number that indicates the status (200 is OK, 404 is Not Found)
Successful Connection Request
The first thing to know is that the response code is 200 if the request is successful.
Python3
Output:
200
Exception Handling for HTTP Errors
Here, we tried the following URL sequence and then passed this variable to the Python requests module using raised_for_status(). If the try part is successful, we will get the response code 200, if the page that we requested doesn’t exist. This is an HTTP error, which was handled by the Request module’s exception HTTPError and you probably got the error 404.
Python3
import
requests
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.HTTPError as errh:
print
(
"HTTP Error"
)
print
(errh.args[
0
])
print
(r)
Output:
HTTP Error 404 Client Error: Not Found for url: https://www.amazon.com/nothing_here <Response [404]>
General Exception Handling
You could also use a general exception from the Request module. That is requests.exceptions.RequestException.
Python3
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.RequestException as errex:
print
(
"Exception request"
)
Output:
Exception request
Now, you may have noticed that there is an argument ‘timeout’ passed into the Request module. We could prescribe a time limit for the requested connection to respond. If this has not happened, we could catch that using the exception requests.exceptions.ReadTimeout. To demonstrate this let us find a website that responds successfully.
Python3
import
requests
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
print
(r)
Output:
<Response [200]>
If we change timeout = 0.01, the same code would return, because the request could not possibly be that fast.
Time out <Response [200]>
Exception Handling for Missing Schema
Another common error is that we might not specify HTTPS or HTTP in the URL. For example, We cause use requests.exceptions.MissingSchema to catch this exception.
Python3
url
=
"www.google.com"
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.MissingSchema as errmiss:
print
(
"Missing schema: include http or https"
)
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
Output:
Missing scheme: include http or https
Exception Handling for Connection Error
Let us say that there is a site that doesn’t exist. Here, the error will occur even when you can’t make a connection because of the lack of an internet connection
Python3
try
:
r
=
requests.get(url, timeout
=
1
, verify
=
True
)
r.raise_for_status()
except
requests.exceptions.HTTPError as errh:
print
(
"HTTP Error"
)
print
(errh.args[
0
])
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
except
requests.exceptions.ConnectionError as conerr:
print
(
"Connection error"
)
Output:
Connection error
Putting Everything Together
Here, We put together everything we tried so far the idea is that the exceptions are handled according to the specificity.
For example, url = “https://www.gle.com”, When this code is run for this URL will produce an Exception request. Whereas, In the absence of connection requests.exceptions.ConnectionError will print the Connection Error, and when the connection is not made the general exception is handled by requests.exceptions.RequestException.
Python3
try
:
r
=
requests.get(url, timeout
=
1
, verify
=
True
)
r.raise_for_status()
except
requests.exceptions.HTTPError as errh:
print
(
"HTTP Error"
)
print
(errh.args[
0
])
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
except
requests.exceptions.ConnectionError as conerr:
print
(
"Connection error"
)
except
requests.exceptions.RequestException as errex:
print
(
"Exception request"
)
Output:
Note: The output may change according to requests.
Time out
Содержание
- Руководство по работе с HTTP в Python. Библиотека requests
- Что же умеет requests?
- Обработка исключений в requests
- Timeout
- ConnectionError
- HTTPError
- Полезные «плюшки»
- 💌 Присоединяйтесь к рассылке
- Requests — Working with Errors
- Complete Python Prime Pack for 2023
- Artificial Intelligence & Machine Learning Prime Pack
- Java Prime Pack 2023
- Error Exception
- Example
- Developer Interface¶
- Main Interface¶
- Exceptions¶
- Request Sessions¶
- Lower-Level Classes¶
- Lower-Lower-Level Classes¶
- Authentication¶
- Encodings¶
- Cookies¶
- Status Code Lookup¶
- Migrating to 1.x¶
- API Changes¶
- Licensing¶
- Migrating to 2.x¶
- API Changes¶
- Behavioural Changes¶
- Stay Informed
Руководство по работе с HTTP в Python. Библиотека requests
Стандартная библиотека Python имеет ряд готовых модулей по работе с HTTP.
Если уж совсем хочется хардкора, то можно и сразу с socket поработать. Но у всех этих модулей есть один большой недостаток — неудобство работы.
Во-первых, большое обилие классов и функций. Во-вторых, код получается вовсе не pythonic. Многие программисты любят Python за его элегантность и простоту, поэтому и был создан модуль, призванный решать проблему существующих и имя ему requests или HTTP For Humans. На момент написания данной заметки, последняя версия библиотеки — 2.9.1. С момента выхода Python версии 3.5 я дал себе негласное обещание писать новый код только на Py >= 3.5. Пора бы уже полностью перебираться на 3-ю ветку змеюки, поэтому в моих примерах print отныне является функцией, а не оператором 🙂
Что же умеет requests?
Для начала хочется показать как выглядит код работы с http, используя модули из стандартной библиотеки Python и код при работе с requests. В качестве мишени для стрельбы http запросами будет использоваться очень удобный сервис httpbin.org
Кстати, urllib.request это надстройка над «низкоуровневой» библиотекой httplib о которой я писал выше.
В простых методах запросов значительных отличий у них не имеется. Но давайте взглянем на работы с Basic Auth:
А теперь чувствуется разница между pythonic и non-pythonic? Я думаю разница на лицо. И несмотря на тот факт, что requests ничто иное как обёртка над urllib3, а последняя является надстройкой над стандартными средствами Python, удобство написания кода в большинстве случаев является приоритетом номер один.
В requests имеется:
- Множество методов http аутентификации
- Сессии с куками
- Полноценная поддержка SSL
- Различные методы-плюшки вроде .json(), которые вернут данные в нужном формате
- Проксирование
- Грамотная и логичная работа с исключениями
О последнем пункте мне бы хотелось поговорить чуточку подробнее.
Обработка исключений в requests
При работе с внешними сервисами никогда не стоит полагаться на их отказоустойчивость. Всё упадёт рано или поздно, поэтому нам, программистам, необходимо быть всегда к этому готовыми, желательно заранее и в спокойной обстановке.
Итак, как у requests дела обстоят с различными факапами в момент сетевых соединений? Для начала определим ряд проблем, которые могут возникнуть:
- Хост недоступен. Обычно такого рода ошибка происходит из-за проблем конфигурирования DNS. (DNS lookup failure)
- «Вылет» соединения по таймауту
- Ошибки HTTP. Подробнее о HTTP кодах можно посмотреть здесь.
- Ошибки SSL соединений (обычно при наличии проблем с SSL сертификатом: просрочен, не является доверенным и т.д.)
Базовым классом-исключением в requests является RequestException. От него наследуются все остальные
- HTTPError
- ConnectionError
- Timeout
- SSLError
- ProxyError
И так далее. Полный список всех исключений можно посмотреть в requests.exceptions.
Timeout
В requests имеется 2 вида таймаут-исключений:
- ConnectTimeout — таймаут на соединения
- ReadTimeout — таймаут на чтение
ConnectionError
HTTPError
Я перечислил основные виды исключений, которые покрывают, пожалуй, 90% всех проблем, возникающих при работе с http. Главное помнить, что если мы действительно намерены отловить что-то и обработать, то это необходимо явно запрограммировать, если же нам неважен тип конкретного исключения, то можно отлавливать общий базовый класс RequestException и действовать уже от конкретного случая, например, залоггировать исключение и выкинуть его дальше наверх. Кстати, о логгировании я напишу отдельный подробный пост.
У блога появился свой Telegram канал, где я стараюсь делиться интересными находками из сети на тему разработки программного обеспечения. Велком, как говорится 🙂
Полезные «плюшки»
- httpbin.org очень полезный сервис для тестирования http клиентов, в частности удобен для тестирования нестандартного поведения сервиса
- httpie консольный http клиент (замена curl) написанный на Python
- responses mock библиотека для работы с requests
- HTTPretty mock библиотека для работы с http модулями
💌 Присоединяйтесь к рассылке
Понравился контент? Пожалуйста, подпишись на рассылку.
Источник
Requests — Working with Errors
Complete Python Prime Pack for 2023
9 Courses 2 eBooks
Artificial Intelligence & Machine Learning Prime Pack
6 Courses 1 eBooks
Java Prime Pack 2023
8 Courses 2 eBooks
This chapter will discuss how to deal with errors coming down when working with the Http request library. It is always a good practice to have errors managed for all possible cases.
Error Exception
The requests module gives the following types of error exception −
ConnectionError − This will be raised, if there is any connection error. For example, the network failed, DNS error so the Request library will raise ConnectionError exception.
Response.raise_for_status() − Based on status code i.e. 401, 404 it will raise HTTPError for the url requested.
HTTPError − This error will be raised for an invalid response coming down for the request made.
Timeout − Errors raised for a timeout for the URL requested.
TooManyRedirects − If the limit is crossed for maximum redirections than it will raise TooManyRedirects error.
Example
Here is an example of errors shown for timeout −
Источник
Developer Interface¶
This part of the documentation covers all the interfaces of Requests. For parts where Requests depends on external libraries, we document the most important right here and provide links to the canonical documentation.
Main Interface¶
All of Requests’ functionality can be accessed by these 7 methods. They all return an instance of the Response object.
Constructs and sends a Request .
Sends a HEAD request.
Parameters: |
|
---|---|
Returns: |
Parameters: |
|
---|---|
Returns: |
requests. get ( url, params=None, **kwargs ) [source] ¶
Sends a GET request.
Parameters: |
|
---|---|
Returns: |
requests. post ( url, data=None, json=None, **kwargs ) [source] ¶
Sends a POST request.
Parameters: |
|
---|---|
Returns: |
requests. put ( url, data=None, **kwargs ) [source] ¶
Sends a PUT request.
Parameters: |
|
---|---|
Returns: |
requests. patch ( url, data=None, **kwargs ) [source] ¶
Sends a PATCH request.
Parameters: |
|
---|---|
Returns: |
requests. delete ( url, **kwargs ) [source] ¶
Sends a DELETE request.
Exceptions¶
There was an ambiguous exception that occurred while handling your request.
exception requests. ConnectionError ( *args, **kwargs ) [source] ¶
A Connection error occurred.
exception requests. HTTPError ( *args, **kwargs ) [source] ¶
An HTTP error occurred.
exception requests. URLRequired ( *args, **kwargs ) [source] ¶
A valid URL is required to make a request.
exception requests. TooManyRedirects ( *args, **kwargs ) [source] ¶
Too many redirects.
exception requests. ConnectTimeout ( *args, **kwargs ) [source] ¶
The request timed out while trying to connect to the remote server.
Requests that produced this error are safe to retry.
exception requests. ReadTimeout ( *args, **kwargs ) [source] ¶
The server did not send any data in the allotted amount of time.
The request timed out.
Catching this error will catch both ConnectTimeout and ReadTimeout errors.
Request Sessions¶
A Requests session.
Provides cookie persistence, connection-pooling, and configuration.
Or as a context manager:
Default Authentication tuple or object to attach to Request .
SSL client certificate default, if String, path to ssl client cert file (.pem). If Tuple, (‘cert’, ‘key’) pair.
Closes all adapters and as such the session
A CookieJar containing all currently outstanding cookies set on this session. By default it is a RequestsCookieJar , but may be any other cookielib.CookieJar compatible object.
Sends a DELETE request. Returns Response object.
Parameters: |
|
---|---|
Returns: |
Parameters: |
|
---|---|
Return type: |
get ( url, **kwargs ) [source] ¶
Sends a GET request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
get_adapter ( url ) [source] ¶
Returns the appropriate connection adapter for the given URL.
Return type: | requests.adapters.BaseAdapter |
---|
get_redirect_target ( resp ) ¶
Receives a Response. Returns a redirect URI or None
Sends a HEAD request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
headers = None¶
A case-insensitive dictionary of headers to be sent on each Request sent from this Session .
Maximum number of redirects allowed. If the request exceeds this limit, a TooManyRedirects exception is raised. This defaults to requests.models.DEFAULT_REDIRECT_LIMIT, which is 30.
Check the environment and merge it with some settings.
Return type: | dict |
---|
mount ( prefix, adapter ) [source] ¶
Registers a connection adapter to a prefix.
Adapters are sorted in descending order by prefix length.
Sends a OPTIONS request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
params = None¶
Dictionary of querystring data to attach to each Request . The dictionary values may be lists for representing multivalued query parameters.
Sends a PATCH request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
post ( url, data=None, json=None, **kwargs ) [source] ¶
Sends a POST request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
prepare_request ( request ) [source] ¶
Constructs a PreparedRequest for transmission and returns it. The PreparedRequest has settings merged from the Request instance and those of the Session .
Parameters: | request – Request instance to prepare with this session’s settings. |
---|---|
Return type: | requests.PreparedRequest |
proxies = None¶
Dictionary mapping protocol or protocol and host to the URL of the proxy (e.g. <‘http’: ‘foo.bar:3128’, ‘http://host.name’: ‘foo.bar:4012’>) to be used on each Request .
Sends a PUT request. Returns Response object.
Parameters: |
|
---|---|
Return type: |
rebuild_auth ( prepared_request, response ) ¶
When being redirected we may want to strip authentication from the request to avoid leaking credentials. This method intelligently removes and reapplies authentication where possible to avoid credential loss.
rebuild_method ( prepared_request, response ) ¶
When being redirected we may want to change the method of the request based on certain specs or browser behavior.
rebuild_proxies ( prepared_request, proxies ) ¶
This method re-evaluates the proxy configuration by considering the environment variables. If we are redirected to a URL covered by NO_PROXY, we strip the proxy configuration. Otherwise, we set missing proxy keys for this URL (in case they were stripped by a previous redirect).
This method also replaces the Proxy-Authorization header where necessary.
Return type: | dict |
---|
request ( method, url, params=None, data=None, headers=None, cookies=None, files=None, auth=None, timeout=None, allow_redirects=True, proxies=None, hooks=None, stream=None, verify=None, cert=None, json=None ) [source] ¶
Constructs a Request , prepares it and sends it. Returns Response object.
Parameters: |
|
---|---|
Return type: |
resolve_redirects ( resp, req, stream=False, timeout=None, verify=True, cert=None, proxies=None, yield_requests=False, **adapter_kwargs ) ¶
Receives a Response. Returns a generator of Responses or Requests.
Send a given PreparedRequest.
Return type: | requests.Response |
---|
should_strip_auth ( old_url, new_url ) ¶
Decide whether Authorization header should be removed when redirecting
Stream response content default.
Trust environment settings for proxy configuration, default authentication and similar.
SSL Verification default.
Lower-Level Classes¶
A user-created Request object.
Used to prepare a PreparedRequest , which is sent to the server.
Parameters: |
|
---|
Deregister a previously registered hook. Returns True if the hook existed, False if not.
Constructs a PreparedRequest for transmission and returns it.
Properly register a hook.
class requests. Response [source] ¶
The Response object, which contains a server’s response to an HTTP request.
The apparent encoding, provided by the chardet library.
Releases the connection back to the pool. Once this method has been called the underlying raw object must not be accessed again.
Note: Should not normally need to be called explicitly.
Content of the response, in bytes.
A CookieJar of Cookies the server sent back.
The amount of time elapsed between sending the request and the arrival of the response (as a timedelta). This property specifically measures the time taken between sending the first byte of the request and finishing parsing the headers. It is therefore unaffected by consuming the response content or the value of the stream keyword argument.
Encoding to decode with when accessing r.text.
Case-insensitive Dictionary of Response Headers. For example, headers[‘content-encoding’] will return the value of a ‘Content-Encoding’ response header.
A list of Response objects from the history of the Request. Any redirect responses will end up here. The list is sorted from the oldest to the most recent request.
True if this Response one of the permanent versions of redirect.
True if this Response is a well-formed HTTP redirect that could have been processed automatically (by Session.resolve_redirects ).
iter_content ( chunk_size=1, decode_unicode=False ) [source] ¶
Iterates over the response data. When stream=True is set on the request, this avoids reading the content at once into memory for large responses. The chunk size is the number of bytes it should read into memory. This is not necessarily the length of each item returned as decoding can take place.
chunk_size must be of type int or None. A value of None will function differently depending on the value of stream . stream=True will read data as it arrives in whatever size the chunks are received. If stream=False, data is returned as a single chunk.
If decode_unicode is True, content will be decoded using the best available encoding based on the response.
iter_lines ( chunk_size=512, decode_unicode=False, delimiter=None ) [source] ¶
Iterates over the response data, one line at a time. When stream=True is set on the request, this avoids reading the content at once into memory for large responses.
This method is not reentrant safe.
Returns the json-encoded content of a response, if any.
Parameters: | **kwargs – Optional arguments that json.loads takes. |
---|---|
Raises: | ValueError – If the response body does not contain valid json. |
links ¶
Returns the parsed header links of the response, if any.
Returns a PreparedRequest for the next request in a redirect chain, if there is one.
Returns True if status_code is less than 400, False if not.
This attribute checks if the status code of the response is between 400 and 600 to see if there was a client error or a server error. If the status code is between 200 and 400, this will return True. This is not a check to see if the response code is 200 OK .
Raises stored HTTPError , if one occurred.
Textual reason of responded HTTP Status, e.g. “Not Found” or “OK”.
The PreparedRequest object to which this is a response.
Integer Code of responded HTTP Status, e.g. 404 or 200.
Content of the response, in unicode.
If Response.encoding is None, encoding will be guessed using chardet .
The encoding of the response content is determined based solely on HTTP headers, following RFC 2616 to the letter. If you can take advantage of non-HTTP knowledge to make a better guess at the encoding, you should set r.encoding appropriately before accessing this property.
Final URL location of Response.
Lower-Lower-Level Classes¶
The fully mutable PreparedRequest object, containing the exact bytes that will be sent to the server.
Generated from either a Request object or manually.
request body to send to the server.
deregister_hook ( event, hook ) ¶
Deregister a previously registered hook. Returns True if the hook existed, False if not.
dictionary of HTTP headers.
dictionary of callback hooks, for internal usage.
HTTP verb to send to the server.
Build the path URL to use.
Prepares the entire request with the given parameters.
Prepares the given HTTP auth data.
Prepares the given HTTP body data.
Prepare Content-Length header based on request method and body
Prepares the given HTTP cookie data.
This function eventually generates a Cookie header from the given cookies using cookielib. Due to cookielib’s design, the header will not be regenerated if it already exists, meaning this function can only be called once for the life of the PreparedRequest object. Any subsequent calls to prepare_cookies will have no actual effect, unless the “Cookie” header is removed beforehand.
Prepares the given HTTP headers.
Prepares the given hooks.
Prepares the given HTTP method.
Prepares the given HTTP URL.
Properly register a hook.
HTTP URL to send the request to.
class requests.adapters. BaseAdapter [source] ¶
The Base Transport Adapter
Cleans up adapter specific items.
Sends PreparedRequest object. Returns Response object.
Parameters: |
|
---|
class requests.adapters. HTTPAdapter ( pool_connections=10, pool_maxsize=10, max_retries=0, pool_block=False ) [source]¶
The built-in HTTP Adapter for urllib3.
Provides a general-case interface for Requests sessions to contact HTTP and HTTPS urls by implementing the Transport Adapter interface. This class will usually be created by the Session class under the covers.
Parameters: |
|
---|
Add any headers needed by the connection. As of v2.0 this does nothing by default, but is left for overriding by users that subclass the HTTPAdapter .
This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|
build_response ( req, resp ) [source]¶
Builds a Response object from a urllib3 response. This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter
Parameters: |
|
---|---|
Return type: |
cert_verify ( conn, url, verify, cert ) [source] ¶
Verify a SSL certificate. This method should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|
close ( ) [source]¶
Disposes of any internal state.
Currently, this closes the PoolManager and any active ProxyManager, which closes any pooled connections.
Returns a urllib3 connection for the given URL. This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|---|
Return type: |
init_poolmanager ( connections, maxsize, block=False, **pool_kwargs ) [source] ¶
Initializes a urllib3 PoolManager.
This method should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|
proxy_headers ( proxy ) [source]¶
Returns a dictionary of the headers to add to any request sent through a proxy. This works with urllib3 magic to ensure that they are correctly sent to the proxy, rather than in a tunnelled request if CONNECT is being used.
This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: | proxy – The url of the proxy being used for this request. |
---|---|
Return type: | dict |
proxy_manager_for ( proxy, **proxy_kwargs ) [source] ¶
Return urllib3 ProxyManager for the given proxy.
This method should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|---|
Returns: |
request_url ( request, proxies ) [source] ¶
Obtain the url to use when making the final request.
If the message is being sent through a HTTP proxy, the full URL has to be used. Otherwise, we should only use the path portion of the URL.
This should not be called from user code, and is only exposed for use when subclassing the HTTPAdapter .
Parameters: |
|
---|---|
Return type: |
send ( request, stream=False, timeout=None, verify=True, cert=None, proxies=None ) [source] ¶
Sends PreparedRequest object. Returns Response object.
Authentication¶
Base class that all auth implementations derive from
class requests.auth. HTTPBasicAuth ( username, password ) [source] ¶
Attaches HTTP Basic Authentication to the given Request object.
class requests.auth. HTTPDigestAuth ( username, password ) [source] ¶
Attaches HTTP Digest Authentication to the given Request object.
Encodings¶
Returns encodings from given content string.
Parameters: |
|
---|---|
Return type: |
Parameters: | content – bytestring to extract encodings from. |
---|
requests.utils. get_encoding_from_headers ( headers ) [source] ¶
Returns encodings from given HTTP Header Dict.
Parameters: | headers – dictionary to extract encoding from. |
---|---|
Return type: | str |
requests.utils. get_unicode_from_response ( r ) [source] ¶
Returns the requested content back in unicode.
Parameters: | r – Response object to get unicode content from. |
---|
- charset from content-type
- fall back and replace all unicode characters
Return type: | str |
---|
Cookies¶
Returns a key/value dictionary from a CookieJar.
Parameters: | cj – CookieJar object to extract cookies from. |
---|---|
Return type: | dict |
requests.utils. add_dict_to_cookiejar ( cj, cookie_dict ) [source] ¶
Returns a CookieJar from a key/value dictionary.
Parameters: |
|
---|---|
Return type: |
requests.cookies. cookiejar_from_dict ( cookie_dict, cookiejar=None, overwrite=True ) [source] ¶
Returns a CookieJar from a key/value dictionary.
Parameters: |
|
---|---|
Return type: |
class requests.cookies. RequestsCookieJar ( policy=None ) [source] ¶
Compatibility class; is a cookielib.CookieJar, but exposes a dict interface.
This is the CookieJar we create by default for requests and sessions that don’t specify one, since some clients may expect response.cookies and session.cookies to support dict operations.
Requests does not use the dict interface internally; it’s just for compatibility with external client code. All requests code should work out of the box with externally provided instances of CookieJar , e.g. LWPCookieJar and FileCookieJar .
Unlike a regular CookieJar, this class is pickleable.
dictionary operations that are normally O(1) may be O(n).
Add correct Cookie: header to request (urllib.request.Request object).
The Cookie2 header is also added unless policy.hide_cookie2 is true.
Clear some cookies.
Invoking this method without arguments will clear all cookies. If given a single argument, only cookies belonging to that domain will be removed. If given two arguments, cookies belonging to the specified path within that domain are removed. If given three arguments, then the cookie with the specified name, path and domain is removed.
Raises KeyError if no matching cookie exists.
Discard all expired cookies.
You probably don’t need to call this method: expired cookies are never sent back to the server (provided you’re using DefaultCookiePolicy), this method is called by CookieJar itself every so often, and the .save() method won’t save expired cookies anyway (unless you ask otherwise by passing a true ignore_expires argument).
Discard all session cookies.
Note that the .save() method won’t save session cookies anyway, unless you ask otherwise by passing a true ignore_discard argument.
Return a copy of this RequestsCookieJar.
extract_cookies ( response, request ) ¶
Extract cookies from response, where allowable given the request.
Dict-like get() that also supports optional domain and path args in order to resolve naming collisions from using one cookie jar over multiple domains.
operation is O(n), not O(1).
Takes as an argument an optional domain and path and returns a plain old Python dict of name-value pairs of cookies that meet the requirements.
Return type: | dict |
---|
get_policy ( ) [source] ¶
Return the CookiePolicy instance used.
Dict-like items() that returns a list of name-value tuples from the jar. Allows client-code to call dict(RequestsCookieJar) and get a vanilla python dict of key value pairs.
keys() and values().
Dict-like iteritems() that returns an iterator of name-value tuples from the jar.
iterkeys() and itervalues().
Dict-like iterkeys() that returns an iterator of names of cookies from the jar.
itervalues() and iteritems().
Dict-like itervalues() that returns an iterator of values of cookies from the jar.
iterkeys() and iteritems().
Dict-like keys() that returns a list of names of cookies from the jar.
values() and items().
Utility method to list all the domains in the jar.
Utility method to list all the paths in the jar.
make_cookies ( response, request ) ¶
Return sequence of Cookie objects extracted from response object.
Returns True if there are multiple domains in the jar. Returns False otherwise.
Return type: | bool |
---|
pop ( k [ , d ] ) → v, remove specified key and return the corresponding value.¶
If key is not found, d is returned if given, otherwise KeyError is raised.
popitem ( ) → (k, v), remove and return some (key, value) pair¶
as a 2-tuple; but raise KeyError if D is empty.
Dict-like set() that also supports optional domain and path args in order to resolve naming collisions from using one cookie jar over multiple domains.
Set a cookie, without checking whether or not it should be set.
set_cookie_if_ok ( cookie, request ) ¶
Set a cookie if policy says it’s OK to do so.
setdefault ( k [ , d ] ) → D.get(k,d), also set D[k]=d if k not in D¶ update ( other ) [source] ¶
Updates this jar with cookies from another CookieJar or dict-like
Dict-like values() that returns a list of values of cookies from the jar.
There are two cookies that meet the criteria specified in the cookie jar. Use .get and .set and include domain and path args in order to be more specific.
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
Status Code Lookup¶
The codes object defines a mapping from common names for HTTP statuses to their numerical codes, accessible either as attributes or as dictionary items.
Some codes have multiple names, and both upper- and lower-case versions of the names are allowed. For example, codes.ok , codes.OK , and codes.okay all correspond to the HTTP status code 200.
- 100: continue
- 101: switching_protocols
- 102: processing
- 103: checkpoint
- 122: uri_too_long , request_uri_too_long
- 200: ok , okay , all_ok , all_okay , all_good , o/ , ✓
- 201: created
- 202: accepted
- 203: non_authoritative_info , non_authoritative_information
- 204: no_content
- 205: reset_content , reset
- 206: partial_content , partial
- 207: multi_status , multiple_status , multi_stati , multiple_stati
- 208: already_reported
- 226: im_used
- 300: multiple_choices
- 301: moved_permanently , moved , o-
- 302: found
- 303: see_other , other
- 304: not_modified
- 305: use_proxy
- 306: switch_proxy
- 307: temporary_redirect , temporary_moved , temporary
- 308: permanent_redirect , resume_incomplete , resume
- 400: bad_request , bad
- 401: unauthorized
- 402: payment_required , payment
- 403: forbidden
- 404: not_found , -o-
- 405: method_not_allowed , not_allowed
- 406: not_acceptable
- 407: proxy_authentication_required , proxy_auth , proxy_authentication
- 408: request_timeout , timeout
- 409: conflict
- 410: gone
- 411: length_required
- 412: precondition_failed , precondition
- 413: request_entity_too_large
- 414: request_uri_too_large
- 415: unsupported_media_type , unsupported_media , media_type
- 416: requested_range_not_satisfiable , requested_range , range_not_satisfiable
- 417: expectation_failed
- 418: im_a_teapot , teapot , i_am_a_teapot
- 421: misdirected_request
- 422: unprocessable_entity , unprocessable
- 423: locked
- 424: failed_dependency , dependency
- 425: unordered_collection , unordered
- 426: upgrade_required , upgrade
- 428: precondition_required , precondition
- 429: too_many_requests , too_many
- 431: header_fields_too_large , fields_too_large
- 444: no_response , none
- 449: retry_with , retry
- 450: blocked_by_windows_parental_controls , parental_controls
- 451: unavailable_for_legal_reasons , legal_reasons
- 499: client_closed_request
- 500: internal_server_error , server_error , /o , ✗
- 501: not_implemented
- 502: bad_gateway
- 503: service_unavailable , unavailable
- 504: gateway_timeout
- 505: http_version_not_supported , http_version
- 506: variant_also_negotiates
- 507: insufficient_storage
- 509: bandwidth_limit_exceeded , bandwidth
- 510: not_extended
- 511: network_authentication_required , network_auth , network_authentication
Migrating to 1.x¶
This section details the main differences between 0.x and 1.x and is meant to ease the pain of upgrading.
API Changes¶
Response.json is now a callable and not a property of a response.
The Session API has changed. Sessions objects no longer take parameters. Session is also now capitalized, but it can still be instantiated with a lowercase session for backwards compatibility.
All request hooks have been removed except ‘response’.
Authentication helpers have been broken out into separate modules. See requests-oauthlib and requests-kerberos.
The parameter for streaming requests was changed from prefetch to stream and the logic was inverted. In addition, stream is now required for raw response reading.
The config parameter to the requests method has been removed. Some of these options are now configured on a Session such as keep-alive and maximum number of redirects. The verbosity option should be handled by configuring logging.
Licensing¶
One key difference that has nothing to do with the API is a change in the license from the ISC license to the Apache 2.0 license. The Apache 2.0 license ensures that contributions to Requests are also covered by the Apache 2.0 license.
Migrating to 2.x¶
Compared with the 1.0 release, there were relatively few backwards incompatible changes, but there are still a few issues to be aware of with this major release.
For more details on the changes in this release including new APIs, links to the relevant GitHub issues and some of the bug fixes, read Cory’s blog on the subject.
API Changes¶
There were a couple changes to how Requests handles exceptions. RequestException is now a subclass of IOError rather than RuntimeError as that more accurately categorizes the type of error. In addition, an invalid URL escape sequence now raises a subclass of RequestException rather than a ValueError .
Lastly, httplib.IncompleteRead exceptions caused by incorrect chunked encoding will now raise a Requests ChunkedEncodingError instead.
The proxy API has changed slightly. The scheme for a proxy URL is now required.
Behavioural Changes¶
- Keys in the headers dictionary are now native strings on all Python versions, i.e. bytestrings on Python 2 and unicode on Python 3. If the keys are not native strings (unicode on Python 2 or bytestrings on Python 3) they will be converted to the native string type assuming UTF-8 encoding.
- Values in the headers dictionary should always be strings. This has been the project’s position since before 1.0 but a recent change (since version 2.11.0) enforces this more strictly. It’s advised to avoid passing header values as unicode when possible.
Requests is an elegant and simple HTTP library for Python, built for human beings. You are currently looking at the documentation of the development release.
Stay Informed
Receive updates on new releases and upcoming projects.
Источник
24 Дек. 2015, Python, 342315 просмотров,
Стандартная библиотека Python имеет ряд готовых модулей по работе с HTTP.
- urllib
- httplib
Если уж совсем хочется хардкора, то можно и сразу с socket поработать. Но у всех этих модулей есть один большой недостаток — неудобство работы.
Во-первых, большое обилие классов и функций. Во-вторых, код получается вовсе не pythonic. Многие программисты любят Python за его элегантность и простоту, поэтому и был создан модуль, призванный решать проблему существующих и имя ему requests или HTTP For Humans. На момент написания данной заметки, последняя версия библиотеки — 2.9.1. С момента выхода Python версии 3.5 я дал себе негласное обещание писать новый код только на Py >= 3.5. Пора бы уже полностью перебираться на 3-ю ветку змеюки, поэтому в моих примерах print отныне является функцией, а не оператором
Что же умеет requests?
Для начала хочется показать как выглядит код работы с http, используя модули из стандартной библиотеки Python и код при работе с requests. В качестве мишени для стрельбы http запросами будет использоваться очень удобный сервис httpbin.org
>>> import urllib.request
>>> response = urllib.request.urlopen('https://httpbin.org/get')
>>> print(response.read())
b'{n "args": {}, n "headers": {n "Accept-Encoding": "identity", n "Host": "httpbin.org", n "User-Agent": "Python-urllib/3.5"n }, n "origin": "95.56.82.136", n "url": "https://httpbin.org/get"n}n'
>>> print(response.getheader('Server'))
nginx
>>> print(response.getcode())
200
>>>
Кстати, urllib.request это надстройка над «низкоуровневой» библиотекой httplib о которой я писал выше.
>>> import requests
>>> response = requests.get('https://httpbin.org/get')
>>> print(response.content)
b'{n "args": {}, n "headers": {n "Accept": "*/*", n "Accept-Encoding": "gzip, deflate", n "Host": "httpbin.org", n "User-Agent": "python-requests/2.9.1"n }, n "origin": "95.56.82.136", n "url": "https://httpbin.org/get"n}n'
>>> response.json()
{'headers': {'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'python-requests/2.9.1', 'Host': 'httpbin.org', 'Accept': '*/*'}, 'args': {}, 'origin': '95.56.82.136', 'url': 'https://httpbin.org/get'}
>>> response.headers
{'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Server': 'nginx', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Origin': '*', 'Content-Length': '237', 'Date': 'Wed, 23 Dec 2015 17:56:46 GMT'}
>>> response.headers.get('Server')
'nginx'
В простых методах запросов значительных отличий у них не имеется. Но давайте взглянем на работы с Basic Auth:
>>> import urllib.request
>>> password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
>>> top_level_url = 'https://httpbin.org/basic-auth/user/passwd'
>>> password_mgr.add_password(None, top_level_url, 'user', 'passwd')
>>> handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
>>> opener = urllib.request.build_opener(handler)
>>> response = opener.open(top_level_url)
>>> response.getcode()
200
>>> response.read()
b'{n "authenticated": true, n "user": "user"n}n'
>>> import requests
>>> response = requests.get('https://httpbin.org/basic-auth/user/passwd', auth=('user', 'passwd'))
>>> print(response.content)
b'{n "authenticated": true, n "user": "user"n}n'
>>> print(response.json())
{'user': 'user', 'authenticated': True}
А теперь чувствуется разница между pythonic и non-pythonic? Я думаю разница на лицо. И несмотря на тот факт, что requests ничто иное как обёртка над urllib3, а последняя является надстройкой над стандартными средствами Python, удобство написания кода в большинстве случаев является приоритетом номер один.
В requests имеется:
- Множество методов http аутентификации
- Сессии с куками
- Полноценная поддержка SSL
- Различные методы-плюшки вроде .json(), которые вернут данные в нужном формате
- Проксирование
- Грамотная и логичная работа с исключениями
О последнем пункте мне бы хотелось поговорить чуточку подробнее.
Обработка исключений в requests
При работе с внешними сервисами никогда не стоит полагаться на их отказоустойчивость. Всё упадёт рано или поздно, поэтому нам, программистам, необходимо быть всегда к этому готовыми, желательно заранее и в спокойной обстановке.
Итак, как у requests дела обстоят с различными факапами в момент сетевых соединений? Для начала определим ряд проблем, которые могут возникнуть:
- Хост недоступен. Обычно такого рода ошибка происходит из-за проблем конфигурирования DNS. (DNS lookup failure)
- «Вылет» соединения по таймауту
- Ошибки HTTP. Подробнее о HTTP кодах можно посмотреть здесь.
- Ошибки SSL соединений (обычно при наличии проблем с SSL сертификатом: просрочен, не является доверенным и т.д.)
Базовым классом-исключением в requests является RequestException. От него наследуются все остальные
- HTTPError
- ConnectionError
- Timeout
- SSLError
- ProxyError
И так далее. Полный список всех исключений можно посмотреть в requests.exceptions.
Timeout
В requests имеется 2 вида таймаут-исключений:
- ConnectTimeout — таймаут на соединения
- ReadTimeout — таймаут на чтение
>>> import requests
>>> try:
... response = requests.get('https://httpbin.org/user-agent', timeout=(0.00001, 10))
... except requests.exceptions.ConnectTimeout:
... print('Oops. Connection timeout occured!')
...
Oops. Connection timeout occured!
>>> try:
... response = requests.get('https://httpbin.org/user-agent', timeout=(10, 0.0001))
... except requests.exceptions.ReadTimeout:
... print('Oops. Read timeout occured')
... except requests.exceptions.ConnectTimeout:
... print('Oops. Connection timeout occured!')
...
Oops. Read timeout occured
ConnectionError
>>> import requests
>>> try:
... response = requests.get('http://urldoesnotexistforsure.bom')
... except requests.exceptions.ConnectionError:
... print('Seems like dns lookup failed..')
...
Seems like dns lookup failed..
HTTPError
>>> import requests
>>> try:
... response = requests.get('https://httpbin.org/status/500')
... response.raise_for_status()
... except requests.exceptions.HTTPError as err:
... print('Oops. HTTP Error occured')
... print('Response is: {content}'.format(content=err.response.content))
...
Oops. HTTP Error occured
Response is: b''
Я перечислил основные виды исключений, которые покрывают, пожалуй, 90% всех проблем, возникающих при работе с http. Главное помнить, что если мы действительно намерены отловить что-то и обработать, то это необходимо явно запрограммировать, если же нам неважен тип конкретного исключения, то можно отлавливать общий базовый класс RequestException и действовать уже от конкретного случая, например, залоггировать исключение и выкинуть его дальше наверх. Кстати, о логгировании я напишу отдельный подробный пост.
У блога появился свой Telegram канал, где я стараюсь делиться интересными находками из сети на тему разработки программного обеспечения. Велком, как говорится
Полезные «плюшки»
- httpbin.org очень полезный сервис для тестирования http клиентов, в частности удобен для тестирования нестандартного поведения сервиса
- httpie консольный http клиент (замена curl) написанный на Python
- responses mock библиотека для работы с requests
- HTTPretty mock библиотека для работы с http модулями
💌 Присоединяйтесь к рассылке
Понравился контент? Пожалуйста, подпишись на рассылку.