Содержание
- How to fix error in python requests?
- 1 Answer 1
- python requests http response 500 (site can be reached in browser)
- 3 Answers 3
- But Wait! There’s More!
- Jupyter Notebook 500 : Internal Server Error
- 33 Answers 33
- Catching a 500 server error in Flask
- 6 Answers 6
- Linked
- Related
- Hot Network Questions
- Subscribe to RSS
- HOWTO Fetch Internet Resources Using The urllib Package¶
- Introduction¶
- Fetching URLs¶
- Data¶
- Headers¶
- Handling Exceptions¶
- URLError¶
- HTTPError¶
- Error Codes¶
- Wrapping it Up¶
- Number 1В¶
- Number 2В¶
- info and geturl¶
- Openers and Handlers¶
- Basic Authentication¶
- Proxies¶
- Sockets and Layers¶
- Footnotes¶
How to fix error in python requests?
I am using an API, which receives a pdf file and does some analysis, but I am receiving Response 500 always
Have initially tested using Postman and the request goes through, receiving response 200 with the corresponding JSON information. The SSL security should be turned off.
However, when I try to do request via Python, I always get Response 500
Python code written by me:
Python code, produced by the Postman:
Have masked the API link as <> due to the confidentiality
Response by Postman:
Response by Python:
UPDATE:
Tried the GET request — works fine, as I receive the JSON response from it. I guess the problem is in posting pdf file. Is there any other options on how to post a file to an API?
Postman Response RAW:
CORRECT REQUEST
So, eventually — the correct code is the following:
1 Answer 1
A 500 error indicates an internal server error, not an error with your script.
If you’re receiving a 500 error (as opposed to a 400 error, which indicates a bad request), then theoretically your script is fine and it’s the server-side code that needs to be adjusted.
In practice, it could still be due a bad request though.
If you’re the one running the API, then you can check the error logs and debug the code line-by-line to figure out why the server is throwing an error.
In this case though, it sounds like it’s a third-party API, correct? If so, I recommend looking through their documentation to find a working example or contacting them if you think it’s an issue on their end (which is unlikely but possible).
Источник
python requests http response 500 (site can be reached in browser)
I am trying to figure out what I’m doing wrong here, but I keep getting lost.
In python 2.7, I’m running following code:
If I open this one in browser, it responds properly. I was digging around and found similar one with urllib library (500 error with urllib.request.urlopen), however I am not able to adapt it, even more I would like to use requests here.
I might be hitting here some missing proxy setting, as suggested for example here (Perl File::Fetch Failed HTTP response: 500 Internal Server Error), but can someone explain me, what is the proper workaround with this one?
3 Answers 3
One thing that is different with the browser request is the User-Agent; however you can alter it using requests like this:
Some web applications will also check the Origin and/or the Referer headers (for example for AJAX requests); you can set these in a similar fashion to User-Agent .
Remember, you are setting these headers to basically bypass checks so please be a good netizen and don’t abuse people’s resources.
The User-Agent, and also other header elements, could be causing your problem.
When I came accross this error I watched a regular request made by a browser using Wireshark, and it turned out there were things other than just the User-Agent in the header which the server expected to be there.
After emulating the header sent by the browser in python requests, the server stopped throwing errors.
But Wait! There’s More!
The above answers did help me on the path to resolution, but I had to find still more things to add to my headers so that certain sites would let me in using python requests. Learning how to use Wireshark (suggested above) was a good new skill for me, but I found an easier way.
If you go to your developer view (right-click then click Inspect in Chrome), then go to the Network tab, and then select one of the Names at left and then look under Headers for Requests Headers and expand, you’ll get a complete list of what your system is sending to the server. I started adding elements that I thought were most likely needed one at a time and testing until my errors went away. Then I reduced that set to the smallest possible set that worked. In my case, with my headers having only User-Agent to deal with other code issues, I only needed to add the Accept-Language key to deal with a few other sites. See picture below as a guide to the text above.
I hope this process helps others to find ways to eliminate undesirable python requests return codes where possible.
Источник
Jupyter Notebook 500 : Internal Server Error
I want to learn how to use Jupyter Notebook. So far, I have managed to download and install it (using pip), but I’m having trouble opening it.
I am opening it by typing:
in my terminal. It opens in my browser, with the URL:
and I just get a big:
message. Could someone point me in the right direction of what’s going wrong please?
The full error message in my terminal:
When attempting to update ipython as advised, the following error message was produced:
33 Answers 33
Try upgrading jupyter hub first:
If you are inside a conda environment, run the following command instead.
After trying all the solutions on this page without success, a variation of @kruger answer is what worked for me, simply this:
pip install —upgrade nbconvert
Was having a similar problem. Fixed it after upgrading ipython with this command
sudo pip install —upgrade «ipython[all]»
Note: make sure to type ipython with double quotes and [all]
I solved this by upgrading the nbconvert package
I had the same problem and was a bit painful until I managed to fix it. The magic line the worked for me was
I also encountered this problem. The root cause in my case was that I already had Jinja2 installed with root permissions (having used sudo pip install before I knew better).
My solution was to uninstall Jinja2 with sudo pip uninstall (which was required because it was installed with root permissions), and re-run pip install jupyter to reinstall it with regular user permissions.
While using sudo to install works here, it makes the problem worse in the longer term because all its packages are installed with root permissions, leading to further problems like this in future with other packages. It’s kind of like kicking that can down the road.
Many won’t care of course, as long as it works. But for those that do I thought I’d mention.
There’s no way to know for sure what the offending package is, but it’s likely to be one of those in the stack trace. I noticed Jinja2 as one I vaguely remembered from my early days in Python so I started there and it worked.
Источник
Catching a 500 server error in Flask
I love Flask’s error catching. It’s beautifully simple:
works like charm. But it doesn’t work for the 500 error code. I want to catch Python errors when something goes wrong an exception is raised in the code. Is that possible?
I should note that if I explicitly call return abort(500) in a view then the 500 errorhandler does work. So this is explicitly for when the Python code fails.
Is this possible?
6 Answers 6
What you have described is, by default, how Flask works. My assumption is that you are running in debug mode, and therefore exceptions are being shown to you in the debug screen. Make sure debug mode is off, then try again. Here is a comment directly from the code itself:
Default exception handling that kicks in when an exception occurs that is not caught. In debug mode the exception will be re-raised immediately, otherwise it is logged and the handler for a 500 internal server error is used. If no such handler exists, a default 500 internal server error message is displayed.
It works fine in my side:
Flask will not set the error code for you, so make sure to also provide the HTTP status code when returning a response.
here is my code snippt
My solution to this was to turn on the propagation of exceptions, by modifying the config dictionary:
The issue is that within the code, not all Exceptions are HTTPException , but Flask catches these by default and returns a generic 500 error response (which may or may not include the original error message as described by @Mark Hildreth). Thus, using @app.errorhandler(500) will not catch those errors, since this happens before Flask returns the generic 500 error.
You would need to have a generic errorhandler(Exception) which works similar to except Exception: in python, which captures everything. A good solution is provided in Flask pallets projects:
You can also return JSON if you’d like and also include the original error message if you’re in debug mode. E.g.
this code catching 500 status code and get exception error
Linked
Hot Network Questions
To subscribe to this RSS feed, copy and paste this URL into your RSS reader.
Site design / logo © 2023 Stack Exchange Inc; user contributions licensed under CC BY-SA . rev 2023.1.14.43159
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.
Источник
HOWTO Fetch Internet Resources Using The urllib Package¶
There is a French translation of an earlier revision of this HOWTO, available at urllib2 — Le Manuel manquant.
Introduction¶
You may also find useful the following article on fetching web resources with Python:
A tutorial on Basic Authentication, with examples in Python.
urllib.request is a Python module for fetching URLs (Uniform Resource Locators). It offers a very simple interface, in the form of the urlopen function. This is capable of fetching URLs using a variety of different protocols. It also offers a slightly more complex interface for handling common situations — like basic authentication, cookies, proxies and so on. These are provided by objects called handlers and openers.
urllib.request supports fetching URLs for many “URL schemes” (identified by the string before the «:» in URL — for example «ftp» is the URL scheme of «ftp://python.org/» ) using their associated network protocols (e.g. FTP, HTTP). This tutorial focuses on the most common case, HTTP.
For straightforward situations urlopen is very easy to use. But as soon as you encounter errors or non-trivial cases when opening HTTP URLs, you will need some understanding of the HyperText Transfer Protocol. The most comprehensive and authoritative reference to HTTP is RFC 2616. This is a technical document and not intended to be easy to read. This HOWTO aims to illustrate using urllib, with enough detail about HTTP to help you through. It is not intended to replace the urllib.request docs, but is supplementary to them.
Fetching URLs¶
The simplest way to use urllib.request is as follows:
If you wish to retrieve a resource via URL and store it in a temporary location, you can do so via the urlretrieve() function:
Many uses of urllib will be that simple (note that instead of an вЂhttp:’ URL we could have used a URL starting with вЂftp:’, вЂfile:’, etc.). However, it’s the purpose of this tutorial to explain the more complicated cases, concentrating on HTTP.
HTTP is based on requests and responses — the client makes requests and servers send responses. urllib.request mirrors this with a Request object which represents the HTTP request you are making. In its simplest form you create a Request object that specifies the URL you want to fetch. Calling urlopen with this Request object returns a response object for the URL requested. This response is a file-like object, which means you can for example call .read() on the response:
Note that urllib.request makes use of the same Request interface to handle all URL schemes. For example, you can make an FTP request like so:
In the case of HTTP, there are two extra things that Request objects allow you to do: First, you can pass data to be sent to the server. Second, you can pass extra information (“metadata”) about the data or the about request itself, to the server — this information is sent as HTTP “headers”. Let’s look at each of these in turn.
Data¶
Sometimes you want to send data to a URL (often the URL will refer to a CGI (Common Gateway Interface) script or other web application). With HTTP, this is often done using what’s known as a POST request. This is often what your browser does when you submit a HTML form that you filled in on the web. Not all POSTs have to come from forms: you can use a POST to transmit arbitrary data to your own application. In the common case of HTML forms, the data needs to be encoded in a standard way, and then passed to the Request object as the data argument. The encoding is done using a function from the urllib.parse library.
Note that other encodings are sometimes required (e.g. for file upload from HTML forms — see HTML Specification, Form Submission for more details).
If you do not pass the data argument, urllib uses a GET request. One way in which GET and POST requests differ is that POST requests often have “side-effects”: they change the state of the system in some way (for example by placing an order with the website for a hundredweight of tinned spam to be delivered to your door). Though the HTTP standard makes it clear that POSTs are intended to always cause side-effects, and GET requests never to cause side-effects, nothing prevents a GET request from having side-effects, nor a POST requests from having no side-effects. Data can also be passed in an HTTP GET request by encoding it in the URL itself.
This is done as follows:
Notice that the full URL is created by adding a ? to the URL, followed by the encoded values.
We’ll discuss here one particular HTTP header, to illustrate how to add headers to your HTTP request.
Some websites [1] dislike being browsed by programs, or send different versions to different browsers [2]. By default urllib identifies itself as Python-urllib/x.y (where x and y are the major and minor version numbers of the Python release, e.g. Python-urllib/2.5 ), which may confuse the site, or just plain not work. The way a browser identifies itself is through the User-Agent header [3]. When you create a Request object you can pass a dictionary of headers in. The following example makes the same request as above, but identifies itself as a version of Internet Explorer [4].
The response also has two useful methods. See the section on info and geturl which comes after we have a look at what happens when things go wrong.
Handling Exceptions¶
urlopen raises URLError when it cannot handle a response (though as usual with Python APIs, built-in exceptions such as ValueError , TypeError etc. may also be raised).
HTTPError is the subclass of URLError raised in the specific case of HTTP URLs.
The exception classes are exported from the urllib.error module.
URLError¶
Often, URLError is raised because there is no network connection (no route to the specified server), or the specified server doesn’t exist. In this case, the exception raised will have a вЂreason’ attribute, which is a tuple containing an error code and a text error message.
HTTPError¶
Every HTTP response from the server contains a numeric “status code”. Sometimes the status code indicates that the server is unable to fulfil the request. The default handlers will handle some of these responses for you (for example, if the response is a “redirection” that requests the client fetch the document from a different URL, urllib will handle that for you). For those it can’t handle, urlopen will raise an HTTPError . Typical errors include вЂ404’ (page not found), вЂ403’ (request forbidden), and вЂ401’ (authentication required).
See section 10 of RFC 2616 for a reference on all the HTTP error codes.
The HTTPError instance raised will have an integer вЂcode’ attribute, which corresponds to the error sent by the server.
Error Codes¶
Because the default handlers handle redirects (codes in the 300 range), and codes in the 100–299 range indicate success, you will usually only see error codes in the 400–599 range.
http.server.BaseHTTPRequestHandler.responses is a useful dictionary of response codes in that shows all the response codes used by RFC 2616. The dictionary is reproduced here for convenience
When an error is raised the server responds by returning an HTTP error code and an error page. You can use the HTTPError instance as a response on the page returned. This means that as well as the code attribute, it also has read, geturl, and info, methods as returned by the urllib.response module:
Wrapping it Up¶
So if you want to be prepared for HTTPError or URLError there are two basic approaches. I prefer the second approach.
Number 1В¶
The except HTTPError must come first, otherwise except URLError will also catch an HTTPError .
Number 2В¶
info and geturl¶
The response returned by urlopen (or the HTTPError instance) has two useful methods info() and geturl() and is defined in the module urllib.response ..
geturl — this returns the real URL of the page fetched. This is useful because urlopen (or the opener object used) may have followed a redirect. The URL of the page fetched may not be the same as the URL requested.
info — this returns a dictionary-like object that describes the page fetched, particularly the headers sent by the server. It is currently an http.client.HTTPMessage instance.
Typical headers include вЂContent-length’, вЂContent-type’, and so on. See the Quick Reference to HTTP Headers for a useful listing of HTTP headers with brief explanations of their meaning and use.
Openers and Handlers¶
When you fetch a URL you use an opener (an instance of the perhaps confusingly-named urllib.request.OpenerDirector ). Normally we have been using the default opener — via urlopen — but you can create custom openers. Openers use handlers. All the “heavy lifting” is done by the handlers. Each handler knows how to open URLs for a particular URL scheme (http, ftp, etc.), or how to handle an aspect of URL opening, for example HTTP redirections or HTTP cookies.
You will want to create openers if you want to fetch URLs with specific handlers installed, for example to get an opener that handles cookies, or to get an opener that does not handle redirections.
To create an opener, instantiate an OpenerDirector , and then call .add_handler(some_handler_instance) repeatedly.
Alternatively, you can use build_opener , which is a convenience function for creating opener objects with a single function call. build_opener adds several handlers by default, but provides a quick way to add more and/or override the default handlers.
Other sorts of handlers you might want to can handle proxies, authentication, and other common but slightly specialised situations.
install_opener can be used to make an opener object the (global) default opener. This means that calls to urlopen will use the opener you have installed.
Opener objects have an open method, which can be called directly to fetch urls in the same way as the urlopen function: there’s no need to call install_opener , except as a convenience.
Basic Authentication¶
To illustrate creating and installing a handler we will use the HTTPBasicAuthHandler . For a more detailed discussion of this subject – including an explanation of how Basic Authentication works — see the Basic Authentication Tutorial.
When authentication is required, the server sends a header (as well as the 401 error code) requesting authentication. This specifies the authentication scheme and a вЂrealm’. The header looks like: WWW-Authenticate: SCHEME realm=»REALM» .
The client should then retry the request with the appropriate name and password for the realm included as a header in the request. This is вЂbasic authentication’. In order to simplify this process we can create an instance of HTTPBasicAuthHandler and an opener to use this handler.
The HTTPBasicAuthHandler uses an object called a password manager to handle the mapping of URLs and realms to passwords and usernames. If you know what the realm is (from the authentication header sent by the server), then you can use a HTTPPasswordMgr . Frequently one doesn’t care what the realm is. In that case, it is convenient to use HTTPPasswordMgrWithDefaultRealm . This allows you to specify a default username and password for a URL. This will be supplied in the absence of you providing an alternative combination for a specific realm. We indicate this by providing None as the realm argument to the add_password method.
The top-level URL is the first URL that requires authentication. URLs “deeper” than the URL you pass to .add_password() will also match.
In the above example we only supplied our HTTPBasicAuthHandler to build_opener . By default openers have the handlers for normal situations – ProxyHandler (if a proxy setting such as an http_proxy environment variable is set), UnknownHandler , HTTPHandler , HTTPDefaultErrorHandler , HTTPRedirectHandler , FTPHandler , FileHandler , DataHandler , HTTPErrorProcessor .
top_level_url is in fact either a full URL (including the вЂhttp:’ scheme component and the hostname and optionally the port number) e.g. «http://example.com/» or an “authority” (i.e. the hostname, optionally including the port number) e.g. «example.com» or «example.com:8080» (the latter example includes a port number). The authority, if present, must NOT contain the “userinfo” component — for example «joe:password@example.com» is not correct.
Proxies¶
urllib will auto-detect your proxy settings and use those. This is through the ProxyHandler , which is part of the normal handler chain when a proxy setting is detected. Normally that’s a good thing, but there are occasions when it may not be helpful [5]. One way to do this is to setup our own ProxyHandler , with no proxies defined. This is done using similar steps to setting up a Basic Authentication handler:
Currently urllib.request does not support fetching of https locations through a proxy. However, this can be enabled by extending urllib.request as shown in the recipe [6].
HTTP_PROXY will be ignored if a variable REQUEST_METHOD is set; see the documentation on getproxies() .
Sockets and Layers¶
The Python support for fetching resources from the web is layered. urllib uses the http.client library, which in turn uses the socket library.
As of Python 2.3 you can specify how long a socket should wait for a response before timing out. This can be useful in applications which have to fetch web pages. By default the socket module has no timeout and can hang. Currently, the socket timeout is not exposed at the http.client or urllib.request levels. However, you can set the default timeout globally for all sockets using
This document was reviewed and revised by John Lee.
Источник
1. None поправил
2. timeout=0.001 — без этого параметра ошибка такая же
Python | ||
|
Добавлено через 3 минуты
Лог ошибки:
Кликните здесь для просмотра всего текста
C:Users230AppDataLocalProgramsPythonPython3 6-32python.exe C:/Work/phyton/main.py
* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
[2018-04-02 21:02:32,454] ERROR in app: Exception on /post [GET]
Traceback (most recent call last):
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesurllib3connectionpool.py», line 387, in _make_request
six.raise_from(e, None)
File «<string>», line 2, in raise_from
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesurllib3connectionpool.py», line 383, in _make_request
httplib_response = conn.getresponse()
File «C:Users230AppDataLocalProgramsPythonPython 36-32libhttpclient.py», line 1331, in getresponse
response.begin()
File «C:Users230AppDataLocalProgramsPythonPython 36-32libhttpclient.py», line 297, in begin
version, status, reason = self._read_status()
File «C:Users230AppDataLocalProgramsPythonPython 36-32libhttpclient.py», line 258, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), «iso-8859-1»)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsocket.py», line 586, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesrequestsadapters.py», line 440, in send
timeout=timeout
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesurllib3connectionpool.py», line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesurllib3utilretry.py», line 357, in increment
raise six.reraise(type(error), error, _stacktrace)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesurllib3packagessix.py», line 686, in reraise
raise value
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesurllib3connectionpool.py», line 601, in urlopen
chunked=chunked)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesurllib3connectionpool.py», line 389, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesurllib3connectionpool.py», line 309, in _raise_timeout
raise ReadTimeoutError(self, url, «Read timed out. (read timeout=%s)» % timeout_value)
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host=’localhost’, port=5000): Read timed out. (read timeout=1)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesflaskapp.py», line 1982, in wsgi_app
response = self.full_dispatch_request()
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesflaskapp.py», line 1614, in full_dispatch_request
rv = self.handle_user_exception(e)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesflaskapp.py», line 1517, in handle_user_exception
reraise(exc_type, exc_value, tb)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesflask_compat.py», line 33, in reraise
raise value
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesflaskapp.py», line 1612, in full_dispatch_request
rv = self.dispatch_request()
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesflaskapp.py», line 1598, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File «C:/Work/phyton/main.py», line 148, in post_to_node
response = requests.post(base_url, data=payload, timeout=1)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesrequestsapi.py», line 112, in post
return request(‘post’, url, data=data, json=json, **kwargs)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesrequestsapi.py», line 58, in request
return session.request(method=method, url=url, **kwargs)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesrequestssessions.py», line 508, in request
resp = self.send(prep, **send_kwargs)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesrequestssessions.py», line 618, in send
r = adapter.send(request, **kwargs)
File «C:Users230AppDataLocalProgramsPythonPython 36-32libsite-packagesrequestsadapters.py», line 521, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host=’localhost’, port=5000): Read timed out. (read timeout=1)
127.0.0.1 — — [02/Apr/2018 21:02:32] «GET /post HTTP/1.1» 500 —
127.0.0.1 — — [02/Apr/2018 21:02:32] «POST /node HTTP/1.1» 200 —
Добавлено через 2 минуты
если таймаут убрать, страница бесконечно пытается грузится с пустым экраном
This document covers some of Requests more advanced features.
Session Objects¶
The Session object allows you to persist certain parameters across
requests. It also persists cookies across all requests made from the
Session instance, and will use urllib3
’s connection pooling. So if
you’re making several requests to the same host, the underlying TCP
connection will be reused, which can result in a significant performance
increase (see HTTP persistent connection).
A Session object has all the methods of the main Requests API.
Let’s persist some cookies across requests:
s = requests.Session() s.get('https://httpbin.org/cookies/set/sessioncookie/123456789') r = s.get('https://httpbin.org/cookies') print(r.text) # '{"cookies": {"sessioncookie": "123456789"}}'
Sessions can also be used to provide default data to the request methods. This
is done by providing data to the properties on a Session object:
s = requests.Session() s.auth = ('user', 'pass') s.headers.update({'x-test': 'true'}) # both 'x-test' and 'x-test2' are sent s.get('https://httpbin.org/headers', headers={'x-test2': 'true'})
Any dictionaries that you pass to a request method will be merged with the
session-level values that are set. The method-level parameters override session
parameters.
Note, however, that method-level parameters will not be persisted across
requests, even if using a session. This example will only send the cookies
with the first request, but not the second:
s = requests.Session() r = s.get('https://httpbin.org/cookies', cookies={'from-my': 'browser'}) print(r.text) # '{"cookies": {"from-my": "browser"}}' r = s.get('https://httpbin.org/cookies') print(r.text) # '{"cookies": {}}'
If you want to manually add cookies to your session, use the
Cookie utility functions to manipulate
Session.cookies
.
Sessions can also be used as context managers:
with requests.Session() as s: s.get('https://httpbin.org/cookies/set/sessioncookie/123456789')
This will make sure the session is closed as soon as the with
block is
exited, even if unhandled exceptions occurred.
Remove a Value From a Dict Parameter
Sometimes you’ll want to omit session-level keys from a dict parameter. To
do this, you simply set that key’s value to None
in the method-level
parameter. It will automatically be omitted.
All values that are contained within a session are directly available to you.
See the Session API Docs to learn more.
Request and Response Objects¶
Whenever a call is made to requests.get()
and friends, you are doing two
major things. First, you are constructing a Request
object which will be
sent off to a server to request or query some resource. Second, a Response
object is generated once Requests gets a response back from the server.
The Response
object contains all of the information returned by the server and
also contains the Request
object you created originally. Here is a simple
request to get some very important information from Wikipedia’s servers:
>>> r = requests.get('https://en.wikipedia.org/wiki/Monty_Python')
If we want to access the headers the server sent back to us, we do this:
>>> r.headers {'content-length': '56170', 'x-content-type-options': 'nosniff', 'x-cache': 'HIT from cp1006.eqiad.wmnet, MISS from cp1010.eqiad.wmnet', 'content-encoding': 'gzip', 'age': '3080', 'content-language': 'en', 'vary': 'Accept-Encoding,Cookie', 'server': 'Apache', 'last-modified': 'Wed, 13 Jun 2012 01:33:50 GMT', 'connection': 'close', 'cache-control': 'private, s-maxage=0, max-age=0, must-revalidate', 'date': 'Thu, 14 Jun 2012 12:59:39 GMT', 'content-type': 'text/html; charset=UTF-8', 'x-cache-lookup': 'HIT from cp1006.eqiad.wmnet:3128, MISS from cp1010.eqiad.wmnet:80'}
However, if we want to get the headers we sent the server, we simply access the
request, and then the request’s headers:
>>> r.request.headers {'Accept-Encoding': 'identity, deflate, compress, gzip', 'Accept': '*/*', 'User-Agent': 'python-requests/1.2.0'}
Prepared Requests¶
Whenever you receive a Response
object
from an API call or a Session call, the request
attribute is actually the
PreparedRequest
that was used. In some cases you may wish to do some extra
work to the body or headers (or anything else really) before sending a
request. The simple recipe for this is the following:
from requests import Request, Session s = Session() req = Request('POST', url, data=data, headers=headers) prepped = req.prepare() # do something with prepped.body prepped.body = 'No, I want exactly this as the body.' # do something with prepped.headers del prepped.headers['Content-Type'] resp = s.send(prepped, stream=stream, verify=verify, proxies=proxies, cert=cert, timeout=timeout ) print(resp.status_code)
Since you are not doing anything special with the Request
object, you
prepare it immediately and modify the PreparedRequest
object. You then
send that with the other parameters you would have sent to requests.*
or
Session.*
.
However, the above code will lose some of the advantages of having a Requests
Session
object. In particular,
Session
-level state such as cookies will
not get applied to your request. To get a
PreparedRequest
with that state
applied, replace the call to Request.prepare()
with a call to
Session.prepare_request()
, like this:
from requests import Request, Session s = Session() req = Request('GET', url, data=data, headers=headers) prepped = s.prepare_request(req) # do something with prepped.body prepped.body = 'Seriously, send exactly these bytes.' # do something with prepped.headers prepped.headers['Keep-Dead'] = 'parrot' resp = s.send(prepped, stream=stream, verify=verify, proxies=proxies, cert=cert, timeout=timeout ) print(resp.status_code)
When you are using the prepared request flow, keep in mind that it does not take into account the environment.
This can cause problems if you are using environment variables to change the behaviour of requests.
For example: Self-signed SSL certificates specified in REQUESTS_CA_BUNDLE
will not be taken into account.
As a result an SSL: CERTIFICATE_VERIFY_FAILED
is thrown.
You can get around this behaviour by explicity merging the environment settings into your session:
from requests import Request, Session s = Session() req = Request('GET', url) prepped = s.prepare_request(req) # Merge environment settings into session settings = s.merge_environment_settings(prepped.url, {}, None, None, None) resp = s.send(prepped, **settings) print(resp.status_code)
SSL Cert Verification¶
Requests verifies SSL certificates for HTTPS requests, just like a web browser.
By default, SSL verification is enabled, and Requests will throw a SSLError if
it’s unable to verify the certificate:
>>> requests.get('https://requestb.in') requests.exceptions.SSLError: hostname 'requestb.in' doesn't match either of '*.herokuapp.com', 'herokuapp.com'
I don’t have SSL setup on this domain, so it throws an exception. Excellent. GitHub does though:
>>> requests.get('https://github.com') <Response [200]>
You can pass verify
the path to a CA_BUNDLE file or directory with certificates of trusted CAs:
>>> requests.get('https://github.com', verify='/path/to/certfile')
or persistent:
s = requests.Session() s.verify = '/path/to/certfile'
Note
If verify
is set to a path to a directory, the directory must have been processed using
the c_rehash utility supplied with OpenSSL.
This list of trusted CAs can also be specified through the REQUESTS_CA_BUNDLE
environment variable.
Requests can also ignore verifying the SSL certificate if you set verify
to False:
>>> requests.get('https://kennethreitz.org', verify=False) <Response [200]>
By default, verify
is set to True. Option verify
only applies to host certs.
Client Side Certificates¶
You can also specify a local cert to use as client side certificate, as a single
file (containing the private key and the certificate) or as a tuple of both
files’ paths:
>>> requests.get('https://kennethreitz.org', cert=('/path/client.cert', '/path/client.key')) <Response [200]>
or persistent:
s = requests.Session() s.cert = '/path/client.cert'
If you specify a wrong path or an invalid cert, you’ll get a SSLError:
>>> requests.get('https://kennethreitz.org', cert='/wrong_path/client.pem') SSLError: [Errno 336265225] _ssl.c:347: error:140B0009:SSL routines:SSL_CTX_use_PrivateKey_file:PEM lib
Warning
The private key to your local certificate must be unencrypted.
Currently, Requests does not support using encrypted keys.
CA Certificates¶
Requests uses certificates from the package certifi. This allows for users
to update their trusted certificates without changing the version of Requests.
Before version 2.16, Requests bundled a set of root CAs that it trusted,
sourced from the Mozilla trust store. The certificates were only updated
once for each Requests version. When certifi
was not installed, this led to
extremely out-of-date certificate bundles when using significantly older
versions of Requests.
For the sake of security we recommend upgrading certifi frequently!
Body Content Workflow¶
By default, when you make a request, the body of the response is downloaded
immediately. You can override this behaviour and defer downloading the response
body until you access the Response.content
attribute with the stream
parameter:
tarball_url = 'https://github.com/requests/requests/tarball/master' r = requests.get(tarball_url, stream=True)
At this point only the response headers have been downloaded and the connection
remains open, hence allowing us to make content retrieval conditional:
if int(r.headers['content-length']) < TOO_LONG: content = r.content ...
You can further control the workflow by use of the Response.iter_content()
and Response.iter_lines()
methods.
Alternatively, you can read the undecoded body from the underlying
urllib3 urllib3.HTTPResponse
at
Response.raw
.
If you set stream
to True
when making a request, Requests cannot
release the connection back to the pool unless you consume all the data or call
Response.close
. This can lead to
inefficiency with connections. If you find yourself partially reading request
bodies (or not reading them at all) while using stream=True
, you should
make the request within a with
statement to ensure it’s always closed:
with requests.get('https://httpbin.org/get', stream=True) as r: # Do things with the response here.
Keep-Alive¶
Excellent news — thanks to urllib3, keep-alive is 100% automatic within a session!
Any requests that you make within a session will automatically reuse the appropriate
connection!
Note that connections are only released back to the pool for reuse once all body
data has been read; be sure to either set stream
to False
or read the
content
property of the Response
object.
Streaming Uploads¶
Requests supports streaming uploads, which allow you to send large streams or
files without reading them into memory. To stream and upload, simply provide a
file-like object for your body:
with open('massive-body', 'rb') as f: requests.post('http://some.url/streamed', data=f)
Warning
It is strongly recommended that you open files in binary
mode. This is because Requests may attempt to provide
the Content-Length
header for you, and if it does this value
will be set to the number of bytes in the file. Errors may occur
if you open the file in text mode.
Chunk-Encoded Requests¶
Requests also supports Chunked transfer encoding for outgoing and incoming requests.
To send a chunk-encoded request, simply provide a generator (or any iterator without
a length) for your body:
def gen(): yield 'hi' yield 'there' requests.post('http://some.url/chunked', data=gen())
For chunked encoded responses, it’s best to iterate over the data using
Response.iter_content()
. In
an ideal situation you’ll have set stream=True
on the request, in which
case you can iterate chunk-by-chunk by calling iter_content
with a chunk_size
parameter of None
. If you want to set a maximum size of the chunk,
you can set a chunk_size
parameter to any integer.
POST Multiple Multipart-Encoded Files¶
You can send multiple files in one request. For example, suppose you want to
upload image files to an HTML form with a multiple file field ‘images’:
<input type="file" name="images" multiple="true" required="true"/>
To do that, just set files to a list of tuples of (form_field_name, file_info)
:
>>> url = 'https://httpbin.org/post' >>> multiple_files = [ ('images', ('foo.png', open('foo.png', 'rb'), 'image/png')), ('images', ('bar.png', open('bar.png', 'rb'), 'image/png'))] >>> r = requests.post(url, files=multiple_files) >>> r.text { ... 'files': {'images': ' ....'} 'Content-Type': 'multipart/form-data; boundary=3131623adb2043caaeb5538cc7aa0b3a', ... }
Warning
It is strongly recommended that you open files in binary
mode. This is because Requests may attempt to provide
the Content-Length
header for you, and if it does this value
will be set to the number of bytes in the file. Errors may occur
if you open the file in text mode.
Event Hooks¶
Requests has a hook system that you can use to manipulate portions of
the request process, or signal event handling.
Available hooks:
response
:- The response generated from a Request.
You can assign a hook function on a per-request basis by passing a
{hook_name: callback_function}
dictionary to the hooks
request
parameter:
hooks={'response': print_url}
That callback_function
will receive a chunk of data as its first
argument.
def print_url(r, *args, **kwargs): print(r.url)
If an error occurs while executing your callback, a warning is given.
If the callback function returns a value, it is assumed that it is to
replace the data that was passed in. If the function doesn’t return
anything, nothing else is affected.
def record_hook(r, *args, **kwargs): r.hook_called = True return r
Let’s print some request method arguments at runtime:
>>> requests.get('https://httpbin.org/', hooks={'response': print_url}) https://httpbin.org/ <Response [200]>
You can add multiple hooks to a single request. Let’s call two hooks at once:
>>> r = requests.get('https://httpbin.org/', hooks={'response': [print_url, record_hook]}) >>> r.hook_called True
You can also add hooks to a Session
instance. Any hooks you add will then
be called on every request made to the session. For example:
>>> s = requests.Session() >>> s.hooks['response'].append(print_url) >>> s.get('https://httpbin.org/') https://httpbin.org/ <Response [200]>
A Session
can have multiple hooks, which will be called in the order
they are added.
Custom Authentication¶
Requests allows you to use specify your own authentication mechanism.
Any callable which is passed as the auth
argument to a request method will
have the opportunity to modify the request before it is dispatched.
Authentication implementations are subclasses of AuthBase
,
and are easy to define. Requests provides two common authentication scheme
implementations in requests.auth
: HTTPBasicAuth
and
HTTPDigestAuth
.
Let’s pretend that we have a web service that will only respond if the
X-Pizza
header is set to a password value. Unlikely, but just go with it.
from requests.auth import AuthBase class PizzaAuth(AuthBase): """Attaches HTTP Pizza Authentication to the given Request object.""" def __init__(self, username): # setup any auth-related data here self.username = username def __call__(self, r): # modify and return the request r.headers['X-Pizza'] = self.username return r
Then, we can make a request using our Pizza Auth:
>>> requests.get('http://pizzabin.org/admin', auth=PizzaAuth('kenneth')) <Response [200]>
Streaming Requests¶
With Response.iter_lines()
you can easily
iterate over streaming APIs such as the Twitter Streaming
API. Simply
set stream
to True
and iterate over the response with
iter_lines
:
import json import requests r = requests.get('https://httpbin.org/stream/20', stream=True) for line in r.iter_lines(): # filter out keep-alive new lines if line: decoded_line = line.decode('utf-8') print(json.loads(decoded_line))
When using decode_unicode=True with
Response.iter_lines()
or
Response.iter_content()
, you’ll want
to provide a fallback encoding in the event the server doesn’t provide one:
r = requests.get('https://httpbin.org/stream/20', stream=True) if r.encoding is None: r.encoding = 'utf-8' for line in r.iter_lines(decode_unicode=True): if line: print(json.loads(line))
Warning
iter_lines
is not reentrant safe.
Calling this method multiple times causes some of the received data
being lost. In case you need to call it from multiple places, use
the resulting iterator object instead:
lines = r.iter_lines() # Save the first line for later or just skip it first_line = next(lines) for line in lines: print(line)
Proxies¶
If you need to use a proxy, you can configure individual requests with the
proxies
argument to any request method:
import requests proxies = { 'http': 'http://10.10.1.10:3128', 'https': 'http://10.10.1.10:1080', } requests.get('http://example.org', proxies=proxies)
You can also configure proxies by setting the environment variables
HTTP_PROXY
and HTTPS_PROXY
.
$ export HTTP_PROXY="http://10.10.1.10:3128" $ export HTTPS_PROXY="http://10.10.1.10:1080" $ python >>> import requests >>> requests.get('http://example.org')
To use HTTP Basic Auth with your proxy, use the http://user:password@host/ syntax:
proxies = {'http': 'http://user:pass@10.10.1.10:3128/'}
To give a proxy for a specific scheme and host, use the
scheme://hostname form for the key. This will match for
any request to the given scheme and exact hostname.
proxies = {'http://10.20.1.128': 'http://10.10.1.10:5323'}
Note that proxy URLs must include the scheme.
SOCKS¶
New in version 2.10.0.
In addition to basic HTTP proxies, Requests also supports proxies using the
SOCKS protocol. This is an optional feature that requires that additional
third-party libraries be installed before use.
You can get the dependencies for this feature from pip
:
$ pip install requests[socks]
Once you’ve installed those dependencies, using a SOCKS proxy is just as easy
as using a HTTP one:
proxies = { 'http': 'socks5://user:pass@host:port', 'https': 'socks5://user:pass@host:port' }
Using the scheme socks5
causes the DNS resolution to happen on the client, rather than on the proxy server. This is in line with curl, which uses the scheme to decide whether to do the DNS resolution on the client or proxy. If you want to resolve the domains on the proxy server, use socks5h
as the scheme.
Compliance¶
Requests is intended to be compliant with all relevant specifications and
RFCs where that compliance will not cause difficulties for users. This
attention to the specification can lead to some behaviour that may seem
unusual to those not familiar with the relevant specification.
Encodings¶
When you receive a response, Requests makes a guess at the encoding to
use for decoding the response when you access the Response.text
attribute. Requests will first check for an
encoding in the HTTP header, and if none is present, will use chardet to attempt to guess the encoding.
The only time Requests will not do this is if no explicit charset
is present in the HTTP headers and the Content-Type
header contains text
. In this situation, RFC 2616 specifies
that the default charset must be ISO-8859-1
. Requests follows the
specification in this case. If you require a different encoding, you can
manually set the Response.encoding
property, or use the raw Response.content
.
HTTP Verbs¶
Requests provides access to almost the full range of HTTP verbs: GET, OPTIONS,
HEAD, POST, PUT, PATCH and DELETE. The following provides detailed examples of
using these various verbs in Requests, using the GitHub API.
We will begin with the verb most commonly used: GET. HTTP GET is an idempotent
method that returns a resource from a given URL. As a result, it is the verb
you ought to use when attempting to retrieve data from a web location. An
example usage would be attempting to get information about a specific commit
from GitHub. Suppose we wanted commit a050faf
on Requests. We would get it
like so:
>>> import requests >>> r = requests.get('https://api.github.com/repos/requests/requests/git/commits/a050faf084662f3a352dd1a941f2c7c9f886d4ad')
We should confirm that GitHub responded correctly. If it has, we want to work
out what type of content it is. Do this like so:
>>> if r.status_code == requests.codes.ok: ... print(r.headers['content-type']) ... application/json; charset=utf-8
So, GitHub returns JSON. That’s great, we can use the r.json
method to parse it into Python objects.
>>> commit_data = r.json() >>> print(commit_data.keys()) [u'committer', u'author', u'url', u'tree', u'sha', u'parents', u'message'] >>> print(commit_data[u'committer']) {u'date': u'2012-05-10T11:10:50-07:00', u'email': u'me@kennethreitz.com', u'name': u'Kenneth Reitz'} >>> print(commit_data[u'message']) makin' history
So far, so simple. Well, let’s investigate the GitHub API a little bit. Now,
we could look at the documentation, but we might have a little more fun if we
use Requests instead. We can take advantage of the Requests OPTIONS verb to
see what kinds of HTTP methods are supported on the url we just used.
>>> verbs = requests.options(r.url) >>> verbs.status_code 500
Uh, what? That’s unhelpful! Turns out GitHub, like many API providers, don’t
actually implement the OPTIONS method. This is an annoying oversight, but it’s
OK, we can just use the boring documentation. If GitHub had correctly
implemented OPTIONS, however, they should return the allowed methods in the
headers, e.g.
>>> verbs = requests.options('http://a-good-website.com/api/cats') >>> print(verbs.headers['allow']) GET,HEAD,POST,OPTIONS
Turning to the documentation, we see that the only other method allowed for
commits is POST, which creates a new commit. As we’re using the Requests repo,
we should probably avoid making ham-handed POSTS to it. Instead, let’s play
with the Issues feature of GitHub.
This documentation was added in response to
Issue #482. Given that
this issue already exists, we will use it as an example. Let’s start by getting it.
>>> r = requests.get('https://api.github.com/repos/requests/requests/issues/482') >>> r.status_code 200 >>> issue = json.loads(r.text) >>> print(issue[u'title']) Feature any http verb in docs >>> print(issue[u'comments']) 3
Cool, we have three comments. Let’s take a look at the last of them.
>>> r = requests.get(r.url + u'/comments') >>> r.status_code 200 >>> comments = r.json() >>> print(comments[0].keys()) [u'body', u'url', u'created_at', u'updated_at', u'user', u'id'] >>> print(comments[2][u'body']) Probably in the "advanced" section
Well, that seems like a silly place. Let’s post a comment telling the poster
that he’s silly. Who is the poster, anyway?
>>> print(comments[2][u'user'][u'login']) kennethreitz
OK, so let’s tell this Kenneth guy that we think this example should go in the
quickstart guide instead. According to the GitHub API doc, the way to do this
is to POST to the thread. Let’s do it.
>>> body = json.dumps({u"body": u"Sounds great! I'll get right on it!"}) >>> url = u"https://api.github.com/repos/requests/requests/issues/482/comments" >>> r = requests.post(url=url, data=body) >>> r.status_code 404
Huh, that’s weird. We probably need to authenticate. That’ll be a pain, right?
Wrong. Requests makes it easy to use many forms of authentication, including
the very common Basic Auth.
>>> from requests.auth import HTTPBasicAuth >>> auth = HTTPBasicAuth('fake@example.com', 'not_a_real_password') >>> r = requests.post(url=url, data=body, auth=auth) >>> r.status_code 201 >>> content = r.json() >>> print(content[u'body']) Sounds great! I'll get right on it.
Brilliant. Oh, wait, no! I meant to add that it would take me a while, because
I had to go feed my cat. If only I could edit this comment! Happily, GitHub
allows us to use another HTTP verb, PATCH, to edit this comment. Let’s do
that.
>>> print(content[u"id"]) 5804413 >>> body = json.dumps({u"body": u"Sounds great! I'll get right on it once I feed my cat."}) >>> url = u"https://api.github.com/repos/requests/requests/issues/comments/5804413" >>> r = requests.patch(url=url, data=body, auth=auth) >>> r.status_code 200
Excellent. Now, just to torture this Kenneth guy, I’ve decided to let him
sweat and not tell him that I’m working on this. That means I want to delete
this comment. GitHub lets us delete comments using the incredibly aptly named
DELETE method. Let’s get rid of it.
>>> r = requests.delete(url=url, auth=auth) >>> r.status_code 204 >>> r.headers['status'] '204 No Content'
Excellent. All gone. The last thing I want to know is how much of my ratelimit
I’ve used. Let’s find out. GitHub sends that information in the headers, so
rather than download the whole page I’ll send a HEAD request to get the
headers.
>>> r = requests.head(url=url, auth=auth) >>> print(r.headers) ... 'x-ratelimit-remaining': '4995' 'x-ratelimit-limit': '5000' ...
Excellent. Time to write a Python program that abuses the GitHub API in all
kinds of exciting ways, 4995 more times.
Custom Verbs¶
From time to time you may be working with a server that, for whatever reason,
allows use or even requires use of HTTP verbs not covered above. One example of
this would be the MKCOL method some WEBDAV servers use. Do not fret, these can
still be used with Requests. These make use of the built-in .request
method. For example:
>>> r = requests.request('MKCOL', url, data=data) >>> r.status_code 200 # Assuming your call was correct
Utilising this, you can make use of any method verb that your server allows.
Transport Adapters¶
As of v1.0.0, Requests has moved to a modular internal design. Part of the
reason this was done was to implement Transport Adapters, originally
described here. Transport Adapters provide a mechanism to define interaction
methods for an HTTP service. In particular, they allow you to apply per-service
configuration.
Requests ships with a single Transport Adapter, the HTTPAdapter
. This adapter provides the default Requests
interaction with HTTP and HTTPS using the powerful urllib3 library. Whenever
a Requests Session
is initialized, one of these is
attached to the Session
object for HTTP, and one
for HTTPS.
Requests enables users to create and use their own Transport Adapters that
provide specific functionality. Once created, a Transport Adapter can be
mounted to a Session object, along with an indication of which web services
it should apply to.
>>> s = requests.Session() >>> s.mount('https://github.com/', MyAdapter())
The mount call registers a specific instance of a Transport Adapter to a
prefix. Once mounted, any HTTP request made using that session whose URL starts
with the given prefix will use the given Transport Adapter.
Many of the details of implementing a Transport Adapter are beyond the scope of
this documentation, but take a look at the next example for a simple SSL use-
case. For more than that, you might look at subclassing the
BaseAdapter
.
Example: Specific SSL Version¶
The Requests team has made a specific choice to use whatever SSL version is
default in the underlying library (urllib3). Normally this is fine, but from
time to time, you might find yourself needing to connect to a service-endpoint
that uses a version that isn’t compatible with the default.
You can use Transport Adapters for this by taking most of the existing
implementation of HTTPAdapter, and adding a parameter ssl_version that gets
passed-through to urllib3. We’ll make a Transport Adapter that instructs the
library to use SSLv3:
import ssl from urllib3.poolmanager import PoolManager from requests.adapters import HTTPAdapter class Ssl3HttpAdapter(HTTPAdapter): """"Transport adapter" that allows us to use SSLv3.""" def init_poolmanager(self, connections, maxsize, block=False): self.poolmanager = PoolManager( num_pools=connections, maxsize=maxsize, block=block, ssl_version=ssl.PROTOCOL_SSLv3)
Blocking Or Non-Blocking?¶
With the default Transport Adapter in place, Requests does not provide any kind
of non-blocking IO. The Response.content
property will block until the entire response has been downloaded. If
you require more granularity, the streaming features of the library (see
Streaming Requests) allow you to retrieve smaller quantities of the
response at a time. However, these calls will still block.
If you are concerned about the use of blocking IO, there are lots of projects
out there that combine Requests with one of Python’s asynchronicity frameworks.
Some excellent examples are requests-threads, grequests, and requests-futures.
Timeouts¶
Most requests to external servers should have a timeout attached, in case the
server is not responding in a timely manner. By default, requests do not time
out unless a timeout value is set explicitly. Without a timeout, your code may
hang for minutes or more.
The connect timeout is the number of seconds Requests will wait for your
client to establish a connection to a remote machine (corresponding to the
connect()) call on the socket. It’s a good practice to set connect timeouts
to slightly larger than a multiple of 3, which is the default TCP packet
retransmission window.
Once your client has connected to the server and sent the HTTP request, the
read timeout is the number of seconds the client will wait for the server
to send a response. (Specifically, it’s the number of seconds that the client
will wait between bytes sent from the server. In 99.9% of cases, this is the
time before the server sends the first byte).
If you specify a single value for the timeout, like this:
r = requests.get('https://github.com', timeout=5)
The timeout value will be applied to both the connect
and the read
timeouts. Specify a tuple if you would like to set the values separately:
r = requests.get('https://github.com', timeout=(3.05, 27))
If the remote server is very slow, you can tell Requests to wait forever for
a response, by passing None as a timeout value and then retrieving a cup of
coffee.
r = requests.get('https://github.com', timeout=None)
In this post we’ll review one of the most widely used Python modules for interacting with web-based services such as REST APIs, the Python requests
module. If you were ever wondering what magic is going on behind the scenes when running one of the thousands of Ansible networking modules, or many of the Python-based SDKs that are available from vendors, there’s a good chance the underlying operations are being performed by requests
. The Python requests
module is a utility that emulates the operations of a web browser using code. It enables programs to interact with a web-based service across the network, while abstracting and handling the lower-level details of opening up a TCP connection to the remote system. Like a web browser, the requests
module allows you to programmatically:
- Initiate HTTP requests such as GET, PUT, POST, PATCH, and DELETE
- Set HTTP headers to be used in the outgoing request
- Store and access the web server content in various forms (HTML, XML, JSON, etc.)
- Store and access cookies
- Utilize either HTTP or HTTPS
Retrieving Data
The most basic example of using requests is simply retrieving the contents of a web page using an HTTP GET:
import requests
response = requests.get('https://google.com')
The resulting response
object will contain the actual HTML code that would be seen by a browser in the text
object, which can be accessed by typing response.text
.
>>> response.text
'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content="Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking for." name="description"><meta content="noodp" name="robots"><meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
-- output omitted for brevity --
There are a lot of great utilities for parsing HTML, but in most cases we will not be doing that when working with networking vendor APIs. In the majority of cases, the data will come back structured as XML or JSON.
response = requests.get('https://nautobot.demo.networktocode.com/api')
>>> response.content
b'{"circuits":"https://nautobot.demo.networktocode.com/api/circuits/","dcim":"https://nautobot.demo.networktocode.com/api/dcim/","extras":"https://nautobot.demo.networktocode.com/api/extras/","graphql":"https://nautobot.demo.networktocode.com/api/graphql/","ipam":"https://nautobot.demo.networktocode.com/api/ipam/","plugins":"https://nautobot.demo.networktocode.com/api/plugins/","status":"https://nautobot.demo.networktocode.com/api/status/","tenancy":"https://nautobot.demo.networktocode.com/api/tenancy/","users":"https://nautobot.demo.networktocode.com/api/users/","virtualization":"https://nautobot.demo.networktocode.com/api/virtualization/"}'
Notice that the above output is in bytes
format. This is indicated by the lowercase “b” in front of the response text. We could convert this into a string using response.content.decode()
and then use the Python json
module to load it into a Python dictionary. However, because json
is one of the most common data formats, the requests
module has a convenience method that will automatically convert the response from bytes to a Python dictionary. Simply call response.json()
:
>>> response.json()
{'circuits': 'https://nautobot.demo.networktocode.com/api/circuits/', 'dcim': 'https://nautobot.demo.networktocode.com/api/dcim/', 'extras': 'https://nautobot.demo.networktocode.com/api/extras/', 'graphql': 'https://nautobot.demo.networktocode.com/api/graphql/', 'ipam': 'https://nautobot.demo.networktocode.com/api/ipam/', 'plugins': 'https://nautobot.demo.networktocode.com/api/plugins/', 'status': 'https://nautobot.demo.networktocode.com/api/status/', 'tenancy': 'https://nautobot.demo.networktocode.com/api/tenancy/', 'users': 'https://nautobot.demo.networktocode.com/api/users/', 'virtualization': 'https://nautobot.demo.networktocode.com/api/virtualization/'}
>>> type(response.json())
<class 'dict'>
In some cases, we will have to specify the desired data format by setting the Accept
header. For example:
headers = {'Accept': 'application/json'}
response = requests.get('https://nautobot.demo.networktocode.com/api', headers=headers)
In this example, we are informing the API that we would like the data to come back formatted as JSON. If the API provides the content as XML, we would specify the header as {'Accept': 'application/xml'}
. The appropriate content type to request should be spelled out in the vendor API documentation. Many APIs use a default, so you may not need to specify the header. Nautobot happens to use a default of application/json
, so it isn’t necessary to set the header. If you do not set the Accept
header, you can find out the type of returned content by examining the Content-Type
header in the response:
>>> response.headers['Content-Type']
'application/json'
Although we are using Nautobot for many of the examples of
requests
module usage, there is a very useful SDK called pynautobot that can handle a lot of the heavy lifting for you, so definitely check that out!
Authentication
Most APIs are protected by an authentication mechanism which can vary from product to product. The API documentation is your best resource in determining the method of authentication in use. We’ll review a few of the more common methods with examples below.
API Key
With API key authentication you typically must first access an administrative portal and generate an API key. Think of the API key the same way as you would your administrative userid/password. In some cases it will provide read/write administrative access to the entire system, so you want to protect it as such. This means don’t store it in the code or in a git
repository where it can be seen in clear text. Commonly the API keys are stored as environment variables and imported at run time, or are imported from password vaults such as Hashicorp or Ansible vault.
Once an API key is generated, it will need to be included in some way with all requests. Next we’ll describe a few common methods for including the API key in requests and provide example code.
One method that is used across a wide variety of APIs is to include the API key as a token in the Authorization
header. A few examples of this are in the authentication methods for Nautobot and Cisco Webex. The two examples below are very similar, with the main difference being that Nautobot uses Token {token}
in the Authorization
header whereas Cisco Webex uses Bearer {token}
in the Authorization
header. Implementation of this is not standardized, so the API documentation should indicate what the format of the header should be.
Nautobot API
First, it is necessary to generate an API key from the Nautobot GUI. Sign into Nautobot and select your username in the upper right-hand corner, and then view your Profile. From the Profile view, select API Tokens and click the button to add a token. The token will then need to be specified in the Authorization
header in all requests as shown below.
import requests
import os
# Get the API token from an environment variable
token = os.environ.get('NAUTOBOT_TOKEN')
# Add the Authorization header
headers = {'Authorization': f'Token {token}'}
# This is the base URL for all Nautobot API calls
base_url = 'https://nautobot.demo.networktocode.com/api'
# Get the list of devices from Nautobot using the requests module and passing in the authorization header defined above
response = requests.get('https://nautobot.demo.networktocode.com/api/dcim/devices/', headers=headers)
>>> response.json()
{'count': 511, 'next': 'https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=50', 'previous': None, 'results': [{'id': 'fd94038c-f09f-4389-a51b-ffa03e798676', 'url': 'https://nautobot.demo.networktocode.com/api/dcim/devices/fd94038c-f09f-4389-a51b-ffa03e798676/', 'name': 'ams01-edge-01', 'device_type': {'id': '774f7008-3a75-46a2-bc75-542205574cee', 'url': 'https://nautobot.demo.networktocode.com/api/dcim/device-types/774f7008-3a75-46a2-bc75-542205574cee/', 'manufacturer': {'id': 'e83e2d58-73e2-468b-8a86-0530dbf3dff9', 'url': 'https://nautobot.demo.networktocode.com/api/dcim/manufacturers/e83e2d58-73e2-468b-8a86-0530dbf3dff9/', 'name': 'Arista', 'slug': 'arista', 'display': 'Arista'}, 'model': 'DCS-7280CR2-60', 'slug': 'dcs-7280cr2-60', 'display': 'Arista DCS-7280CR2-60'}, 'device_role': {'id': 'bea7cc02-e254-4b7d-b871-6438d1aacb76', 'url': 'https://nautobot.demo.networktocode.com/api/dcim/device-roles/bea7cc02-e254-4b7d-b871-6438d1aacb76/'
--- OUTPUT TRUNCATED FOR BREVITY ---
Cisco Webex API
When working with the Webex API, a bot must be created to get an API key. First create a bot in the dashboard https://developer.webex.com/docs/. Upon creating the bot you are provided a token which is good for 100 years. The token should then be included in the Authorization
header in all requests as shown below.
import requests
import os
# Get the API token from an environment variable
token = os.environ.get('WEBEX_TOKEN')
# Add the Authorization header
headers = {'Authorization': f'Bearer {token}'}
# This is the base URL for all Webex API calls
base_url = 'https://webexapis.com'
# Get list of rooms
response = requests.get(f'{base_url}/v1/rooms', headers=headers)
>>> response.json()
{'items': [{'id': 'Y2lzY29zcGFyazovL3VzL1JPT00vNjZlNmZjYTAtMjIxZS0xMWVjLTg2Y2YtMzk0NmQ2YTMzOWVi', 'title': 'nautobot-chatops', 'type': 'group', 'isLocked': False, 'lastActivity': '2021-10-22T19:37:38.091Z', 'creatorId': 'Y2lzY29zcGFyazovL3VzL1BFT1BMRS9iYmRiZDljNC1hMTRkLTQwMTYtYjVjZi1jOGExNzY0MWI1YWQ', 'created': '2021-09-30T18:44:11.242Z', 'ownerId': 'Y2lzY29zcGFyazovL3VzL09SR0FOSVpBVElPTi8zZjE3OTcwNi1mMTFhLTRhYjctYmEzZS01N2E0YTk2YjA4OWY'}, {'id': 'Y2lzY29zcGFyazovL3VzL1JPT00vNzBjZTgwYTAtMjIxMi0xMWVjLWEwMDAtZjcyZTAyM2Q2MDIx', 'title': 'Webex space for Matt', 'type': 'group', 'isLocked': False, 'lastActivity': '2021-09-30T17:18:33.898Z', 'creatorId': 'Y2lzY29zcGFyazovL3VzL1BFT1BMRS9iYmRiZDljNC1hMTRkLTQwMTYtYjVjZi1jOGExNzY0MWI1YWQ', 'created': '2021-09-30T17:18:33.898Z', 'ownerId': 'Y2lzY29zcGFyazovL3VzL09SR0FOSVpBVElPTi8zZjE3OTcwNi1mMTFhLTRhYjctYmEzZS01N2E0YTk2YjA4OWY'}, {'id': 'Y2lzY29zcGFyazovL3VzL1JPT00vOWIwN2FmMjYtYmQ4Ny0zYmYwLWI2YzQtNTdlNmY1OGQwN2E2', 'title': 'Jason Belk', 'type': 'direct', 'isLocked': False, 'lastActivity': '2021-01-26T19:53:01.306Z', 'creatorId': 'Y2lzY29zcGFyazovL3VzL1BFT1BMRS9jNzg2YjVmOC1hZTdjLTQyMzItYjRiNS1jNzQxYTU3MjU4MzQ', 'created': '2020-12-10T17:53:01.202Z'}, {'id': 'Y2lzY29zcGFyazovL3VzL1JPT00vNTYwNzhhNTAtMTNjMi0xMWViLWJiNjctMTNiODIxYWUyMjE1', 'title': 'NTC NSO Projects', 'type': 'group', 'isLocked': False, 'lastActivity': '2021-05-28T17:46:16.727Z', 'creatorId': 'Y2lzY29zcGFyazovL3VzL1BFT1BMR
--- OUTPUT TRUNCATED FOR BREVITY ---
Some APIs require that the API key be provided in a custom header that is included with all requests. The key and the format to use for the value should be spelled out in the API documentation.
Cisco Meraki
Cisco Meraki requires that all requests have an X-Cisco-Meraki-API-Key
header with the API key as the value.
As with the Token in Authorization Header method discussed previously, you must first go to the API dashboard and generate an API key. This is done in the Meraki Dashboard under your profile settings. The key should then be specified in the X-Cisco-Meraki-API-Key
for all requests.
import requests
import os
# Get the API key from an environment variable
api_key = os.environment.get('MERAKI_API_KEY')
# The base URI for all requests
base_uri = "https://api.meraki.com/api/v0"
# Set the custom header to include the API key
headers = {'X-Cisco-Meraki-API-Key': api_key}
# Get a list of organizations
response = requests.get(f'{base_uri}/organizations', headers=headers)
>>> response.json()
[{'id': '681155', 'name': 'DeLab', 'url': 'https://n392.meraki.com/o/49Gm_c/manage/organization/overview'}, {'id': '575334852396583536', 'name': 'TNF - The Network Factory', 'url': 'https://n22.meraki.com/o/K5Faybw/manage/organization/overview'}, {'id': '573083052582914605', 'name': 'Jacks_test_net', 'url': 'https://n18.meraki.com/o/22Uqhas/manage/organization/overview'}, {'id': '549236', 'name': 'DevNet Sandbox', 'url': 'https://n149.meraki.com/o/-t35Mb/manage/organization/overview'}, {'id': '575334852396583264', 'name': 'My organization', 'url': 'https://n22.meraki.com/o/
--- OUTPUT TRUNCATED FOR BREVITY ---
HTTP Basic Authentication w/ Token
Some APIs require that you first issue an HTTP POST to a login url using HTTP Basic Authentication. A token that must be used on subsequent requests is then issued in the response. This type of authentication does not require going to an administrative portal first to generate the token; the token is automatically generated upon successful login.
HTTP Basic Authentication/Token — Cisco DNA Center
The Cisco DNA Center login process requires that a request first be sent to a login URL with HTTP Basic Authentication, and upon successful authentication issues a token in the response. The token must then be sent in an X-Auth-Token
header in subsequent requests.
import requests
from requests.auth import HTTPBasicAuth
import os
username = os.environ.get('DNA_USERNAME')
password = os.environ.get('DNA_PASSWORD')
hostname = 'sandboxdnac2.cisco.com'
# Create an HTTPBasicAuth object that will be passed to requests
auth = HTTPBasicAuth(username, password)
# Define the login URL to get the token
login_url = f"https://{hostname}/dna/system/api/v1/auth/token"
# Issue a login request
response = requests.post(login_url, auth=auth)
# Parse the token from the response if the response was OK
if response.ok:
token = response.json()['Token']
else:
print(f'HTTP Error {response.status_code}:{response.reason} occurred')
# Define the X-Auth-Token header to be used in subsequent requests
headers = {'X-Auth-Token': token}
# Define the url for getting network health information from DNA Center
url = f"https://{hostname}/dna/intent/api/v1/network-health"
# Retrieve network health information from DNA Center
response = requests.get(url, headers=headers, auth=auth)
>>> response.json()
{'version': '1.0', 'response': [{'time': '2021-10-22T19:40:00.000+0000', 'healthScore': 100, 'totalCount': 14, 'goodCount': 14, 'unmonCount': 0, 'fairCount': 0, 'badCount': 0, 'entity': None, 'timeinMillis': 1634931600000}], 'measuredBy': 'global', 'latestMeasuredByEntity': None, 'latestHealthScore': 100, 'monitoredDevices': 14, 'monitoredHealthyDevices': 14, 'monitoredUnHealthyDevices': 0, 'unMonitoredDevices': 0, 'healthDistirubution': [{'category': 'Access', 'totalCount': 2, 'healthScore': 100, 'goodPercentage': 100, 'badPercentage': 0, 'fairPercentage': 0, 'unmonPercentage': 0, 'goodCount': 2, 'badCount': 0, 'fairCount': 0, 'unmonCount': 0}, {'category': 'Distribution', 'totalCount': 1, 'healthScore': 100, 'good
--- OUTPUT TRUNCATED FOR BREVITY ---
POST with JSON Payload
With this method of authentication, the user must first issue a POST to a login URL and include a JSON (most common), XML, or other type of payload that contains the user credentials. A token that must be used with subsequent API requests is then returned. In some cases the token is returned as a cookie in the response. When that is the case, a shortcut is to use a requests.session
object. By using a session
object, the token in the cookie can easily be reused on subsequent requests by sourcing the requests from the session
object. This is the strategy used in the Cisco ACI example below.
POST with JSON Payload — Cisco ACI
Cisco ACI requires a JSON payload to be posted to the /aaaLogin
URL endpoint with the username/password included. The response includes a cookie with key APIC-cookie
and a token in the value that can be used on subsequent requests.
import requests
import os
username = os.environ.get('USERNAME')
password = os.environ.get('PASSWORD')
hostname = 'sandboxapicdc.cisco.com'
# Build the JSON payload with userid/password
payload = {"aaaUser": {"attributes": {"name": username, "pwd" : password }}}
# Create a Session object
session = requests.session()
# Specify the login URL
login_url = f'https://{hostname}/api/aaaLogin.json'
# Issue the login request. The cookie will be stored in session.cookies.
response = session.post(login_url, json=payload, verify=False)
# Use the session object to get ACI tenants
if response.ok:
response = session.get(f'https://{hostname}/api/node/class/fvTenant.json', verify=False)
else:
print(f"HTTP Error {response.status_code}:{response.reason} occurred.")
>>> response.json()
{'totalCount': '4', 'imdata': [{'fvTenant': {'attributes': {'annotation': '', 'childAction': '', 'descr': '', 'dn': 'uni/tn-common', 'extMngdBy': '', 'lcOwn': 'local', 'modTs': '2021-10-08T15:31:47.480+00:00', 'monPolDn': 'uni/tn-common/monepg-default', 'name': 'common', 'nameAlias': '', 'ownerKey': '', 'ownerTag': '', 'status': '', 'uid': '0', 'userdom': 'all'}}}, {'fvTenant': {'attributes': {'annotation': '', 'childAction': '', 'descr': '', 'dn': 'uni/tn-infra', 'extMngdBy': '', 'lcOwn': 'local', 'modTs': '2021-10-08T15:31:55.077+00:00', 'monPolDn': 'uni/tn-common/monepg-default', 'name': 'infra', 'nameAlias': '', 'ownerKey': '', 'ownerTag': '', 'status': '', 'uid': '0', 'userdom': 'all'}}},
--- OUTPUT TRUNCATED FOR BREVITY ---
Certificate Checking
Note the verify=False
in the above example. This can be used to turn off certificate checking when the device or API you are targeting is using a self-signed or invalid SSL certificate. This will cause a log message similar to the following to be generated:
InsecureRequestWarning: Unverified HTTPS request is being made to host ‘sandboxapicdc.cisco.com’. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings`
The solution that should be used for a production deployment would be to install a valid SSL certificate, and don’t use verify=False
. However, if you are dealing with lab devices that may never have a valid certificate then the message can be disabled using the following snippet:
import urllib3
urllib3.disable_warnings()
Handling Errors
It is helpful when working with requests
to understand HTTP status codes and some of the common triggers for them when working with APIs. HTTP status codes indicate the success or failure of a request, and when errors occur, can give a hint toward what the problem might be. Here are some common HTTP status codes that you might see when working with APIs and potential causes:
200 OK: The request was successful
201 Created: Indicates a POST or PUT request was successful
204 Deleted: Indicates a successful DELETE request
400 Bad Request: Usually indicates there was a problem with the payload in the case of a POST, PUT, or PATCH request
401 Unauthorized: Invalid or missing credentials
403 Forbidden: An authenticated user does not have permission to the requested resource
404 Not Found: The URL was not recognized
429 Too Many Requests: The API may have rate limiting in effect. Check the API docs to see if there is a limit on number of requests per second or per minute.
500 Internal Server Error: The server encountered an error processing your request. Like a 400, this can also be caused by a bad payload on a POST, PUT or PATCH.
When the requests
module receives the above status codes in the response, it returns a response object and populates the status_code
and reason
fields in the response object. If a connectivity error occurs, such as a hostname that is unreachable or unresolvable, requests
will throw an exception. However, requests
will not throw an exception by default for HTTP-based errors such as the 4XX and 5XX errors above. Instead it will return the failure status code and reason in the response. A common strategy in error handling is to use the raise_for_status()
method of the response object to also throw an exception for HTTP-based errors as well. Then a Python try/except
block can be used to catch any of the errors and provide a more human-friendly error message to the user, if desired.
Note that HTTP status codes in the 2XX range indicate success, and thus
raise_for_status()
will not raise an exception.
# Example of error for which Requests would throw an exception
# Define a purposely bad URL
url = 'https://badhostname'
# Implement a try/except block to handle the error
try:
response = requests.get(url, json=data)
response.raise_for_status()
except requests.exceptions.RequestException as e:
print(f"Error while connecting to {url}: {e}")
Error while connecting to https://badhostname: HTTPSConnectionPool(host='badhostname', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x108890d60>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))
# Example of HTTP error, no exception thrown but we force one to be triggered with raise_for_status()
# Define a purposely bad URL
url = 'https://nautobot.demo.networktocode.com/api/dcim/regions/bogus'
# Get the API token from an environment variable.
token = os.environ.get('NAUTOBOT_TOKEN')
# Add the Authorization header
headers = {'Authorization': f'Token {token}'}
# Implement a try/except block to handle the error
try:
response = requests.get(url, headers=headers)
response.raise_for_status()
except requests.exceptions.RequestException as e:
print(f"Error while connecting to {url}: {e}")
Error while connecting to https://nautobot.demo.networktocode.com/api/dcim/regions/bogus: 404 Client Error: Not Found for url: https://nautobot.demo.networktocode.com/api/dcim/regions/bogus/
CRUD (Create, Replace, Update, Delete) API Objects
So far we have mostly discussed retrieving data from an API using HTTP GET requests. When creating/updating objects, HTTP POST, PUT, and PATCH are used. A DELETE request would be used to remove objects from the API.
- POST: Used when creating a new object
- PATCH: Update an attribute of an object
- PUT: Replaces an object with a new one
- DELETE: Delete an object
It should be noted that some APIs support both PUT and PATCH, while some others may support only PUT or only PATCH. The Meraki API that we’ll be using for the following example supports only PUT requests to change objects.
POST
When using a POST request with an API, you typically must send a payload along with the request in the format required by the API (usually JSON, sometimes XML, very rarely something else). The format needed for the payload should be documented in the API specification. When using JSON
format, you can specify the json
argument when making the call to requests.post
. For example, requests.post(url, headers=headers, json=payload)
. Other types of payloads such as XML
would use the data
argument. For example, requests.post(url, headers=headers, data=payload)
.
Create a Region in Nautobot
With Nautobot, we can determine the required payload by looking at the Swagger docs that are on the system itself at /api/docs/
. Let’s take a look at the Swagger spec to create a Region
in Nautobot.
The fields marked with a red * above indicate that they are required fields, the other fields are optional. If we click the Try it out button as shown above, it gives us an example payload.
Since the name
and slug
are the only required fields, we can form a payload from the example omitting the other fields if desired. The below code snippet shows how we can create the Region in Nautobot using requests.post
.
import requests
import os
# Get the API token from an environment variable.
token = os.environ.get('NAUTOBOT_TOKEN')
# Add the Authorization header
headers = {'Authorization': f'Token {token}'}
# This is the base URL for all Nautobot API calls
base_url = 'https://nautobot.demo.networktocode.com/api'
# Form the payload for the request, per the API specification
payload = {
"name": "Asia Pacific",
"slug": "asia-pac",
}
# Create the region in Nautobot
response = requests.post('https://nautobot.demo.networktocode.com/api/dcim/regions/', headers=headers, json=payload)
>>> response
<Response [201]>
>>> response.reason
'Created'
PATCH
A PATCH
request can be used to update an attribute of an object. For example, in this next snippet we will change the description of the Region we just created in the POST
request. It was omitted in the previous POST
request so it is currently a blank string. Although it is not called out in the Swagger API specification, the PATCH
request for Nautobot requires the id
field to be defined in the payload. The id
can be looked up for our previously created Region by doing a requests.get
on /regions?slug=asia-pac
. The ?slug=asia-pac
at the end of the URL is a query parameter that is used to filter the request for objects having a field matching a specific value. In this case, we filtered the objects for the one with the slug
field set to asia-pac
to grab the ID. In addition, the payload needs to be in the form of a list of dictionaries rather than a single dictionary as is shown in the Swagger example.
Update a Region Description in Nautobot
import requests
import os
# Get the API token from an environment variable.
token = os.environ.get('NAUTOBOT_TOKEN')
# Add the Authorization header
headers = {'Authorization': f'Token {token}'}
# This is the base URL for all Nautobot API calls
base_url = 'https://nautobot.demo.networktocode.com/api'
# First we get the region_id from our previously created region
response = requests.get('https://nautobot.demo.networktocode.com/api/dcim/regions/?slug=asia-pac', headers=headers)
>>> response.json()
{'count': 1, 'next': None, 'previous': None, 'results': [{'id': 'be2c22a2-56ce-4d84-8ac9-5a68c6a39d62', 'url': 'https://nautobot.demo.networktocode.com/api/dcim/regions/be2c22a2-56ce-4d84-8ac9-5a68c6a39d62/', 'name': 'Asia Pacific', 'slug': 'asia-pac', 'parent': None, 'description': 'Test region created from the API!', 'site_count': 0, '_depth': 0, 'custom_fields': {}, 'created': '2021-10-22', 'last_updated': '2021-10-22T21:20:07.628690', 'display': 'Asia Pacific'}]}
# Parse the above response for the region identifier
region_id = response.json()['results'][0]['id']
# Form the payload for the request, per the API specification (see preceding paragraph for some nuances!)
payload = [{
"name": "Asia Pacific",
"slug": "asia-pac",
"description": "Test region created from the API!",
"id": region_id
}]
# Update the region in Nautobot
response = requests.patch('https://nautobot.demo.networktocode.com/api/dcim/regions/', headers=headers, json=payload)
>>> response
<Response [200]>
PUT
A PUT
request is typically used to replace an an entire object including all attributes of the object.
Replace a Region Object in Nautobot
Let’s say we want to replace the entire Region object that we created previously, giving it a completely new name, slug and description. For this we can use a PUT
request, specifying the id
of the previously created Region and providing new values for the name, slug, and description attributes.
import requests
import os
# Get the API token from an environment variable
token = os.environ.get('NAUTOBOT_TOKEN')
# Add the Authorization header
headers = {'Authorization': f'Token {token}'}
# This is the base URL for all Nautobot API calls
base_url = 'https://nautobot.demo.networktocode.com/api'
# First we get the region_id from our previously created region
response = requests.get('https://nautobot.demo.networktocode.com/api/dcim/regions/?slug=asia-pac', headers=headers)
>>> response.json()
{'count': 1, 'next': None, 'previous': None, 'results': [{'id': 'be2c22a2-56ce-4d84-8ac9-5a68c6a39d62', 'url': 'https://nautobot.demo.networktocode.com/api/dcim/regions/be2c22a2-56ce-4d84-8ac9-5a68c6a39d62/', 'name': 'Asia Pacific', 'slug': 'asia-pac', 'parent': None, 'description': 'Test region created from the API!', 'site_count': 0, '_depth': 0, 'custom_fields': {}, 'created': '2021-10-22', 'last_updated': '2021-10-22T21:20:07.628690', 'display': 'Asia Pacific'}]}
# Parse the above response for the region identifier
region_id = response.json()['results'][0]['id']
# Form the payload for the request, per the API specification (see preceding paragraph for some nuances!)
payload = [{
"name": "Test Region",
"slug": "test-region-1",
"description": "Asia Pac region updated with a PUT request!",
"id": region_id
}]
# Update the region in Nautobot
response = requests.put('https://nautobot.demo.networktocode.com/api/dcim/regions/', headers=headers, json=payload)
>>> response
<Response [200]>
# Search for the region using the new slug
response = requests.get('https://nautobot.demo.networktocode.com/api/dcim/regions/?slug=test-region-1', headers=headers)
# This returns the replaced object, while retaining the same identifier
>>> response.json()
{'count': 1, 'next': None, 'previous': None, 'results': [{'id': 'be2c22a2-56ce-4d84-8ac9-5a68c6a39d62', 'url': 'https://nautobot.demo.networktocode.com/api/dcim/regions/be2c22a2-56ce-4d84-8ac9-5a68c6a39d62/', 'name': 'Test Region', 'slug': 'test-region-1', 'parent': None, 'description': 'Asia Pac region updated with a PUT request!', 'site_count': 0, '_depth': 0, 'custom_fields': {}, 'created': '2021-10-22', 'last_updated': '2021-10-25T17:31:04.003235', 'display': 'Test Region'}]}
Enable an SSID in Meraki
Let’s look at another example of using a PUT
to enable a wireless SSID in the Cisco Meraki dashboard. For this we will use a PUT
request including the appropriate JSON payload to enable SSID 14.
import requests
import os
# Get the API key from an environment variable
api_key = os.environment.get('MERAKI_API_KEY')
# The base URI for all requests
base_uri = "https://api.meraki.com/api/v0"
# Set the custom header to include the API key
headers = {'X-Cisco-Meraki-API-Key': api_key}
net_id = 'DNENT2-mxxxxxdgmail.com'
ssid_number = 14
url = f'{base_uri}/networks/{net_id}/ssids/{ssid_number}'
# Initiate the PUT request to enable an SSID. You must have a reservation in the Always-On DevNet sandbox to gain authorization for this.
response = requests.put(url, headers=headers, json={"enabled": True})
DELETE
An object can be removed by making a DELETE
request to the URI (Universal Resource Indicator) of an object. The URI is the portion of the URL that refers to the object, for example /dcim/regions/{id}
in the case of the Nautobot Region.
Remove a Region from Nautobot
Let’s go ahead and remove the Region that we previously added. To do that, we’ll send a DELETE
request to the URI of the region. The URI can be seen when doing a GET
request in the url
attribute of the Region object. We can also see in the API specification for DELETE
that the call should be made to /regions/{id}
# Search for the region using the slug
response = requests.get('https://nautobot.demo.networktocode.com/api/dcim/regions/?slug=test-region-1', headers=headers)
# Parse the URL from the GET request
url = response.json()['results'][0]['url']
>>> url
'https://nautobot.demo.networktocode.com/api/dcim/regions/be2c22a2-56ce-4d84-8ac9-5a68c6a39d62/'
# Delete the Region object
response = requests.delete(url, headers=headers)
# A status code of 204 indicates successful deletion
>>> response
<Response [204]>
Rate Limiting
Some APIs implement a throttling mechanism to prevent the system from being overwhelmed with requests. This is usually implemented as a rate limit of X number of requests per minute. When the rate limit is hit, the API returns a status code 429: Too Many Requests
. To work around this, your code must implement a backoff timer in order to avoid hitting the threshold. Here’s an example working around the Cisco DNA Center rate limit of 5 requests per minute:
import requests
from requests.auth import HTTPBasicAuth
import time
from pprint import pprint
import os
# Pull in credentials from environment variables
username = os.environ.get('USERNAME')
password = os.environ.get('PASSWORD')
hostname = "sandboxdnac2.cisco.com"
headers = {"Content-Type": "application/json"}
# Use Basic Authentication
auth = HTTPBasicAuth(username, password)
# Request URL for the token
login_url = f"https://{hostname}/dna/system/api/v1/auth/token"
# Retrieve the token
resp = requests.post(login_url, headers=headers, auth=auth)
token = resp.json()['Token']
# Add the token to subsequent requests
headers['X-Auth-Token'] = token
url = f"https://{hostname}/dna/intent/api/v1/network-device"
resp = requests.get(url, headers=headers, auth=auth)
count = 0
# Loop over devices and get device by id
# Each time we reach five requests, pause for 60 seconds to avoid the rate limit
for i, device in enumerate(resp.json()['response']):
count += 1
device_count = len(resp.json()['response'])
print (f"REQUEST #{i+1}")
url = f"https://{hostname}/dna/intent/api/v1/network-device/{device['id']}"
response = requests.get(url, headers=headers, auth=auth)
pprint(response.json(), indent=2)
if count == 5 and (i+1) < device_count:
print("Sleeping for 60 seconds...")
time.sleep(60)
count = 0
Some API calls may set a limit on the number of objects that are returned in a single call. In this case, the API should return paging details in the JSON body including the URL to request the next set of data as well as the previous set. If Previous is empty, we are on the first set of data. If Next is empty, we know we have reached the end of the dataset. Some API implementations follow RFC5988, which includes a Link header in the format:
Link: https://webexapis.com/v1/people?displayName=Harold&max=10&before&after=Y2lzY29zcGFyazovL3VzL1BFT1BMRS83MTZlOWQxYy1jYTQ0LTRmZWQtOGZjYS05ZGY0YjRmNDE3ZjU; rel=”next”
The above example is from the Webex API, which implements RFC5988. This is described in the API documentation here: https://developer.webex.com/docs/api/basics
Keep in mind that not all implementations use the RFC, however. The API documentation should explain how pagination is handled.
Handling Pagination in Nautobot
A good example of pagination can be seen when making a GET
request to retrieve all Devices from Nautobot. Nautobot includes a count
, next
, and previous
attribute in responses that are paginated. By default, the API will return a maximum of 50 records. The limit value as well as an offset value are indicated in the next
value of the response. For example: 'next': 'https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=50'
. In the URL, the limit indicates the max amount of records, and the offset indicates where the next batch of records begins. The previous
attribute indicates the url for the previous set of records. If previous
is None, it means we are on the first set of records. And if next
is None, it means we are on the last set of records.
In the below snippet, we first retrieve the first set of 50 records and store them in a device_list
variable. We then create a while
loop that iterates until the next
field in the response contains None. The returned results are added to the device_list
at each iteration of the loop. At the end we can see that there are 511 devices, which is the same value as the count
field in the response.
import requests
import os
# Get the API token from an environment variable
token = os.environ.get('NAUTOBOT_TOKEN')
# Add the Authorization header
headers = {'Authorization': f'Token {token}'}
# This is the base URL for all Nautobot API calls
base_url = 'https://nautobot.demo.networktocode.com/api'
# Create the initial request for the first batch of records
response = requests.get(f'{base_url}/dcim/devices', headers=headers)
# Store the initial device list
device_list = [device for device in response.json()['results']]
# Notice that we now have the first 50 devices
>>> len(device_list)
50
# But there are 511 total!
>>> response.json()['count']
511
# Loop until 'next' is None, adding the retrieved devices to device_list on each iteration
if response.json()['next']:
while response.json()['next']:
print(f"Retrieving {response.json()['next']}")
response = requests.get(response.json()['next'], headers=headers)
for device in response.json()['results']:
device_list.append(device)
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=50
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=100
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=150
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=200
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=250
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=300
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=350
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=400
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=450
Retrieving https://nautobot.demo.networktocode.com/api/dcim/devices/?limit=50&offset=500
>>> len(device_list)
511
Handling Pagination in Cisco Webex
In the code below, first we get the Room IDs for the WebEx Teams rooms I am a member of. Then we retrieve the members from the DevNet Dev Support Questions room and create a continuous function that follows the Link URL and displays the content. The While loop is broken when the Link header is no longer present, returning None when we try to retrieve it with headers.get(‘Link’).
import requests
import re
import os
api_path = "https://webexapis.com/v1"
# You can retrieve your token here: https://developer.webex.com/docs/api/getting-started
token = os.environ.get('WEBEX_TOKEN')
headers = {"Authorization": f"Bearer {token}"}
# List the rooms, and collect the ID for the DevNet Support Questions room
get_rooms = requests.get(f"{api_path}/rooms", headers=headers)
for room in get_rooms.json()['items']:
print(room['title'], room['id'])
if room['title'] == "Sandbot-Support DevNet":
room_id = room['id']
# This function will follow the Link URLs until there are no more, printing out
# the member display name and next URL at each iteration. Note that I have decreased the maximum number of records to 1 so as to force pagination. This should not be done in a real implementation.
def get_members(room_id):
params = {"roomId": room_id, "max": 1}
# Make the initial request and print the member name
response = requests.get(f"{api_path}/memberships", headers=headers, params=params)
print(response.json()['items'][0]['personDisplayName'])
# Loop until the Link header is empty or not present
while response.headers.get('Link'):
# Get the URL from the Link header
next_url = response.links['next']['url']
print(f"NEXT: {next_url}")
# Request the next set of data
response = requests.get(next_url, headers=headers)
if response.headers.get('Link'):
print(response.json()['items'][0]['personDisplayName'])
else:
print('No Link header, finished!')
# Execute the function using the Sandbox-Support DevNet RoomID
get_members(room_id)
Wrapping Up
I hope this has been a useful tutorial on using requests
to work with vendor REST APIs! While no two API implementations are the same, many of the most common patterns for working with them are covered here. As always, hit us up in the comments or on our public Slack channel with any questions. Thanks for reading!
-Matt
24 Дек. 2015, Python, 342317 просмотров,
Стандартная библиотека Python имеет ряд готовых модулей по работе с HTTP.
- urllib
- httplib
Если уж совсем хочется хардкора, то можно и сразу с socket поработать. Но у всех этих модулей есть один большой недостаток — неудобство работы.
Во-первых, большое обилие классов и функций. Во-вторых, код получается вовсе не pythonic. Многие программисты любят Python за его элегантность и простоту, поэтому и был создан модуль, призванный решать проблему существующих и имя ему requests или HTTP For Humans. На момент написания данной заметки, последняя версия библиотеки — 2.9.1. С момента выхода Python версии 3.5 я дал себе негласное обещание писать новый код только на Py >= 3.5. Пора бы уже полностью перебираться на 3-ю ветку змеюки, поэтому в моих примерах print отныне является функцией, а не оператором
Что же умеет requests?
Для начала хочется показать как выглядит код работы с http, используя модули из стандартной библиотеки Python и код при работе с requests. В качестве мишени для стрельбы http запросами будет использоваться очень удобный сервис httpbin.org
>>> import urllib.request
>>> response = urllib.request.urlopen('https://httpbin.org/get')
>>> print(response.read())
b'{n "args": {}, n "headers": {n "Accept-Encoding": "identity", n "Host": "httpbin.org", n "User-Agent": "Python-urllib/3.5"n }, n "origin": "95.56.82.136", n "url": "https://httpbin.org/get"n}n'
>>> print(response.getheader('Server'))
nginx
>>> print(response.getcode())
200
>>>
Кстати, urllib.request это надстройка над «низкоуровневой» библиотекой httplib о которой я писал выше.
>>> import requests
>>> response = requests.get('https://httpbin.org/get')
>>> print(response.content)
b'{n "args": {}, n "headers": {n "Accept": "*/*", n "Accept-Encoding": "gzip, deflate", n "Host": "httpbin.org", n "User-Agent": "python-requests/2.9.1"n }, n "origin": "95.56.82.136", n "url": "https://httpbin.org/get"n}n'
>>> response.json()
{'headers': {'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'python-requests/2.9.1', 'Host': 'httpbin.org', 'Accept': '*/*'}, 'args': {}, 'origin': '95.56.82.136', 'url': 'https://httpbin.org/get'}
>>> response.headers
{'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Server': 'nginx', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Origin': '*', 'Content-Length': '237', 'Date': 'Wed, 23 Dec 2015 17:56:46 GMT'}
>>> response.headers.get('Server')
'nginx'
В простых методах запросов значительных отличий у них не имеется. Но давайте взглянем на работы с Basic Auth:
>>> import urllib.request
>>> password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
>>> top_level_url = 'https://httpbin.org/basic-auth/user/passwd'
>>> password_mgr.add_password(None, top_level_url, 'user', 'passwd')
>>> handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
>>> opener = urllib.request.build_opener(handler)
>>> response = opener.open(top_level_url)
>>> response.getcode()
200
>>> response.read()
b'{n "authenticated": true, n "user": "user"n}n'
>>> import requests
>>> response = requests.get('https://httpbin.org/basic-auth/user/passwd', auth=('user', 'passwd'))
>>> print(response.content)
b'{n "authenticated": true, n "user": "user"n}n'
>>> print(response.json())
{'user': 'user', 'authenticated': True}
А теперь чувствуется разница между pythonic и non-pythonic? Я думаю разница на лицо. И несмотря на тот факт, что requests ничто иное как обёртка над urllib3, а последняя является надстройкой над стандартными средствами Python, удобство написания кода в большинстве случаев является приоритетом номер один.
В requests имеется:
- Множество методов http аутентификации
- Сессии с куками
- Полноценная поддержка SSL
- Различные методы-плюшки вроде .json(), которые вернут данные в нужном формате
- Проксирование
- Грамотная и логичная работа с исключениями
О последнем пункте мне бы хотелось поговорить чуточку подробнее.
Обработка исключений в requests
При работе с внешними сервисами никогда не стоит полагаться на их отказоустойчивость. Всё упадёт рано или поздно, поэтому нам, программистам, необходимо быть всегда к этому готовыми, желательно заранее и в спокойной обстановке.
Итак, как у requests дела обстоят с различными факапами в момент сетевых соединений? Для начала определим ряд проблем, которые могут возникнуть:
- Хост недоступен. Обычно такого рода ошибка происходит из-за проблем конфигурирования DNS. (DNS lookup failure)
- «Вылет» соединения по таймауту
- Ошибки HTTP. Подробнее о HTTP кодах можно посмотреть здесь.
- Ошибки SSL соединений (обычно при наличии проблем с SSL сертификатом: просрочен, не является доверенным и т.д.)
Базовым классом-исключением в requests является RequestException. От него наследуются все остальные
- HTTPError
- ConnectionError
- Timeout
- SSLError
- ProxyError
И так далее. Полный список всех исключений можно посмотреть в requests.exceptions.
Timeout
В requests имеется 2 вида таймаут-исключений:
- ConnectTimeout — таймаут на соединения
- ReadTimeout — таймаут на чтение
>>> import requests
>>> try:
... response = requests.get('https://httpbin.org/user-agent', timeout=(0.00001, 10))
... except requests.exceptions.ConnectTimeout:
... print('Oops. Connection timeout occured!')
...
Oops. Connection timeout occured!
>>> try:
... response = requests.get('https://httpbin.org/user-agent', timeout=(10, 0.0001))
... except requests.exceptions.ReadTimeout:
... print('Oops. Read timeout occured')
... except requests.exceptions.ConnectTimeout:
... print('Oops. Connection timeout occured!')
...
Oops. Read timeout occured
ConnectionError
>>> import requests
>>> try:
... response = requests.get('http://urldoesnotexistforsure.bom')
... except requests.exceptions.ConnectionError:
... print('Seems like dns lookup failed..')
...
Seems like dns lookup failed..
HTTPError
>>> import requests
>>> try:
... response = requests.get('https://httpbin.org/status/500')
... response.raise_for_status()
... except requests.exceptions.HTTPError as err:
... print('Oops. HTTP Error occured')
... print('Response is: {content}'.format(content=err.response.content))
...
Oops. HTTP Error occured
Response is: b''
Я перечислил основные виды исключений, которые покрывают, пожалуй, 90% всех проблем, возникающих при работе с http. Главное помнить, что если мы действительно намерены отловить что-то и обработать, то это необходимо явно запрограммировать, если же нам неважен тип конкретного исключения, то можно отлавливать общий базовый класс RequestException и действовать уже от конкретного случая, например, залоггировать исключение и выкинуть его дальше наверх. Кстати, о логгировании я напишу отдельный подробный пост.
У блога появился свой Telegram канал, где я стараюсь делиться интересными находками из сети на тему разработки программного обеспечения. Велком, как говорится
Полезные «плюшки»
- httpbin.org очень полезный сервис для тестирования http клиентов, в частности удобен для тестирования нестандартного поведения сервиса
- httpie консольный http клиент (замена curl) написанный на Python
- responses mock библиотека для работы с requests
- HTTPretty mock библиотека для работы с http модулями
💌 Присоединяйтесь к рассылке
Понравился контент? Пожалуйста, подпишись на рассылку.