Building a Web API
What is a Web API?
In earlier chapters we discussed the idea of using Python scripts as web clients to retrieve HTML pages and an HTML parser to scrape data from within the page. We then said that there is a better way to make data available to scripts, namely using a format that is designed for machine consumption such as XML or JSON. This chapter discusses some of the design decisions in building such an API and shows some examples of a Bottle application that returns JSON data.
In its simplest form, a web API is a way of making the resources in a web application available in machine readable form. This would enable a script to retrieve a version of the data hosted in an application without having to screen-scrape HTML. A more sophisticated API will allow all of the operations that are supported by the application to be carried out by scripts as well as by humans through the regular web interface.
In this context it is useful to reiterate the idea of a resource on the web. A resource is that thing referred to by a Uniform Resource Locator or URL. We can retrieve a representation of a resource with a GET request. A resource can correspond to any thing that is stored and manipulated by an application: a message, a personal profile, a document, a real-estate listing, the record of a bid in an auction. A resource can also be a collection of other resources, such as all of the listings on an auction site or all of the documents relating to a meeting.
When we design a web application, we should try to think about what the
resources are in the application and give each one a meaningful URL. A
collection will often have a URL ending in a /
like
http://example.org/listings/
while an individual resource would be
'within' the collection like
http://example.org/listings/bathroom-chair1293
. A URL is a unique
identifier for a resource, so should include enough information to name
it uniquely. This might include a distinct numeric code or other
identifying information. Here's an example from The Guardian newspaper
that includes date information as well as the title:
http://www.theguardian.com/technology/askjack/2015/may/05/how-does-a-domain-name-scam-work
- part of this URL is also a collection
(http://www.theguardian.com/technology/askjack
) that contains all
articles in this column, and http://www.theguardian.com/technology
is
all of the technology articles.
Designing URLs is a tricky task and there is no simple rule to follow. There are some useful discussions on the web: Kyle Neath, David Marland.
A web API isn't just about retrieving resources. A complete URI will support creating new resources and updating existing ones as well: all of the operations that are supported by the regular web application. This is often referred to as CRUD - Create, Read, Update, Delete: the operations that might be implemented on resources and collections. With an web API, these operations correspond to the HTTP verbs: GET (read), POST (create or update), PUT (create), DELETE (delete). A well designed API will support the CRUD operations on resources using these different verbs. So to add a new message to a collection of messages on a site we would use a POST request to the URL that refers to the collection of messages. To update (edit) a message, we'd send a POST request to the URL for the message itself. This style of API is part of what is known as REST (Representational State Transfer). REST is an architectural style for building web applications but at its core it supports naming resources with URIs and implementing operations on these via the HTTP verbs.
Implementing a JSON Response
Bottle makes it very easy to return JSON content in a response. If the return type of a handler is a Python dictionary, it will be converted to JSON in the response. Building on the database backed 'likes' application from the earlier chapter on databases, we'll develop a simple example JSON based API. As a reminder, here is the main page of that application that returns a list of the current likes and the form as an HTML page:
def index():
"""Home page"""
db = COMP249Db()
info = dict()
info['title'] = 'Welcome Home!'
# get the list of likes from the database
info['likes'] = get_likes(db)
return template('dblikes.tpl', info)
If we want to be able to get a version of this data as JSON, we can write a separate handler for a different URL and return a dictionary:
@app.get('/likes')
def getlikes():
"""Get a JSON version of the likes data"""
db = COMP249Db()
info = dict()
# get the list of likes from the database
info['likes'] = get_likes(db)
return info
This version just returns the info
dictionary that has one entry for
the key likes
that will be a list of the current entries from the
database. If we send a GET request to this URL we get the response:
{
"likes": [
"Cheese",
"eggs",
"bananas",
"fruit in general"
]
}
Note that I've formatted this to make it easier to read - by default it
will be returned without any newline characters. The result is a JSON
object with a likes
property who's value is a list of strings. It
contains the same information that would be used to populate the HTML
template, but in a form that can easily be read by a script via a JSON
parser.
A Client for our Web API
Now that we have a machine readable result returned from our
application, we can write a Python client to read this data. It is
simple to write a client that sends a request with urllib
, we then
need to parse the JSON result into a native Python data structure. This
can be done with the json module. Here is a function that will
return a list of likes by querying the remote web api:
from urllib.request import urlopen
from urllib.parse import urljoin
import json
BASE_URL = 'http://127.0.0.1:8080/'
def get_likes():
"""Query the web API for all of the current likes
return a list of strings"""
url = urljoin(BASE_URL, '/likes')
with urlopen(url) as fd:
text = fd.read().decode('utf-8')
likes = json.loads(text)
if 'likes' in likes:
return likes['likes']
else:
return []
if __name__=='__main__':
print(get_likes())
This module uses a global variable BASE_URL
that holds the URL of the
web application. We then add to the URL in the get_likes
function to
get the URL of the endpoint we want to access. We then make a request to
the URL using urlopen
. Using the json.loads
function we parse the
string containing the JSON result into a Python structure which will be
a dictionary. We then check whether the property 'likes'
is in the
dictionary and return the value if it is.
Reflecting on what we've implemented here, we have a Python web
application running on our local machine that has a local database and
serves HTML pages containing a form. We can interact with that form and
add 'likes' to the database. When we request the URL
http://127.0.0.1:8080/likes
the web application queries the database,
gets a Python list of likes, inserts them into a Python dictionary and
then returns them. Bottle turns this dictionary into JSON and sends it
back to the client. Our client script receives the JSON response, parses
the JSON into a Python dictionary and extracts the list of likes. This
is a roundabout way of passing data from one script to another but the
big advantage is that these scripts could be running on different hosts
or could be written in different languages and this would work fine.
HTTP is being used as an inter-application protocol to exchange data
between running systems.
The Update Operation
We can now read data in machine readable form using an HTTP request. The
next step is to be able to update the data stored on the server - to add
a new like. To do this we need a POST request, just like submitting the
form on the web page. In fact we already have a handler for this, since
the POST handler for the /likes
URL accepts a url-encoded form
containing the new like and stores it in the database. At one level
then, we don't need to do anything, however, we could make the API a
little more consistent if this request also used JSON in the same format
as returned by the GET request. This would also allow us to set more
than one like in one request. So, let us extend the POST handler to
accept either an encoded form or a JSON object containing a property
'likes'
; if we get a JSON object we add all of the items to the
database.
The first task is to test whether the request contains a url-encoded
form or a JSON payload. We can do this using the Content-Type header in
the request which Bottle makes available as request.content_type
.
@app.post('/likes')
def like():
"""Process like form post request"""
# initialise likes to an empty list
if request.content_type == 'application/json':
return likes_json()
else:
return likes_form()
def likes_json():
"""Handle the /likes POST request from a JSON submission"""
if 'likes' in request.json:
likes = request.json['likes']
else:
likes = []
for like in likes:
store_like(db, like)
# response with an updated list of likes as JSON
return {'likes': get_likes(db)}
In implementing this I've split the two kinds of response into two
handler functions to keep them simple and then called the appropriate
one based on the content type of the request. The form handler is the
same as the original handler but the JSON handler uses the
request.json
property supplied by Bottle which is the parsed JSON
object (Bottle is smart enough to provide this when the content type is
application/json
). The JSON handler also handles multiple likes and
adds them all to the database and returns an updated list of likes
rather than a redirect to the home page. This is more appropriate
behaviour for a request that we will make from a script.
Now let's look at a client for the new API endpoint for adding likes.
This will have to encode the list of likes as a JSON object and send the
request with the appropriate content type. To do this we need to do a
bit more work thank before with urlopen
; we need to make an explicit
Request
object to be able to set the headers. Here is a client
function that will add a new list of likes and return the updated list
from the response:
def add_likes(likes):
"""Add the strings in likes to the remote application
return a list of the new set of likes"""
url = urljoin(BASE_URL, '/likes')
data = {'likes': likes}
params = json.dumps(data).encode('utf8')
req = Request(url, data=params,
headers={'content-type': 'application/json'})
# send request and decode the response
with urlopen(req) as fd:
text = fd.read().decode('utf-8')
if text:
likes = json.loads(text)
if 'likes' in likes:
return likes['likes']
else:
return []
Summary
This chapter has introduced the idea of designing a web based API for providing access to the resources served by a web application. We've shown how to implement some of the basic operations (Read and Update) using Bottle examples and looked at a simple client for the API. There is a lot more to this topic because web APIs can become very complex. For example, we might need to deal with authentication, which is complicated because we can't ask someone to fill in their username and password in a script. We also need to deal with the Create and Delete operations from CRUD.