Watzmann.Blog - Evolution for REST API's

Evolution for REST API's

03 August 2012

Like everything, REST API’s change over time. An important question is how these changes should be incorporated into your API, and how your clients should behave to survive that evolution.

The first reflex of anybody who’s thought about API’s and their evolution is to stick a version number on the API, and use that to signal to clients what capabilities this incarnation of the API has, and maybe even let clients use that to negotiate how they talk to the server. Mark has a very good post explaining why, for the Web, that is not just undesirable, but often not feasible.

If versioning is out, what else can be done to safely evolve REST API’s ? Before we dive into specific examples, it’s useful to recall what our overriding goal is. Since it is much easier to update a server than all the clients that might talk to it, the fundamental aim of careful evolution of REST API’s is:

Old clients must work against new servers

To make this maxim practical, clients need to follow the simple rule:

Ignore any unexpected data in the interaction with the server

In particular, clients can never assume that they have a complete picture of what they will find in a response from the server.

Let’s look at a little toy API to make these ideas more tangible, and to explore how this API can change while adhering to these rules. The API is for a simplistic blogging application that allows posting articles, and retrieveing them. For the sake of simplicity, I will omit all HTTP request and response headers.

A simple REST API

In sticking with good REST practice, the API has a single entrypoint at /api. Issuing a GET /api will result in the response

<api>
  <link rel="articles" href="/api/articles"/>
</api>

The articles collection can be retrieved with a GET /api/articles:

<articles>
  <article href="/api/articles/1">
    <title>Evolution for REST API's</title>
    <content>
      Like everything, ....
    </content>
  </article>
  <article href="/api/articles/2">
    ...
  </article>
  <actions>
    <link rel="create" href="/api/articles"/>
  </actions>
</articles>

Each article consists of a title and some content; the href on each article gives clients the URL from which they can retrieve that article, and serves as a unique identifier for the article.

The actions element in the articles collection tell the client that they can create new articles by issuing POST requests to /api/articles:

<article>
  <title>How to version REST API's</title>
  <content>...</content>
</article>

It’s worth pointing out a subtlety in including a link for the create action: one reason for including that link is to tell clients the URL to which they can POST to create new articles, and keep them from making assumptions about the URL space of the server. A more important reason though is that we use the presence of this link to communicate to the client that it may post new articles. This, following the HATEOS constraint for REST API’s, is the more important reason to include an explicit link: clients should not even assume that they are allowed to create new articles.

Adding information from the server

Readers might want to know when a particular article has been made available. We therefore add a published attribute to the representation of articles that a GET on the articles collection or on an individual article’s URI returns:

<article href="/api/articles/2">
  <title>How to version REST API's</title>
  <content>...</content>
  <published>2012-08-03T13:00</published>
</article>

This does not break old clients, because we told them to ignore things they do not know about. A client that only knows about the previous version of our API will still work fine, it just won’t do anything with the published element.

Allowing more data when creating an article

Some articles might be related to other resources on the web, and we’d want to let authors call them out explicitly in their articles. We therefore change the API to accept articles with some additional data on POST /api/articles:

<article>
  <title>Great REST resources</title>
  <content>...</content>
  <related>
    <link rel="background" href="http://en.wikipedia.org/wiki/Representational_state_transfer"/>
    <link rel="background" href="http://en.wikipedia.org/wiki/HATEOAS"/>
  </related>
</article>

As long as our new API allows posting of articles without any related links, old clients will continue to work.

Blogging API’s everywhere

If our blogging software is so successful that clients must be prepared to deal with both servers that support adding related reosurces, and ones that do not, we need a way to indicate that to those clients that know about related resources. While there are many ways to do that, one that we’ve found works well for Deltacloud is annotating the collections in the toplevel API entrypoint. When a client does a GET /api from a server that supports related resources, we’d send them the following XML back:

<api>
  <link rel="posts" href="/api/posts">
    <feature name="related_resources"/>
  </link>
</api>

Updating articles

Authors want to revise their articles from time to time; we’d make that possible by allowing them to PUT the updated version of an article to its URL. This won’t introduce any problems for old clients, but new clients will need to know whether the particular instance of the API they are talking to supports updating articles. We’d solve that by adding actions to the article itself, so that a GET of an article or the articles collection will return

<article href="/api/posts/42"/>
  <title>...</title>
  ...
  <actions>
    <link rel="update" href="/api/posts/42"/>
  </actions>
</article>

Not only does the update link tell clients that they are talking to a version of the blogging API that supports updates, it also lets us hide complicated business logic that decides whether an article can be updated or not by simply showing or suppressing the update link.

Merging blogs

Because of its spectacular content, our blog has been so successful that we want to turn it from a personal blog into a group blog, supporting multiple authors. That of course calls for adding the name of each author (or their avatar or whatnot) to each post — in other words, we want to make passing in an author mandatory when creating or updating an article. Rather than break old clients by silently slipping in the author requirement, we add a new action to the articles collection:

<articles>
  <post>...</post>
  <actions>
    <link rel="create_with_author" href="/api/articles_with_author"/>
    ...
  </actions>
</articles>

Old clients will ignore that new action; the remaining question is if we can still allow old clients to post new articles. If we can, for example, by defining a default author out-of-band with this API, we’d still show the old create action in the articles collection. If not, we’d take the ability to post away from old clients by not displaying the create action anymore — but we haven’t broken them, since they can still continue to retrieve posts, we’ve merely degraded them to readonly clients.

While this seems like an extreme change, consider that we’ve changed our application so much that existing clients can simply not provide the data we deem necessary for a successful post. It’s much more realistic that we’d find a way to let old clients still post articles using the old create link.

Some consequences for XML

There are two representations that are popular with REST API’s: JSON and XML. The latter poses an additional challenge for the evolution of REST API’s because the use of XML in REST API’s differs subtly from that in many other places. Since clients can never be sure that they know about everything that might be in a server’s response, it is not possible to write down a schema (or RelaxNG grammar) that the client could use to validate server responses, since responses from an updated server would violate that schema, as the simple example of adding a published date to articles above shows.

It’s of course possible to write down RelaxNG grammars for a specific version of the API, but they are tied to that specific version, and must therefore be ignored by clients who want to happily evolve with the server.

Questions ?

I’ve tried to cover all the different scenarios that one encounters when evolving a RESTful API — I’ve left out HTTP specific issues like status codes (must never change) and headers (adding new optional headers is ok) as the Openstack folks have decided for their API Change Guidelines.

I’d be very curious to hear about changes that can not be addressed by one of the mechanisms described above.

Watzmann.Blog Varying amounts of fiber