>>Zach Maier: Alright. Hi, everybody. Welcome
to this session. I can't believe it's standing
room only. This is pretty exciting [laughs]
I wasn't expecting that. So my name is Zach
Maier and I'm the product manager for our
API Infrastructure team. Joining me on stage
later will be Mark Stahl, the tech lead for
the API Infrastructure team, and Yaniv Inbar
and Joey Schorr, two of the engineers on the
API Infrastructure team. So, just like every
other session we've had so far during IO,
you can follow along with live notes and leave
comments and questions for the end at bit.ly/apiwave
and if for some reason you really, really
want to live tweet what I'm saying right now,
you can do it with the #googleapi8 so people
can follow along. Alright, so this session
is a little bit different from the other session
we have at Google IO. It's kind of a sneak
peak at how Google's been building APIs in
the past and what we've noticed and how we're
going to build APIs in the future so you're
going to get to see like the foundation of
our APIs for the past five years and then
some upcoming technology that we're really,
really excited about. So, what we're going
to talk about - this is a 200 level session
so we're talking about a few advanced technologies
and some terms that not everybody might know
like off the top of their head so I'm going
to give you a quick refresher as to Google's
API 101, the underlying technology that all
Google APIs are built on. We'll then go into
- Mark will come up and talk about how we're
making future APIs awesome, things we've noticed
in the past, and how we're going to be fixing
those. Some things that have already been
fixed and some things will be fixed in the
future so you can be on the lookout there
for some power features that you can use as
your programming Google APIs, and then what
you're really probably all here waiting to
see, Joey's going to come up and show a really
awesome demo of the internal tools we use
to build our own APIs. Kind of a behind the
scenes. Nobody has ever seen this before outside
of Google so it's really exciting stuff and
then of course, questions and comments. If
you have any questions or comments, again,
bit.ly/apiwave. Alright, so Google's API 101.
What are the underlying technologies that
all Google APIs work on? Well, first off,
since we're a web based company all Google
APIs use REST. REST is short for Representational
State Transfer and very simply speaking, its
clients and servers exchanging resource representation
somehow. It's good for cached and layered
systems which is basically the Internet, right?
So cached and layered systems; Representational
State Transfer. In HTTP, every time you make
a get request to a resource, you are using
a REST based system so in this case
if you're using the G data YouTube API, you
get the resource and it will return like a
list of videos or something like that. The
resource representations that we use in Google
APIs - right now we use AtomPub which means
they're modeled as feeds of entries and more
generally it's useful to think of these as
collections of resources and you'll see why
later. So think of - anytime you get a representation,
it's a collection of a bunch of different
resources and just to make sure we're all
on the same page with what collection of resource
means; this is sitting on a server right now.
I have a collection and I have a bunch of
resources in there. Now, depending on which
API I'm using, this collection can be a collection
of contacts if it's a contacts API, a collection
of videos if it's a YouTube API, a collection
of documents, or anything else living in a
cloud if it's like the documents list API.
Each resource is a document or a contact or
a video in that case. So, what can I do with
this information that site on the server?
Well, first off, I can get that information
and bring it to my local client, HTTP Verb
Get, everybody should know what that means.
So once this is on my client, this resource
is on my client, I can of course modify the
resource. Once I modify the resource, I can
use PUT Verb and put that resource right back
into the collection, essentially updating
that resource with any of the changes I made
on the client. So, what else can I do to these
resources? Of course, I can delete a resource
which if I delete a resource, guess what happens?
It gets deleted. I can also post a new resource
into the collection. Now this is inherently
different then every other operation we have
up here so far. This post actually operates
on the collection, not the resource because
since there's no resource living on the server
so far, you have to post into the collection
and then once you post into the collection,
new stuff appears in the collection. One other
thing you can do on the collection level is
you can also get everything in a collection
which in essence lists all of the different
resources so you can pull all of the resources
by calling a GET on the collection. Alright,
so that's a basic REST recollections resources;
the underlying fundamental technologies of
how all our APIs work up to date. So, a quick
question for everybody. Who's seen this logo
before? Anybody? Anybody? A few people have,
which props to you guys, because you've been
around for a long time then because this is
the very, very first Google Data APIs logo
and it's kind of obvious what it looks like,
right? It looks like an atom and as time has
progressed, we've dropped the little electron
arrows flying around and we have the Google
data cube that we use to represent APIs nowadays.
So what I'm trying to get at with this awesome
history of the logo over the past five years
is that right now Google data is basically
equivalent to Atom. You must understand Atom
to use the APIs. This piece of core of our
APIs is built around the Atom Syndication
Format and the Atom Publishing Protocol and
then over time we've extended those core features
with features like query, batch, and concurrency.
So far, like this approach that we've taken,
this Atom based approach, has been very, very
successful. We have more then 25 APIs at Google,
as I'm sure you're all well aware, and across
all those APIs we get about 2
billion hits per day which is a pretty impressive
traffic number. If you divided that out, you
can figure out what QPS that is. So, a lot
of you have used most of these APIs before;
Blogger, YouTube, calendar, spreadsheets
- all these APIs run on the foundation so
we have this awesome foundation built up but
we
want to keep it going, right? We don't want
to stop with what we have. We want to make
it better, we want to get more APIs out there,
and we want these APIs that we launch to be
higher quality. So, inside of Google, and
this is the look behind the scenes that I've
guaranteed everybody. We are moving to a brand
new API Infrastructure and the cool thing
about this is we've done it transparently
so if you're using any of these new APIs that
have been new in the past few weeks; the Google
Moderator API, the Google Buzz API, or the
Google Latitude API; you are using this brand
new infrastructure. Alright, so I've given
you the foundation, laid the foundation here,
told you that we're moving to a brand new
infrastructure but I'm just a product manager
so I can talk about all these cool things
but to actually tell you how it's really done
here is the tech lead, Mark Stahl. [applause]
>>Mark Stahl: Hello. My name is Mark Stahl
and I'm the tech lead for Google APIs. I have
been building APIs for Google for about five
years or we've been building infrastructure
that have allowed teams to build APIs for
about five years. In the part of the talk,
I'm going to explain a little bit about some
of the things we've learned over those five
years; we've been trying to listen; and some
of the rough edges we've noticed and the ways
we're trying to improve APIs going forward
and the new features and stuff that we're
trying to build into our new stack. So, one
of the - I'm going to talk about some of the
rough edges in a couple of areas. First is
just that we're dealing with resource representations
on the wire and so the formats on the wire
are very important. This is what you guys
are actually - these are the resources you're
manipulating. I'll deal a little bit with
REST itself and some of the difficulties that
it provides when trying to do certain types
of operations and we also maintain a set of
libraries that we try to make available to
help you use our APIs and I'll discuss some
of the difficulties we've had and the changes,
again, that we're making going forward to
make these things better. So, first, in the
output formats, one of the things about REST
you'll know is that it's based on transferring
a resource or as the technical way, it's a
representation of a resource; that is some
wire document that says this is the state
of everything in that resource. Now, in order
to modify, you saw that you have to transfer
that resource representation twice. You have
to pull down a full copy of the state of the
resource, you modify what you like, and you
put back a full copy; and this means that
just to do as much as changing a single flag
requires you actually changing, transferring
the whole resource representation twice, and
a resource can be pretty much anything. It
cane be, you know, five lines of configuration
or it can be 400 pieces of meta data and sometimes
that can get a little bit verbose. Also, we
built on AtomPub right from the beginning
which means we're using Atom Syndication Format
and XML as the core document and we also
- maybe some of you have realized that -
resources on the wire can be a little bit
verbose. We get this problem a lot of times.
So, we've been looking at this problem for
a while and we have been looking for RESTful
ways to solve this problem and we have introduced
now, in the last couple of months, you may
have seen come out, we allow for the possibility
of partial operations. Realistically speaking,
you are only operating on part of a resource
90% of the time or the only part that you
want so partial operations come in two basic
flavors. There is the partial response which
is when you return this resource to me, only
give me the part of the resource that I really
want, and I'll just give you a quick example.
Here's an example from YouTube and I'm just
going to click through and show you. This
is the full XML resource. This is a search
of YouTube for the number one Google IO video
and this is a lot of feed metadata coming
up here at the top. You can see the actual
entry, this is the beginning of the actual
result. Lots and lots of metadata, lots and
lots of metadata; somewhere in here is the
content that I want. Can anyone see it? I
think it's up here. No, I can't find it. I
know it's in here because this is actually
live and you can see that this is an awful
lot of data to transfer for one resource so
assuming that what you want to do is only
say - you say display just the title and of
course once you have the content field, you
only want to actually display the content
itself which is a link at the moment to Flash.
All you really want is these two fields. So,
partial GET, every operation now supports
this concept of a field's parameter and from
the field's parameter, you specify a mask
and a mask just says give me what exactly
matches this mask and just to make life a
little bit easier, somebody show me a live
result. Resolving proxy, that's not good,
is it? Ah, there it is! So this is actually
a live result and you'll see this is - if
you're say working in a mobile environment
and you're trying to get just a few things,
this is a big deal. This is something that
people have really been wanting for a long
time. It's still RESTful and it works just
the way you need it to and you'll see here
this is actually a full document in terms
of its structure. It actually has a feed tag,
it has an entry tag, and it has a content
tag so it still parses the same way except
it's only a subset of what Atom Syndication
Format requires. So, going back - we also
have defined another part of partial which
we call the partial update. In this case,
you use the exact same masking syntax and
you say I'm going to send back a same partial
representation and only update some fraction
of that resource. Now I'm not going to go
into this in detail. This was actually launched
and this was discussed actually at last Google
IO and we launched it like about three months
ago so it's available now on about four APIs
and you can actually go and read all the details.
If you are working in a bandwidth constrained
environment or you really need to save space
on a device, this is the type of feature you're
going to want to exploit so this is something
that's currently available and we'll be rolling
out to all APIs, databased APIs, in the future.
Another issue that some of you may have dealt
with; how many people are programming in JavaScript?
Anybody? Not me actually. I'm programming
in Java but perhaps you've [laughs] you're
trying to use XML results on JavaScript so
we realized that we need to be able to offer
alternate formats. XML works great in languages
that have a lot of XML support but JavaScript
is actually another format that a lot of people
want and in fact formats need to be flexible.
Resources are - it doesn't necessarily mean
AtomPub. A resource can actually be represented
any number of ways and so we're supporting
multiple formats, and by multiple formats
I mean these are native to the architecture.
They are both readable and writeable. Now,
this required us to actually deal with some
architectural issues. When we stated building
APIs five years ago, AtomPub was all the hotness
and we built our services actually exported
directly into the Atom Syndication Format
so all our services were tired to Atom. When
JSON developers came to us and said, "We want
JSON." Well, we built JSON; however, has anybody
here worked with our old format JSON? Yeah,
there's a reason and the reason is the JSON
is actually XML coded as JSON and it's not
the most pretty thing. JSON developers, of
course, is not a natural structure to have
to put namespaces in your JSON objects. It's
not a pretty sight. Another feature of this
particular hack that we implemented is that
it's only a one way. This is a read only API
so if you used the JavaScript client libraries,
you actually could write back but what you
didn't see was under the hood we transcoded
it back to XML and then sent it back to the
server, and that was because our servers were
built around the concept of reading and writing
and parsing Atom's Publishing Protocol and
Atom Syndication Format. So, what we've done
is we've re-architected our how we build APIs
from the ground up and we've built and introduced
a generic data concept which you'll see here.
We're using, of course, Google's favorite
structure which is Protocol Buffers, and you'll
see a little bit later exactly how we do this
but what we've also introduced then is just
what everybody else is familiar with is templating
languages but these aren't just simple templating
languages, these are bidirectional templating
languages. So by writing a template, you actually
get both a read and a write format so we've
solved some difficult problems in order to
make it possible for people to really build
APIs and new languages so just to give you
a quick, brief idea of how this really impacts
APIs, this is our new Buzz API. You may have
all seen this here. So, this is the Atom that
comes out of Buzz and it's an awful lot of
Atom here. Buzz is built on the activity streams
specification which means they have an awful
lot of metadata but you can see there's an
awful lot of metadata in order to get one
piece of content here and somewhere in here
- here he is, there it is. This was my content,
I made a Buzz. I'm excited to be speaking
at Google IO. Now, if you look at the same
thing, all you have to do now is specify all
JSON and what you're going to get back is
this somewhat ugly blob but it actually gets
near - if I can find the right key combination
- ah. So, this is a much neater structure.
This is actually native JSON, it's not XMLized
JSON and you'll see somewhere in here, a lot
easier to read, is my actual content labeled
as actual content so APIs going forward will
now support their own native JSON read write
format. We are not longer constrained by the
syntax requirements of Atom, we can actually
build a format that's natural for you to work
with, and one of the other nice features of
this particular change is that the way we
re-architected our system, this templating
model, doesn't restrict us to just these formats.
We can actually start introducing other formats
just as easily and in fact we hope to introduce
new formats in the future. Whatever the new
hotness is, we're ready to be able to introduce
it into our systems so our APIs will grow
as the Web changes and our APIs will be able
to adapt to how they actually change. So,
I'd like to switch to our second topic, things
we've noticed in APIs. REST is very much a
popular approach, an architectural style and
there's a good reason REST is a popular architectural
style. It's built - REST is based on transferring
these resource documents. It's exactly what
all your HTTPL and all your web browsing does,
transfer documents. It's really great for
cacheability. It's really simple for people
to use these type of APIs but the way it works
actually means it can be awkward for certain
type of operations so I built up a small example
here. This example starts with the idea of
a Picasa web. Say you want to rotate a photo
and we're going to rotate this photo in binary
and we're going to do this a very traditional
REST. First, transfer your resource representation
over the wire and get your JPEG. Rotate your
photo and write your photo back. So you see
here, depending on how big this resource representation
here, I've actually transferred a photo twice
over the wire to do what's a fairly simple
operation. Now, how would we solve this in
other ways? Typically you think, "Oh, I should
just be able to send the server a command
saying rotate this photo." Well, there is
no such command in REST. You can do things
like oh maybe I expose metadata that says
let's give the photo a rotate metadata. This
is a certain hack that we've done. This actually
is how Picasa did it because we were constrained
by REST. Of course now, of course, you have
to transfer this rotate state to set it back.
There is no way to send an imperative command
and there are certain types of imperative
commands that are even more common. Say, send
email to all attendees of a conference event
or reset this machine. This type of imperative
statements are inherently difficult to do
in REST. You can always hack it. Everything
can be done, everything can be faked but it's
not a natural approach. A natural approach
would be RPC but to switch to RPC, you're
giving up on REST and you are now in a world
where you don't have the benefits so what
we've decided to do at Google is we've decided
that we're going to be introducing a very
lightweight form of RPC, a RESTful approach,
the idea that resources can have extra options
on them instead of the basic three or four
verbs that REST gives, we are actually introducing
a form of RPC we call Custom Verbs and this
says that I can have a resource, can export
something that says when REST is a difficult
way to approach it, here's a custom verb that
let's me do that resource exactly what I want.
So, in the Picasa web case, this simplifies
our world greatly. All I do is I send along
a command saying identify my resource, here's
the operation I wish to perform on it, here's
the parameters, and away I go. Now you'll
see that we've really reduced the amount of
information on the wire and we've made it
possible. Within the construct, we still have
a RESTful API as our base but we've given
ourselves a way to stop the struggle between
RPC and REST and solve some of the tricky
problems by allowing more capability to write
APIs when it's appropriate. So, I'll just
show another example real quickly. You probably
all have dealt with - has everybody used our
task list API? That's because we don't have
a task list API but if we did this is how
you would say set a task done. You take a
task, you go to its resource, get it, modify
the done bit, and then you're done and basically
put it back. A custom verb approach then is
just to have a method, mark done, and using
the same resource identifier you can now perform
operations on resources so this is a feature
that we're planning on rolling out on APIs
as necessary in order to make them more powerful.
Another tension that we've noticed in API
communities is that not everybody is buying
into the RESTful approach. There are other
approaches out there. How many people have
dealt with, say open social APIs? If you've
done with open social, you know that open
social community decided on JSON RPC as the
standard approach to APIs. However, if you
look at what the open social JSON RPC API
is, a large portion of those are actually
the same RESTful commands that we operate
in the RESTful world and so what we've decided
to do is, you know, there's no reason that
we have to sit here and say, "We're only offering
the REST, we're only offering the JSON RPC."
We can actually offer these in parallel and
we can let them be good where each one is
best so APIs going forward, again, here's
your simple RESTful model but there's not
reason that these all can't be mirrored as
JSON RPC models. When you introduce a custom
verb, of course, they actually fit into this
framework fairly easily so they work quite
well. Now, you say why are we offering RPC.
REST, of course, I said benefits really well
because it's the way the Web works. You benefit
from caching, you benefit from the simplicity.
JSON RPC, well what's it's benefits? Probably
the number one benefit is going to be the
batching mechanisms and we'll actually support
batching that spans multiple services. You
can actually start creating more complex systems
through a common API infrastructure. How many
think this is going to be a really nice ability
to use what's best at the time you use it?
So, and finally, I'd like to talk a little
bit more about the third issue that we've
noticed in building our APIs is our client
library strategy. One of the biggest problems
we've had in client libraries is keeping them
up to date with all the APIs. Google engineers
are very innovative. They keep introducing
new APIs and new features and the core problem
we've had is the way we architect our client
libraries, all of that libraries actually
have - what do we have? We have classes literally
for every XML element in the output streams
of all these libraries so anytime a service
makes a single change to their API, we have
to do another release and what we find is
that some of these libraries get a lot of
love. Java's gets a lot of love because everybody
in Google, Java is one of our top programming
languages; dot net - some, some love but some
other languages; Python - well, we're not
sure how much love - you know, it depends
on how much love we get, how much time we
have and it becomes very difficult to keep
libraries on the cutting edge so we realized
that this was an architectural issue. How
we had designed our libraries and how we had
designed our client library strategy put us
in a bind. You weren't able to get libraries
that worked with our APIs or they lagged behind
because we had built them in a way that made
it difficult for us to keep up so we started
to rethink client libraries from the ground
up, and the very first thing we decided to
do is we're going to introduce the concept,
we're going to introduce the idea of discovery
into APIs at Google. So, this discovery, what
we're basically - every API now will support
a discovery document. It's just JSON but it's
simple to read and you use it to describe
these resources. You can describe the URL's,
parameters, whatever so now there is a way
for a library to be built that actually leverages
this information and this is built deep into
the infrastructure of how we build APIs. It
actually means that these - once you publish
an API at Google, the discovery document is
always up to date so I'll give you another
quick example. The discovery is just another
API. Buzz being one of the first APIs built
actually has a discovery document and here's
a - you'll notice here, this is a URL. This
little number here, V zero point one is just
to let you know that discovery is an experimental
API. We're not officially releasing it today
but we're innovating in the open, this is
Google IO and you are welcome to go look at
this API and see it and give us feedback on
it. So, here I'll show you a quick example.
This is the Buzz document and yes, it looks
terrible. Now it looks a lot better and you
can see it's pretty straightforward. From
the top, we have the name of the service,
it's version. We have a URL and then we start
describing what resources exist in this API.
Here's a set of photos that exist in the API
and here is a URL template that you can use
to construct access to those photos. I've
already mentioned RPC and here's the RPC mechanism
and this is a set of methods that you'll see
here. The insert method is a method for adding
photos to a system. Now you've got both of
these methods and you can now see how to construct
a RESTful request and how to construct an
RPC request without ever having - so the library
is going to be built around this concept.
So based on this concept, we are re-architecting
our client libraries. I call them generic
client libraries. I hate the name generic
[laughs] Does anybody have a better name?
Because I hate the name generic but it really
means is client libraries that are built to
be useable with any API built at Google and
that's really the concept. Several features
that go into this generic concept - one is
the libraries themselves will be able to leverage
this discovery so you no longer have to start
scraping URL's out of documents. You can actually
use the names of resources that are a little
more intelligent to get things out of it.
Another concept is I told you the data model
classes that was a big problem. We were modeling
XML but it's actually, there's actually a
lot of better ways to go about it and so for
JavaScript, you use JSON. For Java, we've
actually come up with a mechanism to map plain
old Java objects directly to JSON and we'll
show you that in a few minutes and you can
basically create the Java data model classes
yourselves in a few minutes. We're also rethinking
how we expose some of the advanced features,
making them much easier to use and finally
the client libraries have to work on all our
platforms. It's been a long time since we've
had a Java library for G data that worked
on Android and that's one of the things we're
gonna have. The thing that I most like, and
hopefully you'll like, is that once this library
is realized it will work with any API. So,
I've talked about this as a generic concept.
We've actually been building this. Again,
we're innovating it in the open and we have
a sample to show you and I'm going to invite
- I'm just a tech lead which means I have
to invite one of the software engineers up
to show actual code so I'm inviting Yaniv
Inbar to come up and show an example of the
client library [applause]
>>Yaniv: Thank you, Mark. So as Mark has been
talking about, we're all about innovating
in the open and today we've made available
a Java client library for
all Google APIs. It's technology we're still
working on, we're experimenting with, but
we wanted to give it out to developers like
you so you can try it out
and give us feedback. As Mark said, I'm going
to be demoing an Android application for the
recently announced Buzz API so let's take
a look at how it works.
So, the first thing I did is when I set up
my clips project here, is I checked out the
project from the open source repository where
this sample is hosted.
The second thing I did is that I started the
emulator and you see this is a G1 device from
2008 and the point I'm making here is that
if you're a developer
that's making an Android application, you
want to reach the maximum number of users
and the best way to do that is to target the
1.5 SDK which represents
virtually all of the Android market. If you
are only targeting, say a device like the
Evo that many of you got today, you are only
going to get say less then
a third of the Android market so you have
to make that trade off between a better SDK
and greater reach for your application and
in this application the
first thing the application does is using
intent, it starts the web browser and it shows
an OAuth authentication page. The end user
then looks at the set of
permissions that they are granting the application
to do, they have to approve that and then
they have to grant our application access
to the Buzz API. When
they click "Grant Access", what happens is
Android defines, our application is defining
custom URI scheme called Buzz demo that this
demo defined and if the
wireless is working we'll redirect back to
the application. It looks like we're having
some technical difficulties here so I'll retry
that. Alright, let me
just show you the code. So, let me show you
a preview how JSON data is modeled into our
plain old Java objects. The key here - [talking
in background] The
key here is that there's a content key and
that's mapped to a JSON train. The JSON data
model and the Java data class, there's a field
called content and
we're using an at key annotation to tell it,
"Okay, take that field and map it into the
JSON key." The type here is trade, previous
trade forward. If you're
used to JPA on web applications, this is a
very familiar concept where they are using
that for persistence. So let's take a look
at the Buzz activity class.
I hope you can see that. Now, the Buzz activity
is just a container for the Buzz object. It
has an ID field which is represented by a
Java field and it has
a Buzz object field called object. Again,
I'm mapping from Java fields to JSON keys
and you'll notice that a Buzz activity actually
has a lot more fields as
Mark showed you earlier but we're only starting
the ones we care about. This is critical on
mobile devices where you really want to keep
a low memory profile
for your application. So, let's take a look
at what discovery looks like in a concrete
Java application. Here's an example of the
post method. Say I want to
knit a canoe Buzz post. I defined a method
called activity set insert and I give it a
set of parameters. In this case, the user
id is at me and that's really
all the library needs to know in order to
make an HTTP request. You execute the request
using the Buzz activity data class to serialize
into JSON and we provide a serialization for
XML and in the future we'll provide it for
other formats. Here's another method, the
delete method. Straightforward, they all look
the same. You define the name of the method
I'm running, activities dot delete, and a
set of parameters so let's look at the Buzz
activity feed. Again, this is just a container
for Buzz activities and I'm using a list of
Buzz activity as the Java type of the field
and at the at connotation, I'm overriding
the field name and I'm using items as a JSON
key. There's also a list method here. I won't
go into any details, any more details and
finally I'll show you the Buzz Parameters
class. The user id, the scope, and the post
id - those were used in the discovery to construct
the URL path. The alt and prettyprint are
query parameters so you might be saying to
yourself, "Wait a second, this isn't JSON."
No, this is for representing a URL but I'm
using the same at key annotation approach
so let's take a look at the - if I can get
the emulator working again - if this doesn't
work, I'll go back to Mark. Hopefully it's
directing back to the application using the
custom URI scheme that we defined. Alright,
well, it's not working. Oh, it's working.
Great [laughs] and if this works, yes, it
will show up over here and let's test it and
make sure that we're not just faking this
demo. Here's the profile page. Ah. Let's go
back to emulator and let's make another post
[applause] Yes. Ah, let's just delete that
one. Okay, so that's it. I encourage you to
download the sample, play with it. Try the
application on your own application. Try it
in Android, try it and install a desktop application
or a web application so I'll let you Mark
tell you where you can download and where
you can give us feedback [applause]
>>Mark: Thank you. It's good we have software
engineers for these things so I'd just like
to, just a quick summary what we think is
better about this
approach. You've seen that there was some
leverage of the discovery which made the API
very easy. The amount of code in that sample,
even though we're using
the Buzz API, there was very little Buzz specific
code that you had to write and there was no
Buzz specific code included in the library
itself and finally,
of course, this Java client library runs on
Android as well as APP engine and desktops.
You are welcome to go check it out. This sample
has been archived for
your pleasure. You can go read it and see
it. The library itself is still in alpha.
We have a pre-released version of it available
in our public depository.
Please go try it, give us feedback, and tell
us what it needs and what you think of it.
So, that was quick summary of the things that
we're doing on the
future of APIs and I've mentioned several
of the things that we've tried to fix around
partial data, formats, and so on. Lots of
things that we're trying to
make changes in how we build APIs going forward.
So, now I'll try and get to what you've probably
really come for. How Google really builds
APIs. So, in
order to make all these changes, we've had
to change how we build APIs from the ground
up and here to show you some of that architecture
and some of those tools that we built, I'm
inviting up Joey Schorr who's another software
engineer and he's going to show you exactly
how Google builds APIs.
>>Joey Schorr: Thanks, Mark. I don't need
that actually [applause] So imagine that I'm
an engineer on a team and I have to build
an API. Traditionally
speaking, I would have to hard code my API
into my front end in whatever format was necessary
as we see in mostly Atom. I would then have
to manage all the
necessary common functionality such as authentication,
logging, and other production concerns. As
we've seen, this can be problematic and as
a result we've
built a new powerful API stack that allows
Google engineers to create an API in just
a few short steps. To begin, they start by
implementing their internal
service. To do so, they define a set of abstract
resources using protocol buffers, our internal
serialization and deserialization format,
which is also now
an open source project. Then, once the engineer
has defined his or her internal resources,
their next step so to define the set of collections
and those
operations or verbs that can be performed
on the collections by protocol buffer RPC.
Once the internal service has been launched
and other engineers and
Google Apps can use it, the next step is to
configure our new API stack. The configuration
is a simple JSON data file which maps the
REST paths, RPC methods,
and query parameters to the internal collections,
resources, and verbs that are necessary for
the API. The API stack also adds all the common
functionality
that is needed - authentication, caching,
logging - thereby removing the burden from
the Google engineer and putting it on the
stack itself. Finally, the
engineer will write the output templates and
these represent the bidirectional transformations
between the internal format, protocol buffers,
and the
external format, JSON, Atom, XML, etc. Now
I'm going to show right here during this demo
how we can implement a very simple API. In
this case, a task list
API. So to begin, I start by defining all
the resources I need in protocol buffer format.
To do so, I'll define a task message because
I want tasks in my
tasks list. I'll define the fields necessary
for my resource. In this case, the ID field.
Notice that I have to give the internal protocol
buffer identifier
for the field. I'll probably want a description
of my task and I might want to specify whether
my task has been completed. Now, once I've
defined all the
resources that I need in my API, in this case
just a simple task, my next step is to define
the collections necessary. In this case, I'll
define the tasks
collection. I'll specify that the resource
id is of type string because I used a string
here. I will then specify that the resource
itself is the task, of
course, which I just defined right here above
and then I need to list all the operations
or verbs that I want to make available as
part of my internal
service. To begin, I'm going to work the common
REST ones; GET to get a task, LIST to list
all the tasks, and INSERT to add a new task.
I might also want a
custom verb. In this case, one to mark a task
as being completed, mark as done. It will
take in, excuse me, the task id of the tasks
that I want to mark as
done, and might want to return the actual
task itself. Now, once I've defined all my
resources and all my collections, my next
step is to run an internal
code generator which spits out an interface
which I can then implement in order to get
this working as an internal service. I've
already done so, so my next
step is to actually config the API stack itself.
Now, as I mentioned earlier, our configuration
is just a simple JSON data file. However,
we wanted to make
it even easier for Google engineers to create
an API in just a few steps. To that end, we've
written a web based tool which I'm going to
show to you
externally for the very first time today,
which allows Google engineers to create an
API in under 10 minutes. To begin, I click
"Create New API". I give the
name of the API, in this case, task list.
I give a descriptive title, "My Task List
API". I then have to specify the address of
my internal service. In this
case, running a local host 2500. Once it has
found my service, my next step is to specify
the mappings of the internal verbs or operations
that I just
defined to those operations or methods that
will be exposed to the external world. To
begin, I hit "Add Method". I give the RPC
name of my method, "Task List
dot tasks dot list". I also give a REST path.
This is ensuring that it's accessible both
as REST and RPC so this will be "Tasks slash
list". I have to choose
the internal operation that will be called
and you can see the system is introspected
upon those operations I just defined, and
then I can add additional
methods. In this case, I'm going to add one
to Mark, a task is done. I'll give it "Tasks
slash the task id" and done, and I will choose
Mark as done. Now, as
we saw, my custom operation required a parameter.
In this case, the task ID. Again, you can
see the system has introspected on the fields
I've defined and
given them to me here. Now, once I hit "Save",
my new API has been created. However, I'm
not quite done yet. In order to truly use
this API externally, I
have to define the template that maps the
internal resource representation to the external
world. So you can see here I have my new API
into the list. I
choose "Templates" and I'm going to want to
map my list method to JSON so I choose JSON
and now you can see here the template that
represents the
bidirectional mapping between the methods
internal source representation and it's external
JSON representation. To begin, I'm going to
want to list the tasks
defined in the list. A loop over, excuse me,
a loop over all the resources defined in the
entity and then for each of the tasks I'm
going to want to list a
JSON object that represents the tasks information
itself. In this case, the ID, the description,
and whether it's been done. Now, once I hit
"Save", I now
have a fully functioning bidirectional JSON
API representing my simple task list and to
that end, I'll actually demonstrate it for
you here. Hold on one moment. I'll just do
it over here. To begin, I'm going to want
to list all the tasks defined in my service
and I'm probably going to want to prettyprint
it to see what it actually looks like. As
you can see, I have a very simple two tasks
that I've prepopulated into my service. You
can see the tasks JSON representation here
that I've already defined and the ID and some
of the other fields. You can also, again,
call this via JSON RPC and I'm also again
going to prettyprint it and you can see here
it's the exact same representation with the
exception of the lack of the data envelope.
Now, I might want to mark a task as done so
I will do "Mark Done". I'll give it the task
id, in this case, "My Task", as defined right
here and of course I'll want to prettyprint
it again and you can see now the task has
been returned with the done field set to true
and if I go back and list the tasks, refresh,
you can see now the done field has been marked
to true so you've seen how we can implement
a very simple yet powerful API in under five
minutes during an IO presentation and this
demonstrates the true power of our stack [applause]
Zach?
>>Zach: Alright, thanks Joey. So, as Joey
just said, three steps and Google engineers
can build an API so I just want to make sure
everybody realized what just happened because
it's so cool that every time I think about
it, uh, I don't know [laughs] So he took what
we had as an internal service, wrote a few
config files, connected it to the new API
stack, and then launched a new API and it
was done in literally five minutes as you
guys watched. So, alright, so that being said,
we are going to conclude our presentation
on that note. Questions and comments, check
out bit.ly/apiwave.
>> I'll pop over there now.
>>Zach: Alright, I'll invite Mark and Yaniv
back up so I can have them answer the hard
questions for me. Alright, and there's mics
throughout the audience if anybody has any
questions as well. Alright. So, want to know
how Google decides what features to offer
when building a new API? Okay, there's two
things that Google - well, the sessions started.
Awesome. There's two things that Google wants
to do when we launch a new API. One is Google
really believes that any data that belongs
to you, belongs to you. Whether it's on our
system or on your system so when we build
API's, we build them so they expose all the
data, that way if you want to leave Google,
we don't lock you into Google services and
you can get all your data out. Now, that said,
you know, exposing all of the data in a simple
API isn't really the best way. We saw a bunch
of examples of mobile clients. You don't want
to get all your data every time you want to
simply update something so when our engineers
build new APIs, they think about what the
use cases for the APIs are and then give you
methods and field that will work the best
for your API. You guys have anything to add?
You've built more APIs then I have so [laughs]
>>Joey: I was just going to mention that we
add APIs where we can add value. We've added
a lot of APIs to again, a lot of services
that make APIs possible. A lot of our apps
APIs are designed to make it possible for
say enterprise customers in order to leverage
the app suite and basically teams are looking
for ways to add APIs in different ways that
add value and that's just an example of how
we choose APIs. So each product, of course,
comes with a slightly different story to a
large extent.
>>Zach: Alright. Well, that was all the moderator
questions so audience, do you have questions?
>> Yeah. How about a tasks API?
>>Zach: So, yes. There will be a tasks API
coming soon. I can say that so I think that
answers your question.
>> We have to take Joey. Joey can probably
do one in ten minutes.
>> So, would this discover API, would it be
directly possible to have just one client
library that would like auto generate libraries
for each and every Google API that supports
discover?
>>Zach: Yeah. So, I mean, as you can see from
Yaniv's demo, there's very little Buzz specific
code in there and so with that client library,
it could be taken and used with any API that
uses discovery without any updates to the
client library itself which is the awesome
part of this. So any API that supports discovery
going forward will work with, actually the
client library that you need is already written.
It's not in it's final form yet but it will
work. Does that answer your question? Cool.
>> I think it's worth mentioning though that
this is still technology we're still developing
so there's no guarantee that the way discovery
worked right now in our demo is the final
form for it. We are all talking about innovation
in the opening and we wanted to give you a
chance to give us feedback so that's why we
released it.
>> And if you have feedback, come grab Yaniv
afterwards since he'll be building all this
stuff later [laughs]
>> Why don't we take a live question while
we wait?
>>Zach: Yeah. So, go ahead.
>> I'm a big fan of protocol buffers and I
noticed that you've got that sort of exposed
internally but you're only exposing tech space
external formats. Are there any plans to expose
-
>>Zach: For now, for now.
>> Expose a binary one?
>>Zach: So, the benefit of the new stack is
that we can adapt any format very quickly
so be on the lookout for your favorite format
coming soon.
>>Yaniv: And make your voice heard. If you
need a format, let us know.
>>Zach: Alright. So, some more questions here.
What new Google APIs are coming out in the
new future? Now, if I told you guys when new
APIs were coming out in the near future, that
would ruin the surprise when they actually
launched [laughs] but the second question;
when will Google release a tasks API? I kind
of said soon already so there might be a tasks
API coming out in the near future. Maybe.
I can't make any promises. Alright, the idea
of a discovery document sound similar to wizdl.
Mark, what are the distinctions between ours
and wizdl's?
>>Mark: Well, we are much more similar to
- I don't know if you are familiar with it
- but there's an alternative to wizdl called
waddle. Are you familiar with that? Which,
if you look at the discovery document format
that we're using right now, is actually almost
a JSON-ified version of waddle. The difference
between wizdl and waddle is mainly wizdl is
an RPC focused document and it also comes
along with it the SOAP protocol. Waddle, which
is a proposed standard, is an XML document
that describes RESTful APIs. Ours actually
support REST APIs and JSON RPC so it's a slightly
different subset of what we're describing
and we will probably be more compatible with
waddle, and in fact, we may offer a waddle
document at some point. It's one of the things
that we're actually considering as a feature
so I hope that answers your question.
>>Zach: So on that same line, I just want
to reiterate what Mark said and reinforce
that with the fact that Google is really committed
to open standards.
You've seen that throughout all of the presentations
so far in the past two days. Obviously, we
are working with JSON RPC and Atom so as we
do discovery, as it progresses further, you
can expect us to make sure that it's as compatible
with as many different discovery formats as
possible. Alright, so - another audience question.
>> How do you deal with convergence issues
like throttling?
>>Zach: So throttling and that type of the
stuff is all taken care of by the API stack
so none of the teams that actually build an
API at Google - it's one of those common features
that we take care of. I don't - do you guys
want to talk about that?
>>Joey: Well, traditionally speaking, that
was actually not the case and one of the benefits
of the new stack is that teams are no longer
responsible, in a visual team basis, for implementing
throttling and logging and other production
concerns is now part of new stack and they
get all those benefits merely by being part
of this stack as we showed.
>> So do you have like a central operations
team for the production side that deals with
those issues?
>>Joey: Yeah. They're sitting in the front
row, right here. [laughs] and shrinking down
into their seats as I saw that. Believe that,
we've been pretty busy the past few days.
>>Zach: What book would I have people read?
Wow. [laughs] So I can honestly say that I
haven't read any books in the past two months
because we've been getting ready for Google
IO. Do you guys have any books you'd like
to recommend to the audience?
>> The number one thing I read a while ago
was just Roy Fielding's thesis just to understand
the concept behind REST and then trying to
figure out where that really - you know, the
concepts behind are abstract and a little
bit deep in terms of thinking about APIs.
Other then that, I would actually suggest
you go read APIs. Go look at Twitter, go look
at New York Times, go look at Facebook, go
look at Flickr. This is where - these are
the people who are building the new RESTful
APIs that are out there. This is where you
learn more. The books aren't there yet, or
at least not that I found so that would be
my recommendation.
>> And of course, take a look at all the developer
guides for all the new APIs that have launched
in IO. They are very good resources.
>>Zach: And if you're looking for a fiction
book, "Day Watch" and "Night Watch" are awesome
books so you can check those out. Alright,
so another question from the audience.
>> Question about the at me. How would you
decide the API that uses authentication or
authorization? And the at me, how do you pass
that in a message?
>>Zach: I'm sorry. Can you repeat the question?
>> How do you pass the at me in the message?
Like when you are defining the message and
how do you decide that it uses authentication?
>>Zach: So at me is just a shortcut for the
currently authenticated users so we are not
exposing your users personal email addresses
and URIs that are being sent around the Web.
>> But do you define the message as using
at me like when you're defining a message,
how do you say that it has to use at me as
a parameter let's say?
>>Mark: Let me see if I can answer your question.
Typically, we are addressing a resource. There's
a URL template. One of those parameters that
are very common is the user ID and that user
ID can be an email address or some form of
encoded ID or an at me token is a special
token saying insert user ID that is provided
by authentication. It's implicit in the URL
template. Does that answer your question?
>> And that comes in from the stack?
>>Mark: So when you are configuring the stack,
one of the fields you give in the tool, which
we didn't demo, is you can actually specify
that certain methods can only be called with
certain credentials and as a result those
credentials are passed on and in the case
of a user ID, will be filled in by the stack.
>> Okay. Got it.
>>Zach: So I think we have time for one more
question before we're up so we'll go live.
>> The question I've had is just from designing
different API's. It's always as you're building
them trying to document it and I saw you kind
of fill out the API and I didn't know if while
you were building it you had somebody also
in line describing kind of what happens or
- I mean, I've always had the problem like
now I have this API and then I have to go
back and document it and this whole process.
It be much better to merge so I didn't know
how you guys solved that.
>>Zach: So actually one of things that we
did not show about this new tool is that it
auto generates documentation for the API.
Well, if you go out and look at the Latitude
API docs, there's a reference guide there
and that entire reference guide was auto generated
with no work by the engineers other then what
you would annotate the fields with as they
are making the API. Obviously, something more
like a developers guide where you have to
understand the basics concept that would be
really awesome to auto generate that but the
stacks not self-aware yet so it's not going
to be able to [laughs]
>>Mark: Just to expand a little bit on that.
If you notice while I was writing the template,
it was a variation of JSON. We support JS
doc like comments in there and if you annotate
fields with the script of information, when
you then go to generate the reference guide
for example, it will automatically pull that
information from the template and also from
the configuration information as part of the
same tool I demoed.
>>Zach: Alright. So I think that's the end
of our session. We've run to the max extent
allowed so thank you everybody for coming.
We're glad to have a full room here. Grab
us afterwards if you have any questions [applause]