Grails Developer

Location:

San Francisco, CA

Posted:

January 09, 2015

Contact this candidate

Resume:

Highly Scalable API’s : API I/O Abstraction and API Chaining

Owen Rubel

A History of the API

In the early days of API development, the concept of API’s were originally quite simple. They were designed as an interface to a separation of concern with two sets of functionality: standardized I/O and resource management. This was very convenient when the communication was localized in the application as the I/O for an API call didn’t have to be shared with external components, there was limited redundancy across classes and no requirements for sending traffic outside the application.

However as API’s were introduced for the web, the internalized I/O functionality associated with the API gradually extended out with the architecture and what was once associated with the separation of concern as functionality within the API, soon needed to be shared across the architecture. This complicated how we built API’s for tooling such as proxy, api gates, facades and message queues which now needed to share the request and response with the application instance.

Original attempts at resolving this issue were to just move all I/O functionality out to the architecture. But this created duplicative functionality across request tooling, the application and response tooling. For example, when doing a forward in the application, one would have to send the request to the external architecture to handle security and I/O checks. This would cause additional overhead, the original thread to be abandoned and a new thread to have to be created.

Eventually came the API Facade. This allowed us to synchronize data across external tooling but not the application. We still couldn’t build to scale in the application; everything needed to be externalized. And again, problems with forwarding and redirects still created the same issues.

One could duplicate all functionality in all places but technology is not going to be the same everywhere, so libraries/services are not guaranteed to be able to work everywhere and would not be able to sync.

This fundamental flaw in API’s stem’s from the fact that the original design of the API binds I/O functionality and business logic. In the case of web API’s, the ‘resource’ has always been data generated by the controllers as they handle the business logic in an MVC framework. And while communications(I/O) are generally handled by the front controller (or near the front controller), when developing API’s, we code the I/O checking into every single controller/method call. Generally this is handled through services or libraries but still overly verbose in the fact that it has to be coded into every method through annotations or as a service call. It also doesn’t take into concern shared I/O concerns outside the instance in the architecture.

To completely resolve this issue, we have to tackle it at the source, in the application, and separate the I/O from the API so that it I/O functionality be shared, I/O data can be shared and it can have an Interceptor ‘loopback’ .

Api I/O Abstraction : Shared I/O

Every single framework and codebase has something known as an ‘Interceptor’; Rails calls them ‘filters’, Node.js has Interceptors in StrongLoop and in Grails and Spring we have the HandlerInterceptor. An Interceptor is a common pattern that does precisely what it says: it intercepts the request prior to calling the endpoint and the response post call. Why do we care about this?

Well an Interceptor allows us to create a communication layer in a framework and in the case of a web framework, it allows us to create a single threaded loopback as well so ‘forwards’ can just be redirected back to the original request without having to go back out and come back in via the architecture.

By moving I/O from the API to an Interceptor, we can do our method checking, data checking and security checks at the preInterceptor call and formatting postInterceptor call. And by keeping all this code in one place and calling it up front, we reject early, reduce processing, reduce the amount of code needed and I/O functionality is easier to share .M Node.js Strongloop made use of this to tremendous effect early on but unfortunately only uses this to a limited extent as it neglects to have a comprehensive Interceptor for security and architectural sharing.

To be able to share the I/O effectively with the architecture, we need something similar to the Facade but applied only to the I/O data on the backend; A JSON object that could be loadable by all architectural tooling which would hold the data associated with each and every api call. This is what I call the ‘apiObject’

The API Object : Sharable API Data Object

As a request comes in, it will go through request tooling in the architecture (ie proxy/api gate) and then in to your application and then as the response goes out, it will go to response tooling (ie message queue/api gate or back to the proxy again). As a result, the checks, security and processing for handling the request/response related to your API’s all need to be shared across architectural components making your API’s architectural cross-cutting concerns.

As a result of ‘api abstraction’, in cases of batch processing, forwards and api chaining, we now pass this data back from the postInterceptor to the preInterceptor so everything stays within the same thread and no new processing or processes need to be spawned.

But how do we identify those api calls so that checking, handling and processing will be done the same across request tooling, response tooling and all instances of the application. We do this with what I call an ‘apiObject’.

The apiObject is a JSON object that is a cached definition of your api calls. By creating these definitions, you create reusable definitions which:

* relate to each other

* are reloadable

* maintain all role associations

* maintain list of expected incoming and outgoing data

* has versioning

* and more

Below is a typical apiObject for a controller showing one api uri:

1. VALUES: Represents a map of all values used by all apis for reference within this controller/service; description and mock data are included for use with OPTIONS (which in turn is used for generating API Docs based on ROLE)

2. CURRENTSTABLE: Represents current stable version; if none set, defaults to 1.

3. VERSION: Represents versions available; can have multiple version loaded for different ROLES/Environments

1. DEFAULTACTION: Represents the default URI to be called if none given; not required.

2. DEPRECATED: Represents a deprecation date for older versions and a message to be delivered if called after said date; not required

3. URI: Represents a map of URI data

1. METHOD: Represents expected REST method for given URI

2. DESCRIPTION: Represents description for URI. Mainly used for OPTIONS

3. ROLES: Represents a list of ROLES that can request given URI

4. BATCH: Represents ROLES that can make batch requests to this URI

5. REQUEST: Represents a map of data related to the request; each entry in map is given with ROLE as key and list of data expected as value. If user matches multiple ROLES, expected data is concatenated

6. RESPONSE: Represents a map of data related to the response; each entry in map is given with ROLE as key and list of data returned as value. If user matches multiple ROLES, returned data is concatenated

So with the above apiObject, we can map all uri’s and values for all controller/methods in the application, load them at runtime into a shared cache, and then read them into local variables in each tool/instance. And with each reload, we can push out new copies to all tools so they all have the same data.

Thus now, the proxy can check secure the same way the application checks it and the same way the message queue checks it. All all data is synced on a common reload able object in the architecture.

We can also at this point do ‘logic sharing. Previous in API architectures, you could shard based upon data but when it came to logic or functionality, it was tied to the instance (or you had to build a separate instance. Now, it is as simple as toggling an instance off or on. One instance can be your batch api server while another can be your regular api server and if one goes down, all you have to do is toggle the config on a spare instance and reload. Suddenly its a batch server now.

API Chaining : Multiple Requests in a Single Thread

One of the greatest benefits of an Interceptor in conjunction with an apiObject is the ability to do API chaining.

API Chaining is an I/O monad in which multiple API calls can be passed using a single request. A monad works with each call passing the result to the next call in the chain. API chaining builds on this by using the apiObject to allow the client to read relationships, build relatable apis and send state for each uri in chain in one call.

When the API chain is sent, each call is processed and checked and required data for the call in chain is sent back through the postInterceptor to the preInterceptor until the end of the chain is reached.

As a request can only have one declared method, the API chain is composed entirely of GET API Calls with ONE unsafe method call (PUT, POST, DELETE). This unsafe method is considered the ‘master’ method telling it either where to start or where to stop based on what type of chain is declared. For example:

* ‘blank chain’ - GET > GET > GET > GET

* ‘prechain’ - POST > GET > GET

* ‘postchain’ - GET > GET > GET > PUT

A lot of people ask,’why can’t we have the method called in the middle?’. This is because it would enable a double call to methods or two unsafe methods in one request. Thus we have the unsafe method as the toggle for the start/end of the chain.

A typical chain call looks something like this:

{

‘testdata':'testamundo',

‘chain':{

‘combine':'false',

‘type':'postchain',

‘order':{

‘test/fred':'id',

‘test/update’:'return'

}

and is called like so:

curl --verbose --request POST --header "Content-Type: application/json" -d "{'testdata':'testamundo','chain combine':'false','type':'postchain','order test/fred':'id','test/update':'return "http://localhost:8080/api_v0.1/test/show/1" --cookie cookies.txt

The API chain is broken out into several pieces:

* data - data is first set of fields sent

* chain - chain is the api chain object; all data related to the chain is found here

* combine - this is a toggle telling us whether to concatenate results from each call; useful in ‘blank chains’

* type - type of chain (blankchain, prechain, postchain)

* order - the order in which the uri’s are called. uri’s are given as key:val with the ‘val’ being values sent to next link in chain.

You will also notice the final value in the chain of ‘return’. This tells the chain to stop and return and can be used at any time in the chain and is useful for chain branching; Chain branching is a way to stop your chain and branch another chain from it.

Conclusion

The advantages to API I/O abstraction and API chaining can be tremendous, turning every api call into a micro service, reducing your process, your code and your time to deploy. API chaining alone can allow your mobile calls to increase their speed by at least 50-75% and clients a plethora of opportunities.

Please see additional documentation and guidelines on API abstraction at the following links:

* http://www.slideshare.net/bobdobbes/api-abstraction-api-chaining

* https://github.com/orubel/grails-api-toolkit-docs/wiki/API-Chaining

Contact this candidate