Hotwire Tech Blog

Scribes from Hotwire Engineering

Problem:

In a service-oriented architecture, often there is a need to cache data. Hotwire Search Team has built a data store for caching static information about hotels such as star rating, neighborhood data, amenities, etc. which is backed by in-memory DB Redis (Elasticache). This cache is exposed via an API called Hotel Static Data Service (HSDS). HSDS was sending a very large response to its clients that needed only a small subset of that information. This caused latencies in critical services that were using HSDS within the search flow.

Current Architecture:

There are a number of services that get called from the time a user enters a search string to the time results are rendered to her. Each service uses some information about hotels to build its response to send it back to the caller. Below diagram shows the clients of HSDS that are called on the search request’s path.

Current Architecture

 

Latency is very critical for these services. We do not want our customers to experience delay in seeing inventory and engaging with our product. Hotel information is stored in Redis in the form of key-value pairs and each value has over 40 fields and is likely to scale up to handle more than that. Depending on the number of hotels for which data is requested in one single request, the response size of HSDS can be pretty large. Often clients need just a small subset of this static information to execute their business logic. Different services have different data requirements and thus, HSDS stores the super set of the static hotel data. Utilizing CPU and memory resources to parse the data that is not even needed by a service is inefficient.

Solution: 

One of the optimization opportunities was to trim down the response returned by HSDS by including only the data about the fields that are requested by the caller. HSDS is built in Node.js and it comes with a variety of out of the box solutions provided by NPM for various problems. One such solution is express-partial-response module, which uses JSON masking underneath. JSON Mask[1] is an engine that is used to select specific parts of JSON object and masking the rest while keeping the service’s response schema same. The clients will just have to send a query parameter “fields=” with names of fields in the URL for which they want to request data from HSDS. The names of fields follow XPath like syntax to allow clients to request data for nested fields and for fields of a JSON object within a collection. If “fields=” query parameter is not specified in the URL, HSDS will return full response. This gives flexibility to the clients to choose the content that they want. Below is an example to illustrate this feature:

SampleRequestJSON

If a client is only interested in hotelId and amenity codes of all hotels, the URL query parameter to request those fields will look like this:

SampleURL

This will generate the response that will have the same schema but only the fields that are requested which reduces the response size significantly. This leads to lower network transfer time and lower parse time on the client’s side. For the above example, the response will look as below:

SampleResponseJSON

Google APIs[2] and LinkedIn APIs[3] largely inspire this solution and Search Team has adopted it since it works for our use case and is an inexpensive solution that provides many advantages.

Advantages:

  1. As HSDS is a Node.js implementation, implementation of this powerful feature is clean without any convoluted logic since it is taken care by an out-of-box solution provided by NPM. Thus, very few lines of code and testing effort were needed to add this feature to the API.
  2. Since this feature doesn’t change the response schema when a partial response of the API is requested. Clients do not have to implement a separate parser to parse the partial response.
  3. Since the “fields” parameter is optional query parameter, clients of HSDS get to choose to opt-in or opt-out of leveraging partial response feature of HSDS without requiring any changes or releases on the service side. This has attracted many internal teams at Hotwire to use this API as data source of static hotel data.
  4. Finally, since response size reduces significantly especially for clients requesting a very small subset of fields, after configuring search APIs to use partial response of HSDS, we have observed over 40% reduction in server side latencies of search APIs.

Future Enhancements:

  1. Currently, the only way to filter out the unnecessary data is by sending the field names of the data that we want in the query string parameter of the URL. The feature can be extended to process “black list” of field names to specify data that is unwanted if a large subset of the data is needed. This will help in constructing shorter and prettier URLs for the service.
  2. There is no support for range filters wherein specifying a range condition will give you all the data within that range. (For example: Get all hotels, which have star rating from 3 to 5.) HSDS can be extended to build and send partial responses based on complex filters.

[1] https://github.com/nemtsov/json-mask

[2] http://googlecode.blogspot.com/2010/03/making-apis-faster-introducing-partial.html

[3] http://yaoganglian.com/2013/07/01/partial-response/