Hotwire Tech Blog

Scribes from Hotwire Engineering

Overview

We will look into running Jobs on Spark cluster and configuring the settings to fine tune a simple example to achieve significantly lower runtimes. We will also allude to the trade off between setting number of tasks per executors and number of executors per node given a cluster node configuration.

Spark Cluster Abstraction

This diagram (source) shows the key components of the cluster. Spark driver is the program you write. The Driver program demands for executors from the Master (the Worker nodes lets the Master know how much resource they each have). The Master allocates the resources…

Read more...

Introduction

At Hotwire we build machine learning models for various purposes, such as to power our hotel sort or to drive our hotel pricing strategy. We spend our time substituting data-driven machine learning approaches for legacy human curated set of business rules that are difficult to scale as well as to incrementally and systematically improve. However, not all is always without issues with machine learning models as they bring their own unique set of challenges. One of the main difficulties that we have to deal with on daily basis is to understand what changes to expect from a new “black-box” machine learning model. It is quite easy to see that weights in neural network…

Read more...

For companies interested in measuring customer satisfaction, the Net-Promoter-Score (NPS) is a widely adopted method and standard. So what is NPS? NPS could be calculated based on the average response of a set of customers to a specific question: “How likely would you recommend our company to your family or friends?” While several variants of the rating scale exits, the most often adopted version uses 0 for least likely to recommend and 10 for most likely to recommend. Naturally, companies are most interested in customers who express high NPS values (e.g. 9 or 10) that are referred to as promoters and customers who…

Read more...