Data science is at the core of Amy Ingram, our AI personal assistant, and Data Scientists make up a big part of our team. To animate Amy (and her twin brother Andrew), we are teaching her to understand scheduling related emails. Teaching machines to parse natural language alone is a daunting challenge (and represents an entire field, Natural Language Processing or NLP). But that’s not the end of it, because humans are not very precise communicators, even at our best. As a result, teaching Amy to successfully negotiate meeting times and locations for our beta customers has required collecting large amounts of data and applying some of the most advanced Machine Learning in the academy, stretching it beyond its usual boundaries.
Our Data Science team has been fairly quiet about their work up until now. I’ve convinced Chief Data Scientist, Marcos Jimenez Belenguer, to answer a few questions to shed some light on Amy’s inner workings.
There’s a sense of beauty in the mathematical models we use as well as a sense of great personal reward in finding workable solutions to these problems.
Q: Data science is a baggy term, some people consider it a buzzword. How do you define it? What makes you a data scientist?
Data science is such a novel discipline that it still lacks a clear definition. Roughly speaking, it’s an extension of statistics into computer science. Software engineers tend to model problems in terms of rigid rules, whereas data scientists solve them through probability distributions and confidence scores. For example, we often have to answer the question: “What is the best decision a computer can make based on past decisions made by humans in similar (but not identical) situations?” To answer this question, a data scientist needs to perform operations on (ideally large) sets of examples and have a machine abstract patterns from them that correlate to human decisions. By recognizing those patterns in future situations, the machine is then able to make human-like decisions. This process is called machine learning, and it’s a big aspect of what data science is today, and what data scientists do.
Without going into too much detail, there is definitely hard labor involved in data science. An (often overlooked) aspect of our job is to prepare the data for analysis. We are typically confronted with ill-formatted, wrongly labeled or partially corrupt data which needs to be ‘cleaned up’ for analysis. By careful study, we then identify and derive those features that will enable the machine to abstract and learn patterns. In a sense, we are delineating what sort of things the machine is able to ‘see’ in the data, from which the machine then forms abstract, internal representations and patterns, a process that’s somewhat similar to how neural structures in our brain form from sensory experience.
By recognizing those patterns in future situations, the machine is then able to make human-like decisions.
There are many popular algorithms to model the relationship between features in data and correct decisions, each with different strengths and weaknesses. Typically, data scientists first make some educated guess as to what algorithms might work best based on the nature of the problem and the dataset, and then comes an exploratory process of searching, optimizing and fine-tuning. In many modern environments, such as ours at x.ai, all of these steps need to be as automated, versatile and scalable as possible. Software engineers or ‘data engineers’ come into the picture here, working with us to “harden” data analysis pipelines, build platforms that bridge experimental and product environments, and help streamline and scale the whole process.
Perhaps all this appears too technical. In fact there’s a lot of creative thought and innovation involved which I have not emphasized. We are often challenged with unsolved and abstract problems for which there is no blue-book to follow. There’s a sense of beauty in the mathematical models we use as well as a sense of great personal reward in finding workable solutions to these problems. At tech start-ups like x.ai, there’s also a general sense of excitement and wonder to our job. We realize that we’re pushing the ability of machines to express human-like behavior in new domains. And what’s more human than natural language? What’s more fascinating than teaching a machine to understand human language and respond to it in a way that makes sense?
Q: How do you see the current state of machine learning? Are advances in the academy being applied yet? Has the commercial world caught up to the research?
Without pretending to be a historian on the topic, it seems that we’re living through an explosive age of machine learning, both in terms of theoretical progress and speed and range of applications. This is undoubtedly fueled by three factors: the omnipresence of data gathering and data sharing devices and networks (smart phones, internet, etc.), the ever-increasing computational power of commercial computers, and by the multi-billion dollar industry that supports data analysis, such as in real-time adaptive advertising.
Let’s take ‘deep learning’ as an example. Deep learning refers to a family of neural-network based algorithms which, very roughly speaking, mimic actual neural networks in human or animal brains and support many (or deep) layers of ‘neurons’ connected in various ways. Academic researchers have made many theoretical breakthroughs on this front, in some instances decades ago; however, the then-intractable computational complexity hampered their large-scale application. Not anymore. Nowadays any computer-savvy college kid can ‘train’ a deep neural network on the GPU of her laptop within minutes, using out-of-the-box free, online libraries.
This immediate feasibility and applicability of cutting-edge research, provided by cheaper computational power and reliable, open source code, has created a direct link between pure research and industry applications. Cutting-edge research from both industry and academia often comes with publicly available open-source software implementations, which the machine learning community quickly tries on an immense number of different problems. This makes some algorithms hugely popular. In the general, the trend is towards distributing knowledge, rather than confining it. Cutting-edge research is no longer conducted solely by academia. A lot of it is happening at tech giants, such as Google, Facebook, Alibaba, as well as valiant tech start-ups, like x.ai.
Q: What are the core ML techniques you’ve used to build Amy and Andrew? Where have you innovated?
Amy is in essence an automated dialog model, in which a machine is conversing with a human through email, and, at each turn, the machine needs to be able to detect the people, places and times the human is talking about and understand the semantic context in which that information appears. The information extraction aspect alone is a very active line of research in industry and academia. It consists of automatically detecting expressions in text related to meeting constraints and resolving them within the context of the meeting.
For general-purpose agents like Siri or Cortana, that support questions about nearly anything, the virtual agent must first determine the context of the query before determining what action to take, making the whole process significantly more difficult and less accurate. The fact that Amy is singularly focused on scheduling meetings means she knows the context, and that makes the problem of detecting and resolving relevant constraints significantly more tractable, if still very challenging. The same reasoning applies to the semantic understanding of the text— how we go about answering the question “why was this constraint mentioned?” Here, our single-functionality focus allows us to frame the problem as a multi-label semantic classification task, a well-known line of research in machine learning, where deep learning has made recent advances. We’re excited that we’ve now collected enough data to be able to try various deep learning techniques to tackle this problem.
Q: What’s on the horizon? How will you take advantage of Deep Learning or other advanced techniques as you refine Amy and Andrew?
We strive to make the customers feel that Amy just ‘magically’ understands them, and takes tactful, diligent and flawless actions on their behalf.
There are two schools of thought within NLP, those who (in a sense following Chomsky) construct algorithms by exploiting known grammatical structures of language, and those who think it’s more practical to have a purely statistical model that ‘learns’ an obscure and internal representation of it. At x.ai we think it’s healthy to enlist both schools and luckily we have enough data to carry out these competing lines of research. On the one hand, we’re excited to try various forms of deep Neural Networks (RNNs, CNNs) applied to all sorts of NLP tasks; yet we’re equally excited about building our own semantic parsers, especially to tackle those problems that are very ‘compositional’ in nature, such as entity coreference and resolution within text.
Deep learning and semantic parsers (which are not totally at odds—hybrid models exist) are currently the most active lines of research at x.ai. We’re applying them as we internally re-construct ever more sophisticated internal representations of the human-machine dialog, in which entities intertwine semantically throughout the conversation in complex ways to capture meaning. We’re working hard to push Amy’s (and Andrew’s) ability to understand increasingly subtle and complex customer preferences, commands and feedback, and to make the experience of interacting with our digital assistant not only as natural as interacting with an actual human being, but actually better. We strive to make the customers feel that Amy just ‘magically’ understands them, and takes tactful, diligent and flawless actions on their behalf.