[This post was co-authored by Satsuko VanAntwerp and Scott Wright, and was previously published in Rat's Nest]
As part of the a monthly meetup I co-lead on the intersect of design and machine intelligence, we invited Stephanie Cruz, Product Designer at Wattpad, and Xavier Snelgrove, CTO and Founder of Whirlscape, to talk us through two design products leveraging artificial intelligence. Stephanie shared how Wattpad used AI to augment their reading app, and Xavier shared how they built their AI-first product: Dango. In an attempt to document and share the thinking and learning emerging from these meetups, below is a round-up of the presentations and larger discussion.
Case study 1: Wattpad
Wattpad is a user-generated story sharing platform. Wattpad’s main challenge is connecting the right content to the right users at the right time. This challenge has two components: the writer side and the reader side. Wattpad deploys AI to close the gap between what people write and what people want to read.
Many stories aren’t tagged accurately, making it hard for readers to find them. Improper tagging can happen when writers add a number of tags in hopes of increasing their opportunity to be featured on top charts or when writers don’t know which tags are appropriate for their story. Since writers often don’t realize the downside of not being selective in their tags, Wattpad is tackling this challenge in two ways. They are experimenting with messaging aimed at educating writers on the value of accurate tagging and they are exploring AI to scan stories and auto-suggest appropriate tags. Using machine learning techniques, they are actively making efforts to remove noise from tags on stories by engaging their readers in tag rating. In doing so, they hope to help writers shift their focus; and drastically improve discoverability of their stories by the right readers (ones who share an interest in the topics they’re writing about) rather than the most readers. This will be the foundation for more sophisticated approaches to tag auto-suggestion in the future.
What we read on the commute home from work is likely different from what we read on a beach holiday or before bed. The same person may be looking to read a different kind (or length) of story based on their mood, time of day, location, recent celebrity news, a new movie launched, or something else all together. That’s because reading fulfills different “jobs” for people based on the context. Leveraging AI to incorporate the reader’s “job to be done” and context, and probing for signals — explicit questions to understand relevance of recommendations, or clicks on this content versus that — to improve personalization of recommendations, will help readers readers to smoothly navigate the close to 250 Million user-generated stories.
Case study 2: Dango
Dango is an emoji & GIF assistant for smartphones. With training from 100s of millions of tweets, Dango uses machine learning to understand real time conversations to predict relevant emoji & GIFs to add to text messages. What’s difficult is understanding the nuance and meaning of phrases well enough in order to predict emojis a user might employ to express their emotions. In other words, you can’t simply match words directly to emoji. That’s because meanings of words are highly contextual (e.g. ‘how you doing’ Joey from Friends; ‘how you doing’ friend at a hospital). Matching whole phrases in a one-to-one way is not flexible or efficient enough to properly map the spectrum of human conversation.
Creating Dango required algorithm-empathy on the part of the designers. How does an AI algorithm see the world? If you don’t tell it about time, it can’t know that people tend to have beers in the evening and coffees in the mornings and thus would not know to suggest related emojis at these times. So, while designers already have user-empathy, designers must develop algorithm-empathy, that is, an intuition for the types of decisions the algorithm will make under the hood. This is a case where designers and engineers can learn from each other!
However when working with data sets, you can get into straits and create a confusing user experience by simply allowing the algorithm to do its job in an unrestricted way. For example, during tests, typing the letter “I” at the beginning of a sentence would produce the donut emoji. Puzzled by this result the team looked at the training data and sure enough, people were tweeting “I [donut] know” or “I [donut] care”. The team decided that keeping this result would seem like nonsense to most users and as a result cause users to lose trust in the product. So, the algorithm was adjusted to reflect user expectations rather than an accurate statistical representation of the user training data.
The Dango team wanted to create a new feature that would predict instances of a particularly good sentence-emoji match. This feature would require a new approach to training their neural net.
Typically, when training a neural net for this kind of task, you would want to primarily use actual user data as the training data. That way outputs would reflect the intuitions of the target population. However, what one defines as a particularly good emoji-phrase match has to do with a matter of taste, sense of humour, and what one finds clever or interesting; and this can vary widely from person to person. So, in this case, using statistically averaged user data would have a washed-out bland result. For example, Xavier joked that if a neural net was trained to suggest clothing choices, everyone would be wearing a white t-shirt and jeans.
Back to the questions of how to define, in simple terms, what is a particularly good prediction? The Dango team came up with an unusual solution to this problem. Since Whirlscape CEO, Will Walmsley, had explained it as “I know it when I see it”, the Dango team had Will hand label 30,000 examples to create the training data. The result was a bespoke neural net that mirrored Will’s taste and intuition of what makes a particularly good match. And, as users signal that the prediction is accurate (by clicking on the feature), the algorithm continues to learn and evolve.
Want more? Here is another concept that was discussed at the event — it gets a bit technical…enjoy!
Thought Vectors: Dango’s strategy is driven by a stack of recurrent neural nets. This type of net is particularly good at making sense of sequential input, like the words in a sentence. The basic job that the net attempts to do is abstract the words and phrases into ideas or sequences of ideas, each of which can ultimately be represented as numbers. In the same way that you can take simple x/y coordinates and plot them on a graph in 2 dimensional space, you can take this longer sequence of numbers and also plot it as a single point. However, in this case there are many more coordinates or dimensions. This particular ‘high-dimensional’ space can be called ‘semantic space’ as it maps the relationship of ideas to one another. For instance, using this mapping it could be understood whether two ideas were ‘close’ to one another semantically. These points in semantic space have been named ‘thought vectors’ by Deep Learning Pioneer Geoff Hinton. To learn more, check out Dango’s tech explainer article.