Exploring the Possibilities of Artificial Intelligence

An interview with Paco Nathan  |  November 9th, 2017

I recently spoke with Paco Nathan, Director of the Learning Group at O’Reilly Media. In the interview below, we discuss making life more livable, AI fears, and more.

I want to start by just asking what are you up to and what are you working on in the data space right now?

My job kind of changed a bit back in February. We were out at our own AI conference about a year ago in New York and talking to people and recognizing that some of what we were showcasing had a lot of applicability for our own products and services.

So we started up an effort to do AI applications in media. Mostly for Safari, but partly for editorial also, partly for conferences. And so I have been building out a different kind of business unit, if you will, that not only leverages the machine learning, but really is applying some of the available AI technology to make life more livable in a world where we are swimming in data and media.

What does making it more livable look like?

One example is when we do a conference—we just got done with AI SF, and are going to go do Strata New York next week. We’ll come out of a conference with maybe a couple hundred hours of video after the professional video editors have worked with it. So the notion is that someone who is an acquisitions editor would be reviewing the product. The talks given at our conferences are very leading edge, very notable people quite often saying things that are surprises, that are announcements, hot off the press kind of stuff. When those videos come out, they are very popular for our products on Safari—for our enterprise customers, and B2C customers to use on Safari.

One of the issues, though, is that we do 20 conferences a year and growing. So if I do the math of how many hours per conference, post-edit, and how many conferences per year, it comes out that a development editor would have to have a finger on the fast forward button for 10 months out of the year to be able to review those videos.

Instead, we have been working with Google and talking to some other cloud services that have AI APIs—we can do speech to text and then parsing that allows us to index video for search and content recommendation. What we are going towards is rather than having to watch 40 minutes of video, we can give you a one-page summary where some of the key parts are time coded and you can click to go directly to that portion of the video. And hopefully that makes life more livable as an editor because now we can put what’s important up front.

Do you think that using detection, using data, using technology in that way is either the future or needs to be the future for content producers? I know O’Reilly tends to be a little bit ahead of the curve with that kind of stuff, but do you think in general content producers are going to need to start to become more comfortable with using technology in that way?

Definitely. I have been going out and giving talks in general about what we are doing in Safari, or maybe how we are using Jupyter in a few different ways that are novel. But lately I have been talking more about how we approach AI, and particularly the management—there is a design pattern called human in the loop you are probably familiar with, the idea of active learning.

The gist of it is that rather than try to automate everything, if there is a complex thing that needs to be done a lot, like annotating categories on content, we can build up machine learning models. In particular we build up ensembles. So we have a machine learning pipeline that will take new content and then start to say what topics we think are most important in it. If the ensembles can agree and have confidence in what they are predicting, then great, that’s all automated. It just goes right through the pipeline, but when the ensembles disagree, or they have low confidence, then we feed that back to a human and somebody makes a decision.

They exercise judgment and the example—the decision that they make—gets fed back in as an example to train the machine learning pipeline. So we have built up these content annotation pipelines purely based off of examples. The people that are making those decisions never really touch anything having to do with machine learning model parameters. It is just like this chapter is about XYZ, and about ABC, and about QPW. And so by building up examples pro and con then we can say, okay, you are searching for iOS operating system. Are you talking about Apple smartphones or Cisco switches? Because we have content about a lot of both on Safari. And those kinds of ambiguities are really hard to solve with deep learning because they involve a lot of expertise and context.

We do the human-in-the-loop part and that allows us to have the machines do the heavy lifting, but people are able to interject for the exceptions. So it is not an overwhelming amount of people time, and at the same time we don’t have to automate 100% of what we are doing. You know, we can get by automating 90 to 95%. And as we are showing this, it is really resonating. We are finding enterprise customers who are running into similar conditions and needs and taking similar approaches. We are finding in academia this is a pain point particularly for digital humanities and so there is some pretty good innovation there. I can’t really point to any other content publisher, per se.

Definitely people working with media are running into this. I think the publishers may not be quite up to deploying the latest in deep reinforcement learning and I wouldn’t expect them to. We are kind of enjoying the privilege of being in touch with a lot of leading researchers and developers. But I do think it is the way things are shaping up. And we have definitely seen several talks like that at AI SF this last week. We had several talks about human-in-the-loop, some of them from large consulting firms like Deloitte, others from smaller consulting firms that are really hot properties now like CrowdFlower.

Part of it is if you want to get in the game of using deep learning, you must have labeled data sets. To get labeled data sets, active learning is a really good technique. We are at that stumbling block, and a lot of other companies are too. So I do think this is a trend that’s picking up.

When you are out there talking to people, what kinds of questions or fears do they come up with? I know it is natural for people to occasionally be resistant when technology starts in new areas, or takes on jobs that humans were doing.

Definitely. I think the two that really come up—one of them is that AI has hit an inflection point. The majors all use it, anybody who is producing smartphones and doing search or all of that, there are a lot of AI use cases. And we are seeing more now start to percolate out into big use cases in manufacturing, transportation, energy, etc.

I think there are more people coming in who haven’t traditionally had as much exposure to machine learning or data science or any of the computer science side of things. They are coming out of mechanical engineering and how to run a factory or build a factory—Peter Norvig has a great way of calling it uncertain domains—working with uncertainty. Because they are leveraging uncertainty to be able to do this kind of work.

I think people coming out of more traditional engineering fields start to see the uncertainty aspects to it, and they kind of balk. So how can they really begin to trust the systems and also change out the tools? I mean, when we talk about testing, we are not talking about running a match-all unit test anymore, we are talking about doing statistical testing. So I think in some ways even software engineering is struggling with AI. Again, Peter has given some great talks about that.

The other thing that I think is a real problem that people stumble on right now is transparency. If you have these machine learning pipelines that are doing fantastic work, and even if you have got humans-in-the-loop, how much can we explain the decisions that the automated part is doing? How much model transparency do we have? There are some good resources—datascience.com has a getup project called Skater that does model interpretation, kind of working towards general case. I think we’ll be seeing more efforts like that.

I’m always curious about how people predict how new ideas will take hold. Not to put you on the spot and hold you to anything, but do you feel that this is something that is going to move pretty rapidly, or do you think five years from now we are going to be kind of getting people comfortable? I know AI in general—who knows what is going to happen—but just with this particular usage of it.

This kind of usage, where it is mostly automated but people can jump in where they need to, that speaks to what customer service departments do.

I have had this conversation repeatedly. We have AI experts going in and making the judgment calls on the edge cases right now, but realistically our customer service department does that all day every day when they are talking to customers. So maybe we just need to tilt our user interface so we have our customer service organization going in and clicking the buttons in the right places to train the AI because that’s basically what we have built. I’m talking with other large organizations that are coming to a very similar conclusion.

Because of that I think, okay, this is kind of a lab thing right now, but it is being used in production and I think it will be rolled out for us—looking ahead a year we will have essentially customer service people training AI and I don’t think it will be much longer before we see a lot of other cases like that. So I would put that definitely in the two to five year horizon where experts training by example will be rolled out even in pretty mainstream enterprise.

As a side note, enterprise customers could see significant tax benefits from this. Customer service organizations are usually cost centers, but could claim R&D tax credits because they’re helping train AIs and accumulate knowledge for use in automation, which is arguably R&D.

The idea of customer service training AI, how do you think that relates to the chatbot craze right now?

Yeah, it is interesting. I think they have some natural overlap because the kind of dialogue that customer service people have over and over can start to be portioned, segmented, and categorized in ways that start to fit with bot development. And certainly I think some people who are in particular customer support roles sometimes feel like they are bots. I have definitely heard that from friends before. So I think there is some natural overlap there.

But doing bots is difficult, and parsing language is different. Natural language generation is still pretty early but there are a lot of good advances. I was involved in a project in 1995 doing bots for customer service, and we did have a way to fail over to a human, like a retail clerk, if something got interesting. That team ended up competing internationally in something called the Loebner Prize. So I’ll give props to Robby Garner who is the real competitor there, who actually took first in Loebner Price a few times. I guess the reason I’m saying this, is this is not entirely a new field. We saw inklings of it at least 20 years ago.

Making convincing bots is kind of an art form, but it is not impossible. I have definitely seen great examples of it. I do worry now that some of the bot toolkits are kind of over-promising, because again it is kind of an art form, and unless you can really get in and do the artistry the results are going to be iffy. I am pretty hopeful in that area.

We are doing chatbots in production on Safari. Oddly enough, when we do live online training one of the hard things is having hundreds of people in a course break up into groups of four to do group exercises. If you are a professor and you are standing in front of 300 people and you say break into groups of four, you can pretty much stare them down until they do. Been there, done that. But when you are online you can’t. You get this dead space. And people are like “What do I do now?”

We made a chatbot that would basically put people into private DMs, in groups of four and then rebalance it if some people didn’t show up. It works great, because otherwise that is a huge amount of overhead for online training.

So we are doing some work with bots, and some information retrieval work with chatbots also. And I’m certain that Alexa, Google Assistant, and others are going to continue to grow and push demand for transforming it from an art to a practice.

A couple of my colleagues wrote a post on chatbots and banking. I am intrigued but, especially with the recent Equifax situation, do I trust any of these people messing with my personal data? But it is definitely seeping into everything. I have had very pleasant interactions with chatbots and I always speak nicely to them because of Skynet.

[laughing] That’s great, you never know when it will come back.

I have a friend at Concur Labs down here for AI SF, and we are talking about that because they are partnering with Slack and doing some really interesting things on getting enterprise services into chatbots and I hope to be participating in some of that soon too.

One thing I will predict is that there has been this real emphasis on deep learning, and when we did the first AI conference in New York last year, I think I counted—over 80% of the talks were about deep learning. And a lot of it was just people bringing up their NIPS talks and just recycling it for industry.

What we are seeing this year is it’s really branching out in a much more balanced approach. There are the deep learning talks, but then there are also people doing really important work in evolutionary software, and the people doing graph algorithms, and the people doing ontology, and the people doing reinforcement learning and on and on. If I were to roll the clock back and hear a lot of industry buzz about chatbots and deep learning I wouldn’t believe half of it.

Because the thing is, to really do chatbots well you do need to have a kind of knowledge graph. You have to have context to focus the conversation in smart ways. And that context is not something that neural networks are going to be very good at giving you. But it is something that if you are working with ontology, it is much more easier to focus a conversation.

We are already seeing it—we are seeing where Amazon, Microsoft, Google on their chat APIs or speech-to-text APIs, they are taking in controlled vocabularies. They are taking in bits of ontology to help refine the quality. And I think that is my main prediction is—you will see a lot more of the ontology work driving that direction.

We are almost at time so I want to ask my favorite question because it results in such random answers. This can be related to what we have been talking about, or something else in data, or nothing to do with data at all, but is there anything you are looking forward to in the future that you are going to jump on as a project?

One thing that I would put into my crystal ball—I think we have been able to do a lot of work with machine learning, and particularly with deep learning AI techniques, lately to do metadata cleanup. It is always a problem when you have a business that has been going on for decades and you have made acquisitions and a lot of that has come forward in the last couple of years. So cleaning up metadata is a lot harder than it sounds, and also extremely valuable.

Lukas Biewald, founder of CrowdFlower, gave a great talk about active learning at AI SF, using the timeline of algorithms introduction versus its first “killer data set” as an example. He stressed the need for metadata cleanup too. And it was this repeated theme that you may have the best algorithm in the world, but until your data sets can really get leveraged it is not going to have a lot of impact.

Right now I’m really hopeful about doing a lot more metadata cleanup, it has big value because then we can deploy the AI side of it. But the thing I see on the horizon right beyond that is, for instance, topological data analysis (TDA). There has been a lot of interesting work—not just AI but others as well—a lot of interesting work. And frankly, the math behind that is so compute intensive that until you could run GPU clusters in the cloud, it didn’t make sense. So I think we are on the horizon that once you can clean up your metadata then you can get into stuff like TDA and get to some really interesting complex insights.

Editor’s note: The above has been edited for length and clarity.