Gadgetory


All Cool Mind-blowing Gadgets You Love in One Place

Reporters' Roundtable Ep. 121: Wavii founder on the future of news

2012-04-24
hey guys it's rafe needleman in San Francisco welcome to reporters roundtable hey last week I covered a really interesting company called wavy wa VII and I just I thought the company was fascinating it is a semantic analysis product that takes news of the world all the news is coming in and gives you a facebook like feed of what businesses and people and events are happening what things are going on so and so bought this so-and-so a company was acquired by such and such and the system that does it is as far as I can tell magic it's a computer that reads the news and parses it for you and I just think that's fascinating and because I think it's fascinating i have the CEO of the company adrian a loon here with us to talk about his business wavy and what it's doing why it's doing it and the future of computers that read the news so you don't have to adrian thanks for making the time to join us on the round table it's a pleasure thanks for having me so give us a brief pitch of what this business is that is and then let's get into the the science in the future here what is wavy and why does it exist yeah sure so you can kind of understand wavy it's probably easiest to put it in context of something that we all know which is Facebook right and what facebook kind of does so well is Facebook just gives you these quick like three second visual updates about what's going on in the world right like Bob checked in to a location you know julie has a new job eric is now dating someone and those nice visual updates are just really really pleasant to consume right there fast I get to decide what I want to click into so if I see that you know you Rafe checked into a restaurant and you've got some photos well I get to sit there and say hey I want to click into these photos and learn a little more or I get to say hey you know I'm just not that interested or let's say you start dating someone new well I get to click on her and go find out more about her who do I know in common with her where she work so Facebook kind of gives us this nice interface of exploring the information about our friends and it's it's incredibly good at keeping us up-to-date with all of our friends right like in a few minutes a day you can keep up to date on God knows a thousand people or 1,500 or 2,000 people and so we basically a wavy just wanted that same product but for the world right I just want to keep up to date with everything that's going on in the world in these nice little visual updates where I can kind of click around and explore and so that's what we set out to do the challenge there is that if you think about it let's say I get in my feet Rafe is dating Julie or you know sorry I don't know your wife's name um and maybe I get some photos of you and Julie below it the way Facebook was able to give me that is that you know maybe you filled out a relationship status and your wife kind of confirmed it and maybe you know one of your friends was there taking some photos and maybe some one of your other friends was at home kind of OC these photos let's tag them yeah so are its all very structured and you're talking about applying structure to other stuff exactly but the trick for us is unfortunately that structure isn't being provided by users for everything going on in the web right it's not like when facebook bought instagram they didn't change their relationship status it's not like when Whitney Houston died she didn't check into her death you know there's there's basically tons of data out there um that's not in a format the computer understands it's in a tweet it's in a blog post it's in an article maybe it's in a YouTube video and so what we're doing is we're teaching the computer to basically read and understand that much the same way a human would and then build these feed items that kind of represent what occurred if that makes any sense hmm and and the product then is like a feed of the news now why do we need this I mean I in my field in technology I read news com of course I scan a couple of blogs and RSS feeds and I look at tech meme why do I need something that is giving me kind of a timeline view of the news when the news is all there right in front of me anyway yeah so the first aspect is it's just hard to get through a lot of the news today right so when facebook bought instagram it came into my feed you know 2,000 times right it's in my RSS it's it's being posted in five different places and so what happens is first it's hard to consume that because maybe there's different fragments to the story over in this article it talks about maybe the purchase price and this article talks about how much Kevin made in this article talks a little about the background so you really kind of want an almost aggregate single view of hey here are all the details that have been pulled out of these things but there's a second aspect too which is things that maybe you care about that aren't so so large in the news maybe your local ice cream shop releases a new flavor and you know what the local paper covered it in two sentences well that sort of stuff just gets buried right and so to an extent we also want to kind of surface that information to the people that care about it but then there's a third aspect too which is just putting things in context right so as I mentioned if you're dating someone i can click on that person on facebook and i can go find out about her and when you're reading a news article today you just don't have that ability to kind of really get context or analyze information really you only get kind of a singular viewpoint at each point in time think of something just really really simple happens to us all the time right let's say i want to know when Apple's releasing the next iphone and i'm pretty sure everybody wants to know this right so so what do i do well I can read maybe your article about it ray for I can read something on who knows some other site but wouldn't it be great if I just had some visualization that was like look 14 sites said it's coming out in april nine sites said it's coming out in june and here's the timeline of when they've released them historically and even better would be hey typically these seven sides are the accurate ones and look at what they're say they're saying june so you probably want to focus on that so we believe that kind of giving users access to more information and kind of visualizing that information is is a very compelling experience now i don't think it replaces your news com i don't think it replaces your technique i think at the end of the day people stay up to date at many different sites and that's okay we just want to provide kind of a set of information and experiences that users aren't getting elsewhere that are valuable so when I saw this product I was very impressed by the demo here and I've been using it and is still very impressive technology what it does is it looks at all this information based on stuff that you say you're interested in and we'll talk about that later and it distills the headlines in some cases of better headlines than the original articles and puts them in front of you I that seems um fairly magical how do you how do you do that how do you figure out that a a story written by me and our terrible rush is actually about such and such buying so and so or Google Mail being down or something like that sure so I don't have my wizardry degree so it's not actually magic though it would probably have been faster to produce if it was um so think about think about how children learn right or think about how you learned when you were very young maybe maybe kind of one day your your father comes home and he says you know what I want a new watch and then maybe you know later that day your mother comes home and she says cat I really want to massage and so what you're doing is a little kid is your kind of hearing tons of language and then you're starting to discern some patterns and so the pattern in this case is maybe you're hearing whom I want us something I want us something and you're you're kind of gonna figure out what does that really mean and maybe you'll ask someone maybe the kid will go up to his cousin and say hey what does it mean that to want something and then the kid learns what wanting really needs and now the kid is kind of like because he's learned the patterns the kid can go ahead and he can try it himself so you can be like well I want a sandwich or I want a toy and the kids kind of learned at that concept from now until the dawn of time well this is how we train our system so our system is looking at all language you know it's a big data problem it's just kind of ingesting tons of content off the web and then our system is using machine learning to discern patterns and so it's an a lot of these things they all look the same and it'll maybe show it one of us but oh those are engagements and so we'll just tell it oh that's that's an engagement when two people kind of you know get engaged it leads to marriage give it a couple details about it and then the system will go off and it'll start up it'll start trying its knowledge much like a little kid and it'll say okay well that means you know this actor got engaged to this actor this politician got engaged this policy and kind of goes through and it's probably guessing pretty correctly most of the time and then maybe it makes a mistake like it says Barack Obama is engaged to Ahmadinejad and we're going wait wait wait wait wait how'd this happened and maybe what it saw was Barack Obama was engaged in a heated debate with Achmed amjad and so what will happen there is will tell the system no it'll figure out once again using machine learning it'll figure out that being engaged to someone and being engaged in a heated debate with someone are kind of two different things and so it's kind of honing its knowledge over time much like a little kid is where a little kid may say I want a happy and you're you know you'll correct that little kid no you can say I want an object and the kid gets better and better over time well that's what our system is doing it's getting better and better over time now people have had theories about this for probably hundreds of years about the linguistics about meaning about this the deconstructing language and human beings of course even you know my five-year-old has or before you know when he was learning language has trillions of neurons and interconnections and our brains work fundamentally differently from the way binary computers work no matter what the programming is uh there has got to be precedent here that you have and past that scientists and linguists have gone down that have been wrong and past that have been right what are you basing your technology on sure so the first and foremost thing to understand is that at the end of the day we haven't built Skynet our computer can't think our computer at the end of the day is just running some math where it's recognizing patterns and representing that information now I take it as a compliment that you know a lot of people look at our product and say there you know there must be something going on there that really is almost artificial intelligence but it's not it's real just a first step along the long long long path of really building full artificial intelligence now what people have tried to do in the past when teaching computers language and this is somewhat a generalization but what they've tried to do is basically teach the computer what you learned when you were in kind of third grade fourth grade fifth grade which is around teaching the computer the rules of grammar so this is a subject this is a verb this is an object this is a past participle you remember like all all that kind of those classes that you took and all that information you learned what we found is that it doesn't really make sense to try teaching language that way because that's not really how humans learn I mean think about it a three-year-old speaks when your when your child was three he or she was speaking but at the end of the day that three-year-old doesn't know what a verb is or at least if they do they're smarter than I was at that age and so what we found was that those the grammar rules that we learn in school are very much an attempt at retrofitting rules on top of language and much the same way where you didn't learn all language in kind of your third fourth and fifth grade class and and by the time you were done with fifth grade you weren't saying oh I now know a hundred percent of everything you're actually learning concept at a time right I'm teaching you a concept right now about you know machine learning and artificial intelligence and maybe later today you're gonna teach me a concept about journalism right and so these concepts that we learn it's a never-ending process we're gonna learn them until the you know until the day we die and that's how we decided to teach our computer rather than just trying to train all the rules up front once and for all I you you come to this field honestly I believe tell us about your upbringing and the lineage of what you've been doing here back to some of the great linguists of our time yes so I've been kind of surrounded by linguistics it's that it's that thorn in the side that keeps like poking me since I was um since I was really young and the reason is because my my father is a linguist he studied at MIT under Chomsky and so I was constantly so founded by my father and his friends and one of the things that's great about MIT is this culture of debate where they're constantly constantly arguing about this language structure that language structure so what happened is when I was growing up we just sit around the dinner table or you know my dad would invite his friends over for drinks and they'd be arguing on various kind of aspects of of these rules which rule is correct is it this rule or this rule which covers more cases and what's funny is they would often just kind of turn to me as the the young child and say well which do you think is right because there's a degree to which intuition and Native understanding of language is often more accurate than over analyzing languages now I never wanted to be a linguist I still don't want to be a linguist but what what that experience taught me was that language you know I just spent a lot of time thinking about language and what it taught me is that language just isn't that hard and by that I don't mean that humans can't speak with incredible incredible incredibly difficult sentences and structures but what I mean is that we've all learned it fairly intuitively and so what that tells me is look if two year olds can speak we can at least get the computer to understand language at the two-year-old level or at the three-year-old level now I don't think we're gonna do it at the 50 year old level but it certainly taught me that that language has some core elements to it that are fairly simple and combine that with my focus more recently on trying to unlock meaning on the web you know for better for worse the only way we're really gonna get I mean think about the experience at facebook gives you they've unlocked all the meaning about your friends I can figure out which you know which of my friends live in Seattle I can figure out which of my friends work at this company which have been to a restaurant who's dating who I mean they've really unlocked all of the information about my friends but if we want that same thing for the internet all of the information on the internet for better for worse is in natural language if I want to know who are all the celebrities that had dui's in 2008 and I'm not saying i do but let's pretend um how can I get that well you know you got to think that Perez Hilton's working all day long to make sure that contents on the web it's in tweet blog post if I want to know what are all the series a evaluations that content is all on the web but the computer can't get at it because the computer doesn't understand natural language so from our perspective that was kind of the cost of doing business if we want to build this product we have to teach the computer to understand language and for better for worse we didn't back down merely because we were kind of comfortable with language on we've been surrounded by it for a while and so it wasn't it wasn't that scary to kind of tackle the problem uh there are other things that signal importance to people in addition to the facts that you so far doing a pretty good job of extracting from unstructured articles and one of those signals is the social signal if my friends read a story then that's arguably important to me if I say I'm interested in something by retweeting it on Twitter then that can be picked up as a signal how does that play into the what the user of wavy sees because one of the things about wavy just as a little side note here is the the display of the wavy page is not information dense in the same way that a New York Times front page or a tech main page is where there's like a hundred headlines on wavy you see a stream and it's far fewer numbers of stories how do you decide what the user sees sure so there's two aspects here there's the deciding helm which items to show the user and then there's a second aspect of deciding how much to show the user and so deciding which items to show a user really comes down to kind of I'd say three main aspects the first is kind of um what we see in the world and by this I mean how often are we seeing something and like how rapidly are we seeing it so we may see something often for example that Apple's release you know rumor that apple is releasing a new iPhone we see it all the time but we don't see it very rapidly in one kind of big spurt right we see it kind of all the time on and just occasionally um so so that's kind of the first aspect is just what we think his world heat the second aspect that we think of is just a priori knowledge of the concept we know that a death is more interesting than a birth almost always we also know for example that acquiring a company for a billion dollars when you've only ever acquired companies in the past for tens of millions is actually a really big deal and so because we have this underlying information we can play some extra kind of analysis games to really make sure we're surfacing relevant content now the third aspect has to do with the user and their world and so what this comes down to is what what is the user following in our system right so are you following Barack Obama and is this something happening with Barack Obama well if it's something happening with brock obama is probably pretty relevant to you or are you following michelle obama but it's something happening with barack obama so it's maybe slightly further away from your interest but it also might be relevant to you um or in the past when we've shown you things about Barack Obama have you clicked on them or not and then the portion that you're discussing which is how much are your friends engaging with this piece of content are your friends all liking and commenting reading well if so that's also going to build heat in our system so so that all goes into building a rancor that decides hey what's the most important thing to show the user now the second aspect is just how much do we show the user and you know do we give you ten items do we give you a hundred items we give you a thousand feet items and this is a trade off of what's known as accuracy and recall the fewer the items we show you though the less recall well pretty typically the better accuracy and by that I mean they're going to be things you care about far far far more right so if I just showed you one feed item facebook bought instagram well the accuracy is really high you're probably gonna care about it the recall is pretty low right you know you're there's tons of things you missed and so this is a game we're constantly playing we don't want to give too much recall and kind of inundate the user with tons of things they don't care about but we also don't want them to miss things so what were what we're doing what exists in the product today but we're also constantly working on is basically trying to look at how the user engages our content and figure out on a per-user basis does this user want more content as this user want less content and so this is something that you see at Facebook the more often you come to Facebook the more stuff they give you in your stream or in your feed but it may be less and less interesting stuff over time because they run out of the good stuff and we have the same notion and finally what is the role in your estimation as the reluctant linguist the role for article writers who are putting their blood sweat and tears into crafting there were into gathering their information gathering the opinion and crafting it together what is the role of the writer yeah so think of this kind of from probably three perspectives the first perspective is on at the beginning at the very beginning like what was a reporter doing they were reporting right so it's breaking the news and our system needs that our system wants that we're not looking to replace that right so the first kind of one or two people that break the news that's particularly interesting to us now the fact that right now many many many articles printed on the web are just kind of duplicates in terms of the actual what happened that's less interesting so we'd like to take all the you know the 50 articles that say Facebook buttons are going to merge that into kind of 11 feet Adam but what is interesting let's think about the second aspect when you're reporting something Rafe maybe you didn't break it but what are you doing you're adding your analysis your opinion your context and that's extremely valuable and I do think that 50 people adding their opinion and analysis is is is still something that we don't want on the you know kind of worldwide we don't want that to change we want people to do that so what we'd love to do is in our perfect world in our system would say here are Rafe's thoughts and if you click on that hey now I dive into raves article to kind of learn a little more about it now there's a third aspect too and this has to do with the fact that we often we often approximate our interest by sources probably the best way to put it and what what i mean by that is i'm interested in tact I don't necessarily know what in tech I'm interested in so instead I follow what Rafe rights and rape is good at kind of introducing me to things it's almost a serendipitous experience of discovery and we think that's extremely valuable to and that's why over time we want people to kind of have we want our users to have as much control as they want to do that if they want to follow the reporter great if they want to follow the company or the product or you know the story type namely an acquisition great um we basically want to give users kind of the the ultimate control over that and let them define their experience if that makes sense okay hey Adrian thank you so much for making the time for us today a very interesting product you guys have to check out wavy wav I icom is also a mobile app for it and I just want to point out for people who or have gotten this far this is a very interesting space to be in from a business perspective the news aggregator site was acquired by CNN power set was acquired the search company the power set was acquired by microsoft i don't know what's gonna happen with wavy but get it while you still can and is still independent this is a really fascinating company to watch adrian good luck to you and thanks for the time thank you very much for a favor really appreciate it you
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.