Gadgetory


All Cool Mind-blowing Gadgets You Love in One Place

Veenome: The future of interacting with video is here

2012-03-13
hey everyone I'm Hollywood from cnet com here at South by Southwest 2012 in austin texas i'm brian Tong in here today we have Kevin Leland name he's the founder and CEO of vino and this is a really cool video product we talk about some of the stuff that we see video is the future can you kind of describe to people about what vino ms doing sure it's it's really simple so vino just tells you what is in a video right so you can find out the stuff that's in a video the product the people the brands and then you can use that data for commerce for advertising to target advertising and for search and discovery yeah and what do you see that's really simple exactly I mean you guys have built your own technology right to do actual video scanning and pull out not just objects but people the concept is simple the machine in the back is not at all right so at the end of the day you know you're going to get data basically on what's in your video but the process to get there is actually really complex yeah so it's gonna go through no I didn't mean it yeah um so so basically how we do this is we we take a whole video let's say it's you know ten seconds long that's 300 frames we find the key frames which is where things are changing by a certain percentage we then take basically tags for each one of those key frames and then we look at all that data linearly and say okay what kind of relationships can be find amongst those tags say if there is you know two tags that say car another one that says Mercedes let's kind of mush this together and call it a Mercedes car and then provide more detailed tags and that's really the keys if I just tell you there's a car in the video it's on it's neat but it's not that interesting or useful because you know it's a car you can see it's a carbon you might not know what time it is now when I got a chance to go to your website at least play and you have examples of how this technology watching where it's amazing because you're literally dragging your mouse around specific items and when it changes for example there's like a macbook air in the frame it changes the iphone it text that now is this is this there is there any human interaction done with the scanning of these video files or if this is all you know a computer brain is interpreting these images and determining what's on the screen so there is a human element if you want want it right so the way that this product works is it's really it's a b2b kind of product in the sense that I don't want to build a site that people come to to watch videos right I want to build a way for people to index the videos and product eyes their videos and product i'm only on this now so yeah that's a good word yeah so there's a hashtag hashtag product that's um we don't really want to create this destination site so we allow people to basically take this technology and use it on their own it so it's a platform and so one that want to they can actually brand object on their own and so if we're here and we're you know I know what kind of jeans I'm wearing and it's my video in my sight I can say you know these are Levi jeans in the case that we can't identify them so there is an option to kind of manually correct and thus far the people that were talking to you about this aspects potentially about our API are really interested in actually being able to control that stuff for brand safety reasons in for just the idea that you know you might not be able to tell ever exactly what kind of sweater I'm wearing and so if you want to add manual detail you can so you provide you have a catalog of tags already right you just are sucking up all kinds of data about what a Mercedes looks like and that kind of thing but you're saying like if a business contracts with you then they can write you mean that the problem with it obviously is that the world of possible things you can shoot your videos infinite right and so there's no way for us to always have everything right and so we allow for that and that's it can be used for custom control to like if you have if you want to sort of product placement and things like that in your videos you can actually I control that and say you know just make all the pizza domino's pizza so what are some potential uses like Hummer seems like the obvious one right I saw that scarf on a TV show how do you see companies implementing this in a consumer-facing way so originally I was kind of like infatuated with this idea what what you saw which is neat it's a clickable video kind of like just hover over something and buy it right and that's really straightforward yes absolutely yeah and so that's like one of those things that are really it was really straight forward it really liked it but what I found is that the engine that drives it we have an API that we're launching here at South by Southwest and that API is much more it's much more versatile because you can use that API to do things like them you just take this data and now you can target advertising based on what's in the video you can connect videos together now so if i know i'm watching this one video and it has you know a bunch of iphones in it i know that you know i can connect these other videos together so it allows you to be able to discover things in video more and so you can kind of go for one video to the next and actually have it connected through a line of content which is I think we're pretty powerful so you've actually potentially cracked the nut on video recommendation on related video I mean did you just win the netflix contest yeah yeah just one not here actually we did relations we did I mean it's one of those things that like I think I became sort of i was very interested in this clickable video concept but i think that at least the early stages i think of be known we're going to do more business around the uses of the api for that things like that connecting videos together doing ad targeting helping people find videos right so with that same data i can now find things more easily if you've ever looked for something on youtube that doesn't have a hundred thousand views or a million views there are no chance for so you have no way of finding unless you know what the nate the title of it is yeah now I've got to imagine also right product product placement amazing and TV shows and a lot of these TV networks are trying to find some sort of product or something that can bridge the gap because we now know a lot of people are interacting on computers at the same time as they're watching TV shows but they're kind of doing two things at once so I've got imagine or you have you talked to some TV networks or how to use this API so that it you know we've seen programs like Shazam or others that can hear what's playing on the TV and then to say okay you're watching the show so my imagination is running a little while is it a thing where these networks are approaching like okay how do we involve not only do we can we hear and know where you are in the show but ads or recommendations are being served to my screen now that it knows exactly what's on that screen while being able to listen to it yeah you just use identified like another use of the API which is which is sort of like if you're not going to make the stuff clickable on the screen maybe you can pipe that into like a second screen app right where you can actually sort of like if you're watching them you're watching this the show you pull up your iPad app right and now we know how to sync it with a couple of the programming so now we know what shall we are watching and we know what time it's at so we now can sort of say as you're watching the show here's a short you can buy here's this thing you can buy and the network or the video producer can control that experience ray because it's their app and so like that's another big piece of date or the reason why the API is just strong business case is just that the big publishers who have premium fountain they can control the experience right so they might not want me to put my little hover jog over there you know ten million dollar episode and so that's one of those things that I think is pretty suitable for them so so really what you're saying is there's going to be a fitting work for your company between CBS and Google yes cuz I could I mean when you describe that algorithm for recommending videos for daisy chaining related videos for discovery like that's got YouTube written all over it yeah you know it does and I think I guess our art my theory in this is that data like the data of videos opaque right now a video sits on a site and no one knows what's in it so someone is going to solve that problem right you can't do anything interesting really really truly interesting with experiences around video until you have the data behind it right to be able to connect things and find out where things are and so someone's are going to do that yeah exactly search and so so someone's going to do it and we think that we're going to be the first people to do it and do it well so I have to say it's super impressive and it's kind of amazing that you honestly I feel like you were the first credible source that I've talked to is doing this and that's five or six years probably of people publicly trying to figure out video search to index like what kind of geniuses do you have with in there so well I'm one of them you know and it was like well hello so you know we might hear my revised I think a lot of it had to do with looking at the problem differently I think so there's this temptation this is a little bit technical but I think it's pretty straightforward so there's this temptation when you're looking at the way the idea came about was I was building iphone apps and Android apps and i was using image recognition to do like alternate entry so to take a picture of a credit card and pick the data gets into the thing you have to enter it with your finger which no one really loves to 16 digits expiration date cvv code and address is not very fun so if you can take a picture of the card that saved some time and what I found was that the like experience was really you have a flash on the camera you don't get the right data and sort of to see like okay you know I just need more chances to get this picture just right and so that from that became okay well I need more frames that's video now the challenge is that regime it with image recognition it's all about like getting what you can from this one frame and saying I want to glean these patterns from this one frame this one image and figure out what the products aren't that but what we do is we say you know what let's let's step back and look at all those frames as a whole like look at them linearly and say what can we know about all these look at me know about all these rings and then how can we connect them like so that if I'm looking at you know ten frames maybe I can find out a little bit more about what's in those things because i have so many different angles and things like that so it's it's sort of more about like looking at the problem a little differently but i think it is sort of raw brain power because the reality is you could stare at a frame all day and never know that it's an apple ipad or never know that it's a you know Mercedes Benz car but 30 frames later you might see so if you can connect that stuff that's that's sort of the inside i think that is helping us get there and we have the processing power now that like the computer horsepowers there's a process that much yeah interesting so you're not trying to attach all new data to videos which i think is what video search is tried to do in the past you're using the data in the video and then matching it up against yeah i mean it's it's hard with the volume with video right you know billions of videos I think YouTube does five billion views a day I mean that's you can't have people going in there antagonist survive August unless you really was more videos online there are people in the world that's a pretty staggering amount and there and they're just kind of keep they keep coming right in to keep and cell phones have every stuff on this camera take HD video now so it's just sort of like the volume can larger larger so now the 65 million dollar question is how accurate are you so we asked that a lot and it's sort of a hard question because I always say like we're really accurate right next to practice yeah hands on the video but I guess so we have an internal QA system so every time we come out with a new algorithm of how we are connecting the data together and how we're kind of processing the frames we do blind QA and we basically we have everyone in our company is sort of the startup atmosphere we everyone does QA you look at a frame and a and you say i'm going to give this a 125 on a sliding scale of specificity and accuracy right so like i'm looking at a mercedes benz you know popsicle stick is a zero right but mercedes-benz is a five and then like car is a three right so there's a sliding scale and in that way we can kind of see if we're improving this with each little tweak and turn of the screw and we think right now we're at like a four or sort of like a bee of where we could be and I think there's a lot of room that extra miles is a big deal and that's part of the reason why we provide the manual option is that if someone wants to go do just a quick find replace and so you know what these are actually all Pepsi's just just let me just do that it just right you know since we're using natural language to process the tags already they can just go in and say you know make all the cans pepsi cans and you know set it and forget it so you obviously have this amazing intriguing product and an algorithm what what's for you guys kind of the end goal um I think it's just sort of we just real i just really want to solve this problem i think it's one of those things that you know people again like you said it's been there's a lot of companies that have come and gone around this idea of a clickable video and video search and video discovery and i think like there's a lot of stuff that it can still be done in video one sweet once we solve this problem and i just really want to be part of that solution also the bidding war between six yeah yeah yeah you know yeah me too yeah we don't judge you for that awesome Ike having super cool technology thank you for talking to us thank you you can find all of our South by Southwest interviews and of course like a lot more interesting video then we'll hopefully soon be scanned and indexed at Cena TV com
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.