Gadgetory


All Cool Mind-blowing Gadgets You Love in One Place

For developers: reversible debugging tool for Android [ARM TechCon 2015]

2015-11-19
hello my name's Gary Sims wrangler authority today I met arm techcon 2015 and I've come to undo software to speak to Greg law hi great how you doing there yeah good at anything about what you offer right so we we help software developers to understand really what their code is doing right so all too often the reality of what a program does is almost but not quite what the developer expected it to be and this is this is this is debugging right so understanding your code of that deep level is very important what we can do is to record a little bit like CCTV we call the execution of a program as it runs and allow them the developer to wind that tape back and forth they can take a recording take it away load it up somewhere else a different time different place and see exactly what happened right down to the instruction level wow great yeah so I'd love to show you a demo we can we can see it in action that we great yes very much yet all right so this is my little demo program here it's very small program so we can kind of understand it for the purpose of this demo it's demos in Eclipse obviously you don't have to use Eclipse in fact what we have is a reversible execution engine that different debuggers can plug into right so GD b is one so you can use it anywhere you use gdb in Eclipse you have customers use it from within Emacs or the command line also arms ds5 in fact we're inside ds5 so we have the feature known as application rewind inside ds5 them and others as well but I'm going to show it in Eclipse here so this is my little program and all it does is store values and their square root in a cache writes very simple array hundred elements mapping values on two square roots and the function we care about today is called cash calculate which just gives give it a value it will just loop through the cash really dumb linear search find a match and return the square root or miss the cash get the square root put it in the cache and then populate one's one entry either side on the basis that there's some kind of locality of reference and then it returns the square root the main function is just a unit test that loops forever getting random values passing them into cash calculate and checking that what is returned really is the square root so let me IM in the Deveaux Grady to go so let me run the program and it's crashed right this demo is supposed to crash so so here we are inside the inside the c library there's no there's no debug information here so just machine code that's fine we followed that in and you can do what you do in every debugger what program has always do is to look up the call stack right because debugging as I said it's that process of reality is deviated from my expectations and I need to find out why that happened and where the source of the problem is right where did that deep weight of that deviation from reality from expectations first happen so cool that's very useful you can see how you got here so we look up the call stack and any debugger will do this it gives you based on a kind of split of guests based on what's on registers and what's on the stack but usually it's fine just like the smash to stack or something you can see where you've been and we can see here all right so cash calculate was given a past in value and it returns square root and I can look up here and I can see that we were passed in 255 and it returns 0 so this clearly is a bug 0 is not the square root of 255 and I need to know why cash calculate returned what if it bites is repeated steps why did that happen now normal debuggers can't take you any further at this point right we've got the call stack it's that sliver of execution history and if what you want is in there great but also often it's not what we can do is rather better so I'm going to hit this button here which is uncool which is like popping up the call stack but it's no longer a guess all the global state has gone back to what it was before and and now more interestingly than that I can start to step back in time right Wow so if I click this back button we're actually unwinding the program's execution all the Global's are going back to what they were now gone back to a point in time which is now just after cash calculate returned I'm at the top of this line the cash calculate has just returned so if i reverse step into i can actually step into cash calculate and see exactly what it did how did that happen so you step back to here now it's returning from the cash so this is looking like some kind of corruption of the cash which is always a horrible kind of odd to look at and enabled running the ice entry of the cash and i can see here that I is 90 so it's returning the 90th the cash let me just come across as quickly do some typing so I can look at the 19th century in the cash here and we can see sure enough it contains the garbage because beta corruption in my cash I don't know whether that was a pointer error a logic error threading Eric I've got no idea who stomped on that data but what I can do here really know to really powerfully answer that how did that happen question is I can add a watch point sometimes these are called beta breakpoints and usually in a debugger what you would do is set that watch point and run the program forward until the data changes what I'm going to run backwards into our changes that's gonna be the line of code that wrote to that data structure so here we go back in time so gone back in time here now to point in the past where the cache contains good data the square root of 40 really is six actually I step forwards this is a little bit like action replay watching sports on the television right so if I step forwards watch out data in the top right hand corner you can see step step that's it that's the corruption happening right there so let's back up a little bit let's see let's see box what what really is going on so this is definitely the smoking gun we're writing value to and square root into the cash let me go up here and have a look and I can see that value 2 is minus 1 and so I've tried to take the square root of minus one and it's giving me zero because you can't do that so again though the question once again why did that happen actually at this point now we can kind of get away with with this code inspection but this is a demo so let's just keep going let's add another watch point to value to and and go back again so we go back in time and we're going to go back okay so this is where value to is being set the value to is being set to value minus 1 and value is 0 so here's our bug called the function with the value of zero return the right thing as a side effect then left one entry and my cash corrupted I didn't see that sometime later now this obviously is very small you know can demo but actually it's it's a canned demo of the real life bug that we one of our early real little victories with one of our early as customers with cadence and they've guys who write all that software for chip design and simulation and one of their biggest customers was having a problem they were trying to type out the they're running the simulation simulation went 48 hours and about one run in 300 during all these tests are not 100 300 do similar to a crash so Caitlin engineers but on site for three months and looking at the core file call file contains a minus one where there should be a pointer but is no you know how did that happen right is no question how it got into that state so that's when they came to us they deployed undo DB I had to run a bunch of times right because it only failed one in three hundred runs and there is some overhead there is some slowdown so took eight hours to run normally with the inside undo it took about 20 hours but they just run it in a little server farm bunch machines again and again and again until eventually they caught it put a watch point on that minus one they went back my time and they had it fixed in three hours absolutely three months getting you know getting nowhere on fantastic so it's great story and it really shows how the other power this stuff but I always say it's not just for those really extreme cases obviously it's very useful to us but otherwise wouldn't get fixed but if you can repeatedly turn an afternoon debug session into ten minutes then that's a good win as well absolutely I tell me a bit about operating system support your Android and linen and Rendell next yeah yeah any particular version of Android any anything is any we need the Linux kernel or the Android colonel to be two point six or later which these days is basically anything yeah and and and that's it arm 32 bit today 64-bit just been announced last week so 64-bit ARM support is in beta right now and and also x86 32 and 64-bit fantastic and if you want to find out more where do they go to they go to our website undo dash software com and you can find everything you need from there that's excellent thank you very much you
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.