About Debugging
Debugging is a strange thing. You do your design, write your code and then the code written doesn't work. Of course you might not know it doesn't work. Sometimes you will take a chance, run the code and look at the results - occasionally I will, especially if I have only made a small change I'm confident in. But mostly I got straight into a debugging mode of some sort.
Usually I develop incrementally: write a bit, debug a bit and repeat. This means that I know straight away if I'm doing something stupid. You also know exactly where the problem must have been introduced. No complex debugging strategies are usually required. Of course, as the problem as a whole gets larger there can be complex interactions. Nothing is a sure-fire way of debugging.
This style of debugging usually involves running test data through the program and if things go wrong then I some sort of print statement to show me the data or program flow. This seems a very simple and effective way of debugging.
Sometimes you have to write significant chunk of code, where debugging incrementally is not possible. This is usually some complicated set of routines where writing test harnesses, or forcing data in, would produce a lot more work than debugging it as a whole. The code quite often has a lot of dependencies with itself. I've no real problem with this - after all I'm not bad at writing low error rate code I've found.
Often this demands a much different approach using debuggers so that you can see where the flow of the code is going and the data being operated on.
In Python I tend to avoid the debugger. It's not that it's bad, just that the one I normally use is the basic text debugger that isn't as easy to use as a GUI driven debugger. Additionally, with Python, you definitely need to test every code path. I agree it's a good strategy in any language - and my experience with Python is that if you don't you're code will not work. There will be some runtime error in some code-path. This sounds a tad superstitious - but most of the checks in Python are done at runtime.
I know people have criticised Java for throwing up run-time errors - and perhaps Java's runtime errors are different. Also, I haven't written very large programs in Python (usually the leverage of the libraries mean that's not necessary). However, at least in my experience, all you need to do is to make sure that you've at least run every line of code before you declare it's done and with a reasonable set of data. That's not all that bad - and the fact that you can get away with not running every single line of code in languages like C and C++ is not necessarily a benefit. I guess you can get away with it most of the time because the compile-time nature usually warns you of stupid syntax or type operation. However, in reality not running every line of code is a mistake, in my opinion - and leaves you open to not fully understanding the code you've written and the associated problems.
This brings me onto another point about C and C++. For some reason, and I guess it's a mixture of the language, the libraries and the way applications are written and the type of applications that I write in C and C++, I find that a debugger is much more important and a good debugger can save a lot of work. It's quite often harder to put a good range of test data through a C and C++ routine. One way round is a test harness, which is a set of code (not part of the application) designed to test a specific routine, module or class.
Whilst test harnesses is definitely the right approach in some circumstances, it certainly isn't in all circumstances. For a start the test harnesses to test a specific piece of code can be significant because of the number of test cases. The module might not just need calling but might also need all the called routines replaced - so it might need to have a separate project with a lot of support code. There comes some point where the effort required to produce the piece of code - which includes debugging the test harness - can be way more than the code being tested. Remember, the object of the exercise is to know that a piece of code works - not to stick to one method. Faster (but effective) solutions should be seriously considered.
The other way is to test in place, usually with a debugger to examine the state before, during and after. Stepping each line once very often illuminates errors or problem cases (not necessarily happening in the current run). Get the program under test to perform actions that provide boundary conditions for the code under test. Sometimes this is via program input data (files, user actions, etc) but other times the rest of the program needs to be modified temporarily to 'stress' the module under test and get the target code to perform in all the required modes of operation. If it's an important test that needs to be repeated when the code is modified the test code can be left in the source code but conditionally complied out. (Of course, if it's not required then take it out since more lines of code = harder to change. These bits of code can always be re-made if the test needs to be repeated).
Of course there are many more approaches to debugging (e.g. support programs). But these are the ones that I use the most.
Apple's Xcode debugger uses gdb underneath. They've added some nice features recently to the GUI front end. Breakpoints especially are getting powerful. You can do things like counts, etc. You can also set them up to log. I guess this means that rather than sprinkling print's over the code you can just click and it will log. This seems neater somehow and I really fancy trying it out.
Additionally you can run a program when a breakpoint is hit. This is very cool - because you could: set an ichat status, send an email, get another program to change one of the data files that the application is using, etc, etc.
Finally, I remember when debugging meant not running anything else on the computer. I think of all the things that have happened in debugging, this is my favourite change.
Usually I develop incrementally: write a bit, debug a bit and repeat. This means that I know straight away if I'm doing something stupid. You also know exactly where the problem must have been introduced. No complex debugging strategies are usually required. Of course, as the problem as a whole gets larger there can be complex interactions. Nothing is a sure-fire way of debugging.
This style of debugging usually involves running test data through the program and if things go wrong then I some sort of print statement to show me the data or program flow. This seems a very simple and effective way of debugging.
Sometimes you have to write significant chunk of code, where debugging incrementally is not possible. This is usually some complicated set of routines where writing test harnesses, or forcing data in, would produce a lot more work than debugging it as a whole. The code quite often has a lot of dependencies with itself. I've no real problem with this - after all I'm not bad at writing low error rate code I've found.
Often this demands a much different approach using debuggers so that you can see where the flow of the code is going and the data being operated on.
In Python I tend to avoid the debugger. It's not that it's bad, just that the one I normally use is the basic text debugger that isn't as easy to use as a GUI driven debugger. Additionally, with Python, you definitely need to test every code path. I agree it's a good strategy in any language - and my experience with Python is that if you don't you're code will not work. There will be some runtime error in some code-path. This sounds a tad superstitious - but most of the checks in Python are done at runtime.
I know people have criticised Java for throwing up run-time errors - and perhaps Java's runtime errors are different. Also, I haven't written very large programs in Python (usually the leverage of the libraries mean that's not necessary). However, at least in my experience, all you need to do is to make sure that you've at least run every line of code before you declare it's done and with a reasonable set of data. That's not all that bad - and the fact that you can get away with not running every single line of code in languages like C and C++ is not necessarily a benefit. I guess you can get away with it most of the time because the compile-time nature usually warns you of stupid syntax or type operation. However, in reality not running every line of code is a mistake, in my opinion - and leaves you open to not fully understanding the code you've written and the associated problems.
This brings me onto another point about C and C++. For some reason, and I guess it's a mixture of the language, the libraries and the way applications are written and the type of applications that I write in C and C++, I find that a debugger is much more important and a good debugger can save a lot of work. It's quite often harder to put a good range of test data through a C and C++ routine. One way round is a test harness, which is a set of code (not part of the application) designed to test a specific routine, module or class.
Whilst test harnesses is definitely the right approach in some circumstances, it certainly isn't in all circumstances. For a start the test harnesses to test a specific piece of code can be significant because of the number of test cases. The module might not just need calling but might also need all the called routines replaced - so it might need to have a separate project with a lot of support code. There comes some point where the effort required to produce the piece of code - which includes debugging the test harness - can be way more than the code being tested. Remember, the object of the exercise is to know that a piece of code works - not to stick to one method. Faster (but effective) solutions should be seriously considered.
The other way is to test in place, usually with a debugger to examine the state before, during and after. Stepping each line once very often illuminates errors or problem cases (not necessarily happening in the current run). Get the program under test to perform actions that provide boundary conditions for the code under test. Sometimes this is via program input data (files, user actions, etc) but other times the rest of the program needs to be modified temporarily to 'stress' the module under test and get the target code to perform in all the required modes of operation. If it's an important test that needs to be repeated when the code is modified the test code can be left in the source code but conditionally complied out. (Of course, if it's not required then take it out since more lines of code = harder to change. These bits of code can always be re-made if the test needs to be repeated).
Of course there are many more approaches to debugging (e.g. support programs). But these are the ones that I use the most.
Apple's Xcode debugger uses gdb underneath. They've added some nice features recently to the GUI front end. Breakpoints especially are getting powerful. You can do things like counts, etc. You can also set them up to log. I guess this means that rather than sprinkling print's over the code you can just click and it will log. This seems neater somehow and I really fancy trying it out.
Additionally you can run a program when a breakpoint is hit. This is very cool - because you could: set an ichat status, send an email, get another program to change one of the data files that the application is using, etc, etc.
Finally, I remember when debugging meant not running anything else on the computer. I think of all the things that have happened in debugging, this is my favourite change.
0 Comments:
Post a Comment
<< Home