Sunday, February 10, 2008

Random read if you have spare 5 minutes

You know, I thought that access to vectors via [] would be fine in, unchecked, in programs you understand totally yourself. However, a recent medium complexity C++ program* (a text converter in this case) showed that me that this is not the case. It was crashing on the gcc build. I compiled it with VS2005.Net (to debug) and luckily the debug version has vector [] out-of-range accesses trapped (thanks Microsoft! I don't say that very often!). It immediately showed me the illegal access!

It's trivial to subclass vector to make Vec (it's in Stroustrup tC++PL) where [] access cause at() to run - at() is the same as [] but checked. I hadn't done it because I thought there was no way I was going to fall foul of it in such 'simple' programs that I had written myself.

(This isn't meant to be a lecture by the way!)

Which goes to show - perhaps all array access should be checked, all strings/containers should be self expanding (i.e. avoid buffer overflows) and memory should be garbage collected, etc. and Java people really had it right - in all cases, even for games, embedded, real-time, and all the rest.(*2)

I'm not sure how far Obj-C goes with this regard ... I've asked Stu.

Python has these things. C++ has a sort of poor half way house from C (if I'm being honest) even if it's blazingly fast(*3). And I'm not suggesting we all convert to Java - just that I think these things are things that should apply to all programming languages regardless - at least as a switchable option. Let's face it - a lot of the ancient BASICs had checked limits on array access.

Some new, big programs that I'm involved in worry me - manually locking with threads, manual memory allocation, etc. All bugs that won't necessarily get caught to make nowhere near a CORRECT and ROBUST program (let alone proving those things). Are we building the same faults in that our current systems suffer with - great from the outside but a extension and maintenance nightmare in 10 years time?

Also - why does it appear that Java is so much slower than C++ on the applications we run at the moment? It can't just be the limit checks outlined above. And surely the JIT compilers are fast? Is it the type of applications that are written with it - i.e. network heavy and dependant on the response time of remote servers?


* NOTE1: The reason it's C++ (when all the other programs I was writing to do text processing are Python) is because of the number of cross-references for the several hundred files I have to parse. This is even though I have duplicate data in both vector and maps to make access hyper fast and it still takes many minutes to run over the code-set I'm processing

*NOTE2: I'm aware of things like programming-by-contract programming, defensive programming, leaving your assertions in production code, etc.

*NOTE3: I won't even mention that it appears that only 50% of C++ compilers complain about uninitialised variables being used... Grrrr...


Post a Comment

<< Home

Newer›  ‹Older