GCC C++ Link problems on small embedded target
Thought someone might be interested in this...
I spent Thursday night and Friday night investigating some link errors I was getting on my robot project, which is small embedded ARM7 target compiling C++ code under GCC.
So here is the detail of the errors:
All of these problems were caused by the fact that I'm not using the C++ libraries - either the compiler C++ support libraries or the standard libraries. And the very summarized version is that none of them were particularly hard to overcome - once you knew why they were happening! This post is about what they are.
Why aren't I using the standard stuff? This is because I don't have much flash or RAM. Well, actually I have 128K of Flash (which I expect I won't use up) and 60K of RAM (which I'm going use a significant fraction for a large data store ... all will be revealed in a later blog entry). It's quite common for embedded systems to roll their own nearly everything. I do link against certain libraries. But certainly nothing like heap management, cout or printf.
Undefined References to class_type_info
Starting with with the undefined references to
`vtable for __cxxabiv1::__vmi_class_type_info'
and
`vtable for __cxxabiv1::__class_type_info'
- these are related to RTTI (run-time type information) as suspected, and are functions patched into the RTTI table. There is a pointer to this from the vtable.
This is the vtable:
As you can see, address 4 points at the type table.
Obviously those are mangled names ... and I've left off a whole other block - this is the pure abstract base class, effectively an interface, part of a concrete class.
Since I'm not using rtti at all, we can get rid of this by adding "-fno-rtti" to the gcc command line options. Both errors go away.
Undefined Reference to `__cxa_pure_virtual'
This one is interesting. Effectively it's a function that is called if you actually (somehow) call a pure virtual member function. Remember that you don't give them a definition (by putting =0) in the class after the member function definition. As you know - this does two things, forces you to define it in any derived classes and stop you making a concrete version of that base class.
I guess you've have to be hacking the vtable or doing some very bizarre casting to get this at all without the compiler spotting it. Either way, I think it's part of the language standard I think - I believe gcc's standard lib does an abort.
eCos has some information on it, as does the OS Dev Wiki.
http://sourceware.org/ml/ecos-patches/2003-03/msg00209.html
http://www.osdev.org/wiki/C_PlusPlus
We just sit in a loop, because I think it will never happen to us.
Undefined Reference to operator delete(void *)
The final ones quite good ... and Google really didn't help direct me to the information in any sort of quick way.
So, I don't use new and delete - because we haven't got a heap, and the memory on the single-chip microcontroller is not really large enough to use that type memory management (certainly not with my usage). It's all stack based and static objects for us. But that's ok.
So why is gcc generating a delete? Turns out there is more than one destructor in gcc (and there also can be two types of constructor as well - but I'll just cover destructors here). If the destructors are virtual they will appear in the vtable (there are two in the vtable one above - but another class has three). How gcc decides to generate 2 or 3, I haven't found out.
In summary, these three are:
So why the difference between in-charge and not-in-charge? Well, it's got to do with virtual inheritance (as opposed to virtual member functions). The summarised version is that these are multiple inherited classes that have a single base class at some point where we want one object copy rather than a copy for each derived path.
The rules say, to avoid trying to 'destruct' these common base classes multiple times, that only the most-derived class can sort out calling the destructors (and this is probably the simplest method, anyhow).
Also note that multiple destructors is not the only way of handling this ... earlier versions of gcc passed parameters into the destructor and generated code to select the desired operation. However, you get this speed overhead all the time. Extra entries removes this problem - because you can call the one you want directly.
The 'in-charge deleting' version is, I'm guessing, when you destroy an object by calling delete on it. Therefore gcc only needs to arrange to call the destructor. Of course this will always be "in-charge", since it's likely to be at the top of a hierarchy ("most-derived").
Some more information for this topic, mainly about virtual inheritance:
The OS Dev Wiki touches on the solution: http://www.osdev.org/wiki/C_PlusPlus
This GNU list describes the virtual inheritance stuff, what's called when: http://lists.gnu.org/archive/html/bug-gnu-utils/2004-07/msg00042.html
Notes about the implementation: http://gcc.gnu.org/ml/gcc-patches/2000-04/msg00403.html
Actual details of what's called when from the closed items of a bug tracking system. Search for C-5 and C-6 - there are quite a few more details. C-4 is interesting as well. http://www.codesourcery.com/cxx-abi/cxx-closed.html
My actual solution? Added this to my project...
And now it builds.
I spent Thursday night and Friday night investigating some link errors I was getting on my robot project, which is small embedded ARM7 target compiling C++ code under GCC.
So here is the detail of the errors:
speed_control.o: In function `~Speed_Control':
source/speed_control.cpp:35: undefined reference to `operator delete(void*)'
speed_control.o: In function `~Sensing_Callback':
source/motor_sensing.h:38: undefined reference to `operator delete(void*)'
speed_control.o:(.rodata._ZTI13Speed_Control[typeinfo for Speed_Control]+0x0): undefined reference to `vtable for __cxxabiv1::__vmi_class_type_info'
speed_control.o:(.rodata._ZTI16Sensing_Callback[typeinfo for Sensing_Callback]+0x0): undefined reference to `vtable for __cxxabiv1::__class_type_info'
speed_control.o:(.rodata._ZTV16Sensing_Callback[vtable for Sensing_Callback]+0x8): undefined reference to `__cxa_pure_virtual'
All of these problems were caused by the fact that I'm not using the C++ libraries - either the compiler C++ support libraries or the standard libraries. And the very summarized version is that none of them were particularly hard to overcome - once you knew why they were happening! This post is about what they are.
Why aren't I using the standard stuff? This is because I don't have much flash or RAM. Well, actually I have 128K of Flash (which I expect I won't use up) and 60K of RAM (which I'm going use a significant fraction for a large data store ... all will be revealed in a later blog entry). It's quite common for embedded systems to roll their own nearly everything. I do link against certain libraries. But certainly nothing like heap management, cout or printf.
Undefined References to class_type_info
Starting with with the undefined references to
`vtable for __cxxabiv1::__vmi_class_type_info'
and
`vtable for __cxxabiv1::__class_type_info'
- these are related to RTTI (run-time type information) as suspected, and are functions patched into the RTTI table. There is a pointer to this from the vtable.
This is the vtable:
507 _ZTV16Sensing_Callback:
508 0000 00000000 .word 0
509 0004 00000000 .word _ZTI16Sensing_Callback
510 0008 00000000 .word __cxa_pure_virtual
511 000c 00000000 .word _ZN16Sensing_CallbackD1Ev
512 0010 00000000 .word _ZN16Sensing_CallbackD0Ev
As you can see, address 4 points at the type table.
492 _ZTI16Sensing_Callback:
493 0000 08000000 .word _ZTVN10__cxxabiv117__class_type_infoE+8
494 0004 00000000 .word _ZTS16Sensing_Callback
Obviously those are mangled names ... and I've left off a whole other block - this is the pure abstract base class, effectively an interface, part of a concrete class.
Since I'm not using rtti at all, we can get rid of this by adding "-fno-rtti" to the gcc command line options. Both errors go away.
Undefined Reference to `__cxa_pure_virtual'
This one is interesting. Effectively it's a function that is called if you actually (somehow) call a pure virtual member function. Remember that you don't give them a definition (by putting =0) in the class after the member function definition. As you know - this does two things, forces you to define it in any derived classes and stop you making a concrete version of that base class.
I guess you've have to be hacking the vtable or doing some very bizarre casting to get this at all without the compiler spotting it. Either way, I think it's part of the language standard I think - I believe gcc's standard lib does an abort.
eCos has some information on it, as does the OS Dev Wiki.
http://sourceware.org/ml/ecos-patches/2003-03/msg00209.html
http://www.osdev.org/wiki/C_PlusPlus
We just sit in a loop, because I think it will never happen to us.
extern "C" void __cxa_pure_virtual(void)
{
// call to a pure virtual function happened ... wow, should never happen ... stop
while(1)
;
}
Undefined Reference to operator delete(void *)
The final ones quite good ... and Google really didn't help direct me to the information in any sort of quick way.
So, I don't use new and delete - because we haven't got a heap, and the memory on the single-chip microcontroller is not really large enough to use that type memory management (certainly not with my usage). It's all stack based and static objects for us. But that's ok.
So why is gcc generating a delete? Turns out there is more than one destructor in gcc (and there also can be two types of constructor as well - but I'll just cover destructors here). If the destructors are virtual they will appear in the vtable (there are two in the vtable one above - but another class has three). How gcc decides to generate 2 or 3, I haven't found out.
In summary, these three are:
- in-charge deleting (the destructor also deletes the memory space) ... has D0Ev at the end of the mangled name.
- in-charge (the destructor is allowed call other destructors) ... has D1Ev at the end of the mangled name.
- not-in-charge (the destructor is NOT allowed to call other destructors ... and this has D2Ev at the end of the mangled name.
So why the difference between in-charge and not-in-charge? Well, it's got to do with virtual inheritance (as opposed to virtual member functions). The summarised version is that these are multiple inherited classes that have a single base class at some point where we want one object copy rather than a copy for each derived path.
The rules say, to avoid trying to 'destruct' these common base classes multiple times, that only the most-derived class can sort out calling the destructors (and this is probably the simplest method, anyhow).
Also note that multiple destructors is not the only way of handling this ... earlier versions of gcc passed parameters into the destructor and generated code to select the desired operation. However, you get this speed overhead all the time. Extra entries removes this problem - because you can call the one you want directly.
The 'in-charge deleting' version is, I'm guessing, when you destroy an object by calling delete on it. Therefore gcc only needs to arrange to call the destructor. Of course this will always be "in-charge", since it's likely to be at the top of a hierarchy ("most-derived").
Some more information for this topic, mainly about virtual inheritance:
The OS Dev Wiki touches on the solution: http://www.osdev.org/wiki/C_PlusPlus
This GNU list describes the virtual inheritance stuff, what's called when: http://lists.gnu.org/archive/html/bug-gnu-utils/2004-07/msg00042.html
Notes about the implementation: http://gcc.gnu.org/ml/gcc-patches/2000-04/msg00403.html
Actual details of what's called when from the closed items of a bug tracking system. Search for C-5 and C-6 - there are quite a few more details. C-4 is interesting as well. http://www.codesourcery.com/cxx-abi/cxx-closed.html
My actual solution? Added this to my project...
void operator delete(void *)
{
// should never get here ... we don't use new
while(1)
;
}
And now it builds.
7 Comments:
Thank you! This is exactly the type of explanation I was looking for that fit my project.
Another solution to the delete problem is to make all destructors non-virtual. This is preferrable in non-embedded contexts where you might link to a library at runtime (such as a plugin) which does use the delete operator, in which case you want it to use the one in libstdc++ instead of the fake one. The -Wno-non-virtual-dtor option to GCC is useful in this case.
Thanks, it really helped! :)
Actually, you wouldn't have to do anything that bizarre to call a pure virtual function. In my experience, I have found two examples of code which would call a pure virtual function.
One is from stackoverflow.com. However, gcc (4.6) catches this on with "warning: pure virtual (...) called from constructor [enabled by default]".
Scott Meyers in his "Effective C++ in an Embedded Environment" gives a much more evil example. Check out this paste. Notice that gcc emits no warning. Now isn't that evil? :)
dare2be: Very nasty. At least terminate is called in a controlled manner... although whether you could tell that on all embedded (or even PC targets) is a different matter...
Destruction problems (as mentioned in the stack overflow article) could be harder to track down.
As a side note, a while after writing this post, I decided new and delete would be useful even on a small RAM target - so I wrote a very small allocator/deallocator. Yes, don't write an allocator at home a waste of time :-)
A great investigation into these errors! I am also trying to wrestle with these compiler errors when compiling C++ for the embedded PSoC platform. Most of them came around when I tried to use virtual functions.
Thank you! This was really helpfull.
Post a Comment
<< Home