How slow is dynamic_cast?

C++ users are advised frequently not to use dynamic_cast, because it's slow.

I thought it would be nice to measure this, so I made up a test on Visual C++. All you need is a simple hierarchy of types with a baseclass, a derived class, and another class derived from that.

The test is this: if you only know that an object is derived from the baseclass, prove whether or not it's of a certain type and (if so) call an appropriate method on it. And you typically want to be able to measure how fast you can determine that a base class is not of a certain type too.

The typical workaround to dynamic_cast is to make a base virtual method like this:

virtual int32 Name() = 0;

I measured that too.

Here's how things wind up in my test:

Base case
  • Use reinterpret_cast on a known type, call a simple 2-cycle method.
  • Result: 1.1B operations/sec (half my CPU speed, good)
Virtual case
  • Implement a virtual method per derived class, call it, check the result. If it matches what we want, follow the base case.
  • Result: 270M operations/sec for positive and negative tests
dynamic_cast case
  • Calls dynamic_cast, and if non-NULL result, calls a method.
  • Proof positive (is the type you want): 21M operations/sec
  • Proof negative (NULL result, single derivation): 15M operations/sec
  • Proof negative (NULL result, two-level derivation): 11M operations/sec
Conclusions

dynamic_cast costs a lot. A simple call can use 100-200 cycles under Visual C++. Calling a virtual function (or checking a base-class member) and testing the result can run up to 20x faster.

However, it is more general than the other listed methods (because you can cast to intermediate types), and it doesn't use extra space in your binary like a bunch of virtual functions do.

But it's not so advisable in performance-critical code.

6 comments:

  1. Anonymous4:37 PM

    Post code, please, so we can post follow-ups or poke holes :)

    ReplyDelete
  2. Interesting results. Thanks for posting.

    John

    ReplyDelete
  3. It does cost you space in the binary -- the compiler creates a struct describing each and every type that has at least one virtual function. that struct then stores the full name of the class (thus leaking all of your naming conventions and making code easier to reverse-engineer). The whole thing then works by comparing these strings which all have long common prefixes (i.e. namespace names). A system that doesn't encoding types with 4-letter codes (that's so 1984) is using GUIDs. Scales well, too (unlike ints).
    Thanks for the timings, I was always wondering how slow dynamic cast really was.
    Alexei Lebedev

    ReplyDelete
  4. I have an even more extreme case of this. I've found that on Intel x64 code, when a process uses > 3GB of memory and you are calling dynamic_cast<> on two-level derivation with 50 methods in vtable, dynamic_cast<> can cost 8us on a 2.4Ghz Q6600, or only 125K ops/sec! Getting to the bad case is actually very hard and involves certain sequences of memory allocation. I have no idea why this is so.

    ReplyDelete
  5. Can you please compare dynamic_cast<> on gcc vs MSVC?

    Ive heard horror stories of what MSVC has to do to get dynamic_cast to work and I'm not sure it reflects properly on the rest of the C++ compilers.

    ReplyDelete
  6. Note that dynamic_cast covers much more than the simple name method. With

    class B : public A {};
    class C : public B {};
    A * a = new C;

    dynamic_cast will allow you to cast a to B *, while the simple Name() method will not. Matters are further complicated by multiple inheritance.

    Still, it's not understandable that it should cost that many cycles.

    ReplyDelete