J@ArangoDB

{ "subject" : "ArangoDB", "tags": [ "multi-model", "nosql", "database" ] }

C++ Constructors and Memory Leaks

Preventing leaks in throwing constructors

The easiest way to prevent memory leaks is to create all objects on the stack and not using dynamic memory at all. However, often this is not possible, for example because stack size is limited or objects need to outlive the caller’s scope.

Another way to prevent memory leaks and leaks of other resources is obviously to employ the RAII pattern. How can it be used safely and easily in practice, so memory leaks can be avoided?

This post will start with a few seemingly working but subtly ill-formed techniques that a few common pitfalls. Later on it will provide a few very simple solutions for getting it right.

None of the solutions here are new or original.

I took some inspiration from the excellent constructor failures GotW post. That doesn’t cover smart pointers and is not explicitly about preventing preventing memory leak, so I put together this overview myself.

Naive implementation

Let’s pretend we have a simple test program main.cpp, which creates an object of class MyClass on the stack like this:

main.cpp
1
2
3
4
5
6
7
8
9
10
11
12
#include <iostream>
#include "MyClass.h"

int main () {
  try {
    MyClass myClass;
    std::cout << "NO EXCEPTION" << std::endl;
  }
  catch (...) {
    std::cout << "CAUGHT EXCEPTION" << std::endl;
  }
}

The above code creates the myClass instance on the stack, so itself will not leak any memory. When the creating of the myClass instance fails for whatever reason, the instance newer existed so the memory for holding a MyClass object will be freed automatically. If object creation succeeds and the object goes out of scope at the end of the try block, then the object’s destructor will be called and resources can be freed, too.

Obviously this is already good, so let’s keep it as it is and have a look at the implementation of MyClass now. This class will manage two heap objects of type A, which are created using the helper function createInstance:

MyClass.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include "A.h"

struct MyClass {
  A* a1;
  A* a2;

  MyClass ()
    : a1(createInstance()),
      a2(createInstance()) {

    std::cout << "CTOR MYCLASS" << std::endl;
  }

  ~MyClass () {
    std::cout << "DTOR MYCLASS" << std::endl;
    delete a1;
    delete a2;
  }
};

For completeness, here is class A. It won’t manage any resources itself:

A.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>

struct A {
  A () {
    std::cout << "CTOR A" << std::endl;
  }
  ~A () {
    std::cout << "DTOR A" << std::endl;
  }
};

// helper method for creating an instance of A
A* createInstance (bool shouldThrow = false) {
  if (shouldThrow) {
    throw "THROWING AN EXCEPTION";
  }
  return new A;
}

During this complete post, the code of A.h will remain unchanged.

Compiling and running the initial version of main.cpp will produce the following output:

output of naive implementation
1
2
3
4
5
6
7
CTOR A
CTOR A
CTOR MYCLASS
NO EXCEPTION
DTOR MYCLASS
DTOR A
DTOR A

Valgrind also reports no memory leaks. Are we done already?

Introducing exceptions

No, because everything still went well. Let’s introduce exceptions into the picture and check what happens then.

Let’s first introduce an exception in the constructor of MyClass. We’ll make the createInstance function throw on second invocation (we do this by passing a value of true to it):

constructor throwing an exception
1
2
3
4
5
6
MyClass ()
  : a1(createInstance()),
    a2(createInstance(true)) {

  std::cout << "CTOR MYCLASS" << std::endl;
}

Running the program will now emit the following:

output of naive implementation, with exception
1
2
CTOR A
CAUGHT EXCEPTION

As we’re throwing in the initializer list already, we don’t even reach the constructor body. This is no problem, but worse is that the destructor for class MyClass is not being called at all. Valgrind therefore reports the memory for first A instance as leaked.

By the way, the destructor for the MyClass instance is intentionally not being called as the object hasn’t been fully constructed and logically never existed.

Will it help if we move the heap allocations from the initializer list into the constructor body like this?

using the constructor body instead of the initializer list
1
2
3
4
5
MyClass () {
  std::cout << "CTOR MYCLASS" << std::endl;
  a1 = createInstance();
  a2 = createInstance(true);
}

Unfortunately not. Still no destructor invocations:

output of constructor body variant
1
2
3
CTOR MYCLASS
CTOR A
CAUGHT EXCEPTION

Remember: an object’s destructor won’t be called if its constructor threw and the exception wasn’t caught. That also means releasing an object’s resources solely via the destructor as in implementation above will not be sufficient if resources are allocated in the constructor and the constructor can throw.

What can be done about that?

Obviously all resource allocations can be moved into the constructor body so exceptions can be caught there:

catching exceptions in constructor of MyClass
1
2
3
4
5
6
7
8
9
10
11
12
13
14
MyClass () {
  std::cout << "CTOR MYCLASS" << std::endl;
  a1 = createInstance();

  try {
    a2 = createInstance(true);
  }
  catch (...) {
    // must clean up a1 to prevent a leak
    delete a1;
    // and re-throw the exception
    throw;
  }
}

While the above will work, it’s clumsy, verbose and error-prone. If more objects need to be managed this will make us end up in deeply nested try…catch blocks.

try…catch for the initializer list

But wait, wasn’t there a try…catch feature especially for initializer list code? Sounds like it could be useful. Maybe we can use this instead so we can catch exceptions during initialization?

There is indeed something like that: exceptions thrown from the initializer list can be caught using the following special syntax:

catching exceptions thrown in the initializer list
1
2
3
4
5
6
7
8
9
10
MyClass ()
  try : a1(createInstance()),
        a2(createInstance(true)) {

    std::cout << "CTOR MYCLASS" << std::endl;
  }
  catch (...) {  // catch block for initializer list code
    std::cout << "CATCH BLOCK MYCLASS" << std::endl;
    delete a1;
  }

Running the program with the above MyClass constructor will also do what is expected: when creating the second A instance, the initializer list code will throw, invoking its catch block. Again code execution won’t make it into the constructor body, and we don’t see the destructor code in action.

The output of the program is:

output of initializer list variant
1
2
3
4
CTOR A
CATCH BLOCK MYCLASS
DTOR A
CAUGHT EXCEPTION

Valgrind does not report a leak, so are we done now?

No, as the above code has a severe problem. It worked only because we knew the second invocation of createInstance would fail.

But in the general case, either the first call or the second call can fail. If the first call fails, then the initializer hasn’t initialized any of the object’s members, and it would be unsafe to delete any object members in the initializer’s catch block. If the second createInstance call fails, then the initializer has created a1 but not a2. To prevent a leak in this case, we should delete a1, but we better don’t delete a2 yet.

But how do we tell in the catch block at what stage the initializer list had thrown? There is no natural way to do this correctly without introducing more state. And without that, we have the choice between undefined behavior when deleting the not-yet-initialized object members, and memory leaks when ignoring them.

Not using pointers at all

Note that if we wouldn’t have used pointers for our managed A objects, then we could have used the fact that destructors for all initialized object members are actually called when object construction fails.

However, simple pointers don’t have a destructor, so the objects they point to remain and the memory is lost.

So one obvious solution for preventing memory leaks is to not use pointers, and get rid of all new and delete statements.

In some situations we can probably get away with making the managed objects regular class members of the class that manages them:

not using pointers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
struct MyClass {
  A a1; // no pointer anymore!
  A a2; // no pointer anymore!

  MyClass ()
    : a1(),
      a2() {

    std::cout << "CTOR MYCLASS" << std::endl;
  }

  ~MyClass () {
    std::cout << "DTOR MYCLASS" << std::endl;
    // no delete statements needed anymore!
  }
};

Now if any of the A constructors will throw an exception during initialization, everything will be cleaned up properly. Now we can make use of the destructor of A. If A instances are not pointers but regular objects, the destructors for already created instances will be called normally, and no destructors will be called for the not-yet-initialized A instances. That’s how it should be. We don’t get this benefit with regular pointers, which don’t have a destructor.

As an aside, we got rid of the delete statements in the destructor and may even get away with the default destructor.

Obviously this is an easy and safe solution, but it also has a few downsides. Here are a few (incomplete list):

  • when compiling MyClass, the compiler will now need to know the definition for class A. You can’t get away with a simple forward declaration for class A anymore as in the case when the class only contained pointers to A. So this solution increases the source code dependencies and coupling.
  • instances of managed objects (e.g. A) will need to be created when the managing object (e.g. MyClass) is created. There is no way to postpone the object creation as in the case of when using pointers.
  • in general, the lifetime of the managed objects is tied to the lifetime of the managing object. This may or may not be ok, depending on requirements.

Using smart pointers (e.g. std::unique_ptr)

In many cases the superior alternative to all the above is using one of the available smart pointer classes for managing resources.

The promise of smart pointers is that resource management becomes easier, safer and more flexible with them.

Really useful smart pointers (this excludes std::auto_ptr) are part of standard C++ since C++11, and to my knowledge they can be used in all C++11-compatible compilers and even in some older ones. Apart from that, smart pointers are available in Boost for a long time already.

In the following snippets, I’ll be using smart pointers of type std::unique_ptr as it is the perfect fit for this particular problem. I won’t cover shared_ptr, weak_ptr or other types of smart pointers here.

When using an std::unique_ptr for managing the resources of MyClass, the MyClass code becomes:

using std::unique_ptr
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <memory>

struct MyClass {
  std::unique_ptr<A> a1;
  std::unique_ptr<A> a2;

  MyClass () :
    a1(createInstance()),
    a2(createInstance(true)) {

    std::cout << "CTOR MYCLASS" << std::endl;
  }

  ~MyClass () {
    std::cout << "DTOR MYCLASS" << std::endl;
  }
};

With a unique_ptr, we can still create resources when needed, either in the initializer list, the constructor or even later. The resources can still be created dynamically using new (as is still done by function createInstance). When we’re not taking the resources away from the unique_ptrs, then they will free their managed objects automatically and safely. We don’t need to bother with delete.

And we don’t need to bother with nested try…catch blocks either. If anything goes wrong during object creation, any already assigned unique_ptrs will happily release the resources they manage in their own destructors.

It does not matter if the above code throws an exception in the first invocation of createInstance, in the second or not at all: in every case any allocated resources are released properly, and still there is no need for any explicit exception handling or cleanup code. This is what a smart pointer will do for us, behind the scenes.

Simply compare the following two code snippets, which both create three instances of A while making sure no memory will be leaked if the initialization goes wrong:

solution using smart pointers
1
2
3
4
5
6
7
8
9
std::unique_ptr<A> a1(createInstance());
std::unique_ptr<A> a2(createInstance());
std::unique_ptr<A> a3(createInstance());

// now do something with a1, a2, a3
// managed objects will be released automatically when
// the unique_ptrs go out of scope
// note: they may go out of scope unintentionally if
// some code below will throw an exception...
solution using nested try…catch blocks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
A* a1 = nullptr;
A* a2 = nullptr;
A* a3 = nullptr;

a1 = new A;
try {
  a2 = new A;
  try {
    a3 = new A;
  }
  catch (...) {
    delete a2;
    throw;
  }
}
catch (...) {
  delete a1;
  throw;
}

// now do something with a1, a2, a3
// objects a1, a2, a3 will not be released automatically
// when a1, a2, a3 go out of scope. any user of a1, a2, a3
// below must make sure to release the objects when they
// go out of scope or when an exception is thrown...

Obviously the smart pointer-based solution is less verbose, but it is also safer and hard to get wrong. It is especially useful for initializing and managing dynamically allocated object members, because as we’ve seen most of the other ways to do this are either subtly broken or much more complex.

Apart from that, we can take the managed object from out of a unique_ptr and take over responsibility for managing its lifetime.

Further on the plus side, a class definition that contains unique_ptrs can be compiled with only forward declarations for the managed types. However, when the unique_ptr is a regular object member, at least the class destructor implementation will need to know the size of the managed type so it can call delete properly.

The downside of using smart pointers is that they may impose minimal overhead when compared to the pure pointer-based solution. However in most cases this overhead should be absolutely negligible or even be optimized away by the compiler. It may make a difference though when compiling without any optimizations, but this shouldn’t matter too much in reality.