It’s again a complex topic with a lot of nuances so my breakdown.

Difference betweend declaration, definition and initialization?

*can be extended to assignment but that’s trivial

Declaration: tells the compiler of a new entity of type T named x exists. No storage is allocated for it yet ( unless declaration is same as definition ). Definition: allocates storage for the entity and initializes it. Initialization: sets the value of the entity to a specific value. If done later than definition, it’s an assignment.

In all cases, a definition is also an “initialization” ( since if it’s defined it holds some value atleast ) if you consider the default intialization as well though from what I have read, that is not considered to be an initialization in the strict sense.

Per the standard, declaration is a definition unless it’s one of:

  • extern without initialization
  • forward declaration of a class or function

Examples:

// only declaration,
extern int x; // no storage allocated, no initialization
void f();
class C;
 
// definition
int x;
void f() {}
class C {};
 
// initialization
int y = 5; // definition and initialization
// no concept of an initialization for a function or class

A sizeof gotcha. What is the size of extern int x;? You would be tempted to say 0 based on this new-found knowledge but it’s not. Moreover, extern is a storage class specifier, not a type specifier, so it doesn’t affect the type of the variable.

#include <cstdio>
#include <iostream>
 
extern int c;
 
int main() {
    printf("Size: %d bytes\n", sizeof(c));
    return 0;
}

Why? Because what sizeof evaluates is sizeof(type) which is just sizeof(int), it is also evaluated at compile time. So this will be 4 always. Note that it’s the pointer sizes that are dependent on the architecture. ints are usually same size across architectures unless using some exotic ones.

Default initialization

The default initialization depends on the storage class of the variable.

Storage ClassKeywordScopeLifetimeInitial ValueMemory Location
AutomaticautoLocalFunction blockGarbageStack
StaticstaticLocal/GlobalEntire programZeroData segment
RegisterregisterLocalFunction blockGarbageRegister/Memory
ExternalexternGlobalEntire programZeroData segment
MutablemutableClass memberObject lifetimeGarbageObject memory
Thread-localthread_localThread-specificThread durationVariesThread storage

The common takeaway is:

  • local is automatic and has garbage value
  • global or explicitly static is static and has zero-initialization as the default initialization ( covered next )

When is default initialization not possible?

  • for a const or a reference, there is no possible default initialization.
  • a default initialization for a class is the recursive default initialization of all members, so the above rule applies there as well.
  • a class must have the default constructor defined to be default initialized

Example of doing a default initialization:

int x;
void f(){
    int local_x;
}

Zero initialization

Zero initialization zeros out memory for uninitialized static and global variables. These variables are placed in the BSS segment. At program startup, the OS allocates zero-filled memory pages for the BSS segment to prevent information leakage from previous processes. Initialized static and global variables go to the data segment and are loaded with their specified values from the executable file ( the OS still cleans them out before loading but since they are initialized, they are not “zeroed” in that sense ).

Can you trigger a zero initialization? Yes, by using what’s called a value-initialization. What this does is trigger a zero initialization followed by a default initialization.

Also note that if you define a parameterized constructor, the default constructor is not generated and hence no zero initialization is a compile time error since the required constructor is not available.

int x{};
MyClass obj{};
MyClass obj = MyClass(); // if explicit default contructor, then values might not be zero

When can zero initialization not be done?

  • When there’s a non = default constructor. In this case, it’s just a constructor call and the memory is not zeroed out. ( idea being since you have an explicit constructor, you should know what you’re doing + this conflicts with the list initialization syntax so that takes precedence)
  • When there’s no default initialization possible, though this will result in a compile time error ( like reference or const member variables )
  • Explicitly disabled for performance reasons through compiler flags. __attribute__((__uninitialized__)) char buf[BUFSIZ];

Why is my static variable not zero initialized?

Stackoverflow question

class Test {
    static int counter;  // not zero-initialized
};

The declaration for a non-inline static data member inside the class is not a definition.

and hence not zero-initialized. Also, since it’s declared but not defined, any attempt to use it will result in a compile time failure. For example the above compiles but if you try to use it’s value by making it public it will fail.

#include <iostream>
 
class C {
    static int x;
 
   public:
    int val() const {
        return x;
    }
};
 
int main() {
    C c;
    // std::cout << "Value of x: " << c.val() << std::endl; // uncommenting this will cause a compile time error
    return 0;
}

Which makes sense since in this case, there’s only declaration and not definition so no memory is allocated for x to read from.

A defintion will be as follows:

class Test {
    static int counter;  // not zero-initialized
};
int Test::counter;  // now it's defined and zero-initialized

Both of the following would work as well:

class Test {
    static const int counter = 0;
};
class Test2{
    static inline int counter; // from C++17 onwards. This could also be a an initialization expression
}

What does the inline keyword do here?

This is a new feature introduced in C++17 and you cannot in general use inline with a non-static member variable. It allows for a couple of nice things:

  • runtime initialization of the variable: static inline int random_value = rand();
  • allows for the variable to be defined in a header file without violating the One Definition Rule (ODR) since it can be included in multiple translation units without causing multiple definitions

Benefit over a static const initialization is that the const initialization only works for integrals and enums and you cannot initialize, say a string or a vector with it.

Copy initialization

When you have an = and an rhs expression. This invokes the copy constructor unless it can be optimized out by the compiler ( eg: copy elision ). Also does implicit conversions.

int x = 5;
MyClass obj = some_other_obj;

What about this though?

MyClass obj = MyClass(10);

Will this call a parameterized constructor and then a copy constructor? No. Will this call the parameterized constructor and then a move constructor? No. This will call the parameterized constructor and then do copy elision, so no copy or move constructor is called at all.

While this was an optional optimization pre C++17, most compiler did it anyway ( though technically per standard you could observe a double constructor call ). Post C++17, there’s guaranteed copy elision

#include <iostream>
 
class C {
    int x;
 
   public:
    C() {
        std::cout << "Default constructor called" << std::endl;
    }
    C(int x) : x(x) {
        std::cout << "Parameterized constructor called" << std::endl;
    }
    C(const C&) {
        std::cout << "Copy constructor called" << std::endl;
    }
    // if commented out, copy constructor is called for c9 case
    C(C&&) {
        std::cout << "Move constructor called" << std::endl;
    }
};
 
C createObject() {
    return C(42);
}
 
C createObject2() {
    C something(10);
    return something;
}
 
C createObject3() {
    C obj1(10);
    C obj2(20);
    if (rand() % 2 == 0) {
        return obj1;
    } else {
        return obj2;
    }
}
 
int main() {
    std::cout << "c1:\n";
    C c1;
    std::cout << "c2:\n";
    C c2 = c1;
    std::cout << "c3:\n";
    C c3 = C();
    std::cout << "c4:\n";
    C c4(10);
    std::cout << "c5:\n";
    C c5 = C(20);
    std::cout << "c6:\n";
    C c6 = c4;
 
    std::cout << "c7:\n";
    C c7 = createObject();
    // optional NRVO done
    std::cout << "c8:\n";
    C c8 = createObject2();
 
    // no nrvo possible at compile time, preference is to call a move constructor then a copy constructor and fail if both are deleted
    std::cout << "c9:\n";
    C c9 = createObject3();
    return 0;
}
c1:
Default constructor called
c2:
Copy constructor called
c3:
Default constructor called
c4:
Parameterized constructor called
c5:
Parameterized constructor called
c6:
Copy constructor called
c7:
Parameterized constructor called
c8:
Parameterized constructor called
c9:
Parameterized constructor called
Parameterized constructor called
Move constructor called

Note that createObject2() is an NRVO (Named Return Value Optimization) case, where even per the 23 standard, it’s an optional optimization since it’s not always possible to do it ( compiler needs to generate assembly with branching whose outcome is not known ) Note the conditions for automatic generation of different constructors here.

Direct initialization

Use of parentheses () to initialize an object. This is the one that most people usuall do.

int x(5);
MyClass obj(arg1, arg2);

Uniform initialization ( List initialization )

Introduced in C++11, this uses curly braces {} to initialize objects. This is the preferred way to initialize in modern C++.

There’s two forms to it. One is direct list initialization which is as following:

int x{5};
MyClass obj{arg1, arg2};
vector<int> v{1, 2, 3};

The other is copy list initialization where you have an = sign:

int x = {5};
MyClass obj = {arg1, arg2};
vector<int> v = {1, 2, 3};

Is there a difference between the two? Yes, for auto type deduction.

#include <iostream>
 
int main() {
    auto list1 = {42};       // std::initializer_list<int>
    auto list2 = {1, 2, 3};  // std::initializer_list<int>
    auto list3{42};          // int
 
    std::cout << "list1 type: " << typeid(list1).name() << "\n";
    std::cout << "list2 type: " << typeid(list2).name() << "\n";
    std::cout << "list3 type: " << typeid(list3).name() << "\n";
 
    // auto mixed = {42, 84.42};  // ERROR: mixed types not allowed
    // auto list4{1, 2, 3}; // ERROR: multiple values not allowed
    return 0;
}

One other key behavior to note is that implicit narrowed conversions are not allowed.

int x{5.5}; // error: narrowing conversion from 'double' to 'int'
int y = 5.5; // ok, implicit conversion

There’s aggregate initialization which is special case of list initialization.

struct Point { int x, y; };
Point p{1, 2};           // Aggregate initialization
 
int arr[]{1, 2, 3, 4};   // Aggregate initialization for arrays
int arr2[5]{1, 2};       // Remaining elements zero-initialized

Also works for classes that are essentially a struct ( no private members, no user-provided constructors, no base classes, and no virtual functions ) Order is always same as the declaration order of the members.

Value initialization

I find the name to be a misnomer since there’s no “value” suplied. This is basically an explicit way of doing a zero initialization that is covered above already.