Memory & Addresses

One of the principles behind the design of C++ is that programmers should have as much access as possible to the underlying hardware. For this reason, C++ makes memory addresses explicitly visible to the programmer. An object whose value is an address in memory is called a pointer.

The illustration on the right provides a rough sketch as to how memory is organized in a typical C++ program when it is loaded from disk and run. The machine instructions are put into the text or (code) section. This section of memory is read-only and protected by the operating system.

Global variables and static variables are stored in the static area (which, in the illustration is marked as Initialized Data and Uninitialized Data). You can read and write data to this area of memory, but variables stored here don't move around. They are stored when the program loads and before it starts executing.

At the opposite end of memory is the stack. Each time your program calls a function the computer creates a new stack frame in this memory region, containing parameters, local variables and the return address. When that function returns, the stack frame is discarded leaving the memory free to be used for the subsequent calls.

The region between the end of the program data and the stack is called the heap which is used for dynamically allocated memory. We'll study dynamic memory allocation a little later in this course. It will be a big part of CS 250.

Where a variable is stored depends on how the variable is defined. Click the "Running Man" on the left to visualize three variables.

Naming Concepts

Three terms are used to describe the characteristics of a variable or function name:

Scope: where the name is visible.

Variables with block scope are visible from the point of their declaration to the end of the block where they are declared. Local variables and parameters have block scope.
Functions and global variables have file scope; they are visible from the point of declaration to the end of the file in which they are declared.

Storage and duration: where a variable is located, and how long it stays there.

Variables in static storage are placed there when the program starts running and stay at the same address until the program is finished. Global variables and local static variables have static storage class.
Variables with automatic storage, are placed on the stack when they are defined, and then destroyed when the block they are defined in ends; automatic variables always have block scope. Local variables and parameters have automatic storage.
Dynamic storage is determined by the programmer; dynamic variables are placed on the heap and removed from the heap in response to specific programmer commands, such as new and delete.

Linkage: how variables and functions can be shared between different files.

External linkage means that a global variable or function can be used from other files. This is the normal case with global variables.
Internal linkage means that a "global" variable or function is only visible to other functions in the same file. This is indicated by placing the keyword static before the definition of the variable or function.
No linkage means that a variable cannot be used inside any other function. All local variables and parameters have no linkage.

Global Variables

Global variables—usually constants in this class—are allocated in the static storage area. Thus, if the compiler sees the definition below (outside of any function), it reserves eight bytes in the static area, and stores the literal value when the program is compiled.

const double kPi = 3.14159;

A picture of a global variable in memory. As a programmer, you have no idea what address the compiler will choose, but it often helps you to visualize what is happening if you make up an address and use that in a diagram. Here you might imagine that the constant kPi is stored in the address 0200.

Most platforms support a much more accurate value for PI. We can calculate that value using the expression acos(-1.0) at run-time.

const auto kPi = acos(-1.0);

This produces the following output when printed with 16 digits of precision:

kPI->3.1415926535897931

Local Variables

Parameter variables, and variables created inside a function, are local variables, allocated on the stack in a block of memory called a stack frame. Internally, these variables are pushed onto the top of the stack at the time of each function call.

The same local variable may be stored at a different address each time the function is called. In fact, when we covered recursion earlier in the semester, we saw that there may be multiple copies of the same local variable, each stored at a different location on the stack. This is what makes recursion possible.

Local static Variables

A local variable that uses the static modifier is not stored on the stack, but in the static storage area, like a global variable. As far as its storage class goes, it is a global variable, but as far as its scope and linkage goes, it is a local variable.

Characteristics of Variables

Every variable has at least three characteristics.

Name: used to access the data in your code.
Type: used to determine the amount of memory required to store the variable, the representation or interpretation of the bits stored in memory at that location, and the operations that are legal on that memory location.
Value: the meaning of the bits stored at the memory location selected by the compiler, when interpreted according to its type.

When you define a variable in a C++ program, the compiler makes sure that the variable is allocated enough memory to hold a value of that type. Here's an example:

int a = 3;          // name->a, type->int, value->3
auto b = 3.14159;   // name->b, type->double, value->pi
cout << a << endl;  // print value
cout << b << endl;  // print value

The sizeof Operator

The sizeof operator returns the amount of memory allocated for a variable. The operator takes a single operand, which must be either an expression, such as the name of a variable or a type name. Type names must be enclosed in parentheses.

If used with a variable or an expression, the sizeof operator returns the number of bytes required to store the value of that expression. If used with a type, sizeof returns the number of bytes required to store a value of that type.

int a = 42;
cout << sizeof a << endl;       // 4 (on our platform)
cout << sizeof(double) << endl; // 8
cout << sizeof 7LL + 4 << endl; // 12-WHY?

The first line prints the bytes required to store the int variable a; the second line prints the number of bytes required to store any value of type double. The third is more confusing. On our platform, a long long should take 8 bytes, but this prints 12. Why?

Simple: sizeof is a unary operator. That means that the expression shown here reads as sizeof(7LL) + 4 which is 8 + 4 or 12. The fix is equally simple: always parenthesize arguments to the sizeof operator, even when they are not needed.

The Address Operator

The address operator, &, when placed in front of a variable, returns the address where the variable is stored in memory.

cout << "&a->" << &a << endl;
cout << "&b->" << &b << endl;

Addresses are normally printed in hexadecimal, and depend on the size of the pointer. Here's the output from this fragment of code when run on two platforms:

&a->009CF808 - Visual C++ 19 (Windows)
&b->009CF7F8
&a->0x7fff448e448c - G++ & Clang (Unix)
&b->0x7fff448e4490

Notice that Visual Studio has 32-bit addresses, while Unix uses 64-bit. Of course, the actual values of the addresses printed on your machine will be entirely different. You can take the address of a variable, such as &a or &b, but not a type. Writing &int is nonsensical and illegal.

You also cannot take the address of a literal or expression: &12 is meaningless.