Structured Types
Primitive types are fine for creating simple programs. But, for most tasks, you need more complex user-defined data types, such as string and vector. When we combine multiple data items into a larger unit, it is called a structured type. The types in the standard library, such as string and vector, are structured types.
The C++ language has two derived structured data types. The built-in linear list-type collection is called an array. In an array, all of the elements must be of the same type, so we say that an array is a homogeneous structured type. With an array, the elements are accessed by using their index, just like the string type.
In addition to arrays, programs often combine related pieces of information into a composite object which can be manipulated as a unit, such as an employee record. Each worker has an employee number, a name, an address, job title and so on. Such types are called records (which is a generic Computer Science term) or structures (which is the C/C++ specific term)
The data members which make up a structure do not all need to be of the same type, so we say that a structure is a heterogenous data type. The Date type shown here is a structure type, consisting of a month, day and year. C++ also has more advanced record types, called classes, which you'll study later in the semester.
Structure Definitions
Here is the C++ definition for a Date user-defined structure type:
#include <string>
struct Date
{
std::string month;
int day;
int year;
};
Unlike a variable definition, a structure definition does not create an object in memory. Instead, it defines or specifies a new type which contains three data members. (You should not call them fields—ala Java— since the term field has a slightly different meaning in C++).
The structure name (Date) is formally known as the structure tag. As mentioned earlier, structure members do not all need to be of the same type. In Date, month is of type string, while day and year are of type int.
Nested Structures
A structure member may be another type of structure. This is called a nested structure. For instance, a person has a name and a birthday. We have a Date type, so we can use it as part of a Person definition.
struct Person
{
std::string name;
Date birthday;
};
Structure definitions are normally found inside header files. Thus, all library members (such as the std::string month), must be fully qualified. It is illegal to include the same type definition multiple times, even if the definitions are exactly the same. Protect against this by using header guards (which are not shown here).
Don't forget the semicolon appearing after the final brace. If you leave it off, you're likely to see a misleading error message pointing to a different area of your code.
Structure Variables
A structure definition introduces a new type. Once you have the type definition, you can define variables, as you would with any other type.
int n; // uninitialized int variable n
Date today; // uninitialized Date variable today
These two lines instruct the compiler to allocate memory for the int variable n, and for the Date variable today. The Date variable today includes data members that store the values of its month, day and year components.
If you were to draw a box diagram of the variable, it would look something like the picture on the right. Just as the int variable n is uninitialized, day and year in the variable today are also uninitialized.
The month member is default initialized, because it is a library type. This is the opposite of Java. If we were to create a Java Date class with a public String field, that field would be uninitialized, while the primitive types would be default initialized.
Anonymous Structures
You may also create a structure variable along with the definition. This can be useful when you need to group together a pair of variables for immediate use.
struct iPair {int a, b;} p1;
struct {int a, b} p2;
Here, p1 is a structure variable, of type iPair. When you do this, you may also omit the structure tag, as is done for the variable p2, creating an anonymous structure.
Initialization
Starting with C++11 you can provide in-definition initializers for each of your data members, just like Java. You should definitely take advantage of this as it will eliminate unininitialized data members.
struct Date
{
std::string month; // no initializer required
int day = 0; // legacy initialization
int year{0}; // uniform initialization
};
Use legacy ("assignment") initialization (day), or uniform initialization (year). You may not use direct initialization with parentheses instead of braces. Note that month does not need an initializer, since it is a library type, and it will automatically be initialized by its constructor. However, you may explicitly initialize it as well, if you like.
Aggregate Initialization
You may explicitly initialize a structure variable by supplying a list of values, one for every data member, inside curly braces, separated by commas and ending with a semicolon. This is called aggregate initialization.
Date birthday = {"February", 2, 1950};
Date empty = {};
If you supply no initializers, as with the Date empty, then all members are default initialized. In this case, that means that day and year are set to 0 instead of a random number. If the members already have default initializers (from the structure definition), then those default initializers are used instead.
Member Access
You select the individual members of a structure variable by using the member access operator, or, more informally, the dot operator, like this:
cout << birthday.month << endl;
Here, birthday is the structure variable and month is the data member it contains. Such selection expressions are assignable, so you can modify the components of birthday like this:
Date birthday;
birthday.month = "February";
birthday.day = 2;
birthday.year = 1950;
Since this is assignment, and not initialization, this must appear inside a function.
With a nested structure:
- You can access the nested member in its entirety (aggregate)
- You can access the data members of the nested structure, using another level of "dots". Here is an example.
Date groundhog = {"February", 2, 1950};
Person steve;
steve.name = "Stephen";
steve.birthday = groundhog; // aggregate assignment, or ...
steve.birthday.year = 2023; // nested dots
Aggregate & Unsupported Operations
Structure types in the C programming language cannot automatically perform all of the common operations that the built-in types can, so we say that such derived types are second-class types. Operations that work with the structure as a whole are called aggregate operations.
Four built-in aggregate operations work in both C and in C++: assignment, initialization, passing parameters and returning structures. Given a Date variable:
- You can assign it to another variable, just as if it were an int or double.
- You can use it to initialize another variable.
- You can pass it to a function as an argument.
- You can return it from a function.
All of these are closely related to assignment. Here are some things you can and cannot do with structure variables:
Date d1{"February", 2, 1950}, d2;
d2 = d1; // assignment OK
Date d3{d2}; // initialize OK
if (d1 == d2) // NO aggregate comparison
cout << d1 << endl; // NO aggregate I/O
d1++; // NO built-in arithmetic
As you can see:
- You cannot compare two structures using either equality or the relational operators. You must compare the individual data members instead.
- You cannot automatically display a structure variable using cout; you must access and print the individual data members.
- There is no built-in arithmetic.
It is, however, easy to turn each of these operations into an aggregate operation by simply writing some functions. We'll look at those shortly.