Something about programming

Data Types and Variables in C++

Previous tutorial: Hello World in C++
Next tutorial: Input and Output in C++. Streams

Computer destination is to process information (or data). It takes information and do something with it. In this tutorial we'll talk about how C++ interpret information.

Information is anything that have meaning (in any context). It can be: amount of money on your bank account, your age in your profile on some site, number of points in the game you play, text of the book you are reading, music track you've listened this morning, video clip with cat that you've watched just before you've start to read this tutorial. So the information may be very small or very large. We'll start with very small chunks of data in C++ language.

What are literals in C++

Literal is just a piece of data as is. There will be lot of literals in your code. Example of literals are these:

384 11.1 true false "hey" 'c'

So wherever you see in code number, text string, separate text character - they are just literals. They are hardcoded in code. Example of literal from last tutorial is string "Hello World\n" that we printed in console.

As you can see there are different kind of literals: numeric, text and so on. In C++ all data must have data type.

Data Types in C++

C++ has a specific set of data types. Each data type has it's own size - how much computer memory it occupies.

Integer Data Types in C++

In C++ there are several data types for integer numbers. One reason for that - it's the legacy and another one - it allows to save memory. The main difference between integer types - they occupy different amount of memory and can have store different range of numbers.

The smallest one is char. It takes one byte of memory (8 bits). It's very small and the problem with it, that it can store only 256 values: 0 to 255 or -128 to 127. So, we can store only very small numbers with it. Traditionally char is used to store text. This type was enough to encode all letters (52 for uppercase and lowercase letters of English alphabet), different punctuation and mathematical signs and also alphabet of another language.

int. It occupies 4 bytes (32 bits) and can store about 4.3 billion values. When we saw word int before our main function that was reference to this type.

long long - takes 8 bytes. Why word long repeats two times? Because earlier when computers have less memory long type was reserved to occupy 4 bytes. But what about int ? At those times int took 2 bytes. So when computers became more powerful, capacity of int type was increased.

Signed and unsigned numbers

For C++ there is difference if number is signed or unsigned. Let's take a look at char type. It can store 256 values. By default it's signed values, so total range of char data type is -128 to 127. We can make C++ to interpret any number as unsigned. It will get us double range of positive numbers.So unsigned char has range from 0 to 255. It can be applied to any integer type.

Float Data types in C++

Float numbers are numbers like: 0.1, 1.5, 0.132589869235982 and so on. For computer there is a big difference between integer numbers and float numbers as they are stored very differently in memory.

Type float takes 4 bytes. It can store much bigger numbers than int. But they are not as precise as integer numbers - there is almost always precision error. We'll discuss this interesting fact later.

Another data type for float numbers is double. It takes 8 byte and can store huge numbers.

bool type in C++

bool takes 1 byte, but can store only two values. Theoretically it would be enough to have only one bit for this type but modern computers can address only specific bytes not bits so bool generously got 1 byte for itself. C++ has special keywords for bool values: true and false.

How to store text in C++?

As we saw earlier type char is integer data type. And at the same type it can be used to store text. How it's done? There is a thing that's called encoding: it was desided that for each number of range 0 to 127 will be assigned specific symbol (letter, mathematical sign, punctuation sign...). There are many encodings but the most common that's used is ASCII. In this encoding number 65 means A, 97 - a, number 48 represents symbol of number 0, and 57 represents symbol of number 9. So, when you see number 0 when you read this text, for computer it's number 48. That's just convention for ASCII. ASCII encodes only 7 bytes - it has 128 different values. Another bit and another 128 values can be used if some national alphabet is used (like Greek, Hindi or Japanese).

There are two different text literals in C++: single quoted and double quoted. In single quotes we can put only one character - it's character literal: 'a', '1', 'b'... In double quotes we can put several characters: "Hey", "Some text"...

Unicode in C++

There are only 256 values in char type. ASCII encodes 128 values of it. And as you can imagine the world have many languages and it's not possible to encode all alphabets in char. To solve this problem Unicode can be used. Unicode is encoding like ASCII. Moreover, first 128 values of Unicode match ASCII (some types of Unicode - there are many). There are several Unicode encodings. In C++ Unicode can be represented by types: wchar_t (wide string - 2 bytes), char16_t (UTF-16 - 2 bytes), char32_t (UTF-32 - 4 bytes). All this types has their own literals: L"Some text" - wchar_t literal, u"Some text" - char16_t literal, U"Some text" - char32_t literal.

When there is no type - void in C++

There are situations when we'll need to tell the compiler that there is no any type. For this case we'll use keyword void.

There is also special type in C++ - nullptr. We'll discuss it when we'll talk about pointers.

Variables

Variable is a part of memory that has name and data type. Memory is just a sequence of bytes. We can tell compiler to reserve part of memory. And we can use this part arbitrarily.

Declaring variables in C++

To use any variable we first need to declare it. For this we need data type and name:

int a; // variable a of type int char c1; // variable c1 of type char float pointsOfThePlayer; //variable pointsOfThePlayer of type float

a, c1, pointsOfThePlayer are variable names. Name must start with letter and can contain digits and underscore sign _. For variable a and pointsOfThePlayer compiler will reserve 4 bytes and for variable c1 compiler will reserver 1 byte. From now on compiler knows what type each variable have and we can't mess them. For example compiler will not allow us to store float value in variable a.

Assignment operations

One of the most important operations in any programming language is assignment. It's just = sign. It takes what is from the right of the sign and puts it in the left part.

int i1; char c1; i1 = 1000; c1 = 'a';

From now on memory that is reserved for i1 has value 1000 and another chunk of memory that was allocated for c1 has number 97, that represents 'a' in ASCII encoding.

int i1 = 4; char c1 = 127; unsigned c2 = 255; char c3 = 'a'; float f1 = 3.14;

When we declare variable and assign it a value in one string, it's called initialization. This kind of initialization was inherited from C language. In C++ we can initialize variables another way:

int i1(10); int i2{10};

On first line we use parentheses, and on second - curly braces. Curly braces were introduced in C++ 11.

Printing variables

In Hello World tutorial we saw how to print string literal in console. Let's see how we can print variables.

int i1{1}; int i2{2}; cout << i1; cout << i2;

In C++ we can print any information with cout. We just replaced string literal with variable. But if you compile and run this code you'll see in console 12 - the values will be on same line and it's not clear that there are two separate values. We can delimit them by printing each of them on separate line.

cout << i1 << endl; cout << i2;

Keyword endl inputs new line character where it appears. Pay attention that we can use many << operations for output.

Let's see what else we can do for separting values:

int i1{1}; int i2{2}; int i3{3}; char c1 = '\n'; cout << i1 << endl << i2 << endl; cout << i1 << ' ' << i2 << endl; cout << i1 << c1 << i2 << endl;

On first line with ouput we separated values by new line. On second we used string literal with space. Third line used variable c1 that stores \n. \n is a special character that breaks the line. In ASCII it has value 10.

Conlcusion

There are more data types in C++ but we discussed most important.

Exercises

  1. Declare variable but don't initialize it. Print this variable to console and check what will be printed for uninitialized variable. Try different types: int, char, float.

Comments:

No comments yet