Functions and vectors

In this introduction to functions in C++, we have so far seen examples of functions that return values of the basic types of the language, like int and double. But, could a function return a vector? In addition, the functions used in previous examples only have arguments of the basic types of the language. But, could a vector be an argument of a function? For instance, one may want to write a function that displays the list of values stored in a given vector.

We address now these questions.

Vectors as return values

Yes, functions in C++ can return a value of type std::vector. Let us see an example of how this can be used in practice.

Recall again the example about the course grades given in the beginning of the motivation section for functions. A vector is used to count the number of grades that fall in each of the $11$ clusters (or intervals). Slot with index $0$ counts the number of grades in the interval $[0,9]$, slot with index $1$ counts the number of grades in the interval $[10,19]$, slot with index $2$ counts the number of grades in the interval $[20,29]$, $\cdots$, slot with index $9$ counts the number of grades in the interval $[90,99]$, and finally, slot with index $10$ counts the number of grades in the last interval $[100,100]$.

A useful function in this context is one that reads the grades of a class of students, at same time counts the number of grades that fall in each cluster, and then returns a vector with the $11$ cluster counters.

// Read a list of grades
// Return a vector of 11 cluster counters
std::vector<int> read_grades();

Since the program should handle the grades of two different classes of students, two vectors are used, named clustersA and clustersB. These vectors can then be initialized by calling function read_grades().

int main() {
    // Read grades for class A and create the clusters counters
    std::cout << "Enter grades for  class A:\n";
    std::vector<int> clustersA = read_grades();

    // Read grades for class B and create the clusters counters
    std::cout << "Enter grades for  class B:\n";
    std::vector<int> clustersB = read_grades();
	
   // ...
}

Function read_grades can be defined as follows.

// Read a list of grades
// Return a vector of 11 cluster counters
std::vector<int> read_grades() {
    std::vector<int> C(11, 0); // create a vector of 11 cluster counters initialized with zeros

    // Read each grade and increment the corresponding cluster counter
    for (int grade = 0; std::cin >> grade && grade >= 0 && grade <= 100;) {
        ++C[grade / 10];
    }

    return C;
}

The for-loop above could also be written as a while-loop.

int grade;
while (std::cin >> grade && grade >= 0 && grade <= 100) {
    int cluster = grade / 10;

    ++C[cluster];   
}

Note that function read_grades is called twice in the program’s main in order to read the grades of both student classes. This illustrates an advantage of using functions: code reusability. By using a function, the programmer was kept away from re-writing the same block of code in different places of the program, what would have made the code lengthier and more complicated.

Vectors as function arguments

Functions can also have arguments which are vectors.

Continuing with the same example of the course grades, another useful function is one that displays the counters stored in a vector. Thus, one could declare this function as follows. Note that the function has an argument which is a vector of integer values (i.e. the counters).

// Display the cluster counters in vector C
void display_cluster_counters(std::vector<int> C);

The main can then call this function for variables clustersA and clustersB (lines $12$ and $16$ below).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int main() {
    // Read grades for class A and create the clusters counters
    std::cout << "Enter grades for  class A:\n";
    std::vector<int> clustersA = read_grades();

    // Read grades for class B and create the clusters counters
    std::cout << "Enter grades for  class B:\n";
    std::vector<int> clustersB = read_grades();

    // Display cluster counters, for class A
    std::cout << "\nClass A\n";
    display_cluster_counters(clustersA);

    // Display cluster counters, for class B
    std::cout << "\nClass B\n";
    display_cluster_counters(clustersB);
}

The function display_cluster_counters can be defined as follows. The $11$th cluster is handled separately (lines $6$ and $7$) from the other ten clusters because the $11$th cluster corresponds to an interval with a single value $[100,100]$.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// Display the cluster counters in vector C
void display_cluster_counters(std::vector<int> C) { // call-by-value: not efficient!!
    int counter = 0;
	 
    for (int c : C) {
        if (counter == 10) { // 11th cluster?
            std::cout << "[" << 100 << "," << 100 << "]: ";
        } else {
            std::cout << "[" << counter * 10 << "," << (counter + 1) * 10 - 1 << "]: ";
        }

        std::cout << c << "\n";

        ++counter;
    }
}

Note that call-by-value is used to pass a vector to function display_cluster_counters. One can motivate this decision by the fact that the function should not modify the actual arguments, clustersA and clustersB (not even accidentally by a programming bug in the function). Logically speaking, all the function should do is to walk through the list of values in the vector passed to it and display them, but the vector should not be modified in any way. So, call by-by-value gives us this guarantee, since a copy of the vector clustersA (clustersB) is made when the function is called in display_cluster_counters(clustersA) (display_cluster_counters(clustersB)), see lines $12$ and $16$ of the main above. However, the problem is that the vector clustersA (or clustersB) is a variable with too many bytes to be copied into the function. In many systems, an int occupies $4$ bytes. Thus, the vector clustersA (or clustersB) occupies $11 \times 4 = 44$ bytes, at least. The bottom line is that copying vectors is not efficient and unnecessary copying of vectors should be avoided.

To tackle this problem one should use call by const-reference, instead.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Display the cluster counters in vector C
// Note: call-by-reference is used, but the function cannot modify C
void display_cluster_counters(const std::vector<int>& C) {
    int counter = 0;
	
    for (int c : C) {
        if (counter == 10) { // 11th cluster?
            std::cout << "[" << 100 << "," << 100 << "]: ";
        } else {
            std::cout << "[" << counter * 10 << "," << (counter + 1) * 10 - 1 << "]: ";
        }

        std::cout << c << "\n";

        ++counter;
    }
}

Similar to call-by-reference, the function display_cluster_counters receives the address of the vector passed to it and, therefore, it has access to the actual arguments defined in the main, clustersA and clustersB. Thus, there is no copying of the vector passed to the function. In addition, in C++, it is possible to state that an argument passed to the function should be handled as a constant within the function. In the case of the example above, this is expressed by adding const to the function’s argument type (const std::vector<int>&) which implies that the vector referred by C cannot be modified by the function. Any such attempts would result in a compiler error. In this way, it’s possible to combine the efficiency of call-by-reference with the safety of call-by-value for function arguments that occupy many bytes of memory and should not be modified by the function.

We suggest that you make the following experiment: add a line of code to the function display_cluster_counters which attempts to modify vector C, like line of code $17$ in the function below. Note then the compilation error.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Display the cluster counters in vector C
void display_cluster_counters(const std::vector<int>& C) { // call by const reference
    int counter = 0;
	
    for (int c : C) {
        if (counter == 10) { // 11th cluster?
            std::cout << "[" << 100 << "," << 100 << "]: ";
        } else {
            std::cout << "[" << counter * 10 << "," << (counter + 1) * 10 - 1 << "]: ";
        }

        std::cout << c << "\n";

        ++counter;
    }

    C[0] = -1;  //Compiler error -- "C: you cannot assign to a variable that is const"
}

The course grades example

You can find below the complete program for the course grades example introduced in motivation section for functions and discussed also in the previous sections.

#include <iostream>
#include <iomanip>
#include <vector>

// Read a list of grades
// Return a vector of 11 cluster counters
std::vector<int> read_grades();

// Display the cluster counters in C
void display_cluster_counters(const std::vector<int>& C);

/* ***************************** */

int main() {
    // Read grades for class A and create the clusters counters
    std::cout << "Enter grades for  class A:\n";
    std::vector<int> clustersA = read_grades();

   // Read grades for class B and create the clusters counters
    std::cout << "Enter grades for  class B:\n";
    std::vector<int> clustersB = read_grades();

    // Display cluster counters, for class A
    std::cout << "\nClass A\n";
    display_cluster_counters(clustersA);

    // Display cluster counters, for class B
    std::cout << "\nClass B\n";
    display_cluster_counters(clustersB);
}

/* ***************************** */

// Read a list of grades
// Return a vector of 11 cluster counters
std::vector<int> read_grades() {
    std::vector<int> C(11, 0); //create a vector of 11 cluster counters initialized with zeros

    for (int grade; std::cin >> grade && grade >= 0 && grade <= 100;) {
        ++C[grade / 10];
    }
    return C;
}

// Display the cluster counters in C
void display_cluster_counters(const std::vector<int>& C) {
    int counter = 0;

    for (int c : C) {
        if (counter == 10) { // 11th cluster?
            std::cout << "[" << 100 << "," << 100 << "]: ";
        } else {
            std::cout << "[" << counter * 10 << "," << (counter + 1) * 10 - 1 << "]: ";
        }

        std::cout << c << "\n";

        ++counter;
    }
}

Good programming practices

In the section Passing arguments to functions, we presented some guidelines for which method to use when passing arguments to functions. We now complete that information with one more case: call by const-reference.

Call-by-value Call-by-reference Call by
const
reference
void f(T x) {
   //Do something with x
}
void update(T& x) {
   //Do something that modifies x
}
void f(const T& x) {
   //Do something with x
   //x cannot be modified
}
Type T is cheap to copy. Thus, T occupies few bytes (e.g. int, double, bool). Function modifies the value stored in the variable referred by x. Type T is expensive to copy. Thus, T occupies many bytes. But, x cannot be modified in f.

Good programming practice: use call by const-reference for a vector passed to a function f, if f should not modify the vector passed to it.

When using call-by-reference (or call by const-reference) to pass a vector as argument to a function f, the memory address of the vector is sent to f. A memory address is a non-negative integer usually ocuppying not more than $8$ bytes, while a vector often occupies many more bytes of memory.

Recall that C++-programmers assume that if a function uses call-by-reference to pass an argument x then the function can, and it will, modify x.

We also stress that call by const-reference shouldn’t be used for function arguments that occupy few bytes of memory. The reason is that call-by-value results in code that executes usually faster than call by const-reference, in the case of variables that require a few bytes of memory.

Remove an item from a vector

Programs that store a list of values in a vector may need, at some point, to remove a given value from the list. Thus, it’s useful to create a function to remove a given value x from a given vector V.

Recall that it’s only possible to remove the last element stored in a vector (this operation is called std::pop_back()). Removing an element which is not the last element in the vector is not directly supported by the language and, therefore, requires some extra coding to accomplish this task. We present two functions, based on two different algorithms, to accomplish this task.

Algorithm $1$

The first function is based on the following simple algorithm:

  1. Copy all elements of V to a new vector, with exception of the elements equal to x. Let us call this new vector aux (from auxliary).
  2. Return the newly created vector aux.
// Return a vector with all elements in V except those equal to x
std::vector<int> copy_some(const std::vector<int>& V, int x) {
    std::vector<int> aux;  // empty vector

    for (int e : V) {
        if (e != x) {
            aux.push_back(e); // add e in the end of aux
        }
    }
    return aux; 
}

As you can see, the function above considers vectors storing values of type int, though the code could be easily generalized for other types (e.g. double). In C++, it is also possible to write generic functions which are not dependent on the type of elements stored in the vector. These are called function templates and will be introduced in a more advanced C++ programming course.

Note that call by const-reference is used to pass a vector V to the function, since the function does not modify the vector passed to it, while call-by-value is used for argument x. In addition, if there are multiple occurrences of x in V then none of them will be copied to the new vector aux. As an exercise, we suggest that you modify the function above so that it copies all elements in V to a new vector, with exception for the first element in V equal to x.

You can find below a main that calls function copy_some.

int main() {
    std::vector<int> list = {2, 4, 5, 8, 10, 12, 23, 16, 10, 300};

    // Read a value
    std::cout << "Enter an integer:\n";
    int value;
    std::cin >> value;

    // Remove all occurrences of value from the list
    list = copy_some(list, value);
	
    // Display the list
    std::cout << "List = ";

    for (int e : list) {
        std::cout << e << " ";
    }
}

Algorithm $2$

Our second function is based on a slighly more complicated algorithm with three steps.

  1. Search for the first occurrence of x in vector V.
  2. Shift all elements after (the first occurrence of) x one slot to the left.
  3. Remove the last element of vector V.

Let us illustrate this algorithm with a small example. Consider the vector V given below and that one wants to remove 3.

V[0] V[1] V[2] V[3]
2 3 6 8

Step $1$: The first occurence of 3 in V is found in slot with index $1$.

Step $2$: All elements in V stored after slot $1$ are shifted to the left, i.e. V[2] is copied into V[1] and V[3] is copied into V[2].

V[0] V[1] V[2] V[3]
2 6 8 8

Step $3$: The last element in the vector is now removed (V.pop_back() can be used here).

V[0] V[1] V[2]
2 6 8

Next, we translate this second algorithm into C++-code. Since the first step of the algorithm requires a search, we define first a function to make a linear search for a value in a vector (i.e. to search for the value to be removed). Note that the call by const-reference is used to pass a vector to the search function, since a vector usually occupies too many bytes of memory to be copied and a search function should not modify the values stored in the vector.

constexpr int not_found = -1;

// Return the index of the first occurrence of value in V
// Return not_found, if value does not occur in V
int linear_search(const std::vector<int>& V, int value) {
    int counter = 0; // to count slots visited

    for (int e : V) {
        if (e == value) {
            return counter;
        } else {
            ++counter;
        }
    }
    return not_found;
}

We can now define a function named remove based on the second algorithm presented above. Note that call-by-reference is used to pass a vector to this function because function remove modifies the vector passed to it.

void remove(std::vector<int>& V, int x) {
    // Step 1: find first occurrenc of x in V
    int found = linear_search(V, x);

    if (found == not_found) // end the function, if x does not exist in V
        return;

    // Step 2: shift all values after x to the left
    for (int i = found + 1; i < std::ssize(V); ++i) {
        V[i - 1] = V[i];
    }

    // Step 3: remove last element of V
    V.pop_back();
}

You can find below a main that calls this function remove.

int main() {
    // Declare a vector and initialize it
    std::vector<int> list = {2, 4, 5, 8, 10, 12, 23, 16, 10, 300};

    // Read a value
    std::cout << "Enter an integer:\n";
    int value;
    std::cin >> value;

    remove(list, value);

    // Display the list
    std::cout << "List = ";

    for (int e : list) {
        std::cout << e << " ";
    }
}

Let us now compare both algorithms given above for removing an element from a vector.

  • Algorithm $1$ does not modify the vector V passed to it, while algorithm $2$ modifies V. Thus, call by const-reference is used to pass the vector to the first function (named copy_some), but call-by-reference is used to pass the vector to the second function (named remove).
  • Algorithm $1$ creates a new vector and, consequently, requires more memory space than Algorithm $2$.
  • Algorithm $1$ can be used to eliminate all values equal to x from a list stored in a vector. Algorithm $2$ removes only the first occurrence of x.

Algorithm $1$ can be easily modified to remove only the first occurence of x. An interesting exercise that we suggest that you try is to modify the function remove based on the $2$nd algorithm so that it removes all occurrences of x from the vector.