Saturday, 26 January 2013

Practice Problem

Problem:01

In this challenge, write a program that takes in three arguments, a start temperature (in Celsius), an end temperature (in Celsius) and a step size. Print out a table that goes from the start temperature to the end temperature, in steps of the step size; you do not actually need to print the final end temperature if the step size does not exactly match. You should perform input validation: do not accept start temperatures less than a lower limit (which your code should specify as a constant) or higher than an upper limit (which your code should also specify). You should not allow a step size greater than the difference in temperatures. (This exercise was based on a problem from C Programming Language).

Sample run:

  Please give in a lower limit, limit >= 0: 10
  Please give in a higher limit, 10 > limit <= 50000: 20
  Please give in a step, 0 < step <= 10: 4

  Celsius         Fahrenheit
  -------         ----------
  10.000000       50.000000
  14.000000       57.200000
  18.000000       64.400000

```
Problem:02 
```

Line Count Programming Challenge

Here's a simple help free challenge to get you started: write a program that takes a file as an argument and counts the total number of lines. Lines are defined as ending with a newline character. Program usage should be count filename.txt and the output should be the line count.

Problem:03

File Size Challenge

In this challenge, given the name of a file, print out the size of the file, in bytes. If no file is given, provide a help string to the user that indicates how to use the program. You might need help with taking parameters via the command line or file I/O in C++ (if you want to solve this problem in C, you might be interested in this article on C file I/O).

Solve all problems: If you have any problem then email us we give you solution

Final Lesson:The C++ Modulus Operator

Take a simple arithmetic problem: what's left over when you divide 11 by 3? The answer is easy to compute: divide 11 by 3 and take the remainder: 2. But how would you compute this in a programming language like C or C++? It's not hard to come up with a formula, but the language provides a built-in mechanism, the modulus operator ('%'), that computes the remainder that results from performing integer division.
The modulus operator is useful in a variety of circumstances. It is commonly used to take a randomly generated number and reduce that number to a random number on a smaller range, and it can also quickly tell you if one number is a factor of another.

If you wanted to know if a number was odd or even, you could use modulus to quickly tell you by asking for the remainder of the number when divided by 2.

#include <iostream>

using namespace std;

int main()
{
    int num;
    cin >> num;
    // num % 2 computes the remainder when num is divided by 2
    if ( num % 2 == 0 )
    {
        cout << num << " is even ";
    }

    return 0;
}

The key line is the one that performs the modulus operation: "num % 2 == 0". A number is even if and only if it is divisible by two, and a number is divisible by another only if there is no remainder.

How could you use modulus to write a program that checks if a number is prime?

Lesson 29:Getting Random Values in C and C++ with Rand

At some point in any programmer's life, he or she must learn how to get a random value, or values, in their program. To some this seems involved, difficult, or even beyond their personal ability. This, however, is simply not the case.

Randomizing of values is, at its most basic form, one of the easier things a programmer can do with the C++ language. I have created this short tutorial to aid you in learning, constructing, and using the functions available to you to randomize values.

I will first start with an introduction to the idea of randomizing values, followed by a simple example program that will output three random values. Once a secure understanding of these concepts is in place (hopefully it will be), I will include a short program that uses a range of values from which the random values can be taken.

Ok, now that you know why this tutorial was written, and what it includes, you are ready to learn how to randomize values! So without further ado, let's get started, shall we?

Many programs that you will write require the use of random numbers. For example, a game such as backgammon requires a roll of two dice on each move. Since there are 6 numbers on each die, you could calculate each roll by finding a random number from 1 to 6 for each die.

To make this task a little easier, C++ provides us with a library function, called rand that returns an integer between 0 and RAND_MAX. Let's take a break to explain what RAND_MAX is. RAND_MAX is a compiler-dependent constant, and it is inclusive. Inclusive means that the value of RAND_MAX is included in the range of values. The function, rand, and the constant, RAND_MAX, are included in the library header file stdlib.h.

The number returned by function rand is dependent on the initial value, called a seed that remains the same for each run of a program. This means that the sequence of random numbers that is generated by the program will be exactly the same on each run of the program.

How do you solve this problem you ask? Well I'll tell you! To help us combat this problem we will use another function, srand(seed), which is also declared in the stdlib.h header file. This function allows an application to specify the initial value used by rand at program startup.

Using this method of randomization, the program will use a different seed value on every run, causing a different set of random values every run, which is what we want in this case. The problem posed to us now, of course, is how to get an arbitrary seed value. Forcing the user or programmer to enter this value every time the program was run wouldn't be very efficient at all, so we need another way to do it.

So we turn to the perfect source for our always-changing value, the system clock. The C++ data type time_t and the function time, both declared in time.h, can be used to easily retrieve the time on the computers clock.

When converted to an unsigned integer, a positive whole number, the program time (at execution of program) can make a very nice seed value. This works nicely because no two program executions will occur at the same instant of the computers clock.

As promised, here is a very basic example program. The following code was written in Visual C++ 6.0, but should compile fine on most computers (given u have a compiler, which if your reading this I assume you do). The program outputs three random values.

/*Steven Billington
January 17, 2003
Ranexample.cpp
Program displays three random integers.
*/
/*
Header: iostream
Reason: Input/Output stream
Header: cstdlib
Reason: For functions rand and srand
Header: time.h
Reason: For function time, and for data type time_t
*/
#include <iostream>
#include <cstdlib>
#include <time.h>

using namespace std;

int main()
{
/*
Declare variable to hold seconds on clock.
*/
time_t seconds;
/*
Get value from system clock and
place in seconds variable.
*/
time(&seconds);
/*
Convert seconds to a unsigned
integer.
*/
srand((unsigned int) seconds);
/*
Output random values.
*/
cout<< rand() << endl;
cout<< rand() << endl;
cout<< rand() << endl;
return 0;
}

Users of a random number generator might wish to have a narrower or a wider range of numbers than provided by the rand function. Ideally, to solve this problem a user would specify the range with integer values representing the lower and the upper bounds. To understand how we might accomplish this with the rand function, consider how to generate a number between 0 and an arbitrary upper bound, referred to as high, inclusive.

For any two integers, say a and b, a % b is between 0 and b - 1, inclusive. With this in mind, the expression rand() % high + 1 would generate a number between 1 and high, inclusive, where high is less than or equal to RAND_MAX, a constant defined by the compiler. To place a lower bound in replacement of 1 on that result, we can have the program generate a random number between 0 and (high - low + 1) + low.

I realize how confused you might be right now, so take a look at the next sample program I promised, run it, toy with it, and alternate it to give you different values. It has been a pleasure to teach you another chapter in the world of C++, and you may feel free to email me at Silent_Death17@hotmail.com or to contact me on the message boards of this fine website, where I use the name RoD.

Enjoy, and happy programming!

/*
Steven Billington
January 17, 2003
exDice.cpp
Program rolls two dice with random
results.
*/
/*
Header: iostream
Reason: Input/Output stream
Header: stdlib
Reason: For functions rand and srand
Header: time.h
Reason: For function time, and for data type time_t
*/
#include <iostream>
#include <cstdlib>
#include <time.h>
/*
These constants define our upper
and our lower bounds. The random numbers
will always be between 1 and 6, inclusive.
*/
const int LOW = 1;
const int HIGH = 6;

using namespace std;

int main()
{
/*
Variables to hold random values
for the first and the second die on
each roll.
*/
int first_die, sec_die;
/*
Declare variable to hold seconds on clock.
*/
time_t seconds;
/*
Get value from system clock and
place in seconds variable.
*/
time(&seconds);
/*
Convert seconds to a unsigned
integer.
*/
srand((unsigned int) seconds);
/*
Get first and second random numbers.
*/
first_die = rand() % (HIGH - LOW + 1) + LOW;
sec_die = rand() % (HIGH - LOW + 1) + LOW;
/*
Output first roll results.
*/
cout<< "Your roll is (" << first_die << ", "
<< sec_die << "}" << endl << endl;
/*
Get two new random values.
*/
first_die = rand() % (HIGH - LOW + 1) + LOW;
sec_die = rand() % (HIGH - LOW + 1) + LOW;
/*
Output second roll results.
*/
cout<< "My roll is (" << first_die << ", "
<< sec_die << "}" << endl << endl;
return 0;
}

Lesson 28:Formatting Cout Output in C++ using iomanip

Creating cleanly formatted output is a common programming requirement--it improves your user interface and makes it easier to read any debugging messages that you might print to the screen. In C, formatted output works via the printf statement, but in C++, you can create nicely formatted output to streams such as cout. This tutorial covers a set of basic I/O manipulations possible in C++ from the iomanip header file. Note that all of the functions in the iomanip header are inside the std namespace, so you will need to either prefix your calls with "std::" or put "using namespace std;" before using the functions.

Dealing with Spacing Issues using iomanip

A principle aspect of nicely formatted output is that the spacing looks right. There aren't columns of text that are too long or too short, and everything is appropriately aligned. This section deals with ways of spacing output correctly.

Setting the field width with setw

The std::setw function allows you to set the minimum width of the next output via the insertion operator. setw takes, one argument, the width of the next output (insertion), an integer. if the next output is too short, then spaces will be used for padding. There is no effect if the output is longer than the width--note that the output won't be truncated. The only strange thing about setw is that its return value must be inserted into the stream. The setw function has no effect if it is called without reference to a stream. A simple example is

using namespace std;
cout<<setw(10)<<"ten"<<"four"<<"four";

The output from the above would look like this:

ten       fourfour

Note that since setw takes an argument, at runtime it would be possible to specify the width of a column of output so that it is slightly wider than the longest element of the column.

You might wonder whether it is possible to change the padding character. It turns out that yes, you can, by using the setfill function, which takes a character to use for the padding. Note that setfill should also be used as a stream manipulator only, so it must be inserted into the stream:

cout<<setfill('-')<<setw(80)<<"-"<<endl;

The above code sets the padding character to a dash, the width of the next output to be at least 80 characters, and then outputs a dash. This results in the rest of the line being filled with dashes too. The output would look like this:

--------------------------------------------------------------------------------

Note that the pad character is changed until the next time you call setfill to change it again.

Aligning text with iomanip

It's possible to specify whether output is left or right aligned by using the manipulator flags that are part of ios_bas. In particular, it is possible to specify that output should be either left or right aligned by passing in the stream manipulators std::left and std::right.

Putting Your Knowledge of iomanip Together

Now that we know how to space and align text, we can correctly print formatted data in columns. For instance, if you had a struct containing the names of individuals:

using namespace std;

struct person
{
    string firstname;
    string lastname;
};

If you then had a vector of persons, then you could output them in a nice way with evenly spaced columns for the first and last name as follows:

// given the above code, we could write this
vector<person> people;
// fill the vector somehow

int field_one_width = 0, field_two_width = 0;

// get the max widths

for ( vector<person>::iterator iter = people.begin();
      iter != people.end();
      ++iter )
{
    if ( iter->firstname.length() > field_one_width )
    {
        field_one_width = iter->firstname.length();
    }
    if ( iter->lastname.length() > field_two_width )
    {
        field_two_width = iter->lastname.length();
    }
}

// print the elements of the vector
for ( vector<person>::iterator iter = people.begin();
      iter != people.end();
      ++iter )
{
    cout<<setw(field_one_width)<<left<<iter->firstname;
    cout<<" ";
    cout<<setw(field_two_width)<<left<<iter->lastname;
}

Note that the space output between the two fields wasn't strictly necessary because we could have added it by changing the first call to setw to set the width to one more than the longest first name (since it would use a space as the padding for the extra character).

Printing Numbers

Another challenge in creating nice output is correctly formatting numbers; for instance, when printing out a hexadecimal value, it would be nice if it were preceded by the "0x" prefix. More generally, it's nice to correctly set the number of trailing zeros after a decimal place.

Setting the precision of numerical output with setprecision

The setprecision function can be used to set the maximum number of digits that are displayed for a number. Like setw, it should be inserted into the stream. In fact, its usage is very similar to setw in all respects. For instance, to print the number 2.71828 to 3 decimal places:

std::cout << setprecision(3) << 2.71828;

Note that setprecision will change the precision until the next time it is passed into a given stream. So changing the above example to also print out 1.412 would result in the output of

2.71 1.41

Output in different bases

In computer science, frequently numbers need to be printed in octal or hexadecimal. The setbase function returns a value that can be passed into a stream to set the base of numbers to either base 8, 10, or 16. The input number is still read as a number in base ten, but it is printed in the given base. For instance,

std::cout << setbase(16) << 32;

will print out "20", which is 32 written in base 16. Note that you can use dec, oct, and hex as shorthand for setbase(10), setbase(8), and setbase(16) respectively when inserting into a stream. If you wish to include an indication of the base along with the printed number, you can use the setiosflags function, again passed into a stream, with an input of ios_base::showbase. Using the ios_base::showbase flag will append a "0x" in front of hexadecimal numbers and a 0 in front of octal numbers. Decimal numbers will be printed as normal.

std::cout << setbase(16) << 32;

This should get you started with the ability to create nicely formatted output in C++ without having to resort to returning to printf!

Lesson 27:The C Preprocessor

The C preprocessor modifies a source code file before handing it over to the compiler. You're most likely used to using the preprocessor to include files directly into other files, or #define constants, but the preprocessor can also be used to create "inlined" code using macros expanded at compile time and to prevent code from being compiled twice.
There are essentially three uses of the preprocessor--directives, constants, and macros. Directives are commands that tell the preprocessor to skip part of a file, include another file, or define a constant or macro. Directives always begin with a sharp sign (#) and for readability should be placed flush to the left of the page. All other uses of the preprocessor involve processing #define'd constants or macros. Typically, constants and macros are written in ALL CAPS to indicate they are special (as we will see).

Header Files

The #include directive tells the preprocessor to grab the text of a file and place it directly into the current file. Typically, such statements are placed at the top of a program--hence the name "header file" for files thus included.

Constants

If we write

#define [identifier name] [value]

whenever [identifier name] shows up in the file, it will be replaced by [value].

If you are defining a constant in terms of a mathematical expression, it is wise to surround the entire value in parentheses:

#define PI_PLUS_ONE (3.14 + 1)

By doing so, you avoid the possibility that an order of operations issue will destroy the meaning of your constant:

x = PI_PLUS_ONE * 5;

Without parentheses, the above would be converted to

x = 3.14 + 1 * 5;

which would result in 1 * 5 being evaluated before the addition, not after. Oops!

It is also possible to write simply

#define [identifier name]

which defines [identifier name] without giving it a value. This can be useful in conjunction with another set of directives that allow conditional compilation.

Conditional Compilation

There are a whole set of options that can be used to determine whether the preprocessor will remove lines of code before handing the file to the compiler. They include #if, #elif, #else, #ifdef, and #ifndef. An #if or #if/#elif/#else block or a #ifdef or #ifndef block must be terminated with a closing #endif.

The #if directive takes a numerical argument that evaluates to true if it's non-zero. If its argument is false, then code until the closing #else, #elif, of #endif will be excluded.

Commenting out Code

Conditional compilation is a particularly useful way to comment out a block of code that contains multi-line comments (which cannot be nested).

#if 0
/* comment ...
*/

// code

/* comment */
#endif

Include Guards

Another common problem is that a header file is required in multiple other header files that are later included into a source code file, with the result often being that variables, structs, classes or functions appear to be defined multiple times (once for each time the header file is included). This can result in a lot of compile-time headaches. Fortunately, the preprocessor provides an easy technique for ensuring that any given file is included once and only once.

By using the #ifndef directive, you can include a block of text only if a particular expression is undefined; then, within the header file, you can define the expression. This ensures that the code in the #ifndef is included only the first time the file is loaded.

#ifndef _FILE_NAME_H_
#define _FILE_NAME_H_

/* code */

#endif // #ifndef _FILE_NAME_H_

Notice that it's not necessary to actually give a value to the expression _FILE_NAME_H_. It's sufficient to include the line "#define _FILE_NAME_H_" to make it "defined". (Note that there is an n in #ifndef--it stands for "if not defined").

A similar tactic can be used for defining specific constants, such as NULL:

#ifndef NULL
#define NULL (void *)0
#endif // #ifndef NULL

Notice that it's useful to comment which conditional statement a particular #endif terminates. This is particularly true because preprocessor directives are rarely indented, so it can be hard to follow the flow of execution.

On many compilers, the #pragma once directive can be used intead of include guards.

Macros

The other major use of the preprocessor is to define macros. The advantage of a macro is that it can be type-neutral (this can also be a disadvantage, of course), and it's inlined directly into the code, so there isn't any function call overhead. (Note that in C++, it's possible to get around both of these issues with templated functions and the inline keyword.)

A macro definition is usually of the following form:

#define MACRO_NAME(arg1, arg2, ...) [code to expand to]

For instance, a simple increment macro might look like this:

#define INCREMENT(x) x++

They look a lot like function calls, but they're not so simple. There are actually a couple of tricky points when it comes to working with macros. First, remember that the exact text of the macro argument is "pasted in" to the macro. For instance, if you wrote something like this:

#define MULT(x, y) x * y

and then wrote

int z = MULT(3 + 2, 4 + 2);

what value do you expect z to end up with? The obvious answer, 30, is wrong! That's because what happens when the macro MULT expands is that it looks like this:

int z = 3 + 2 * 4 + 2;    // 2 * 4 will be evaluated first!

So z would end up with the value 13! This is almost certainly not what you want to happen. The way to avoid it is to force the arguments themselves to be evaluated before the rest of the macro body. You can do this by surrounding them by parentheses in the macro definition:

#define MULT(x, y) (x) * (y)
// now MULT(3 + 2, 4 + 2) will expand to (3 + 2) * (4 + 2)

But this isn't the only gotcha! It is also generally a good idea to surround the macro's code in parentheses if you expect it to return a value. Otherwise, you can get similar problems as when you define a constant. For instance, the following macro, which adds 5 to a given argument, has problems when embedded within a larger statement:

#define ADD_FIVE(a) (a) + 5

int x = ADD_FIVE(3) * 3;
// this expands to (3) + 5 * 3, so 5 * 3 is evaluated first
// Now x is 18, not 24!

To fix this, you generally want to surround the whole macro body with parentheses to prevent the surrounding context from affecting the macro body.

#define ADD_FIVE(a) ((a) + 5)

int x = ADD_FIVE(3) * 3;

On the other hand, if you have a multiline macro that you are using for its side effects, rather than to compute a value, you probably want to wrap it within curly braces so you don't have problems when using it following an if statement.

// We use a trick involving exclusive-or to swap two variables
#define SWAP(a, b)  a ^= b; b ^= a; a ^= b; 

int x = 10;
int y = 5;

// works OK
SWAP(x, y);

// What happens now?
if(x < 0)
    SWAP(x, y);

When SWAP is expanded in the second example, only the first statement, a ^= b, is governed by the conditional; the other two statements will always execute. What we really meant was that all of the statements should be grouped together, which we can enforce using curly braces:

#define SWAP(a, b)  {a ^= b; b ^= a; a ^= b;}

Now, there is still a bit more to our story! What if you write code like so:

#define SWAP(a, b)  { a ^= b; b ^= a; a ^= b; }

int x = 10;
int y = 5;
int z = 4;

// What happens now?
if(x < 0)
    SWAP(x, y);
else
    SWAP(x, z);

Then it will not compile because semicolon after the closing curly brace will break the flow between if and else. The solution? Use a do-while loop:

#define SWAP(a, b)  do { a ^= b; b ^= a; a ^= b; } while ( 0 )

int x = 10;
int y = 5;
int z = 4;

// What happens now?
if(x < 0)
    SWAP(x, y);
else
    SWAP(x, z);

Now the semi-colon doesn't break anything because it is part of the expression. (By the way, note that we didn't surround the arguments in parentheses because we don't expect anyone to pass an expression into swap!)

More Gotchas

By now, you've probably realized why people don't really like using macros. They're dangerous, they're picky, and they're just not that safe. Perhaps the most irritating problem with macros is that you don't want to pass arguments with "side effects" to macros. By side effects, I mean any expression that does something besides evaluate to a value. For instance, ++x evaluates to x+1, but it also increments x. This increment operation is a side effect.

The problem with side effects is that macros don't evaluate their arguments; they just paste them into the macro text when performing the substitution. So something like

#define MAX(a, b) ((a) < (b) ? (b) : (a))
int x = 5, y = 10;
int z = MAX(x++, y++);

will end up looking like this:

int x = (x++ < y++ ? y++ : x++)

The problem here is that y++ ends up being evaluated twice! The nasty consequence is that after this expression, y will have a value of 12 rather than the expected 11. This can be a real pain to debug!

Multiline macros

Until now, we've seen only short, one line macros (possibly taking advantage of the semicolon to put multiple statements on one line.) It turns out that by using a the "\" to indicate a line continuation, we can write our macros across multiple lines to make them a bit more readable.

For instance, we could rewrite swap as

#define SWAP(a, b)  {                   \
                        a ^= b;         \
                        b ^= a;         \ 
                        a ^= b;         \
                    }

Notice that you do not need a slash at the end of the last line! The slash tells the preprocessor that the macro continues to the next line, not that the line is a continuation from a previous line.

Aside from readability, writing multi-line macros may make it more obvious that you need to use curly braces to surround the body because it's more clear that multiple effects are happening at once.

Advanced Macro Tricks

In addition to simple substitution, the preprocessor can also perform a bit of extra work on macro arguments, such as turning them into strings or pasting them together.

Pasting Tokens

Each argument passed to a macro is a token, and sometimes it might be expedient to paste arguments together to form a new token. This could come in handy if you have a complicated structure and you'd like to debug your program by printing out different fields. Instead of writing out the whole structure each time, you might use a macro to pass in the field of the structure to print.

To paste tokens in a macro, use ## between the two things to paste together.

For instance

#define BUILD_FIELD(field) my_struct.inner_struct.union_a.##field

Now, when used with a particular field name, it will expand to something like

my_struct.inner_struct.union_a.field1

The tokens are literally pasted together.

String-izing Tokens

Another potentially useful macro option is to turn a token into a string containing the literal text of the token. This might be useful for printing out the token. The syntax is simple--simply prefix the token with a pound sign (#).

#define PRINT_TOKEN(token) printf(#token " is %d", token)

For instance, PRINT_TOKEN(foo) would expand to

printf("<foo>" " is %d" <foo>)

(Note that in C, string literals next to each other are concatenated, so something like "token" " is " " this " will effectively become "token is this". This can be useful for formatting printf statements.)

For instance, you might use it to print the value of an expression as well as the expression itself (for debugging purposes).

PRINT_TOKEN(x + y);

Avoiding Macros in C++

In C++, you should generally avoid macros when possible. You won't be able to avoid them entirely if you need the ability to paste tokens together, but with templated classes and type inference for templated functions, you shouldn't need to use macros to create type-neutral code. Inline functions should also get rid of the need for macros for efficiency reasons. (Though you aren't guaranteed that the compiler will inline your code.)

Moreover, you should use const to declare typed constants rather than #define to create untyped (and therefore less safe) constants. Const should work in pretty much all contexts where you would want to use a #define, including declaring static sized arrays or as template parameters.

Lesson 26:Enumerated Types - enums

Sometimes as programmers we want to express the idea that a variable will be used for a specific purpose and should only be able to have a small number of values--for instance, a variable that stores the current direction of the wind might only need to store values corresponding to north, south, east, and west. One solution to this problem might be to use an int and some #define'd values:

#define NORTH_WIND        0
#define SOUTH_WIND        1
#define EAST_WIND         2
#define WEST_WIND         3     
#define NO_WIND           4       

int wind_direction = NO_WIND;

The problem with this approach is that it doesn't really prevent someone from assigning a nonsensical value to wind_direction; for instance, I could set wind_direction to 453 without any complaints from my compiler. And if I looked at the type of wind_direction, i would see that it's just a plain old integer. there's just no way to know that something is wrong.

The idea behind enumerated types is to create new data types that can take on only a restricted range of values. Moreover, these values are all expressed as constants rather than magic numbers--in fact, there should be no need to know the underlying values. The names of the constants should be sufficient for the purposes of comparing values.

When you declare an enumerated type, you specify the name of the new type, and the possible values it can take on:

enum wind_directions_t {NO_WIND, NORTH_WIND, SOUTH_WIND, EAST_WIND, WEST_WIND};

Note the _t at the end of the name of the type: this stands for "type" and is a way to visually distinguish the name of the type from the name of variables. Your text editor may also have the ability to use syntax highlighting to make the new type look like other built-in types, such as int, for you.

Now we can declare a wind_directions_t variable that can only take on five values:

wind_directions_t wind_direction = NO_WIND;

wind_direction = 453; // doesn't work, we get a compiler error!

Note to C Programmers: If you're planning on using enums in C, however, you don't get this type safety. The above assignment will compile without giving you an error. By the way, if you're using enums in C, you will also need to prefix the declaration with the keyword enum: enum wind_directions_t wind_direction = NO_WIND;

You might be wondering exactly what values the constants take on--what if you wanted to compare then using < or >? You actually have a choice: if you want to set the values yourself, you may, or you can choose to use default values, which start at zero for the first constant and increase by one. In our example, NO_WIND has the value 0, and WEST_WIND has the value 4 (just like our #define'd constants).

On the other hand, we could reverse this by giving explicit values:

enum wind_directions_t {NO_WIND = 4, NORTH_WIND = 3, SOUTH_WIND = 2, EAST_WIND = 1, WEST_WIND = 0};

Why would you ever want to give explicit values to elements of an enumerated type? Isn't the whole point of constants so that you don't need to know what the values are? The answer is that if the values of the constant are never used outside of comparisons between elements of the enumeration, then there's almost no reason to define the values to be anything in particular (one exception is if you want one value to have multiple names, you'd have to set at least one value explicitly). But if you need the values for communicating with the outside world, you might need to give specific values. For example, if you decided to use an enum to store all of the possible text colors you could pass into a function to set the text colors, you'd probably need to make sure that the enum names, such a RED or BLUE, matched up to the values corresponding to those colors.

Printing Enums

You might wonder what happens when you print out an enum: by default, you'll get the integer value of the enum. If you want to do something fancier than that, you'll have to handle it specially.

Naming Enums

One issue with enums is that the name of the enumerated type doesn't show up along with the enum. When you use the enum constant, it could really mean anything. The problem is that if you give your enums names that are too general, you can run into problems. First, it becomes hard to tell which enumeration a constant belongs to if you have several enumerated lists of values. A related problem is that sometimes you really want to use the same name. For instance, what if you had two color schemes, each of which included the color red, but for which the value of the RED constant needed to be different?

The solution to both of these problems is to include part of the name of the enum in the names of the constants. Notice that in the above example, I included "WIND" in the name of each enumerated constant. (Perhaps this wasn't entirely necessary--why not just have an enum for each ordinal direction? The answer is that it depends on whether someone else is already using the name. In this case, we avoid the problem by making the names specific enough that it's unlikely someone else will have a WEST_WIND constant.

Type Correctness

Because enums are "integer-like" types, they can safely be assigned into an integer without a cast. For instance, both of the following assignments are totally valid:

int my_wind = EAST_WIND;

wind_directions_t wind_direction = NO_WIND;

int my_wind = wind_direction;

As already mentioned, you can't make the reverse assignment in C++ without using a typecast. There might be times when you do need to do this, but you'd like to avoid it as best you can. For instance, if you need to convert a user's input into an enumerated type, it would be a bad idea to just typecast the variable to an int:

wind_directions_t wind_direction = NO_WIND;

std::cin >> static_cast( wind_direction );

This would let the user input any value at all and, almost as bad, force the user to know the range of values that the enum could take on. A much better solution would be to shield the user from the enumeration by asking for a string and then validating the input by comparing it to the possible input strings to choose which constant to assign the enum. For instance,

std::cout << "Please enter NORTH, SOUTH, EAST, WEST, or NONE for our wind direction";
std::cout << std::endl;

string input_wind_dir;
cin >>

wind_directions_t wind_dir;

if ( user_wind_dir == "NORTH" )
{
        wind_dir = NORTH_WIND;
}
else if ( user_wind_dir == "SOUTH" )
{
        wind_dir = SOUTH_WIND;
}
else if ( user_wind_dir == "EAST" )
{
        wind_dir = EAST_WIND;
}
else if ( user_wind_dir == "WEST" )
{
        wind_dir = WEST_WIND;
}
else if ( user_wind_dir == "NONE" )
{
        wind_dir = NO_WIND;
}
else
{
        std::cout << "That's not a valid direction!" << std::endl;
}

Polymorphic Enums?

In C++, we often use polymorphism to allow old code to handle new code--for instance, as long as we subclass the interface expected by a function, we can pass in the new class and expect it to work correctly with the code that was written before the new class ever existed. Unfortunately, with enums, you can't really do this, even though there are occasional times you'd like to. (For instance, if you were managing the settings for your program and you stored all of them as enum values, then it might be nice to have an enum, settings_t, from which all of your other enums inherited so that you could store every new enum in the settings list. Note that since the list contains values of different types, you can't use templates.)

If you need this kind of behavior, you're forced to store the enums as integers and then retrieve them using typecasts to assign the particular value to the setting of interest. And you won't even get the benefit of dynamic_cast to help you ensure that the cast is safe--you'll have to rely on the fact that incorrect values cannot be stored in the list.

Lesson 25:Template Specialization and Partial Template Specialization

Template Specialization

In many cases when working with templates, you'll write one generic version for all possible data types and leave it at that--every vector may be implemented in exactly the same way. The idea of template specialization is to override the default template implementation to handle a particular type in a different way.
For instance, while most vectors might be implemented as arrays of the given type, you might decide to save some memory and implement vectors of bools as a vector of integers with each bit corresponding to one entry in the vector. So you might have two separate vector classes. The first class would look like this.

template <typename T>
class vector
{
    // accessor functions and so forth
    private:
    T* vec_data;   // we'll store the data as block of dynamically allocated 
                   // memory
    int length;    // number of elements used 
    int vec_size;  // actual size of vec_data
};

But when it comes to bools, you might not really want to do this because most systems are going to use 16 or 32 bits for each boolean type even though all that's required is a single bit. So we might make our boolean vector look a little bit different by representing the data as an array of integers whose bits we manually manipulate. (For more on manipulating bits directly, see bitwise operators and bit manipulations in C and C++.)

To do this, we still need to specify that we're working with something akin to a template, but this time the list of template parameters will be empty:

template <>

and the class name is followed by the specialized type: class className<type>. In this case, the template would look like this:

template <>
class vector <bool>
{
    // interface

    private:
    unsigned int *vector_data;
    int length;
    int size;
};

Note that it would be perfectly reasonable if the specialized version of the vector class had a different interface (set of public methods) than the generic vector class--although they're both vector templates, they don't share any interface or any code.

It's worth pointing out that the salient reason for the specialization in this case was to allow for a more space-efficient implementation, but you could think of other reasons why this might come in handy--for instance, if you wanted to add extra methods to one templated class based on its type, but not to other templates. For instance, you might have a vector of doubles with a method that returns the non-integer component of each element although you might think prefer inheritance in this case. There isn't a particular reason to prevent the existence of a vector of doubles without those extra features. If, however, you felt strongly about the issue and wanted to prevent it, you could do so using template specialization.

Another time when you might want to specialize certain templates could be if you have a template type that relies on some behavior that was not implemented in a collection of classes you'd like to store in that template. For example, if you had a templated sortedVector type that required the > operator to be defined, and a set of classes written by someone else that didn't include any overloaded operators but did include a function for comparison, you might specialize your template to handle these classes separately.

Template Partial Specialization

Partial template specialization stems from similar motives as full specialization as described above. This time, however, instead of implementing a class for one specific type, you end up implementing a template that still allows some parameterization. That is, you write a template that specializes on one feature but still lets the class user choose other features as part of the template. Let's make this more concrete with an example.

Going back to the idea of extending the concept of vectors so that we can have a sortedVector, let's think about how this might look: we'll need a way of making comparisons. Fine; we can just use > if it's been implemented, or specialize if it hasn't. But now let's say that we wanted to have pointers to objects in our sorted vector. We could sort them by the value of the pointers, just doing a standard > comparison (we'll have a vector sorted from low to high):

template <typename T>
class sortedVector
{
    public:
    void insert (T val)
    {
        if ( length == vec_size )   // length is the number of elements
        {
            vec_size *= 2;    // we'll just ignore overflow possibility!
            vec_data = new T[vec_size];
        }
        ++length;  // we are about to add an element
        
   // we'll start at the end, sliding elements back until we find the
   // place to insert the new element
        int pos;
        for( pos = length; pos > 0 && val > vec_data[pos - 1]; --pos )
        {
            vec_data[pos] = vec_data[pos - 1];
        }
        vec_data[pos] = val;
    }
    // other functions...
    private:
    T *vec_data;
    int length;
    int size;
};

Now, notice that in the above for loop, we're making a direct comparison between elements of type T. That's OK for most things, but it would probably make more sense to have sorted on the actual object type instead of the pointer address. To do that, we'd need to write code that had this line:

for( pos = length; pos > 0 && *val > *vec_data[pos - 1]; --pos )

Of course, that would break for any non-pointer type. What we want to do here is use a partial specialization based on whether the type is a pointer or a non-pointer (you could get fancy and have multiple levels of pointers, but we'll stay simple).

To declare a partially specialized template that handles any pointer types, we'd add this class declaration:

template <typename T>
class sortedVector<T *>
{
    public:
    // same functions as before.  Now the insert function looks like this:
    insert( T *val )
    {
        if ( length == vec_size ) // length is the number of elements
        {
            vec_size *= 2;// we'll just ignore overflow possibility!
            vec_data = new T[vec_size];
        }
        ++length;  // we are about to add an element
        
  // we'll start at the end, sliding elements back until we find the

 // place to insert the new element
        int pos;
        for( pos = length; pos > 0 && *val > *vec_data[pos - 1]; --pos )
        {
            vec_data[pos] = vec_data[pos - 1];
        }
        vec_data[pos] = val;
    }

    private:
    T** vec_data;
    int length;
    int size;
};

There are a couple of syntax points to notice here. First, our template parameter list still names T as the parameter, but the declaration now has a T * after the name of the class; this tells the compiler to match a pointer of any type with this template instead of the more general template. The second thing to note is that T is now the type pointed to; it is not itself a pointer. For instance, when you declare a sortedVector<int *>, T will refer to the int type! This makes some sense if you think of it as a form of pattern matching where T matches the type if that type is followed by an asterisk. This does mean that you have to be a tad bit more careful in your implementation: note that vec_data is a T** because we need a dynamically sized array made up of pointers.

You might wonder if you really want your sortedVector type to work like this--after all, if you're putting them in an array of pointers, you'd expect them to be sorted by pointer type. But there's a practical reason for doing this: when you allocate memory for an array of objects, the default constructor must be called to construct each object. If no default constructor exists (for instance, if every object needs some data to be created), you're stuck needing a list of pointers to objects, but you probably want them to be sorted the same way the actual objects themselves would be!

Note, by the way, that you can also partially specialize on template arguments--for instance, if you had a fixedVector type that allowed the user of the class to specify both a type to store and the length of the vector (possibly to avoid the cost of dynamic memory allocations), it might look something like this:

template <typename T, unsigned length>
class fixedVector { ... };

Then you could partially specialize for booleans with the following syntax

template <unsigned length>
class fixedVector<bool, length> {...}

Note that since T is no longer a template parameter, it's left out of the template parameter list, leaving only length. Also note that length now shows up as part of fixedVector's name (unlike when you have a generic template declaration, where you specify nothing after the name). (By the way, don't be surprised to see a template parameter that's a non-type: it's perfectly valid, and sometimes useful, to have template arguments that are integer types such as unsigned.)

A final implementation detail comes up with partial specializations: how does the compiler pick which specialization to use if there are a combination of completely generic types, some partial specializations, and maybe even some full specializations? The general rule of thumb is that the compiler will pick the most specific template specialization--the most specific template specialization is the one whose template arguments would be accepted by the other template declarations, but which would not accept all possible arguments that other templates with the same name would accept.

For instance, if you decided that you wanted a sortedVector<int *> that sorted by memory location, you could create a full specialization of sortedVector and if you declared a sortedVector<int *>, then the compiler would pick that implementation over the less-specific partial specialization for pointers. It's the most specialized since only an int * matches the full specialization, not any other pointer type such as a double *, whereas int * certainly could be a parameter to either of the other templates.

Lesson 24:Templated Functions

C++ templates can be used both for classes and for functions in C++. Templated functions are actually a bit easier to use than templated classes, as the compiler can often deduce the desired type from the function's argument list.
The syntax for declaring a templated function is similar to that for a templated class:

template <class type> type func_name(type arg1, ...);

For instance, to declare a templated function to add two values together, you could use the following syntax:

template <class type> type add(type a, type b)
{
    return a + b;
}

Now, when you actually use the add function, you can simply treat it like any other function because the desired type is also the type given for the arguments. This means that upon compiling the code, the compiler will know what type is desired:

int x = add(1, 2);

will correctly deduce that "type" should be int. This would be the equivalent of saying:

int x = add<int>(1, 2);

where the template is explicitly instantiated by giving the type as a template parameter.

On the other hand, type inference of this sort isn't always possible because it's not always feasible to guess the desired types from the arguments to the function. For instance, if you wanted a function that performed some kind of cast on the arguments, you might have a template with multiple parameters:

template <class type1, class type2> type2 cast(type1 x)
{
    return (type2)x;
}

Using this function without specifying the correct type for type2 would be impossible. On the other hand, it is possible to take advantage of some type inference if the template parameters are correctly ordered. In particular, if the first argument must be specified and the second deduced, it is only necessary to specify the first, and the second parameter can be deduced.

For instance, given the following declaration

template <class rettype, class argtype> rettype cast(argtype x)
{
    return (rettype)x;
}

this function call specifies everything that is necessary to allow the compiler deduce the correct type:

cast<double>(10);

which will cast an int to a double. Note that arguments to be deduced must always follow arguments to be specified. (This is similar to the way that default arguments to functions work.)

You might wonder why you cannot use type inference for classes in C++. The problem is that it would be a much more complex process with classes, especially as constructors may have multiple versions that take different numbers of parameters, and not all of the necessary template parameters may be used in any given constructor.

Templated Classes with Templated Functions

It is also possible to have a templated class that has a member function that is itself a template, separate from the class template. For instance,

template <class type> class TClass
{
    // constructors, etc
    
    template <class type2> type2 myFunc(type2 arg);
};

The function myFunc is a templated function inside of a templated class, and when you actually define the function, you must respect this by using the template keyword twice:

template <class type>  // For the class
    template <class type2>  // For the function
    type2 TClass<type>::myFunc(type2 arg)
    {
        // code
    }

The following attempt to combine the two is wrong and will not work:

// bad code!
template <class type, class type2> type2 TClass<type>::myFunc(type2 arg)
{
    // ...
}

because it suggests that the template is entirely the class template and not a function template at all.

Lesson 23:Templates and Template Classes in C++

What's better than having several classes that do the same thing to different datatypes? One class that lets you choose which datatype it acts on.

Templates are a way of making your classes more abstract by letting you define the behavior of the class without actually knowing what datatype will be handled by the operations of the class. In essence, this is what is known as generic programming; this term is a useful way to think about templates because it helps remind the programmer that a templated class does not depend on the datatype (or types) it deals with. To a large degree, a templated class is more focused on the algorithmic thought rather than the specific nuances of a single datatype. Templates can be used in conjunction with abstract datatypes in order to allow them to handle any type of data. For example, you could make a templated stack class that can handle a stack of any datatype, rather than having to create a stack class for every different datatype for which you want the stack to function. The ability to have a single class that can handle several different datatypes means the code is easier to maintain, and it makes classes more reusable.

The basic syntax for declaring a templated class is as follows:

template <class a_type> class a_class {...};

The keyword 'class' above simply means that the identifier a_type will stand for a datatype. NB: a_type is not a keyword; it is an identifier that during the execution of the program will represent a single datatype. For example, you could, when defining variables in the class, use the following line:

a_type a_var;

and when the programmer defines which datatype 'a_type' is to be when the program instantiates a particular instance of a_class, a_var will be of that type.

When defining a function as a member of a templated class, it is necessary to define it as a templated function:

template<class a_type> void a_class<a_type>::a_function(){...}

When declaring an instance of a templated class, the syntax is as follows:

a_class<int> an_example_class;

An instantiated object of a templated class is called a specialization; the term specialization is useful to remember because it reminds us that the original class is a generic class, whereas a specific instantiation of a class is specialized for a single datatype (although it is possible to template multiple types).

Usually when writing code it is easiest to precede from concrete to abstract; therefore, it is easier to write a class for a specific datatype and then proceed to a templated - generic - class. For that brevity is the soul of wit, this example will be brief and therefore of little practical application.

We will define the first class to act only on integers.

class calc
{
  public:
    int multiply(int x, int y);
    int add(int x, int y);
 };
int calc::multiply(int x, int y)
{
  return x*y;
}
int calc::add(int x, int y)
{
  return x+y;
}

We now have a perfectly harmless little class that functions perfectly well for integers; but what if we decided we wanted a generic class that would work equally well for floating point numbers? We would use a template.

template <class A_Type> class calc
{
  public:
    A_Type multiply(A_Type x, A_Type y);
    A_Type add(A_Type x, A_Type y);
};
template <class A_Type> A_Type calc<A_Type>::multiply(A_Type x,A_Type y)
{
  return x*y;
}
template <class A_Type> A_Type calc<A_Type>::add(A_Type x, A_Type y)
{
  return x+y;
}

To understand the templated class, just think about replacing the identifier A_Type everywhere it appears, except as part of the template or class definition, with the keyword int. It would be the same as the above class; now when you instantiate an
object of class calc you can choose which datatype the class will handle.

calc <double> a_calc_class;

Templates are handy for making your programs more generic and allowing your code to be reused later.

Lesson 22:Understanding Initialization Lists in C++

Understanding the Start of an Object's Lifetime

In C++, whenever an object of a class is created, its constructor is called. But that's not all--its parent class constructor is called, as are the constructors for all objects that belong to the class. By default, the constructors invoked are the default ("no-argument") constructors. Moreover, all of these constructors are called before the class's own constructor is called.

For instance, take the following code:

#include <iostream>
class Foo
{
        public:
        Foo() { std::cout << "Foo's constructor" << std::endl; }
};
class Bar : public Foo
{
        public:
        Bar() { std::cout << "Bar's constructor" << std::endl; }
};

int main()
{
        // a lovely elephant ;)
        Bar bar;
}

The object bar is constructed in two stages: first, the Foo constructor is invoked and then the Bar constructor is invoked. The output of the above program will be to indicate that Foo's constructor is called first, followed by Bar's constructor.

Why do this? There are a few reasons. First, each class should need to initialize things that belong to it, not things that belong to other classes. So a child class should hand off the work of constructing the portion of it that belongs to the parent class. Second, the child class may depend on these fields when initializing its own fields; therefore, the constructor needs to be called before the child class's constructor runs. In addition, all of the objects that belong to the class should be initialized so that the constructor can use them if it needs to.

But what if you have a parent class that needs to take arguments to its constructor? This is where initialization lists come into play. An initialization list immediately follows the constructor's signature, separated by a colon:

class Foo : public parent_class
{
        Foo() : parent_class( "arg" ) // sample initialization list
        {
                // you must include a body, even if it's merely empty
        }
};

Note that to call a particular parent class constructor, you just need to use the name of the class (it's as though you're making a function call to the constructor).

For instance, in our above example, if Foo's constructor took an integer as an argument, we could do this:

#include <iostream>
class Foo
{
        public:
        Foo( int x ) 
        {
                std::cout << "Foo's constructor " 
                          << "called with " 
                          << x 
                          << std::endl; 
        }
};

class Bar : public Foo
{
        public:
        Bar() : Foo( 10 )  // construct the Foo part of Bar
        { 
                std::cout << "Bar's constructor" << std::endl; 
        }
};

int main()
{
        Bar stool;
}

Using Initialization Lists to Initialize Fields

In addition to letting you pick which constructor of the parent class gets called, the initialization list also lets you specify which constructor gets called for the objects that are fields of the class. For instance, if you have a string inside your class:

class Qux
{
        public:
                Qux() : _foo( "initialize foo to this!" ) { }
        // This is nearly equivalent to 
        // Qux() { _foo = "initialize foo to this!"; }
        // but without the extra call to construct an empty string

        private:
        std::string _foo;
};

Here, the constructor is invoked by giving the name of the object to be constructed rather than the name of the class (as in the case of using initialization lists to call the parent class's constructor).

If you have multiple fields of a class, then the names of the objects being initialized should appear in the order they are declared in the class (and after any parent class constructor call):

class Baz
{
        public:
                Baz() : _foo( "initialize foo first" ), _bar( "then bar" ) { }

        private:
        std::string _foo;
        std::string _bar;
};

Initialization Lists and Scope Issues

If you have a field of your class that is the same name as the argument to your constructor, then the initialization list "does the right thing." For instance,

class Baz
{
        public:
                Baz( std::string foo ) : foo( foo ) { }
        private:
            std::string foo;
};

is roughly equivalent to

class Baz
{
        public:
                Baz( std::string foo )
                {
                    this->foo = foo;
                }
        private:
            std::string foo;
};

That is, the compiler knows which foo belongs to the object, and which foo belongs to the function.

Initialization Lists and Primitive Types

It turns out that initialization lists work to initialize both user-defined types (objects of classes) and primitive types (e.g., int). When the field is a primitive type, giving it an argument is equivalent to assignment. For instance,

class Quux
{
        public:
                Quux() : _my_int( 5 )  // sets _my_int to 5
                { }

        private:
                int _my_int;
};

This behavior allows you to specify templates where the templated type can be either a class or a primitive type (otherwise, you would have to have different ways of handling initializing fields of the templated type for the case of classes and objects).

template <class T>
class my_template
{
        public:
        // works as long as T has a copy constructor
         my_template( T bar ) : _bar( bar ) { }

        private:
                T _bar;
};

Initialization Lists and Const Fields

Using initialization lists to initialize fields is not always necessary (although it is probably more convenient than other approaches). But it is necessary for const fields. If you have a const field, then it can be initialized only once, so it must be initialized in the initialization list.

class const_field
{
        public:
                const_field() : _constant( 1 ) { }
                // this is an error: const_field() { _constant = 1; } 

        private:
                const int _constant;
};

When Else do you Need Initialization Lists?

No Default Constructor

If you have a field that has no default constructor (or a parent class with no default constructor), you must specify which constructor you wish to use.

References

If you have a field that is a reference, you also must initialize it in the initialization list; since references are immutable they can be initialized only once.

Initialization Lists and Exceptions

Since constructors can throw exceptions, it's possible that you might want to be able to handle exceptions that are thrown by constructors invoked as part of the initialization list.

First, you should know that even if you catch the exception, it will get rethrown because it cannot be guaranteed that your object is in a valid state because one of its fields (or parts of its parent class) couldn't be initialized. That said, one reason you'd want to catch an exception here is that there's some kind of translation of error messages that needs to be done.

The syntax for catching an exception in an initialization list is somewhat awkward: the 'try' goes right before the colon, and the catch goes after the body of the function:

class Foo
{
        Foo() try : _str( "text of string" ) 
        { 
        } 
        catch ( ... ) 
        { 
               std::cerr << "Couldn't create _str";
             // now, the exception is rethrown as if we'd written
            // "throw;" here
        }
};

Initialization Lists: Summary

Before the body of the constructor is run, all of the constructors for its parent class and then for its fields are invoked. By default, the no-argument constructors are invoked. Initialization lists allow you to choose which constructor is called and what arguments that constructor receives.

If you have a reference or a const field, or if one of the classes used does not have a default constructor, you must use an initialization list.

Lesson 21:Class Design in C++

Understanding Interfaces

When you're designing a class in C++, the first thing you should decide is the public interface for the class. The public interface determines how your class will be used by other programmers (or you), and once designed and implemented it should generally stay pretty constant. You may decide to add to the interface, but once you've started using the class, it will be hard to remove functions from the public interface (unless they aren't used and weren't necessary in the first place).

But that doesn't mean that you should include more functionality in your class than necessary just so that you can later decide what to remove from the interface. If you do this, you'll just make the class harder to use. People will ask questions like, "why are there four ways of doing this? Which one is better? How can I choose between them?" It's usually easier to keep things simple and provide one way of doing each thing unless there's a compelling reason why your class should offer multiple methods with the same basic functionality.

At the same time, just because adding methods to the public interface (probably) won't break anything that doesn't mean that you should start off with a tiny interface. First of all, if anybody decides to inherit from your class and you then choose a function with the same name, you're in for a boatload of confusion. First, if you don't declare the function virtual, then an object of the subclass will have the function chosen depending on the static type of the pointer. This can be messy. Moreover, if you do declare it virtual, then you have the issue that it might provide a different type of functionality than was intended by the original implementation of that function. Finally, you just can't add a pure virtual function to a class that's already in use because nobody who has inherited from it will have implemented that function.

The public interface, then, should remain as constant as possible. In fact, a good approach to designing classes is to write the interface before the implementation because it's what determines how your class interacts with the rest of the world (which is more important for the program as a whole than how the class is actually implemented). Moreover, if you write the interface first, you can get a feel for how the class will work with other classes before you actually dive into the implementation details.

Inheritance and Class Design

The second issue of your class design is what should be available to programmers who wish to create subclasses. This interface is primarily determined by virtual functions, but you can also include protected methods that are designed for use by the class or its subclasses (remember that protected methods are visible to subclasses while private methods are not).

A key consideration is whether it makes sense for a function to be virtual. A function should be virtual when the implementation is likely to differ from subclass to subclass. Vice-versa, whenever a function should not change, then it should be made non-virtual. The key idea is to think about whether to make a function virtual by asking if the function should always be the same for every class.

For example, if you have a class is designed to allow users to monitor network traffic and you want to allow subclasses that implement different ways of analyzing the traffic, you might use the following interface:

class TrafficWatch
{
        public:
  // Packet is some class that implements information about network
  // packets
  void addPacket (const Packet& network_packet);
  int getAveragePacketSize ();
  int getMaxPacket ();
virtual bool isOverloaded ();
};

In this class, some methods will not change from implementation to implementation; adding a packet should always be handled the same way, and the average packet size isn't going to change either. On the other hand, someone might have a very different idea of what it means to have an overloaded network. This will change from situation to situation and we don't want to prevent someone from changing how this is computed--for some, anything over 10 Mbits/sec of traffic might be an overloaded network, and for others, it would require 100 Mbits/sec on some specific network cables.

Finally, when publicly inheriting from any class or designing for inheritance, remember that you should strive for it to be clear that inheritance models is-a. At heart, the is-a relationship means that the subclass should be able to appear anywhere the parent class could appear. From the standpoint of the user of the class, it should not matter whether a class is the parent class or a subclass.

To design an is-a relationship, make sure that it makes sense for the class to include certain functions to be sure that it doesn't include that subclasses might not actually need. One example of having an extra function is that of a Bird class that implements a fly function. The problem is that not all birds can fly--penguins and emus, for instance. This suggests that a more prudent design choice might be to have two subclasses of birds, one for birds that can fly and one for flightless birds. Of course, it might be overkill to have two subclasses of bird depending on how complex your class hierarchy will be. If you know that nobody would ever expect use your class for a flightless bird, then it's not so bad. Of course, you won't always know what someone will use your class for and it's much easier to think carefully before you start to implement an entire class hierarchy than it will be to go back and change it once people are using it.

Lesson 20: C++ Inheritance - Syntax

Before beginning this lesson, you should have an understanding of the idea of inheritance. If you do not, please read lesson 19. This lesson will consist of an overview of the syntax of inheritance, the use of the keywords public, private, and protected, and then an example program following to demonstrate each.
The syntax to denote one class as inheriting from another is simple. It looks like the following: class Bear : public Animal, in place of simply the keyword class and then the class name. The ": public base_class_name" is the essential syntax of inheritance; the function of this syntax is that the class will contain all public and protected variables of the base class. Do not confuse the idea of a derived class having access to data members of a base class and specific instances of the derived class possessing data. The data members - variables and functions - possessed by the derived class are specific to the type of class, not to each individual object of that type. So, two different Bear objects, while having the same member variables and functions, may have different information stored in their variables; furthermore, if there is a class Animal with an object, say object BigAnimal, of that type, and not of a more specific type inherited from that class, those two bears will not have access to the data within BigAnimal. They will simply possess variables and functions with the same name and of the same type.

A quick example of inheritance:

class Animal
{
  public:
  Animal();
  ~Animal();
  void eat();
  void sleep();
  void drink();

private:
  int legs;
  int arms;
  int age;
};
//The class Animal contains information and functions
//related to all animals (at least, all animals this lesson uses)
class Cat : public Animal
{
  public:
  int fur_color;
  void purr();
  void fish();
  void markTerritory();
};
//each of the above operations is unique
//to your friendly furry friends
//(or enemies, as the case may be)

A discussion of the keywords public, private, and protected is useful when discussing inheritance. The three keywords are used to control access to functions and variables stored within a class.

public:

The most open level of data hiding is public. Anything that is public is available to all derived classes of a base class, and the public variables and data for each object of both the base and derived class is accessible by code outside the class. Functions marked public are generally those the class uses to give information to and take information from the outside world; they are typically the interface with the class. The rest of the class should be hidden from the user using private or protected data (This hidden nature and the highly focused nature of classes is known collectively as encapsulation). The syntax for public is:

public:

Everything following is public until the end of the class or another data hiding keyword is used.

In general, a well-designed class will have no public fields--everything should go through the class's functions. Functions that retrieve variables are known as 'getters' and those that change values are known as 'setters'. Since the public part of the class is intended for use by others, it is often sensible to put the public section at the top of the class.

protected:

Variables and functions marked protected are inherited by derived classes; however, these derived classes hide the data from code outside of any instance of the object. Keep in mind, even if you have another object of the same type as your first object, the second object cannot access a protected variable in the first object. Instead, the second object will have its own variable with the same name - but not necessarily the same data. Protected is a useful level of access control for important aspects to a class that must be passed on without allowing it to be accessed. The syntax is the same as that of public. specifically,

protected:

private:

Private is the highest level of data-hiding. Not only are the functions and variables marked private not accessible by code outside the specific object in which that data appears, but private variables and functions are not inherited (in the sense that the derived class cannot directly access these variables or functions). The level of data protection afforded by protected is generally more flexible than that of the private level. On the other hand, if you do not wish derived classes to access a method, declaring it private is sensible.

private:

Lesson 19: Inheritance in C++

The ability to use the object-oriented programming is an important feature of C++. Lesson 12: classes in C++ introduced the idea of the class; if you have not read it and do not know the basic details of classes, you should read it before continuing this tutorial.

Inheritance is an important feature of classes; in fact, it is integral to the idea of object oriented programming. Inheritance allows you to create a hierarchy of classes, with various classes of more specific natures inheriting the general aspects of more generalized classes. In this way, it is possible to structure a program starting with abstract ideas that are then implemented by specific classes. For example, you might have a class Animal from which class dog and cat inherent the traits that are general to all animals; at the same time, each of those classes will have attributes specific to the animal dog or cat.
Inheritance offers many useful features to programmers. The ability, for example, of a variable of a more general class to function as any of the more specific classes which inherit from it, called polymorphism, is handy. For now, we will concentrate on the basic syntax of inheritance. Polymorphism will be covered in its own tutorial.

Any class can inherit from any other class, but it is not necessarily good practice to use inheritance (put it in the bank rather than go on a vacation). Inheritance should be used when you have a more general class of objects that describes a set of objects. The features of every element of that set (of every object that is also of the more general type) should be reflected in the more general class. This class is called the base class. base classes usually contain functions that all the classes inheriting from it, known as derived classes, will need. base classes should also have all the variables that every derived class would otherwise contain.

Let us look at an example of how to structure a program with several classes. Take a program used to simulate the interaction between types of organisms, trees, birds, bears, and other creatures coinhabiting a forest. There would likely be several base classes that would then have derived classes specific to individual animal types. In fact, if you know anything about biology, you might wish to structure your classes to take advantage of the biological classification from Kingdom to species, although it would probably be overly complex. Instead, you might have base classes for the animals and the plants. If you wanted to use more base classes (a class can be both a derived of one class and a base of another), you might have classes for flying animals and land animals, and perhaps trees and scrub. Then you would want classes for specific types of animals: pigeons and vultures, bears and lions, and specific types of plants: oak and pine, grass and flower. These are unlikely to live together in the same area, but the idea is essentially there: more specific classes ought to inherit from less specific classes.

Classes, of course, share data. A derived class has access to most of the functions and variables of the base class. There are, however, ways to keep a derived class from accessing some attributes of its base class. The keywords public, protected, and private are used to control access to information within a class. It is important to remember that public, protected, and private control information both for specific instances of classes and for classes as general data types. Variables and functions designated public are both inheritable by derived classes and accessible to outside functions and code when they are elements of a specific instance of a class. Protected variables are not accessible by functions and code outside the class, but derived classes inherit these functions and variables as part of their own class. Private variables are neither accessible outside the class when it is a specific class nor are available to derived classes. Private variables are useful when you have variables that make sense in the context of large idea.

Binary Trees in C++: Part 1

The binary tree is a fundamental data structure used in computer science. The binary tree is a useful data structure for rapidly storing sorted data and rapidly retrieving stored data. A binary tree is composed of parent nodes, or leaves, each of which stores data and also links to up to two other child nodes (leaves) which can be visualized spatially as below the first node with one placed to the left and with one placed to the right. It is the relationship between the leaves linked to and the linking leaf, also known as the parent node, which makes the binary tree such an efficient data structure. It is the leaf on the left which has a lesser key value (i.e., the value used to search for a leaf in the tree), and it is the leaf on the right which has an equal or greater key value. As a result, the leaves on the farthest left of the tree have the lowest values, whereas the leaves on the right of the tree have the greatest values. More importantly, as each leaf connects to two other leaves, it is the beginning of a new, smaller, binary tree. Due to this nature, it is possible to easily access and insert data in a binary tree using search and insert functions recursively called on successive leaves.
The typical graphical representation of a binary tree is essentially that of an upside down tree. It begins with a root node, which contains the original key value. The root node has two child nodes; each child node might have its own child nodes. Ideally, the tree would be structured so that it is a perfectly balanced tree, with each node having the same number of child nodes to its left and to its right. A perfectly balanced tree allows for the fastest average insertion of data or retrieval of data. The worst case scenario is a tree in which each node only has one child node, so it becomes as if it were a linked list in terms of speed. The typical representation of a binary tree looks like the following:

   
             10
           /    \
          6      14
         / \    /  \
        5   8  11  18

The node storing the 10, represented here merely as 10, is the root node, linking to the left and right child nodes, with the left node storing a lower value than the parent node, and the node on the right storing a greater value than the parent node. Notice that if one removed the root node and the right child nodes, that the node storing the value 6 would be the equivalent a new, smaller, binary tree.
The structure of a binary tree makes the insertion and search functions simple to implement using recursion. In fact, the two insertion and search functions are also both very similar. To insert data into a binary tree involves a function searching for an unused node in the proper position in the tree in which to insert the key value. The insert function is generally a recursive function that continues moving down the levels of a binary tree until there is an unused leaf in a position which follows the rules of placing nodes. The rules are that a lower value should be to the left of the node, and a greater or equal value should be to the right. Following the rules, an insert function should check each node to see if it is empty, if so, it would insert the data to be stored along with the key value (in most implementations, an empty node will simply be a NULL pointer from a parent node, so the function would also have to create the node). If the node is filled already, the insert function should check to see if the key value to be inserted is less than the key value of the current node, and if so, the insert function should be recursively called on the left child node, or if the key value to be inserted is greater than or equal to the key value of the current node the insert function should be recursively called on the right child node. The search function works along a similar fashion. It should check to see if the key value of the current node is the value to be searched. If not, it should check to see if the value to be searched for is less than the value of the node, in which case it should be recursively called on the left child node, or if it is greater than the value of the node, it should be recursively called on the right child node. Of course, it is also necessary to check to ensure that the left or right child node actually exists before calling the function on the node.
Because binary trees have log (base 2) n layers, the average search time for a binary tree is log (base 2) n. To fill an entire binary tree, sorted, takes roughly log (base 2) n * n. Let's take a look at the necessary code for a simple implementation of a binary tree. First, it is necessary to have a struct, or class, defined as a node.

struct node
{
  int key_value;
  node *left;
  node *right;
};

The struct has the ability to store the key_value and contains the two child nodes which define the node as part of a tree. In fact, the node itself is very similar to the node in a linked list. A basic knowledge of the code for a linked list will be very helpful in understanding the techniques of binary trees. Essentially, pointers are necessary to allow the arbitrary creation of new nodes in the tree.
It is most logical to create a binary tree class to encapsulate the workings of the tree into a single area, and also making it reusable. The class will contain functions to insert data into the tree and to search for data. Due to the use of pointers, it will be necessary to include a function to delete the tree in order to conserve memory after the program has finished.

 
class btree
{
    public:
        btree();
        ~btree();

        void insert(int key);
        node *search(int key);
        void destroy_tree();

    private:
        void destroy_tree(node *leaf);
        void insert(int key, node *leaf);
        node *search(int key, node *leaf);
        
        node *root;
};

The insert and search functions that are public members of the class are designed to allow the user of the class to use the class without dealing with the underlying design. The insert and search functions which will be called recursively are the ones which contain two parameters, allowing them to travel down the tree. The destroy_tree function without arguments is a front for the destroy_tree function which will recursively destroy the tree, node by node, from the bottom up.
The code for the class would look similar to the following:

btree::btree()
{
  root=NULL;
}

It is necessary to initialize root to NULL for the later functions to be able to recognize that it does not exist.

btree::~btree()
{
  destroy_tree();
}

The destroy_tree function will set off the recursive function destroy_tree shown below which will actually delete all nodes of the tree.

void btree::destroy_tree(node *leaf)
{
  if(leaf!=NULL)
  {
    destroy_tree(leaf->left);
    destroy_tree(leaf->right);
    delete leaf;
  }
}

The function destroy_tree goes to the bottom of each part of the tree, that is, searching while there is a non-null node, deletes that leaf, and then it works its way back up. The function deletes the leftmost node, then the right child node from the leftmost node's parent node, then it deletes the parent node, then works its way back to deleting the other child node of the parent of the node it just deleted, and it continues this deletion working its way up to the node of the tree upon which delete_tree was originally called. In the example tree above, the order of deletion of nodes would be 5 8 6 11 18 14 10. Note that it is necessary to delete all the child nodes to avoid wasting memory.

void btree::insert(int key, node *leaf)
{
  if(key< leaf->key_value)
  {
    if(leaf->left!=NULL)
     insert(key, leaf->left);
    else
    {
      leaf->left=new node;
      leaf->left->key_value=key;
      leaf->left->left=NULL;//Sets the left child of the child node to null
      leaf->left->right=NULL;//Sets the right child of the child node to null
    }  
  }
  else if(key>=leaf->key_value)
  {
    if(leaf->right!=NULL)
      insert(key, leaf->right);
    else
    {
      leaf->right=new node;
      leaf->right->key_value=key;
      leaf->right->left=NULL;  //Sets the left child of the child node to null
      leaf->right->right=NULL; //Sets the right child of the child node to null
    }
  }
}

The case where the root node is still NULL will be taken care of by the insert function that is nonrecursive and available to non-members of the class. The insert function searches, moving down the tree of children nodes, following the prescribed rules, left for a lower value to be inserted and right for a greater value, until it finds an empty node which it creates using the 'new' keyword and initializes with the key value while setting the new node's child node pointers to NULL. After creating the new node, the insert function will no longer call itself.

node *btree::search(int key, node *leaf)
{
  if(leaf!=NULL)
  {
    if(key==leaf->key_value)
      return leaf;
    if(key<leaf->key_value)
      return search(key, leaf->left);
    else
      return search(key, leaf->right);
  }
  else return NULL;
}

The search function shown above recursively moves down the tree until it either reaches a node with a key value equal to the value for which the function is searching or until the function reaches an uninitialized node, meaning that the value being searched for is not stored in the binary tree. It returns a pointer to the node to the previous instance of the function which called it, handing the pointer back up to the search function accessible outside the class.

void btree::insert(int key)
{
  if(root!=NULL)
    insert(key, root);
  else
  {
    root=new node;
    root->key_value=key;
    root->left=NULL;
    root->right=NULL;
  }
}

The public version of the insert function takes care of the case where the root has not been initialized by allocating the memory for it and setting both child nodes to NULL and setting the key_value to the value to be inserted. If the root node already exists, insert is called with the root node as the initial node of the function, and the recursive insert function takes over.

node *btree::search(int key)
{
  return search(key, root);
}

The public version of the search function is used to set off the search recursion at the root node, keeping it from being necessary for the user to have access to the root node.

void btree::destroy_tree()
{
  destroy_tree(root);
}

The public version of the destroy tree function is merely used to initialize the recursive destroy_tree function which then deletes all the nodes of the tree.