Arrays of c-strings

There are a lot of use cases in which you deal with a set of strings.  An array of strings seems to be a logical, first-pass choice for storing them.  Because c-strings are basic arrays fundamentally, they are a great topic to use to understand the array/pointer relationship more fully.

Consider, the following array declaration:

char data[5][10] = {"red", "yellow", "green", "blue", "violet"};

Because data is the name of a 2D array, it can be treated as a pointer to a pointer, or more precisely in this case, a pointer to an array of pointers.  Because we are dealing with a 2D array of chars with each row acting as a null-terminated c-string, then we can also conceptualize data as an array of c-strings.  So, if I have an array of c-strings, then I can display just one c-string from the array by sending the address of the 1st element of the array to cout like this:

cout << data[1] << endl; //would print yellow

Remember, if you send an ostream object a char pointer, it will start printing at that character until it gets to a null terminator.  This means that if we send it the address of the first letter of a row and an additional offset (but not subscripted), it will start printing at that letter and go to the null terminator as well.  Here is an example:

cout << data[2] + 2 << endl; //would print een of green

The expression data[2]+2 is the memory address of the 3rd character in the 3rd string of the array.  Therefore, it will start printing at that character and print until the end of the string.

You may be asking yourself, “What’s the difference between the example above and data[2][2]?”  Great question!  The difference is that in data[2][2], the 2nd number is subscripted, which means essentially that it is dereferenced.  So, the data type of the expression data[2]+2 is char* while the data type of the expression data[2][2] is char.  If we send cout a char, it will only print that one character.

Pointer Offset Notation and Subscript Notation

As we know,

data[2]

is equivalent to

*(data + 2)

One first is subscript notation, then second is pointer-offset notation.  We can include multiple levels of offsets depending on the data type of the base constructs (the datatype of data, in this case).  So, this means that the expression

data[2][2]

is equivalent to the expression

*(*(data + 2)+2)

If you think about starting at the deepest level of nesting and work your way out, you’ll see that they are equivalent.

As an example, the following two statements produce the same output:

cout << (data[1] + 3) << endl;
cout << (*(data+1)+3) << endl;

The relationship between pointer-offset notation and subscript notation can seem to be a little confusing.  But if you just take it slow, it all makes logical sense.

Peace, love, c++!

Trackbacks

  1. […] C-Strings and Pointer-offset notation vs. Subscript Notation […]

Speak Your Mind

*

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.