ZeePedia

String Handling, String Manipulation Functions, Character Handling Functions, String Conversion Functions

<< Multi-dimensional Arrays, Pointers to Pointers, Command-line Arguments
Files: Text File Handling, Output File Handling >>
img
CS201 ­ Introduction to Programming
Lecture Handout
Introduction to Programming
Lecture No. 17
Reading Material
Deitel & Deitel - C++ How to Program
Chapter 5
5.29,5.30, 5.31, 5.32, 5.33, 5.34
Chapter 16
16.16 ­ 16.33 ( Pages 869 ­ 884)
Summary
·
String Handling
·
String Manipulation Functions
·
Character Handling Functions
·
Sample Program
·
String Conversion Functions
·
String Functions
·
Search Functions
·
Examples
·
Exercises
String Handling
We have briefly talked about 'Strings' in some of the previous lectures. In this lecture,
you will see how a string may be handled. Before actually discussing the subject, it is
pertinent to know how the things were going on before the evolution of the concept of
'strings'.
When C language and UNIX operating system were being developed in BELL
Laboratories, the scientists wanted to publish the articles. They needed a text editor to
publish the articles. What they needed was some easy mechanism by which the articles
could be formatted and published. We are talking about the times when PCs and word
Page 194
img
CS201 ­ Introduction to Programming
processors did not exist. It may be very strange thing for you people who can perform
the tasks like making the characters bold, large or format a paragraph with the help of
word processors these days. Those scientists had not such a facility available with them.
The task of writing article and turning into publishable material was mainly done with the
help of typewriters. Then these computer experts decided to develop a program, which
could help in the processing of text editing in an easy manner. The resultant efforts led to
the development of a program for editing the text. The process to edit text was called text
processing. The in- line commands were written as a part of the text and were processed
on out put. Later, such programs were evolved in which a command was inserted for the
functions like making the character bold. The effect of this command could be preview
and then modified if needed.
Now coming to the topic of strings again, we will discuss in detail the in-built functions
to handle the strings.
String Manipulation Functions
C language provides many functions to manipulate strings. To understand the functions,
let's consider building block (or unit) of a string i.e., a character. Characters are
represented inside the computers in terms of numbers. There is a code number for each
character, used by a computer. Mostly the computers use ASCII (American Standard
Code for Information Interchange) code for a character to store it. This is used in the
computer memory for manipulation. It is used as an output in the form of character. We
can write a program to see the ASCII values.
We have a data type char to store a character. A character includes every thing, which we
can type with a keyboard for example white space, comma, full stop and colon etc all are
characters. 0, 1, 2 are also characters. Though, as numbers, they are treated differently,
yet they are typed as characters. Another data type is called as int, which stores whole
numbers. As we know that characters are stored in side computer as numbers so these can
be manipulated in the same form. A character is stored in the memory in one byte i.e. 8
bits. It means that 28 (256) different combinations for different values can be stored. We
want to ascertain what number it stores, when we press a key on the board. In other
words, we will see what character will be displayed when we have a number in memory.
The code of the program, which displays the characters and their corresponding integer,
values (ASCII codes) as under.
In the program the statement c = i ; has integer value on right hand side (as i is an int)
while c has its character representation. We display the value of i and c. It shows us the
characters and their integer values.
//This program displays the ASCII code table
# include <iostream.h>
main ( )
{
Page 195
img
CS201 ­ Introduction to Programming
int i, char c ;
for (i = 0 ; i < 256 ; i ++)
{
c=i;
cout << i << "\t" << c << "\n" ;
}
}
In the output of this program, we will see integer numbers and their character
representation. For example, there is a character, say white space (which we use between
two words). It is a non-printable character and leaves a space. From the ASCII table, we
can see that the values of a-z and A-Z are continuos. We can get the value of an alphabet
letter by adding 1 to the value of its previous letter. So what we need to remember as a
baseline is the value of `a' and `A'.
Character Handling Functions
C language provides many functions to perform useful tests and manipulations of
character data. These functions are found in the header file ctype.h. The programs that
have character manipulation or tests on character data must have included this header file
to avoid a compiler error. Each function in ctype.h receives a character (an int ) or EOF
(end of file; it is a special character) as an argument. ctype.h has many functions, which
have self-explanatory names.
Of these, int isdigit (int c) takes a simple character as its argument and returns true or
false. This function is like a question being asked. The question can be described whether
it is a character digit? The answer may be true or false. If the argument is a numeric
character (digit), then this function will return true otherwise false. This is a useful
function to test the input. To check for an alphabet (i.e. a-z), the function isalpha can be
used. isalpha will return true for alphabet a-z for small and capital letters. Other than
alphabets, it will return false. The function isalnum (is alphanumeric) returns true if its
argument is a digit or letter. It will return false otherwise. All the functions included in
ctype.h are shown in the following table with their description.
Prototype
Description
Returns true if c is a digit and false otherwise.
int isdigit( int c )
int isalpha( int c )
Returns true if c is a letter and false otherwise.
int isalnum( int c ) Returns true if c is a digit or a letter and false otherwise.
Returns true if c is a hexadecimal digit character and false
int isxdigit( int c )
otherwise.
Page 196
img
CS201 ­ Introduction to Programming
Returns true if c is a lowercase letter and false otherwise.
int islower( int c )
int isupper( int c ) Returns true if c is an uppercase letter; false otherwise.
int tolower( int c ) If c is an uppercase letter, tolower returns c as a lowercase letter.
Otherwise, tolower returns the argument unchanged.
int toupper( int c ) If c is a lowercase letter, toupper returns c as an uppercase letter.
Otherwise, toupper returns the argument unchanged.
Returns true if c is a white-space character--newline ('\n'), space
int isspace( int c )
(' '), form feed ('\f'), carriage return ('\r'), horizontal tab ('\t'), or
vertical tab ('\v')--and false otherwise
Returns true if c is a control character and false otherwise.
int iscntrl( int c )
Returns true if c is a printing character other than a space, a digit,
int ispunct( int c )
or a letter and false otherwise.
Returns true value if c is a printing character including space (' ')
int isprint( int c )
and false otherwise.
int isgraph( int c ) Returns true if c is a printing character other than space (' ') and
false otherwise.
The functions tolower and toupper are conversion functions. The tolower function
converts its uppercase letter argument into a lowercase letter. If its argument is other than
uppercase letter, it returns the argument unchanged. Similarly the toupper function
converts its lowercase letter argument into uppercase letter. If its argument is other than
lowercase letter, it returns the argument without effecting any change.
Sample Program
Let's consider the following example to further demonstrate the use of the functions of
ctype.h. Suppose, we write a program which prompts the user to enter a string. Then
the string entered is checked to count different types of characters (digit, upper and
lowercase letters, white space etc). We keep a counter for each category of character
entered. When the user ends the input, the number of characters entered in different
types will be displayed. In this example we are using a function getchar(), instead of cin
to get the input. This function is defined in header file as stdio.h. While carrying out
character manipulation, we use the getchar() function. This function reads a single
character from the input buffer or keyboard. This function can get the new line character
`\n' (the ENTER key) so we run the loop for input until user presses the ENTER key. As
soon as the getchar() gets the ENTER key pressed (i.e. new line character `\n'), the loop
is terminated. We know that, every C statement returns a value. When we use an
assignment statement ( as used in our program c = getchar()), the value assigned to the
left hand side variable is the value of the statement too. Thus, the statement (c =
getchar())  returns the value that is assigned to char c. Afterwards, this value is
compared with the new line character `\n'. If it is not equal inside the loop, we apply the
Page 197
img
CS201 ­ Introduction to Programming
tests on c to check whether it is uppercase letter, lowercase letter or a digit etc. In this
program, the whole string entered by the user is manipulated character.
Following is the code of this program.
// Example: analysis of text using <ctype.h> library
#include <iostream.h>
#include <stdio.h>
#include <ctype.h>
main()
{
char c;
int i = 0, lc = 0, uc = 0, dig = 0, ws = 0, pun = 0, oth = 0;
cout << "Please enter a character string and then press ENTER: ";
// Analyse text as it is input:
while ((c = getchar()) != '\n')
{
if (islower(c))
lc++;
else if (isupper(c))
uc++;
else if (isdigit(c))
dig++;
else if (isspace(c))
ws++;
else if (ispunct(c))
pun++;
else
oth++;
}
// display the counts of different types of characters
cout << "You typed:"<< endl;
cout<< "lower case letters = "<< lc<< endl;
cout << "upper case letters = " << uc <<endl;
cout<< "digits = " << dig << endl;
cout<< "white space = "<< ws << endl;
cout<< "punctuation = "<< pun<< endl;
cout<< "others = "<< oth;
}
Page 198
img
CS201 ­ Introduction to Programming
A sample output of the program is given below.
Please enter a character string and then press ENTER: Sixty Five = 65.00
You typed:
lower case letters = 7
upper case letters = 2
digits = 4
white space = 3
punctuation = 2
others = 0
String Conversion Functions
The header file stdlib.h includes functions, used for different conversions. When we get
input of a different type other than the type of variable in which the value is being stored,
it warrants the need to convert that type into another type. These conversion functions
take an argument of a type and return it after converting into another type. These
functions and their description are given in the table below.
Prototype
Description
Converts the string nPtr to double.
double atof( const char *nPtr )
Converts the string nPtr to int.
Int atoi( const char *nPtr )
Converts the string nPtr to long int.
long atol( const char *nPtr )
Converts the string nPtr to double.
double strtod( const char *nPtr, char
**endPtr )
Converts the string nPtr to long.
long strtol( const char *nPtr,  char
**endPtr, int base )
Converts the string nPtr to unsigned long.
unsigned long strtoul( const char
*nPtr, char **endPtr, int base )
Use of these functions:
While writing main () in a program, we can put them inside the parentheses of main. `int
arg c, char ** arg v are written inside the parentheses. The arg c is the count of number
of arguments passed to the program including the name of the program itself while arg v
is a vector of strings or an array of strings. It is used while giving command line
arguments to the program. The arguments in the command line will always be character
strings. The number in the command line (for example 12.8 or 45) are stored as strings.
While using the numbers in the program, we need these conversion functions.
Following is a simple program which demonstrate the use of atoi function. This program
prompts the user to enter an integer between 10-100, and checks if a valid integer is
entered.
//This program demonstrate the use of atoi function
Page 199
img
CS201 ­ Introduction to Programming
# include <iostream.h>
# include <stdlib.h>
main( )
{
int anInteger;
char myInt [20]
cout << "Enter an integer between 10-100 : ";
cin >> myInt;
if (atoi(myInt) == 0)
cout << "\nError : Not a valid input"; // could be non numeric
else
{
anInteger = atoi(myInt);
if (anInteger < 10 || anInteger > 100)
cout << "\nError : only integers between 10-100 are allowed!";
else
cout << "\n OK, you have entered " << anInteger;
}
}
The output of the program is as follows.
Enter an integer between 10-100 : 45.5
OK, you have entered 45
String Functions
We know a program to guess a number, stored in the computer. To find out a name
(which is a character array) among many names in the memory, we can perform string
comparison on two strings by comparing a character of first string with the corresponding
character of the second string. Before doing this, we check the length of both the strings
to compare. C library provides functions to compare strings, copy a string and for other
string manipulations.
The following table shows the string manipulation functions and their description. All
these functions are defined in the header file string.h, in the C library.
Function prototype
Function description
Copies string s2 into character array s1.
char *strcpy( char *s1, const char *s2 )
The value of s1 is returned.
Copies at most n characters of string s2
char *strncpy( char *s1, const char *s2,
into array s1. The value of s1 is
size_t n )
Page 200
img
CS201 ­ Introduction to Programming
returned.
Appends string s2 to array s1. The first
char *strcat( char *s1, const char *s2 )
character of s2 overwrites the
terminating null character of s1. The
value of s1 is returned.
Appends at most n characters of string
char *strncat( char *s1, const char *s2,
s2 to array s1. The first character of s2
size_t n )
overwrites the terminating null character
of s1. The value of s1 is returned.
int strcmp( const char *s1, const char *s2)  Compares string s1 to s2. Returns a
negative number if s1 < s2, zero if s1 ==
s2 or a positive number if s1 > s2
int strncmp( const char *s1, const char *s2, Compares up to n characters of string s1
to s2. Returns a negative number if s1 <
size_t n )
s2, zero if s1 == s2 or a positive number
if s1 > s2.
int strlen ( const char *s)
Determines the length of string s. The
number of characters preceding the
terminating null character is returned.
Let's look at the string copy function which is strcpy. The prototype of this function is
char *strcpy( char *s1, const char *s2 )
Here the first argument is a pointer to a character array or string s1 whereas the second
argument is a pointer to a string s2. The string s2 is copied to string s1 and a pointer to
that resultant string is returned. The string s2 remains the same. We can describe the
string s1 as the destination string and s2 as the source string. As the source remains the
same during the execution of strcpy and other string functions, the const keyword is used
before the name of source string. The const keyword prevents any change in the source
string (i.e. s2). If we want to copy a number of characters of a string instead of the entire
string, the function strncpy is employed. The function strncpy has arguments a pointer
to destination strings (s1), a pointer to source string (s2) . The third argument is int n.
Here n is the number of characters which we want to copy from s2 into s1. Here s1 must
be large enough to copy the n number of characters.
The next function is strcat (string concatenation). This function concatenates (joins) two
strings. For example, in a string, we have first name of a student, followed by another
string, the last name of the student is found. We can concatenate these two strings to get a
string, which holds the first and the last name of the student. For this purpose, we use the
strcat function. The prototype of this function is char *strcat( char *s1, const char *s2
). This function writes the string s2 (source) at the end of the string s1(destination). The
characters of s1 are not overwritten. We can concatenate a number of characters of s2 to
s1 by using the function strncat. Here we provide the function three arguments, a
character pointer to s1, a character pointer to s2 while third argument is the number of
characters to be concatenated. The prototype of this function is written as
char *strncat( char *s1, const char *s2, size_t n )
Page 201
img
CS201 ­ Introduction to Programming
Examples
Let's consider some simple examples to demonstrate the use of strcpy, strncpy, strcat
and strncat functions. To begin with, we can fully understand the use of the function
strcpy and strncpy.
Example 1
//Program to display the operation of the strcpy() and strncpy()
# include<iostream.h>
# include<string.h>
void main()
{
char string1[15]="String1";
char string2[15]="String2";
cout<<"Before the copy :"<<endl;
cout<<"String 1:\t"<<string1<<endl;
cout<<"String 2:\t"<<string2<<endl;
//copy the whole string
strcpy(string2,string1); //copy string1 into string2
cout<<"After the copy :"<<endl;
cout<<"String 1:\t"<<string1<<endl;
cout<<"String 2:\t"<<string2<<endl;
//copy three characters of the string1 into string3
strncpy(string3, string1, 3);
cout << "strncpy (string3, string1, 3) = " << string3 ;
}
Following is the output of the program.
Before the copy :
String 1:
String1
String 2:
String2
After the copy :
String 1:
String1
String 2:
String1
Strncpy (string3, string1, 3) = Str
Example 2 (strcat and strncat)
Page 202
img
CS201 ­ Introduction to Programming
The following example demonstrates the use of function strcat and strncat.
//Program to display the operation of the strcat() and strncat()
#include <iostream.h>
#include <string.h>
int main()
{
char s1[ 20 ] = "Welcome to ";
char s2[] = "Virtual University ";
char s3[ 40 ] = "";
cout<< "s1 = " << s1 << endl << "s2 = " << s2 << endl << "s3 = " << s3 << endl;
cout<< "strcat( s1, s2 ) = "<< strcat( s1, s2 );
cout << "strncat( s3, s1, 6 ) = " << strncat( s3, s1, 6 );
}
The output of the program is given below.
s1 = Welcome to
s2 =
s3 =
strcat( s1, s2 ) = Welcome to Virtual University
strncat( s3, s1, 7 ) = Welcome
Now we come across the function strcmp. This function compares two strings, and
returns an integer value depending upon the result of the comparison. The prototype of
this function is
int strcmp( const char *s1, const char *s2)
This function returns a number less than zero (a negative number), if s1 is less than s2. It
returns zero if s1 and s2 are identical and returns a positive number (greater than zero) if
s1 is greater than s2. The space character in a string and lower and upper case letters are
also considered while comparing two strings. So the strings "Hello", "hello" and "He llo"
are three different strings these are not identical.
Similarly there is a function strncmp, which can be used to compare a number of
characters of two strings. The prototype of this function is
int strncmp( const char *s1, const char *s2, size_t n )
Here s1 and s2 are two strings and n is the number upto which the characters of s1 and s2
are compared. Its return type is also int. It returns a negative number if first n characters
of s1 are less than first n characters of s2. It returns zero if n characters of s1 and n
characters of s2 are identical. However, it returns a positive number if n characters of s1
are greater than n characters of s2.
Now we will talk about the function, `strlen' (string length) which is used to determine
the length of a character string. This function returns the length of the string passed to it.
The prototype of this function is given below.
int strlen ( const char *s)
Page 203
img
CS201 ­ Introduction to Programming
This function determines the length of string s. the number of characters preceding the
terminating null character is returned.
Search Functions
C provides another set of functions relating to strings, called search functions. With the
help of these functions, we can do different types of search in a string. For example, we
can find at what position a specific character exists. We can search a character starting
from any position in the string. We can find the preceding or proceeding string from a
specific position. We can find a string inside another string. These functions are given in
the following table.
Function prototype
Function description
Locates the first occurrence of character c in string
char *strchr( const char *s, int c
s. If c is found, a pointer to c in s is returned.
);
Otherwise, a NULL pointer is returned.
Determines and returns the length of the initial
size_t strcspn( const char *s1,
segment of string s1 consisting of characters not
const char *s2 );
contained in string s2.
Determines and returns the length of the initial
size_t strspn( const char *s1,
segment of string s1 consisting only of characters
const char *s2 );
contained in string s2.
Locates the first occurrence in string s1 of any
char *strpbrk( const char *s1,
character in string s2. If a character from string s2
const char *s2 );
is found, a pointer to the character in string s1 is
returned. Otherwise, a NULL pointer is returned.
char *strrchr( const char *s, int c Locates the last occurrence of c in string s. If c is
found, a pointer to c in string s is returned.
);
Otherwise, a NULL pointer is returned.
char *strstr( const char *s1, const Locates the first occurrence in string s1 of string s2.
If the string is found, a pointer to the string in s1 is
char *s2 );
returned. Otherwise, a NULL pointer is returned.
char *strtok( char *s1, const char A sequence of calls to strtok breaks string s1 into
"tokens"--logical pieces such as words in a line of
*s2 );
text--separated by characters contained in string
s2. The first call contains s1 as the first argument,
and subsequent calls to continue tokenizing the
same string contain NULL as the first argument. A
pointer to the current token is returned by each call.
If there are no more tokens when the function is
called, NULL is returned.
Page 204
img
CS201 ­ Introduction to Programming
Example 3
Here is an example, which shows the use of different string manipulation functions.
The code of the program is given below.
//A program which shows string manipulation using <string.h> library
#include <iostream.h>
#include <string.h>
#include <stdlib.h>
main()
{
char s1[] = "Welcome to " ;
char s2[] = "Virtual University" ;
char s3[] = "Welcome to Karachi" ;
char city[] = "Karachi";
char province[] = "Sind";
char s[80];
char *pc;
int n;
cout << "s1 = " << s1 << endl << "s2 = " << s2 << endl ;
cout << "s3 = " << s3 << endl ;
// function for string length
cout << "The length of s1 = " << strlen(s1) << endl ;
cout << "The length of s2 = " << strlen(s2) << endl ;
cout << "The length of s3 = " << strlen(s3) << endl ;
strcpy(s, "Hyderabad"); // string copy
cout<< "The nearest city to "<< city << " is " << s << endl ;
strcat(s, " and "); // string concatenation
strcat(s,city);
strcat(s, " are in ");
strcat(s, province);
strcat(s, ".\n");
cout << s;
if (!(strcmp (s1,s2))) // ! is used as zero is returned if s1 & s2 are equal
cout << "s1 and s2 are identical" << endl ;
else
cout << "s1 and s2 are not identical" << endl ;
if (!(strncmp (s1,s3,7)))
// ! is used as zero is returned for equality
cout << "First 7 characters of s1 and s3 are identical" << endl ;
Page 205
img
CS201 ­ Introduction to Programming
else
cout << "First 7 characters of s1 and s3 are not identical" << endl ;
}
Following is the output of the program.
S1 = Welcome to
S2 =
S3 = Welcome to Karachi
The length of s1 = 11
The length of s2 = 18
The length of s3 = 18
The nearest city to Karachi is Hyderabad
Hyderabad and Karachi are in Sind.
S1 and s2 are not identical
First 7 characters of s1 and s3 are identical
Exercises
1: Write a program that displays the ASCII code set in tabular form on the screen.
2: Write your own functions for different manipulations of strings.
3: Write a program, which uses different search functions.
Page 206
Table of Contents:
  1. What is programming
  2. System Software, Application Software, C language
  3. C language: Variables, Data Types, Arithmetic Operators, Precedence of Operators
  4. C++: Examples of Expressions, Use of Operators
  5. Flow Charting, if/else structure, Logical Operators
  6. Repetition Structure (Loop), Overflow Condition, Infinite Loop, Properties of While loop, Flow Chart
  7. Do-While Statement, for Statement, Increment/decrement Operators
  8. Switch Statement, Break Statement, Continue Statement, Rules for structured Programming/Flow Charting
  9. Functions in C: Structure of a Function, Declaration and Definition of a Function
  10. Header Files, Scope of Identifiers, Functions, Call by Value, Call by Reference
  11. Arrays: Initialization of Arrays, Copying Arrays, Linear Search
  12. Character Arrays: Arrays Comparisonm, Sorting Arrays Searching arrays, Functions arrays, Multidimensional Arrays
  13. Array Manipulation, Real World Problem and Design Recipe
  14. Pointers: Declaration of Pointers, Bubble Sort Example, Pointers and Call By Reference
  15. Introduction, Relationship between Pointers and Arrays, Pointer Expressions and Arithmetic, Pointers Comparison, Pointer, String and Arrays
  16. Multi-dimensional Arrays, Pointers to Pointers, Command-line Arguments
  17. String Handling, String Manipulation Functions, Character Handling Functions, String Conversion Functions
  18. Files: Text File Handling, Output File Handling
  19. Sequential Access Files, Random Access Files, Setting the Position in a File, seekg() and tellg() Functions
  20. Structures, Declaration of a Structure, Initializing Structures, Functions and structures, Arrays of structures, sizeof operator
  21. Bit Manipulation Operators, AND Operator, OR Operator, Exclusive OR Operator, NOT Operator Bit Flags Masking Unsigned Integers
  22. Bitwise Manipulation and Assignment Operator, Programming Constructs
  23. Pre-processor, include directive, define directive, Other Preprocessor Directives, Macros
  24. Dynamic Memory Allocation, calloc, malloc, realloc Function, Dangling Pointers
  25. History of C/C++, Structured Programming, Default Function Arguments
  26. Classes and Objects, Structure of a class, Constructor
  27. Classes And Objects, Types of Constructors, Utility Functions, Destructors
  28. Memory Allocation in C++, Operator and Classes, Structures, Function in C++,
  29. Declaration of Friend Functions, Friend Classes
  30. Difference Between References and Pointers, Dangling References
  31. Operator Overloading, Non-member Operator Functions
  32. Overloading Minus Operator, Operators with Date Class, Unary Operators
  33. Assignment Operator, Self Assignmentm, Pointer, Conversions
  34. Dynamic Arrays of Objects, Overloading new and delete Operators
  35. Source and Destination of streams, Formatted Input and Output, Buffered Input/Output
  36. Stream Manipulations, Manipulators, Non Parameterized Manipulators, Formatting Manipulation
  37. Overloading Insertion and Extraction Operators
  38. User Defined Manipulator, Static keyword, Static Objects
  39. Pointers, References, Call by Value, Call by Reference, Dynamic Memory Allocation
  40. Advantages of Objects as Class Members, Structures as Class Members
  41. Overloading Template Functions, Template Functions and Objects
  42. Class Templates and Nontype Parameters, Templates and Static Members
  43. Matrices, Design Recipe, Problem Analysis, Design Issues and Class Interface
  44. Matrix Constructor, Matrix Class, Utility Functions of Matrix, Input, Transpose Function
  45. Operator Functions: Assignment, Addition, Plus-equal, Overloaded Plus, Minus, Multiplication, Insertion and Extraction