Previously, we covered a simple introduction to the language of C. Today, we are going to delve into the characters. We are now going to do some related programs for processing character data. You will find that many programs are just expanded version of the following prototypes that we are going to discuss here:
- File Copying
- Character Counting
- Line Counting
- Word Counting
Text input and outputs is dealt with as streams of characters. A text stream is a sequence of characters divided into lines; each line consists of zero or more characters followed by a newline character.
The standard library in C provides several functions for reading and writing one character at a time, of which getchar and putchar area the simplest.
Each time it is called,
- getchar reads the next input character from a text stream and returns that as its value
c contains the next character input. The characters usually are inputted through a keyboard. - putchar prints a character
prints the contents of the integer variable c as a character, usually on the screen
1. File Copying
We can have some surprising amount of useful codes given just the getchar and putchar. Consider the following file copying program.
The relational operator != means “not equal to”
Why is c declared as int type?
What appears to be characters on the keyboard or screen are actually stored internally just as a bit pattern. (refer to the ASCII code of this post) The type char is specifically meant for storing character data, but any integer type can be used. We used int for a very special reason.
It has something to do with distinguishing the end of an input from valid data. The solution is that getchar returns a distinctive value when there is no more input, a value that is not a real character. This value is called EOF (“end of file”). So we declared c to be big enough to hold any value that getchar returns. Hence, we use int.
EOF is actually an integer symbolic constant in <stdio.h> library. By using the symbolic constant, we are assured that nothing in the program depends on the specific numeric value.
Concise version of the program
An experienced C programmer can concisely rewrite the program we presented as,
The assignment can appear as part of a larger expression. Just as what is presented above. The while gets a character, assigns it to c, then test whether the character was the end-of-file signal. If it was not, the body of the while is executed, printing the character. The while then repeats.
This version shrinks the program into smaller lines, and also centralizes the input. It is more compact and easier to read. You’ll see this style more often.
The precedence of != is higher than that of = , which means that in the absence of the parentheses the relational test != would be done before assignment.
2. Character Counting
This next program counts the number of characters inputted:
The new operator ++nc means increment by one. You can write nc = nc+1 but the thing about ++nc it that it is more precise and efficient. (Similarly, there is a corresponding operator – which means decrement by 1).
Note that the operators ++ and – can be either a prefix operator (++nc) or a postfix (nc++).
The conversion specification %ld implies that we are printing a long integer.
It is possible to cope with even bigger numbers using double.
Version 2 of the Line Counting Program
We’ve change the while loop with for loop for a more concise and efficient presentation of the logic. For float data, we’ve used %.0f for conversion specification, which suppresses printing of the decimal point.
Note that the body of the for loop is empty, all the work is done in the test and increment parts. The isolated semicolon. is called the null statement.
Here is the output for these program, I’ve used Ctrl+Z to end the file.
3. Line Counting
This next program counts the number of inputted lines. Remember that standard libraries ensures that an input text stream appears as a sequence of lines, each terminated by a newline.
Our line counting program is,
The expected output would look like this:
4. Word Counting
The fourth in our series of useful programs counts lines, words, and characters, with the loose definition that a word is any sequence of characters that does not contain a blank, tab, or newline.
Example:
The phrase “steem is going to the the moon” is consist of a single line (1), 7 words, and 27 characters.
The code for this program is given below:
The state OUT implies that the program is currently not examining a word; it is “outside a word”. We prefer the symbolic constants OUT and IN instead of 0 and 1 to make the program more readable. (You’ll appreciate this technique when you start writing larger programs). The line,
sets all three variables to zero. This assignment can be represented more clearly if we had written,
But the first one offers more real life expression.
The operator means OR. The line,
says “if c is blank OR c is a newline OR c is a tab”. There is a corresponding operator for AND. It has a precedence just higher than , that if we have an expression with and , it will be evaluated from left to right. In our program, whenever the c is a blank, there is no need to test the other conditions. (This will be significant in more complicated situations, as we will soon see.)
In this example, we show an else which specifies an alternative action if the condition part of an if statement is false.
The general form is
If the expression is true, is executed; if not, is executed. Each statement can be a single statement or several in braces.
Disclaimer: this article is a summary of section 1.5 from the book The C Programming Language (ANSI C): by Brian Kernighan and Dennis Ritchie, the content apart from rephrasing is identical, most of the equations are screenshots of the book and the same line of codes are treated.
Thank you for reading. copyright 2018 by @sinbad989
Release the Kraken! You got a 16.82% upvote from @seakraken courtesy of @sinbad989!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit