Invariant File Character Selection

CSC511 Formal Methods in Programming

James Canavan

E-Mail: jaco@donotenter.com
WWW: www.donotenter.com

© September 25, 1993

Introduction:

This program (short) will read the values from an input file in and check character by character if they are repeated until a "*" (star) is read in. This last step is fortuitous because it allows the use of an empty file. The program will keep an running count of the longest chain of repeated characters - and output this value at when all characters are read in - i.e. until a "*" is read in.

Details:

The program was written on the SUNY system using ANSI C - this is new for me as I am not a C programmer, but I am determined to become proficient in this language in order to make myself more marketable. Anyway, the program first checks for the correct file input using:
if(argc==1)
{
	printf("no file specified");
	exit(1);
}
... and , in case there is a problem opening the input file
if((fptr1=fopen(argv[1], "r")) == (FILE*) NULL)
{
	printf("error opening file one,program terminates ...:);
	exit(2);
}
Up until this point - the program just sets up three counters (newcount, total, and oldcount), and three character storage locations ( ch, chOld, and chNew). The count values are set to one (1) and total is set to zero (0) - the reason being that if any of the count variables are used then there has been a repetition, and the count will start at one; and total starts at zero to include the empty string (i.e. a file with just the termination character - "*".
The program gets a character with:
fscanf(fptr1, "%c", &ch);
... and the algorithm fun begins...
while(ch!='*')
{
	chNew = ch;
	if( chNew != chOld) newcount = 1;
		if (chNew == chOld)
		{
			newcount = newcount +1;
			oldcount = newcount;
			if( oldcount > total ) total = oldcount;
		}
	chOld = chNew;
	fscanf(fptr1,"%c", &ch);
}
The core logic here is that a new character is read in and put in chNew, it is then tested as NOT being the termination character. It is then tested with chOld character to see if this new-character is repeated or not - if it is, newcount is incremented. This count is put into oldcount and total is updated if oldcount (which is really the present count) is greater than the total for the entire file. Getting out of the most internal loop, the new-character (chNew) is saved in chOld, and another character is read in. Note that if a running count is NOT in progress, newcount is set to zero - in order to begin counting a new series of repeated characters.

Lastly, the value of total is printed out and the input file is closed:

printf( "longest array: %d\n",total);
fclose(fptr1);

Output:

Included in the back of this file are the two input files ( actually, one file only has a "*" {the termination character) - to indicate an empty array). The output script is also included. The program successfully counted the longest running subset array. Zero (0) in the empty set file and ten (10) in the other.

Conclusions:

The program does what is should: count the longest array of repeated characters from a list of characters - it works also for a NULL array.

In completing this C program I noticed the transition from the theory we are learning in class to the actual programming in the language ( on the SUNY pool machine ) - that is, I knew what I wanted to do using the symbology from chapters 3 and 4 of the text ( The Science of Programming, by David Gries ), but getting it into code was NOT a clear-cut translation. This was interesting for me because it showed that solving a problem not only includes the higher-level algorithmic solution but also the lower-level "nut-and-bolts" implementation.


Here is the author's E-mail address: jaco@donotenter.com