Assignment #6 - 92B

Assignment #6 - 92B


Assignment 6                                  ==>>>>>>> DUE Fri 20 Nov
============                                            WEIGHT 35
                                              ==>>>>>>> E-mail HDNY
                                                        TA A. Ezust
!!! Last UPDATED 11 November 1992 !!!

1. This assignment can be completed as originally set and will be
graded out of 25.  That is, there are 5 bonus marks for students
who do the first version of the assignment.

2. To make the assignment easier, and still get full marks of 20/20
{only output THE most frequent word, and ignore the percentage.

3. Note the revised deadline for submitting your assignment.


If you have questions, comments or other requests, e-mail to
depeche@cs.mcgill.ca. Assignments, however, must still be sent
to HDNY from your HD-- code on MUSICB!!
-->¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦<---------------

Write a Turbo Pascal program that reads a text file counting occurrences
of unique words, and reports information about the file,                low.
as described below.

Your program must throw out all words with 2 or less letters, all
punctuation marks, and all non alpha characters. In addition, the
case of a word is to be ignored (i.e. tOtalLy and Totally are the
same word.) Words like e-mail should be treated as TWO words, e and mail
(so punctuation marks, newlines, OR spaces can separate words).

When the information gathering is finished, output the 10 words with the
highest frequency in descending order of frequency. In addition, output the
following:

o total word count
o word count of words thrown away
o average word length over ALL words, including those thrown away
o percent of words in the file which made it into the top ten

NOTE: I want to see separate procedures or functions for each major
operation, so the main program should be kept very small (no more than 20
lines of code, but preferably <= 10).

You can use THIS assignment description as your test data.

If you are using arrays with fixed length, and you run out of space
to hold
new words as they are being read into your program, OUTPUT AN ERROR
MESSAGE, report what you read in so far, and stop the execution of the
program. Make sure that the program has its constants set up so
that it can read this file without an error message resulting.