The program wfc counts the frequencies of words in
a text document and writes the words along with their
frequencies in a file.
The lines of the output file are ordered according
to the word frequencies in descending order, i.e.,
the most frequent word is on the top.
The program wfc makes use of parallelism by
letting child processes parse different parts
of the input file.
This is my solution of an assignment in the course CS511 (Concurrent Programming) at Stevens Institute of Technology in the Fall of 2012.
The program can be compiled by running make inside the folder
where wfc.cpp and Makefile are located.
The general usage syntax is: wfc [-p ] [-i ] [-o ] where
parallelismis the number of child processes to fork. If this parameter is unspecified,4will be used as the parallelism.input fileis the path to the input file. If this parameter is unspecified,test_in.txtwill be used as the input file.output fileis the path to the output file. If this parameter is unspecified,test_out.txtwill be used as the output file.
Make sure that you are allowed to allocate enough shared memory
for the input file.
The maximum shared memory size must set to a value that is slightly
higher than the input file size.
You can set the maximum shared memory size with the command
sysctl -w kernel.shmmax=size where size is the desired size in
bytes.
For example, sysctl -w kernel.shmmax=2147483648 sets the
maximum shared memory segment size to 2 GiB.
(Copyright) 2012 Fabian Foerg