Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The main objectives of this lab include: set up a 2d array (or matrix) with proper initial values using vector of vectors given a string,

The main objectives of this lab include:

set up a 2d array (or matrix) with proper initial values using vector of vectors

given a string, implement a tokenizer to identify all the unique tokens contained within this string and the number of times (i.e., frequency) a given token appears in this string.

create a new datatype using struct to hold a token and its frequency; further store all the tokens and their frequencies into a vector

implement the insertion sort algorithm to sort the list of tokens in increasing order of frequency

implement the selection sort algorithm to sort the list of tokens in decreasing order of frequency

1. Implement the following function to create a matrix of dimensionality numRows x numCols, where matrix starts with an initial size of 0. Furthermore, initialize the value at matrix[i][j] to the product of i and j.

void matrixInit( vector< vector >& matrix, int numRows, int numCols); 

For example, if numRows is 3, numCols is 4, you are going to allocate space to matrix so that it contains three row vectors, where the size of each row vector will be 4. At the end of this function call, the content of matrix will be:

size of matrix is: 3x4 matrix[0][0]=0 matrix[0][1]=0 matrix[0][2]=0 matrix[0][3]=0 matrix[1][0]=0 matrix[1][1]=1 matrix[1][2]=2 matrix[1][3]=3 matrix[2][0]=0 matrix[2][1]=2 matrix[2][2]=4 matrix[2][3]=6 

2. Given a string variable string s; , implement a tokenizer to identify the unique tokens contained in this string, identify all the unique tokens and their frequencies, and store such information into a vector. Tokens are sequences of contiguous characters separated by any of the specified delimiters (e.g., white spaces). In this lab, only white spaces will be considered as delimiters. For instance, the string "Hello, what's that thing? " contains four tokens: "Hello", "what's", "that" and "thing?". The frequency of a token is the number of times this token appears in this string. In this example, each token has a frequency of 1. Note that in this lab, these tokens are case insensitive. For example, "Hello" and "hello" are considered to be the same token.

Specifically, you are required to

declare a struct TokenFreq that consists of two data members: (1) string token; and (2) int freq; Obviously, an object of this struct will be used to store a specific token and its frequency. For example, the following object word stores the token "dream" and its frequency 100:

 TokenFreq word; word.value="dream"; word.freq=100; 

implement the following function, where istr is the input string, and tfVec will be used to store the list of unique and case insensitive tokens and their corresponding frequencies identified within istr. You might find it's very convenient to use stringstream objects to tokenize a string.

void getTokenFreqVec( const string& istr, vector & tfVec) 

Assume that the value of istr is

And no, I'm not a walking C++ dictionary. I do not keep every technical detail in my head at all times. If I did that, I would be a much poorer programmer. I do keep the main points straight in my head most of the time, and I do know where to find the details when I need them. by Bjarne Stroustrup 

After calling the above function, tfVec is expected to contain the following values (where order of appearances doesn't matter):

size=46 { [0] = (token = "and", freq = 2) [1] = (token = "no,", freq = 1) [2] = (token = "i'm", freq = 1) [3] = (token = "not", freq = 2) [4] = (token = "a", freq = 2) [5] = (token = "walking", freq = 1) [6] = (token = "c++", freq = 1) [7] = (token = "dictionary.", freq = 1) [8] = (token = "i", freq = 6) [9] = (token = "do", freq = 3) [10] = (token = "keep", freq = 2) [11] = (token = "every", freq = 1) [12] = (token = "technical", freq = 1) [13] = (token = "detail", freq = 1) [14] = (token = "in", freq = 2) [15] = (token = "my", freq = 2) [16] = (token = "head", freq = 2) [17] = (token = "at", freq = 1) [18] = (token = "all", freq = 1) [19] = (token = "times.", freq = 1) [20] = (token = "if", freq = 1) [21] = (token = "did", freq = 1) [22] = (token = "that,", freq = 1) [23] = (token = "would", freq = 1) [24] = (token = "be", freq = 1) [25] = (token = "much", freq = 1) [26] = (token = "poorer", freq = 1) [27] = (token = "programmer.", freq = 1) [28] = (token = "the", freq = 3) [29] = (token = "main", freq = 1) [30] = (token = "points", freq = 1) [31] = (token = "straight", freq = 1) [32] = (token = "most", freq = 1) [33] = (token = "of", freq = 1) [34] = (token = "time,", freq = 1) [35] = (token = "know", freq = 1) [36] = (token = "where", freq = 1) [37] = (token = "to", freq = 1) [38] = (token = "find", freq = 1) [39] = (token = "details", freq = 1) [40] = (token = "when", freq = 1) [41] = (token = "need", freq = 1) [42] = (token = "them.", freq = 1) [43] = (token = "by", freq = 1) [44] = (token = "bjarne", freq = 1) [45] = (token = "stroustrup", freq = 1) 

Implement the selection sort algorithm to sort a vector in ascending order of token frequency. The pseudo code of the selection algorithm can be found at this webpage. You can also watch an animation of the sorting process at http://visualgo.net/sorting -->under "select". This function has the following prototype:

void selectionSort( vector & tokFreqVector ); //This function receives a vector of TokenFreq objects by reference and applies the selections sort algorithm to sort this vector in increasing order of token frequencies. 

Implement the insertion sort algorithm to sort a vector in descending order of token frequency. The pseudo code of the selection algorithm can be found at [http://www.algolist.net/Algorithms/Sorting/Insertionsort}(http://www.algolist.net/Algorithms/Sorting/Insertionsort). Use the same link above to watch an animation of this algorithm. This function has the following prototype:

 void insertionSort( vector & tokFreqVector ); 

Remember to include a main() function to test the above functions.

LAB

Submission Instructions

Deliverables

matrixTokenizerSorting.cpp

You must submit these file(s)

Compile command

g++ matrixTokenizerSorting.cpp -Wall -o a.out

We will use this command to compile your code

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Time Series Databases New Ways To Store And Access Data

Authors: Ted Dunning, Ellen Friedman

1st Edition

1491914726, 978-1491914724

More Books

Students also viewed these Databases questions

Question

What is focal length? Explain with a diagram and give an example.

Answered: 1 week ago