Assignment 5: Hash Table implementation andconcordance There are three parts to this assignment. In the first two parts,you will complete the implementation of a hash
Assignment 5: Hash Table implementation andconcordance
There are three parts to this assignment. In the first two parts,you will complete the implementation of a hash map and aconcordance program. In the third part, you will answer a number ofquestions using the concordance program. There is also an extracredit opportunity.
Prerequisites
Reading chapter 12, watching this week’s lecture on hash tableswith chaining, and completing worksheet 38 will better prepare youto tackle this assignment. It may also be helpful to review C fileI/O with fopen() and fclose().
Hash map
First complete the hash map implementation in hashMap.c. This hashmap uses a table of buckets, each containing a linked list of hashlinks. Each hash link stores the key-value pair (string and integerin this case) and a pointer to the next link in the list. SeehashMap.h and the accompanied drawing posted with this assignmentfor clarification. You must implement each function in hashMap.cwith the // FIXME: implement comment.
At the top of hashMap.h you should see two macros: HASH_FUNCTIONand MAX_TABLE_LOAD. You are free to change their definitions butknow that the default values will be used when grading.HASH_FUNCTION is the name of the hash function you want to use. Youwill change this when answering the written part of the assignment.Make sure everywhere in your implementation to useHASH_FUNCTION(key) instead of directly calling a hash function.MAX_TABLE_LOAD is the table load threshold on which you shouldresize the table.
A number of tests for the hash map are included in tests.c. Eachone of these test cases use several or all of the hash mapfunctions, so don’t expect tests to pass until you implement all ofthem. Each test case is slightly more thorough than the one beforeit and there is a lot of redundancy to better ensure correctness.Use these tests to help you debug your hash map implementation.They will also help your TA grade your submission. You can buildthe tests with make tests or make and run them with ./tests.
Concordance
The concordance counts how many times each word occurs in adocument. You will implement a concordance using the hash mapimplementation from the previous part. Each hash link in the tablewill store a word from the document as the key and the number oftimes the word appeared as the value. You must finish theconcordance implementation in main.c.
You are provided with a function nextWord() which takes a FILE*,allocates memory for the next word in the file, and returns theword. If the end of the file is reached, nextWord() will returnNULL. It is your job to open the file using fopen(), populate theconcordance with the words, and close the file with fclose().
The file name to open should be given as a command line argumentwhen running the program. It will default to input1.txt if no filename is provided. Your concordance code should loop over the wordsuntil the end of the file is reached, doing the following stepseach iteration:
1. Get the next word with getWord.
2. If the word is already in the hash map, then increment itsnumber of occurrences.
3. Otherwise, put the word in the hash map with a count of 1.
4. Free the word.
After processing the text file, print all words and occurrencecounts in the hash map. Please print them in the format of thefollowing example above the call to hashMapPrint():
best: 1
It: 2
was: 2
the: 2
of: 2
worst: 1
times: 2
You can build the program with make prog or make and run it with./prog
Written
Submit a pdf or text file answering the following questions:
1. Give an example of two words that would hash to the same valueusing hashFunction1 but would not using hashFunction2.
2. Why does the above observation make hashFunction2 superior tohashFunction1?
3. When you run your program on the same input file once withhashFunction1 and once with hashFunction2, is it possible for yourhashMapSize function to return different values?
4. When you run your program on the same input file once withhashFunction1 and once with hashFunction2, is it possible for yourhashMapTableLoad function to return different values?
5. When you run your program on the same input file once withhashFunction1 and once with hashFunction2, is it possible for yourhashMapEmptyBuckets function to return different values?
6. Is there any difference in the number of empty buckets when youchange the table size from an even number like 1000 to a prime like997?
Last Item
There are a lot of uses for a hash map, and one of them isimplementing a spell checker. All you need to get started is adictionary, which is provided in dictionary.txt. In spellChecker.cyou will find some code to get you started with the spell checker.It is fairly similar to the code in main.c.
You can build the program with makespellChecker.
FYI:
The spellchecker program flow should as following -
1. The user types in a word
2. Potential matches are outputted Like "Did you mean...?"etc
3. Continue to prompt user for word until they type quit
The best way to implement a dictionary that's used for aspellchecker would probably be to design it with that purpose inmind from the beginning, i.e. associating a similarity for eachword to some base word (maybe "abcdefghijklmnopqrstuvwyz") and thenincorporating that into the hash function. There are better ways (https://en.wikipedia.org/wiki/Levenshtein_distance) to establishsimilarity than computing the cosine of the angle between twovectors (strings) to create a list of candidates and furtherwinnowed that list according to substring comparisons. So, I wouldsay calculating the Levenshtein distance between the misspelledword and all strings in the dictionary, create 5/6 best candidatesand print them as suggestion.
Files are below, must be written in C.
CuTest.h
#ifndef CU_TEST_H
#define CU_TEST_H
#include
#include
#define CUTEST_VERSION "CuTest 1.5"
/* CuString */
char* CuStrAlloc(int size);
char* CuStrCopy(const char* old);
#define CU_ALLOC(TYPE) ((TYPE*)malloc(sizeof(TYPE)))
#define HUGE_STRING_LEN 8192
#define STRING_MAX 256
#define STRING_INC 256
typedef struct
{
int length;
int size;
char* buffer;
} CuString;
void CuStringInit(CuString* str);
CuString* CuStringNew(void);
void CuStringRead(CuString* str, const char* path);
void CuStringAppend(CuString* str, const char* text);
void CuStringAppendChar(CuString* str, char ch);
void CuStringAppendFormat(CuString* str, const char* format,...);
void CuStringInsert(CuString* str, const char* text, intpos);
void CuStringResize(CuString* str, int newSize);
void CuStringDelete(CuString* str);
/* CuTest */
typedef struct CuTest CuTest;
typedef void (*TestFunction)(CuTest *);
struct CuTest
{
char* name;
TestFunction function;
int failed;
int ran;
int parents;
char* message;
jmp_buf *jumpBuf;
};
void CuTestInit(CuTest* t, const char* name, TestFunctionfunction);
CuTest* CuTestNew(const char* name, TestFunction function);
CuTest* CuTestCopy(CuTest* t);
void CuTestRun(CuTest* tc);
void CuTestDelete(CuTest *t);
/* Internal versions of assert functions -- use the publicversions */
void CuFail_Line(CuTest* tc, const char* file, int line, constchar* message2, const char* message);
void CuAssert_Line(CuTest* tc, const char* file, int line, constchar* message, int condition);
void CuAssertStrEquals_LineMsg(CuTest* tc, const char* file, intline, const char* message, const char* expected, const char*actual);
void CuAssertIntEquals_LineMsg(CuTest* tc, const char* file, intline, const char* message, int expected, int actual);
void CuAssertDblEquals_LineMsg(CuTest* tc, const char* file, intline, const char* message, double expected, double actual, doubledelta);
void CuAssertPtrEquals_LineMsg(CuTest* tc, const char* file, intline, const char* message, void* expected, void* actual);
/* public assert functions */
#define CuFail(tc,ms) CuFail_Line( (tc), __FILE__, __LINE__, NULL, (ms))
#define CuAssert(tc, ms,cond) CuAssert_Line((tc), __FILE__, __LINE__, (ms), (cond))
#define CuAssertTrue(tc,cond) CuAssert_Line((tc), __FILE__, __LINE__, "assert failed",(cond))
#defineCuAssertStrEquals(tc,ex,ac) CuAssertStrEquals_LineMsg((tc),__FILE__,__LINE__,NULL,(ex),(ac))
#define CuAssertStrEquals_Msg(tc,ms,ex,ac) CuAssertStrEquals_LineMsg((tc),__FILE__,__LINE__,(ms),(ex),(ac))
#defineCuAssertIntEquals(tc,ex,ac) CuAssertIntEquals_LineMsg((tc),__FILE__,__LINE__,NULL,(ex),(ac))
#define CuAssertIntEquals_Msg(tc,ms,ex,ac) CuAssertIntEquals_LineMsg((tc),__FILE__,__LINE__,(ms),(ex),(ac))
#defineCuAssertDblEquals(tc,ex,ac,dl) CuAssertDblEquals_LineMsg((tc),__FILE__,__LINE__,NULL,(ex),(ac),(dl))
#define CuAssertDblEquals_Msg(tc,ms,ex,ac,dl)CuAssertDblEquals_LineMsg((tc),__FILE__,__LINE__,(ms),(ex),(ac),(dl))
#defineCuAssertPtrEquals(tc,ex,ac) CuAssertPtrEquals_LineMsg((tc),__FILE__,__LINE__,NULL,(ex),(ac))
#define CuAssertPtrEquals_Msg(tc,ms,ex,ac) CuAssertPtrEquals_LineMsg((tc),__FILE__,__LINE__,(ms),(ex),(ac))
#defineCuAssertPtrNotNull(tc,p) CuAssert_Line((tc),__FILE__,__LINE__,"null pointer unexpected",(p!= NULL))
#define CuAssertPtrNotNullMsg(tc,msg,p)CuAssert_Line((tc),__FILE__,__LINE__,(msg),(p != NULL))
/* CuSuite */
#define MAX_TEST_CASES 1024
#define SUITE_ADD_TEST(SUITE,TEST) CuSuiteAdd(SUITE,CuTestNew(#TEST, TEST))
typedef struct
{
int count;
CuTest* list[MAX_TEST_CASES];
int failCount;
} CuSuite;
void CuSuiteInit(CuSuite* testSuite);
CuSuite* CuSuiteNew(void);
void CuSuiteDelete(CuSuite *testSuite);
void CuSuiteAdd(CuSuite* testSuite, CuTest *testCase);
void CuSuiteAddSuite(CuSuite* testSuite, CuSuite*testSuite2);
void CuSuiteConsume(CuSuite* testSuite, CuSuite* testSuite2);
void CuSuiteRun(CuSuite* testSuite);
void CuSuiteSummary(CuSuite* testSuite, CuString* summary);
void CuSuiteDetails(CuSuite* testSuite, CuString* details);
#endif /* CU_TEST_H */
CuTest.c
#include
#include
#include
#include
#include
#include
#include
#include "CuTest.h"
char* CuStrAlloc(int size)
{
char* newStr = (char*) malloc( sizeof(char) * (size) );
return newStr;
}
char* CuStrCopy(const char* old)
{
int len = strlen(old);
char* newStr = CuStrAlloc(len + 1);
strcpy(newStr, old);
return newStr;
}
void CuStringInit(CuString* str)
{
str->length = 0;
str->size = STRING_MAX;
str->buffer = (char*) malloc(sizeof(char) * str->size);
str->buffer[0] = '0';
}
CuString* CuStringNew(void)
{
CuString* str = (CuString*) malloc(sizeof(CuString));
str->length = 0;
str->size = STRING_MAX;
str->buffer = (char*) malloc(sizeof(char) * str->size);
str->buffer[0] = '0';
return str;
}
void CuStringDelete(CuString *str)
{
if (!str) return;
free(str->buffer);
free(str);
}
void CuStringResize(CuString* str, int newSize)
{
str->buffer = (char*) realloc(str->buffer, sizeof(char) *newSize);
str->size = newSize;
}
void CuStringAppend(CuString* str, const char* text)
{
int length;
if (text == NULL) {
text = "NULL";
}
length = strlen(text);
if (str->length + length + 1 >= str->size)
CuStringResize(str, str->length + length + 1 +STRING_INC);
str->length += length;
strcat(str->buffer, text);
}
void CuStringAppendChar(CuString* str, char ch)
{
char text[2];
text[0] = ch;
text[1] = '0';
CuStringAppend(str, text);
}
void CuStringAppendFormat(CuString* str, const char* format,...)
{
va_list argp;
char buf[HUGE_STRING_LEN];
va_start(argp, format);
vsprintf(buf, format, argp);
va_end(argp);
CuStringAppend(str, buf);
}
void CuStringInsert(CuString* str, const char* text, intpos)
{
int length = strlen(text);
if (pos > str->length)
pos = str->length;
if (str->length + length + 1 >= str->size)
CuStringResize(str, str->length + length + 1 +STRING_INC);
memmove(str->buffer + pos + length, str->buffer + pos,(str->length - pos) + 1);
str->length += length;
memcpy(str->buffer + pos, text, length);
}
void CuTestInit(CuTest* t, const char* name, TestFunctionfunction)
{
t->name = CuStrCopy(name);
t->failed = 0;
t->ran = 0;
t->parents = 0;
t->message = NULL;
t->function = function;
t->jumpBuf = NULL;
}
CuTest* CuTestNew(const char* name, TestFunction function)
{
CuTest* tc = CU_ALLOC(CuTest);
CuTestInit(tc, name, function);
return tc;
}
CuTest* CuTestCopy(CuTest* t)
{
CuTest* copy = CU_ALLOC(CuTest);
memcpy(copy, t, sizeof(CuTest));
return copy;
}
void CuTestDelete(CuTest *t)
{
if (!t) return;
if (--t->parents < 1)
{
free(t->name);
free(t->message);
free(t);
}
}
void CuTestRun(CuTest* tc)
{
jmp_buf buf;
tc->jumpBuf = &buf;
if (setjmp(buf) == 0){
tc->ran = 1;
(tc->function)(tc);
}
tc->jumpBuf = 0;
}
static void CuFailInternal(CuTest* tc, const char* file, intline, CuString* string)
{
char buf[HUGE_STRING_LEN];
sprintf(buf, "%s:%d: ", file, line);
CuStringInsert(string, buf, 0);
tc->failed = 1;
tc->message = string->buffer;
if (tc->jumpBuf != 0) longjmp(*(tc->jumpBuf), 0);
}
void CuFail_Line(CuTest* tc, const char* file, int line, constchar* message2, const char* message)
{
CuString string;
CuStringInit(&string);
if (message2 != NULL){
CuStringAppend(&string, message2);
CuStringAppend(&string, ": ");
}
CuStringAppend(&string, message);
CuFailInternal(tc, file, line, &string);
}
void CuAssert_Line(CuTest* tc, const char* file, int line, constchar* message, int condition)
{
if (condition) return;
CuFail_Line(tc, file, line, NULL, message);
}
void CuAssertStrEquals_LineMsg(CuTest* tc, const char* file, intline, const char* message, const char* expected, const char*actual)
{
CuString string;
if ((expected == NULL && actual == NULL) || (expected !=NULL && actual != NULL && strcmp(expected, actual)== 0))
{
return;
}
CuStringInit(&string);
if (message != NULL){
CuStringAppend(&string, message);
CuStringAppend(&string, ": ");
}
CuStringAppend(&string, "expected <");
CuStringAppend(&string, expected);
CuStringAppend(&string, "> but was <");
CuStringAppend(&string, actual);
CuStringAppend(&string, ">");
CuFailInternal(tc, file, line, &string);
}
void CuAssertIntEquals_LineMsg(CuTest* tc, const char* file, intline, const char* message, int expected, int actual)
{
char buf[STRING_MAX];
if (expected == actual) return;
sprintf(buf, "expected <%d> but was <%d>", expected,actual);
CuFail_Line(tc, file, line, message, buf);
}
void CuAssertDblEquals_LineMsg(CuTest* tc, const char* file, intline, const char* message, double expected, double actual, doubledelta)
{
char buf[STRING_MAX];
if (fabs(expected - actual) <= delta) return;
sprintf(buf, "expected <%f> but was <%f>", expected,actual);
CuFail_Line(tc, file, line, message, buf);
}
void CuAssertPtrEquals_LineMsg(CuTest* tc, const char* file, intline, const char* message, void* expected, void* actual)
{
char buf[STRING_MAX];
if (expected == actual) return;
sprintf(buf, "expected pointer <0x%p> but was <0x%p>",expected, actual);
CuFail_Line(tc, file, line, message, buf);
}
void CuSuiteInit(CuSuite* testSuite)
{
testSuite->count = 0;
testSuite->failCount = 0;
memset(testSuite->list, 0, sizeof(testSuite->list));
}
CuSuite* CuSuiteNew(void)
{
CuSuite* testSuite = CU_ALLOC(CuSuite);
CuSuiteInit(testSuite);
return testSuite;
}
void CuSuiteDelete(CuSuite *testSuite)
{
unsigned int n;
for (n=0; n
CuTestDelete(testSuite->list[n]);
}
}
free(testSuite);
}
void CuSuiteAdd(CuSuite* testSuite, CuTest *testCase)
{
assert(testSuite->count < MAX_TEST_CASES);
testSuite->list[testSuite->count] = testCase;
testSuite->count++;
if (testCase->parents != INT_MAX){
testCase->parents++;
}
else{
testCase = CuTestCopy(testCase);
}
}
void CuSuiteAddSuite(CuSuite* testSuite, CuSuite*testSuite2)
{
int i;
for (i = 0 ; i < testSuite2->count ; ++i){
CuTest* testCase = testSuite2->list[i];
CuSuiteAdd(testSuite, testCase);
}
}
void CuSuiteConsume(CuSuite* testSuite, CuSuite*testSuite2)
{
CuSuiteAddSuite(testSuite, testSuite2);
CuSuiteDelete(testSuite2);
}
void CuSuiteRun(CuSuite* testSuite)
{
int i;
for (i = 0 ; i < testSuite->count ; ++i){
CuTest* testCase = testSuite->list[i];
CuTestRun(testCase);
if (testCase->failed) { testSuite->failCount +=1; }
}
}
void CuSuiteSummary(CuSuite* testSuite, CuString* summary)
{
int i;
for (i = 0 ; i < testSuite->count ; ++i){
CuTest* testCase = testSuite->list[i];
CuStringAppend(summary, testCase->failed ? "F" :".");
}
CuStringAppend(summary, "");
}
void CuSuiteDetails(CuSuite* testSuite, CuString* details)
{
int i;
int failCount = 0;
if (testSuite->failCount == 0){
int passCount = testSuite->count -testSuite->failCount;
const char* testWord = passCount == 1 ? "test" :"tests";
CuStringAppendFormat(details, "OK (%d %s)",passCount, testWord);
}
else{
if (testSuite->failCount == 1)
CuStringAppend(details, "There was 1failure:");
else
CuStringAppendFormat(details, "There were %dfailures:", testSuite->failCount);
for (i = 0 ; i < testSuite->count ;++i){
CuTest* testCase = testSuite->list[i];
if (testCase->failed){
failCount++;
CuStringAppendFormat(details, "%d) %s:%s",
failCount, testCase->name,testCase->message);
}
}
CuStringAppend(details, "!!!FAILURES!!!");
CuStringAppendFormat(details, "Runs: %d", testSuite->count);
CuStringAppendFormat(details, "Passes: %d ",testSuite->count - testSuite->failCount);
CuStringAppendFormat(details, "Fails: %d",testSuite->failCount);
}
}
hashMap.h
#ifndef HASH_MAP_H
#define HASH_MAP_H
#define HASH_FUNCTION hashFunction1
#define MAX_TABLE_LOAD .75
typedef struct HashMap HashMap;
typedef struct HashLink HashLink;
struct HashLink
{
char* key;
int value;
HashLink* next;
};
struct HashMap
{
HashLink** table;
// Number of links in the table.
int size;
// Number of buckets in the table.
int capacity;
};
HashMap* hashMapNew(int capacity);
void hashMapDelete(HashMap* map);
int* hashMapGet(HashMap* map, const char* key);
void hashMapPut(HashMap* map, const char* key, int value);
void hashMapRemove(HashMap* map, const char* key);
int hashMapContainsKey(HashMap* map, const char* key);
int hashMapSize(HashMap* map);
int hashMapCapacity(HashMap* map);
int hashMapEmptyBuckets(HashMap* map);
float hashMapTableLoad(HashMap* map);
void hashMapPrint(HashMap* map);
#endif
?hashMap.c
#include "hashMap.h"
#include
#include
#include
#include
int hashFunction1(const char* key)
{
int r = 0;
for (int i = 0; key[i] != '0'; i++){
r += key[i];
}
return r;
}
int hashFunction2(const char* key)
{
int r = 0;
for (int i = 0; key[i] != '0'; i++){
r += (i + 1) *key[i];
}
return r;
}
/**
* Creates a new hash table link with a copy of the keystring.
* @param key Key string to copy in the link.
* @param value Value to set in the link.
* @param next Pointer to set as the link's next.
* @return Hash table link allocated on the heap.
*/
HashLink* hashLinkNew(const char* key, int value, HashLink*next)
{
HashLink* link = malloc(sizeof(HashLink));
link->key = malloc(sizeof(char) *(strlen(key) + 1));
strcpy(link->key, key);
link->value = value;
link->next = next;
return link;
}
/**
* Free the allocated memory for a hash table link created withhashLinkNew.
* @param link
*/
static void hashLinkDelete(HashLink* link)
{
free(link->key);
free(link);
}
/**
* Initializes a hash table map, allocating memory for a linkpointer table with
* the given number of buckets.
* @param map
* @param capacity The number of table buckets.
*/
void hashMapInit(HashMap* map, int capacity)
{
map->capacity = capacity;
map->size = 0;
map->table = malloc(sizeof(HashLink*) *capacity);
for (int i = 0; i < capacity; i++){
map->table[i] =NULL;
}
}
/**
* Removes all links in the map and frees all allocated memory. Youcan use
* hashLinkDelete to free the links.
* @param map
*/
void hashMapCleanUp(HashMap* map)
{
// FIXME: implement
}
/**
* Creates a hash table map, allocating memory for a link pointertable with
* the given number of buckets.
* @param capacity The number of buckets.
* @return The allocated map.
*/
HashMap* hashMapNew(int capacity)
{
HashMap* map = malloc(sizeof(HashMap));
hashMapInit(map, capacity);
return map;
}
/**
* Removes all links in the map and frees all allocated memory,including the
* map itself.
* @param map
*/
void hashMapDelete(HashMap* map)
{
hashMapCleanUp(map);
free(map);
}
/**
* Returns a pointer to the value of the link with the given key.Returns NULL
* if no link with that key is in the table.
* Use HASH_FUNCTION(key) and the map's capacity to find the indexof the
* correct linked list bucket. Also make sure to search the entirelist.
* @param map
* @param key
* @return Link value or NULL if no matching link.
*/
int* hashMapGet(HashMap* map, const char* key)
{
// FIXME: implement
return NULL;
}
/**
* Resizes the hash table to have a number of buckets equal to thegiven
* capacity. After allocating the new table, all of the links needto be
* rehashed into it because the capacity has changed.
* Remember to free the old table and any old links if you usehashMapPut to
* rehash them.
* @param map
* @param capacity The new number of buckets.
*/
void resizeTable(HashMap* map, int capacity)
{
// FIXME: implement
}
/**
* Updates the given key-value pair in the hash table. If a linkwith the given
* key already exists, this will just update the value. Otherwise,it will
* create a new link with the given key and value and add it to thetable
* bucket's linked list. You can use hashLinkNew to create thelink.
* Use HASH_FUNCTION(key) and the map's capacity to find the indexof the
* correct linked list bucket. Also make sure to search the entirelist.
* @param map
* @param key
* @param value
*/
void hashMapPut(HashMap* map, const char* key, int value)
{
// FIXME: implement
}
/**
* Removes and frees the link with the given key from the table. Ifno such link
* exists, this does nothing. Remember to search the entire linkedlist at the
* bucket. You can use hashLinkDelete to free the link.
* @param map
* @param key
*/
void hashMapRemove(HashMap* map, const char* key)
{
// FIXME: implement
}
/**
* Returns 1 if a link with the given key is in the table and 0otherwise.
* Use HASH_FUNCTION(key) and the map's capacity to find the indexof the
* correct linked list bucket. Also make sure to search the entirelist.
* @param map
* @param key
* @return 1 if the key is found, 0 otherwise.
*/
int hashMapContainsKey(HashMap* map, const char* key)
{
// FIXME: implement
return 0;
}
/**
* Returns the number of links in the table.
* @param map
* @return Number of links in the table.
*/
int hashMapSize(HashMap* map)
{
// FIXME: implement
return 0;
}
/**
* Returns the number of buckets in the table.
* @param map
* @return Number of buckets in the table.
*/
int hashMapCapacity(HashMap* map)
{
// FIXME: implement
return 0;
}
/**
* Returns the number of table buckets without any links.
* @param map
* @return Number of empty buckets.
*/
int hashMapEmptyBuckets(HashMap* map)
{
// FIXME: implement
return 0;
}
/**
* Returns the ratio of (number of links) / (number of buckets) inthe table.
* Remember that the buckets are linked lists, so this ratio tellsyou nothing
* about the number of empty buckets. Remember also that the load isa floating
* point number, so don't do integer division.
* @param map
* @return Table load.
*/
float hashMapTableLoad(HashMap* map)
{
// FIXME: implement
return 0;
}
/**
* Prints all the links in each of the buckets in the table.
* @param map
*/
void hashMapPrint(HashMap* map)
{
for (int i = 0; i < map->capacity;i++){
HashLink* link =map->table[i];
if (link != NULL){
printf("Bucket %i ->", i);
while (link != NULL){
printf(" (%s, %d) ->", link->key, link->value);
link = link->next;
}
}
}
printf("");
}
spellChecker.c
#include "hashMap.h"
#include
#include
#include
#include
#include
/**
* Allocates a string for the next word in the file and returns it.This string
* is null terminated. Returns NULL after reaching the end of thefile.
* @param file
* @return Allocated string or NULL.
*/
char* nextWord(FILE* file)
{
int maxLength = 16;
int length = 0;
char* word = malloc(sizeof(char) *maxLength);
while (1){
char c =fgetc(file);
if ((c >='0'&& c <= '9') || (c >= 'A' && c <= 'Z')|| (c >= 'a' && c <= 'z') || c == '''){
if (length + 1 >= maxLength){
maxLength *= 2;
word = realloc(word, maxLength);
}
word[length] = c;
length++;
}
else if (length > 0|| c == EOF){
break;
}
}
if (length == 0){
free(word);
return NULL;
}
word[length] = '0';
return word;
}
/**
* Loads the contents of the file into the hash map.
* @param file
* @param map
*/
void loadDictionary(FILE* file, HashMap* map)
{
// FIXME: implement
}
/**
* Prints the concordance of the given file and performanceinformation. Uses
* the file input1.txt by default or a file name specified as acommand line
* argument.
* @param argc
* @param argv
* @return
*/
int main(int argc, const char** argv)
{
// FIXME: implement
HashMap* map = hashMapNew(1000);
FILE* file = fopen("dictionary.txt", "r");
clock_t timer = clock();
loadDictionary(file, map);
timer = clock() - timer;
printf("Dictionary loaded in %f seconds",(float)timer / (float)CLOCKS_PER_SEC);
fclose(file);
char inputBuffer[256];
int quit = 0;
while (!quit){
printf("Enter a word or"quit" to quit: ");
scanf("%s",inputBuffer);
// Implement the spellchecker code here..
if (strcmp(inputBuffer,"quit") == 0){
quit = 1;
}
}
hashMapDelete(map);
return 0;
}
tests.c
#include "CuTest.h"
#include "hashMap.h"
#include
#include
#include
#include
// --- Test Helpers ---
typedef struct HistLink HistLink;
typedef struct Histogram Histogram;
struct HistLink
{
char* key;
int count;
HistLink* next;
};
struct Histogram
{
HistLink* head;
int size;
};
void histInit(Histogram* hist)
{
hist->head = NULL;
hist->size = 0;
}
void histCleanUp(Histogram* hist)
{
HistLink* link = hist->head;
while (link != NULL){
HistLink* next =link->next;
free(link);
link = next;
}
}
void histAdd(Histogram* hist, char* key)
{
HistLink* link = hist->head;
while (link != NULL){
if (strcmp(key,link->key) == 0 ){
link->count++;
return;
}
link =link->next;
}
link = malloc(sizeof(HistLink));
link->key = key;
link->count = 1;
link->next = hist->head;
hist->head = link;
hist->size++;
}
/**
* Counts the number of times each key appears in the table.
* @param hist
* @param map
*/
void histFromTable(Histogram* hist, HashMap* map)
{
histInit(hist);
for (int i = 0; i < map->capacity;i++){
HashLink* link =map->table[i];
while (link !=NULL){
histAdd(hist, link->key);
link = link->next;
}
}
}
/**
* Asserts that each key is unique (count is 1 for each key).
* @param test
* @param hist
*/
void assertHistCounts(CuTest* test, Histogram* hist)
{
HistLink* link = hist->head;
while (link != NULL){
CuAssertIntEquals(test,1, link->count);
link =link->next;
}
}
// --- Hash Map tests ---
/*
* Test cases:
* - At most one link in each bucket under threshold.
* - At most one link in each bucket over threshold.
* - Multiple links in some buckets under threshold.
* - Multiple links in some buckets over threshold.
* - Multiple links in some buckets over threshold withduplicates.
*/
/**
* Tests all hash map functions after adding and removing all of thegiven keys
* and values.
* @param test
* @param links The key-value pairs to be added and removed.
* @param notKeys Some keys not in the table to test contains andget.
* @param numLinks The number of key-value pairs to be added andremoved.
* @param numNotKeys The number of keys not in the table.
* @param numBuckets The initial number of buckets (capacity) in thetable.
*/
void testCase(CuTest* test, HashLink* links, const char** notKeys,int numLinks,
int numNotKeys, int numBuckets)
{
HashMap* map = hashMapNew(numBuckets);
Histogram hist;
// Add links
for (int i = 0; i < numLinks; i++){
hashMapPut(map,links[i].key, links[i].value);
}
// Print table
printf("After adding all key-valuepairs:");
hashMapPrint(map);
// Check size
CuAssertIntEquals(test, numLinks,hashMapSize(map));
// Check capacity
CuAssertIntEquals(test, map->capacity,hashMapCapacity(map));
// Check empty buckets
int sum = 0;
for (int i = 0; i < map->capacity;i++)
{
if (map->table[i] ==NULL){
sum++;
}
}
CuAssertIntEquals(test, sum,hashMapEmptyBuckets(map));
// Check table load
CuAssertIntEquals(test, (float)numLinks /map->capacity, hashMapTableLoad(map));
// Check contains and get on valid keys.
for (int i = 0; i < numLinks; i++){
CuAssertIntEquals(test,1, hashMapContainsKey(map, links[i].key));
int* value =hashMapGet(map, links[i].key);
CuAssertPtrNotNull(test,value);
CuAssertIntEquals(test,links[i].value, *value);
}
// Check contains and get on invalid keys.
for (int i = 0; i < numNotKeys; i++){
CuAssertIntEquals(test,0, hashMapContainsKey(map, notKeys[i]));
CuAssertPtrEquals(test,NULL, hashMapGet(map, notKeys[i]));
}
// Check that all links are present and have aunique key.
histFromTable(&hist, map);
CuAssertIntEquals(test, numLinks,hist.size);
assertHistCounts(test, &hist);
histCleanUp(&hist);
// Remove keys
for (int i = 0; i < numLinks; i++){
hashMapRemove(map,links[i].key);
}
// Print table
printf("After removing all key-valuepairs:");
hashMapPrint(map);
// Check size
CuAssertIntEquals(test, 0,hashMapSize(map));
// Check capacity
CuAssertIntEquals(test, map->capacity,hashMapCapacity(map));
// Check empty buckets
CuAssertIntEquals(test, map->capacity,hashMapEmptyBuckets(map));
// Check table load
CuAssertIntEquals(test, 0,hashMapTableLoad(map));
// Check contains and get on valid keys.
for (int i = 0; i < numLinks; i++){
CuAssertIntEquals(test,0, hashMapContainsKey(map, links[i].key));
CuAssertPtrEquals(test,NULL, hashMapGet(map, links[i].key));
}
// Check contains and get on invalid keys.
for (int i = 0; i < numNotKeys; i++){
CuAssertIntEquals(test,0, hashMapContainsKey(map, notKeys[i]));
CuAssertPtrEquals(test,NULL, hashMapGet(map, notKeys[i]));
}
// Check that there are no links in thetable.
histFromTable(&hist, map);
CuAssertIntEquals(test, 0, hist.size);
assertHistCounts(test, &hist);
histCleanUp(&hist);
hashMapDelete(map);
}
/**
* Tests hash map functions for a table with no more than onelink
* in each bucket and without hitting the table loadthreshold.
* @param test
*/
void testSingleUnder(CuTest* test)
{
printf("--- Testing single-link chains underthreshold ---");
HashLink links[] = {
{ .key = "a", .value =0, .next = NULL },
{ .key = "c", .value =1, .next = NULL },
{ .key = "d", .value =2, .next = NULL },
{ .key = "f", .value =3, .next = NULL },
{ .key = "g", .value =4, .next = NULL }
};
const char* notKeys[] = { "b", "e", "h" };
testCase(test, links, notKeys, 5, 3, 10);
}
/**
* Tests hash map functions for a table with no more than onelink
* in each bucket while hitting the table load threshold.
* @param test
*/
void testSingleOver(CuTest* test)
{
printf("--- Testing single-link chains overthreshold ---");
HashLink links[] = {
{ .key = "a", .value =0, .next = NULL },
{ .key = "c", .value =1, .next = NULL },
{ .key = "d", .value =2, .next = NULL },
{ .key = "f", .value =3, .next = NULL },
{ .key = "g", .value =4, .next = NULL }
};
const char* notKeys[] = { "b", "e", "h" };
testCase(test, links, notKeys, 5, 3, 1);
}
/**
* Tests hash map functions for a table with 2+ links in somebuckets without
* hitting the table load threshold.
* @param test
*/
void testMultipleUnder(CuTest* test)
{
printf("--- Testing multiple-link chains underthreshold ---");
HashLink links[] = {
{ .key = "ab", .value =0, .next = NULL },
{ .key = "c", .value =1, .next = NULL },
{ .key = "ba", .value =2, .next = NULL },
{ .key = "f", .value =3, .next = NULL },
{ .key = "gh", .value =4, .next = NULL }
};
const char* notKeys[] = { "b", "e", "hg"};
testCase(test, links, notKeys, 5, 3, 10);
}
/**
* Tests hash map functions for a table with 2+ links in somebuckets while
* hitting the table load threshold.
* @param test
*/
void testMultipleOver(CuTest* test)
{
printf("--- Testing multiple-link chains overthreshold ---");
HashLink links[] = {
{ .key = "ab", .value =0, .next = NULL },
{ .key = "c", .value =1, .next = NULL },
{ .key = "ba", .value =2, .next = NULL },
{ .key = "f", .value =3, .next = NULL },
{ .key = "gh", .value =4, .next = NULL }
};
const char* notKeys[] = { "b", "e", "hg"};
testCase(test, links, notKeys, 5, 3, 1);
}
/**
* Tests that values are updated when inserting with a key alreadyin the table.
* Also tests that keys remain unique after insertion (no duplicatelinks).
* @param test
*/
void testValueUpdate(CuTest* test)
{
int numLinks = 5;
printf("--- Testing value updates---");
HashLink links[] = {
{ .key = "ab", .value =0, .next = NULL },
{ .key = "c", .value =1, .next = NULL },
{ .key = "ba", .value =2, .next = NULL },
{ .key = "ab", .value =3, .next = NULL },
{ .key = "gh", .value =4, .next = NULL }
};
HashMap* map = hashMapNew(1);
// Add links
for (int i = 0; i < numLinks; i++){
hashMapPut(map,links[i].key, links[i].value);
}
// Print table
printf("After adding all key-valuepairs:");
hashMapPrint(map);
int* value = hashMapGet(map, "ab");
CuAssertPtrNotNull(test, value);
CuAssertIntEquals(test, 3, *value);
Histogram hist;
histFromTable(&hist, map);
CuAssertIntEquals(test, numLinks - 1,hist.size);
assertHistCounts(test, &hist);
histCleanUp(&hist);
hashMapDelete(map);
}
// --- Test Suite ---
void addAllTests(CuSuite* suite)
{
SUITE_ADD_TEST(suite, testSingleUnder);
SUITE_ADD_TEST(suite, testSingleOver);
SUITE_ADD_TEST(suite, testMultipleUnder);
SUITE_ADD_TEST(suite, testMultipleOver);
SUITE_ADD_TEST(suite, testValueUpdate);
}
int main()
{
CuSuite* suite = CuSuiteNew();
addAllTests(suite);
CuSuiteRun(suite);
CuString* output = CuStringNew();
CuSuiteSummary(suite, output);
CuSuiteDetails(suite, output);
printf("%s", output->buffer);
CuStringDelete(output);
CuSuiteDelete(suite);
return 0;
}
?main.c
#include "hashMap.h"
#include
#include
#include
#include
#include
/**
* Allocates a string for the next word in the file and returns it.This string
* is null terminated. Returns NULL after reaching the end of thefile.
* @param file
* @return Allocated string or NULL.
*/
char* nextWord(FILE* file)
{
int maxLength = 16;
int length = 0;
char* word = malloc(sizeof(char) *maxLength);
while (1){
char c =fgetc(file);
if ((c >='0'&& c <= '9') || (c >= 'A' && c <= 'Z')|| (c >= 'a' && c <= 'z') || c == '''){
if (length + 1 >= maxLength){
maxLength *= 2;
word = realloc(word, maxLength);
}
word[length] = c;
length++;
}
else if (length > 0|| c == EOF){
break;
}
}
if (length == 0){
free(word);
return NULL;
}
word[length] = '0';
return word;
}
/**
* Prints the concordance of the given file and performanceinformation. Uses
* the file input1.txt by default or a file name specified as acommand line
* argument.
* @param argc
* @param argv
* @return
*/
int main(int argc, const char** argv)
{
// FIXME: implement
const char* fileName = "input1.txt";
if (argc > 1){
fileName =argv[1];
}
printf("Opening file: %s", fileName);
clock_t timer = clock();
HashMap* map = hashMapNew(10);
// --- Concordance code begins here ---
// Be sure to free the word after you are donewith it here.
// --- Concordance code ends here ---
hashMapPrint(map);
timer = clock() - timer;
printf("Ran in %f seconds", (float)timer /(float)CLOCKS_PER_SEC);
printf("Empty buckets: %d",hashMapEmptyBuckets(map));
printf("Number of links: %d",hashMapSize(map));
printf("Number of buckets: %d",hashMapCapacity(map));
printf("Table load: %f",hashMapTableLoad(map));
hashMapDelete(map);
return 0;
}
input1.txt
As the wind does blow
Across the trees I see the
Buds blooming in May
I walk across sand
And find myself blistering
In the hot hot heat
Falling to the ground
I watch a leaf settle down
In a bed of brown.
It's cold and I wait
For someone to shelter me
And take me from here.
input2.txt
Call me Ishmael Some years ago never mind how long preciselyhaving little or no money in my purse and nothing particular tointerest me on shore I thought I would sail about a little and seethe watery part of the world It is a way I have of driving off thespleen and regulating the circulation Whenever I find myselfgrowing grim about the mouth whenever it is a damp drizzly Novemberin my soul whenever I find myself involuntarily pausing beforecoffin warehouses and bringing up the rear of every funeral I meetand especially whenever my hypos get such an upper hand of me thatit requires a strong moral principle to prevent me fromdeliberately stepping into the street and methodically knockingpeoples hats off then I account it high time to get to sea as soonas I can This is my substitute for pistol and ball With aphilosophical flourish Cato throws himself upon his sword I quietlytake to the ship There is nothing surprising in this If they butknew it almost all men in their degree some time or other cherishvery nearly the same feelings towards the ocean with me
There now is your insular city of the Manhattoes belted round bywharves as Indian isles by coral reefs commerce surrounds it withher surf Right and left the streets take you waterward Its extremedowntown is the battery where that noble mole is washed by wavesand cooled by breezes which a few hours previous were out of sightof land Look at the crowds of water gazers there
input3.txt
eat ate tea
Step by Step Solution
3.43 Rating (150 Votes )
There are 3 Steps involved in it
Step: 1
Solutions Step 1 Programs made up of procedures or routines are called programs and C is a procedural program language adhering to this paradigm Explanation C offers lowlevel memory access manipulatin...See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started