Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Please read the instructions...I only need help with the emboldened methods in the code i displayed The web crawler will start at some web page,

Please read the instructions...I only need help with the emboldened methods in the code i displayed

The web crawler will start at some web page, follow a random link on that page, then follow a random link on that new page, and so on, keeping a list of the pages that it has visited. Your part of this project will be to complete the linked list data structure that stores the web addresses that the crawler visits. Note that this project deals with a singly-linked list (unlike the doubly-linked list we discussed in class). Singlylinked lists are simpler, because each node just points to the next node (i.e., no previous pointers). The last node in the list has a NULL next pointer to indicate the end. (FYI, the disadvantage with singly-linked lists is that you cant traverse them backwards, while doubly-linked lists let you go forward and backward.)

The file crawler.c contains some code already written. You should only modify that file in the places that are marked "TODO: complete this". There are comments to help guide you through the code. Note that there are four functions that operate on the linked list that you need to complete: contains, insertBack, printAddresses, and destroyList. Each of these functions must be recursive. The comments indicate how those functions should behave. Only "printAddresses" should print any output to the terminal.

#include  #include  #include  #define MAX_ADDR_LENGTH 1000 /* * a node in our linked-list of web addresses */ struct listNode{ char addr[MAX_ADDR_LENGTH]; struct listNode *next; }; /* * returns 1 if the list starting at pNode contains the address addr, * and returns 0 otherwise */ int contains(const struct listNode *pNode, const char *addr); /* * inserts the address addr as a new listNode at the end of * the list */ void insertBack(struct listNode *pNode, const char *addr); /* * prints the addresses from pNode to the end of the list, * one on each line */ void printAddresses(const struct listNode *pNode); /* * frees the memory associated with this node and all subsequent nodes */ void destroyList(struct listNode *pNode); /* * srcAddr should be a web address (e.g., http://www.yahoo.com). * link should point to an array of maxLinkLength characters. * getLink returns 1 if a link was found and 0 otherwise. * If a link was found, "link" will be filled in with the web address. */ int getLink(const char* srcAddr, char* link, const int maxLinkLength); int main(int argc, char** argv){ long seed; char startAddr[MAX_ADDR_LENGTH]; char destAddr[MAX_ADDR_LENGTH]; int hopNum, numHops; struct listNode *pListStart; if(argc < 3 || argc > 4){ fprintf(stderr, "USAGE: %s startAddr numHops [rand seed]", argv[0]); return -1; } /* initialization */ if(argc >= 4){ seed = atol(argv[3]); } else{ seed = time(NULL); } printf("seed = %ld ", seed); srand(seed); strncpy(startAddr, argv[1], MAX_ADDR_LENGTH); startAddr[MAX_ADDR_LENGTH - 1] = '\0'; numHops = atoi(argv[2]); pListStart = malloc(sizeof(struct listNode)); if(pListStart == NULL){ fprintf(stderr, "ERROR: could not allocate memory "); return -2; } strncpy(pListStart->addr, startAddr, MAX_ADDR_LENGTH); pListStart->next = NULL; /* start the crawling */ for(hopNum=1; hopNum <= numHops; hopNum++){ int res = getLink(startAddr, destAddr, MAX_ADDR_LENGTH); if(!res){ printf("Dead end on hop %d: no outgoing links ", hopNum); break; } if(contains(pListStart, destAddr)){ printf("Cycle detected on hop %d: address %s ", hopNum, destAddr); hopNum--; // try again for this hop in the next iteration } else{ insertBack(pListStart, destAddr); strncpy(startAddr, destAddr, MAX_ADDR_LENGTH); } } printAddresses(pListStart); destroyList(pListStart); return 0; } /* * returns 1 if the list starting at pNode contains the address addr, * and returns 0 otherwise */ int contains(const struct listNode *pNode, const char *addr){ TODO: complete this } /* * inserts the address addr as a new listNode at the end of * the list */ void insertBack(struct listNode *pNode, const char *addr){ TODO: complete this } /* * prints the addresses from pNode to the end of the list, * one on each line */ void printAddresses(const struct listNode *pNode){ TODO: complete this } /* * frees the memory associated with this node and all subsequent nodes */ void destroyList(struct listNode *pNode){ TODO: complete this }  int getLink(const char* srcAddr, char* link, const int maxLinkLength){ const int bufSize = 1000; char buffer[bufSize]; int numLinks = 0; FILE *pipe; snprintf(buffer, bufSize, "curl -s \"%s\" | python getLinks.py", srcAddr); pipe = popen(buffer, "r"); if(pipe == NULL){ fprintf(stderr, "ERROR: could not open the pipe for command %s ", buffer); return 0; } fscanf(pipe, "%d ", &numLinks); if(numLinks > 0){ int linkNum; double r = (double)rand() / ((double)RAND_MAX + 1.0); for(linkNum=0; linkNum 0){ return 1; } else{ return 0; } } 

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

How many Tables Will Base HCMSs typically have? Why?

Answered: 1 week ago

Question

What is the process of normalization?

Answered: 1 week ago