Question: You must develop the components of an autocomplete application. To keep things simple, we will assume that our autocomplete application will work in the context

You must develop the components of an autocomplete application. To keep things simple, we will assume that our autocomplete application will work in the context of something like a search engine. Autocomplete will take in a string (well call it a prefix) and will output a list of likely completions (well call each a query). So, a user could type a portion of a query and autocomplete would offer a list of predictions of the completed query that the user intends to complete. The strings can contain any Unicode character except newline.

Our autocomplete application can only work in one context at a time. For example, if we want to let users search for movie titles, we would have to load needed movie title data first. Once loaded, autocomplete can predict possible full movie titles based on a given prefix. Likewise, if we want to provide word completion similar to messaging apps, our autocomplete application would have to load needed word data first. Once loaded, autocomplete can predict possible whole words based on a given prefix.

The data that the autocomplete application needs will be a set of (query, weight) pairs that represent all possible completions. Each query will be a complete search query (like a complete movie title) that a user might want to search for. Each weight will be a non-negative integer that is a distinguishing attribute of the query that will be used for the purposes of ranking queries: The larger the weight, the more likely the query. For example, in a word-based autocomplete for a messaging app the individual queries would be words, and weights would be the frequency of occurrence; thus making more frequently used words the more likely predictions of autocomplete.

the 23135851162 of 13151942776 and 12997637966 to 12136980858 a 9081174698 in 8469404971 for 5933321709 is 4705743816 on 3750423199 that 3400031103

Each line consists of a word (query) and an integer (weight). The integer records the total number of occurrences of the associated word in some large corpus of English text. Note the data is arranged in descending order of weight. This data shows that the is the most frequently occurring English word, appearing over 23 billion times in the corpus from which the data was drawn. So, if t is the prefix, the autocomplete should return the, to, and that - in that order - as the first three predicted completions.

We will develop our autocomplete application in terms of three classes: Term, BinarySearch, and Autocomplete.

The Term class

Term is used to represent a (query, weight) pair. You must write this class so that it is completely consistent with the following API.

public class Term implements Comparable { /** * Initialize a term with the given query and weight. * This method throws a NullPointerException if query is null, * and an IllegalArgumentException if weight is negative. */ public Term(String query, long weight) { } /** * Compares the two terms in descending order of weight. */ public static Comparator byDescendingWeightOrder() { } /** * Compares the two terms in ascending lexicographic order of query, * but using only the first length characters of query. This method * throws an IllegalArgumentException if length is less than or equal * to zero. */ public static Comparator byPrefixOrder(int length) { } /** * Compares this term with the other term in ascending lexicographic order * of query. */ @Override public int compareTo(Term other) { } /** * Returns a string representation of this term in the following format: * query followed by a tab followed by weight */ @Override public String toString(){ } }

Note that Term supports comparison by three different orders: (1) lexicographic order (the natural order), descending order of weight (one alternate ordering), and lexicographic order using only the first length characters of the query string (a family of alternate orderings). By providing three different comparison methods, we are able to sort terms in three different orders. All three orders will be important in delivering the autocomplete functionality.

The BinarySearch class

BinarySearch provides two search methods, both of which are variants of the classic binary search presented in lecture. When binary searching a sorted array that contains more than one key equal to the search key, the client may want to know the index of either the first or last matching key.

public class BinarySearch { /** * Returns the index of the first key in a[] that equals the search key, * or -1 if no such key exists. This method throws a NullPointerException * if any parameter is null. */ public static  int firstIndexOf(Key[] a, Key key, Comparator comparator) { } /** * Returns the index of the last key in a[] that equals the search key, * or -1 if no such key exists. This method throws a NullPointerException * if any parameter is null. */ public static  int lastIndexOf(Key[] a, Key key, Comparator comparator) { } }

Both methods must use the binary search algorithm and make on the order of log N comparisons to a[middle] where N is the number of elements in a[].

The Autocomplete class

Autocomplete uses the Term and BinarySearch class to provide complete autocomplete functionality for a given set of strings and weights.

public class Autocomplete { /** * Initializes a data structure from the given array of terms. * This method throws a NullPointerException if terms is null. */ public Autocomplete(Term[] terms) { } /** * Returns all terms that start with the given prefix, in descending order of weight. * This method throws a NullPointerException if prefix is null. */ public Term[] allMatches(String prefix) { } }

The constructor must store the contents of terms in its own internal array and then sort this array in natural order (lexicographic order of query). The allMatches method must use the binary search methods to identify the range of methods that begin with the given prefix and return these elements in an array sorted in descending order of weight. This returned array represents the predicted completions for the given prefix.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Read the case study Wildfire Entertainment: Organizational structure archetypes on the following pages and answer the questions below: In the context of rapid growth, how can a tech-based start-up...

List three specific parts of the Case Guide, Objectives and Strategy Section (See below) that you had the most difficulty understanding. Describe your current understanding of these parts. Provide...

The Martial Arts Academy (MAA)-School Management System UNIVERSITY OF NORTH CAROLINA WILMINGTON MIS 419 - IT PROJECT MANAGEMENT Assignment 1: Business Case A business case is the first deliverable...

Read the case study Wildfire Entertainment: Organizational structure archetypes on the following pages and answer the questions below: Describe the industry context Wildfire faced in the early days...

Read the case study Wildfire Entertainment: Organizational structure archetypes on the following pages and answer the questions below: Wildfire used three main organizational archetypes: functional,...

Read the case study Wildfire Entertainment: Organizational structure archetypes on the following pages and answer the questions below: Under what conditions is an organization likely to prefer (a) a...

Wildfire Entertainment: Organizational Structure Archetypes This case was written by Charles Galunic, Professor of Organisational Behaviour and the AVIVA Chair in Leadership & Responsibility, and...

Strategy Report -one thousand four hundred words one thousand four hundred words report as follows: 1. Using appropriate tools, critically evaluate the operating environment of your chosen case study...

To find the present value of an uneven series of cash flows, you must find the PVs of the individual cash flows and then sum them. Annuity procedures can never be of use, even some of the cash flows...

What are the advantages and disadvantages of Best Buys customer centricity strategy?

A private, nonprofit college or university must present its expenses by both natural classification and function in its: Select one: a . Statement of Activities b . Any of these c . Statement of...

How would I timestamp the output data in Arduino software (using an Uno with 1.8.13 version) so the serial monitor would display for each data value an associated time value in seconds? I know about...