Question: You must develop the components of an autocomplete application. To keep things simple, we will assume that our autocomplete application will work in the context
You must develop the components of an autocomplete application. To keep things simple, we will assume that our autocomplete application will work in the context of something like a search engine. Autocomplete will take in a string (well call it a prefix) and will output a list of likely completions (well call each a query). So, a user could type a portion of a query and autocomplete would offer a list of predictions of the completed query that the user intends to complete. The strings can contain any Unicode character except newline.
Our autocomplete application can only work in one context at a time. For example, if we want to let users search for movie titles, we would have to load needed movie title data first. Once loaded, autocomplete can predict possible full movie titles based on a given prefix. Likewise, if we want to provide word completion similar to messaging apps, our autocomplete application would have to load needed word data first. Once loaded, autocomplete can predict possible whole words based on a given prefix.
The data that the autocomplete application needs will be a set of (query, weight) pairs that represent all possible completions. Each query will be a complete search query (like a complete movie title) that a user might want to search for. Each weight will be a non-negative integer that is a distinguishing attribute of the query that will be used for the purposes of ranking queries: The larger the weight, the more likely the query. For example, in a word-based autocomplete for a messaging app the individual queries would be words, and weights would be the frequency of occurrence; thus making more frequently used words the more likely predictions of autocomplete.
the 23135851162 of 13151942776 and 12997637966 to 12136980858 a 9081174698 in 8469404971 for 5933321709 is 4705743816 on 3750423199 that 3400031103
Each line consists of a word (query) and an integer (weight). The integer records the total number of occurrences of the associated word in some large corpus of English text. Note the data is arranged in descending order of weight. This data shows that the is the most frequently occurring English word, appearing over 23 billion times in the corpus from which the data was drawn. So, if t is the prefix, the autocomplete should return the, to, and that - in that order - as the first three predicted completions.
We will develop our autocomplete application in terms of three classes: Term, BinarySearch, and Autocomplete.
The Term class
Term is used to represent a (query, weight) pair. You must write this class so that it is completely consistent with the following API.
public class Term implements Comparable{ /** * Initialize a term with the given query and weight. * This method throws a NullPointerException if query is null, * and an IllegalArgumentException if weight is negative. */ public Term(String query, long weight) { } /** * Compares the two terms in descending order of weight. */ public static Comparator byDescendingWeightOrder() { } /** * Compares the two terms in ascending lexicographic order of query, * but using only the first length characters of query. This method * throws an IllegalArgumentException if length is less than or equal * to zero. */ public static Comparator byPrefixOrder(int length) { } /** * Compares this term with the other term in ascending lexicographic order * of query. */ @Override public int compareTo(Term other) { } /** * Returns a string representation of this term in the following format: * query followed by a tab followed by weight */ @Override public String toString(){ } }
Note that Term supports comparison by three different orders: (1) lexicographic order (the natural order), descending order of weight (one alternate ordering), and lexicographic order using only the first length characters of the query string (a family of alternate orderings). By providing three different comparison methods, we are able to sort terms in three different orders. All three orders will be important in delivering the autocomplete functionality.
The BinarySearch class
BinarySearch provides two search methods, both of which are variants of the classic binary search presented in lecture. When binary searching a sorted array that contains more than one key equal to the search key, the client may want to know the index of either the first or last matching key.
public class BinarySearch { /** * Returns the index of the first key in a[] that equals the search key, * or -1 if no such key exists. This method throws a NullPointerException * if any parameter is null. */ public static int firstIndexOf(Key[] a, Key key, Comparator comparator) { } /** * Returns the index of the last key in a[] that equals the search key, * or -1 if no such key exists. This method throws a NullPointerException * if any parameter is null. */ public static int lastIndexOf(Key[] a, Key key, Comparator comparator) { } } Both methods must use the binary search algorithm and make on the order of log N comparisons to a[middle] where N is the number of elements in a[].
The Autocomplete class
Autocomplete uses the Term and BinarySearch class to provide complete autocomplete functionality for a given set of strings and weights.
public class Autocomplete { /** * Initializes a data structure from the given array of terms. * This method throws a NullPointerException if terms is null. */ public Autocomplete(Term[] terms) { } /** * Returns all terms that start with the given prefix, in descending order of weight. * This method throws a NullPointerException if prefix is null. */ public Term[] allMatches(String prefix) { } } The constructor must store the contents of terms in its own internal array and then sort this array in natural order (lexicographic order of query). The allMatches method must use the binary search methods to identify the range of methods that begin with the given prefix and return these elements in an array sorted in descending order of weight. This returned array represents the predicted completions for the given prefix.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
