Question

1 Approved Answer

Posted on Sep 26, 2024

elements in either collection. If we build a collection for each team in question that contains the name or identifier of the opponent that was

elements in either collection. If we build a collection for each team in question that contains the name or identifier of the opponent that was faced in each game the team played, we end up with two separate collections of opponents. The goal is to generate a third collection that contains only the common opponents faced by both teams. For example, consider two teams from the AFC North Division from the 2011 season, the Pittsburgh Steelers and the Baltimore Ravens. Both teams finished the regular season with won-lost-tied records of 12-4-0. As it turns out, the Ravens won the tiebreaker not on the rule were describing in this assignment, but on the rule that the Ravens won both head-to-head matchups during the season. But lets assume the tiebreaker had to be resolved by the won-lost-tied percentage in common games. Table 1 shows the entire regular season schedules for both teams, and indicates common opponents in bold. Although in this example, most of the games are common;i.e., both teams faced common opponents, this is not a requisite condition for the problem. The result returned by an algorithm solving this problem should return a third collection containing only those teams which both the Steelers and the Ravens faced during the season. The order in which these common teams are present in this collection is not important, and duplicates should be included. Table 2 shows one possible correct result. Game Pittsburgh Steelers Baltimore Ravens 1 Baltimore Ravens Pittsburgh Steelers 2 Seattle Seahawks Tennessee Titans 3 Indianapolis Colts St. Louis Rams 4 Houston Texans New York Jets 5 Tennessee Titans bye week 6 Jacksonville Jaguars Houston Texans 7 Arizona Cardinals Jacksonville Jaguars 8 New England Patriots Arizona Cardinals 9 Baltimore Ravens Pittsburgh Steelers 10 Cincinnati Bengals Seattle Seahawks 11 bye week Cincinnati Bengals 12 Kansas City Chiefs San Francisco 49ers 13 Cincinnati Bengals Cleveland Browns 14 Cleveland Browns Indianapolis Colts 15 San Francisco 49ers San Diego Chargers 16 St. Louis Rams Cleveland Browns 17 Cleveland Browns Cincinnati Bengals Table 1. Regular season opponents of the Pittsburgh Steelers and Baltimore Ravens, listed in the order in which the games were played. Note that each team had a bye week, during which they did not play. Common opponents are indicated in bold. Seattle Seahawks Indianapolis Colts Houston Texans Tennessee Titans Jacksonville Jaguars Arizona Cardinals Cincinnati Bengals Cincinnati Bengals Cleveland Browns San Francisco 49ers St. Louis Rams Cleveland Browns Table 2. One possible correct result showing all common elements (duplicates included) between the two collections in Table 1. A Quadratic Solution This problem can be solved in a number of ways. The most straightforward approach to this problem would be to set up a nested loop structure, where one collection is chosen as the query collection. We will call this collection, Query_Collection. Query_Collection is traversed, and for each element in Query_ Collection, we traverse all the remaining collections. Each of the remaining collections, represented by the variable, current_collection, is traversed to see whether or not the current element from Query_Collection is present. Below is this algorithm, represented as pseudocode. for each element X in Query_Collection { for each other collection { for each element Y in current_collection { if X.compareTo(Y) == 0 { X is common between Query_Collection and current_collection break } } if X not in current_collection { X not common to all collections } } if X exists in all collections { add X to common_collection } } This algorithm bears some explanation. First, let us assume all the collections have an equal number of elements. The outermost loop, therefore, will iterate N times. The middle loop traverse the set of collections, not the elements of a collection. Since there are k collections, and the query collection is not compared to itself, the middle loop will iterate k minus 1 times. The innermost loop is similar to the outermost loop, and will iterate N times in the worst case. The total number of loop iterations is therefore given by N*(k - 1)*N, or the (k - 1) N2. The starting point for each loop is independent of the current positions of the traversals of the other loops, so no counting formulas apply here. The running time for this algorithm is thus quadratic. For 2 collections, both of length N, the worst case running time would be exactly N2. The number of comparisons increases linearly with each additional collection. It should be apparent that each additional collection effectively adds 1 to the coefcient of the formula above, assuming all collections are of the same length. For example, for 3 collections, all of length N, the maximum number of comparisons would be N*2N, or 2N2; for 4 collections, all of length N, the maximum number of comparisons would be 3N2; and so on. Note that the coefcient of the quadratic term is always equal to the number of collections minus 1. Note that there must be at least 2 collections in order to perform any comparisons. If there is only a single collection, then by default the collection of common elements is simply the collection itself. A more general formula, allowing for an arbitrary number of collections of differing lengths, is: Formula 1: For k collections C1, C2, C3, ..., Ck, of lengths N1, N2, N3, ..., Nk, respectively, where one collection of which is chosen as the query collection, Cq, of length Nq, the worst case number of comparisons for finding all elements common to all k collections is proportional to: Nq * ([N1, N2, N3, ..., Nk] - Nq). In Formula 1, the length, Nq, is the length of the collection chosen to be the query collection, which corresponds to the collection labeled C1 in the pseudocode. In the formula, the size of the query collection is subtracted from the sum of the sizes of all the collections, since the elements in the query collection are not compared against themselves. If all the collections have the same length, the general formula becomes: Formula 2: For k collections C1, C2, C3, ..., Ck, all of which have length = N, the maximum number of comparisons for finding all elements common to all k collections is proportional to: N * (kN - N)), or O((k - 1)N2. One thing that should be apparent is that the size of the query collection imposes an implicit upper bound on the number of elements common to all the collections. That is, the number of common elements must be less than or equal to the size of the query collection. Although any one of the collections can be designated as the query collection, the number of comparisons will be minimized if the smallest collection is used as the query collection. This is stated more generally as: For any k collections, where Nq represents the size of the smallest collection, the number of elements common to all collections will be less than or equal to Nq. C1 C2 Baltimore Ravens Pittsburgh Steelers Seattle Seahawks Tennessee Titans Indianapolis Colts St. Louis Rams Houston Texans New York Jets Tennessee Titans bye week Jacksonville Jaguars Houston Texans Arizona Cardinals Jacksonville Jaguars New England Patriots Arizona Cardinals Baltimore Ravens Pittsburgh Steelers Cincinnati Bengals Seattle Seahawks bye week Cincinnati Bengals Kansas City Chiefs San Francisco 49ers Cincinnati Bengals Cleveland Browns Cleveland Browns Indianapolis Colts San Francisco 49ers San Diego Chargers St. Louis Rams Cleveland Browns Cleveland Browns Cincinnati Bengals Table 3. Regular season opponents of the Steelers and Ravens, with the bye weeks eliminated, generically referred to as C1 and C2, respectively. What You Need to Do: If we were only interested in two relatively small collections, as shown in Table 3, the quadratic algorithm described above is adequate. However, if N is very large, and/or there are many collections (i.e., k is very large), and/or the complexity of the comparison operation is high (i.e., the data are more complex than integers), the quadratic algorithm may be too slow to be useful. There are a number of strategies that can be employed to reduce the complexity below quadratic, and at this point we have covered enough topics to design a more efficient algorithm. Based on the material covered thus far in this course, your goal is to design and implement a more efficient algorithm for finding the common elements of a set of collections. Ideally, the goal is to achieve an algorithm that will only need to perform at most on the order of (k - 1)N comparisons. This can only be achieved if each element in the non-query collections only participates in at most 1 comparison (with possibly a few exceptions). Your algorithm should satisfy the following criteria: 1. It should be able to accept as input 0 to k collections, stored as simple arrays. Were restricting the data structure to arrays since we havent covered higher order data structures yet. 2. The elements of the collections should all be of type Comparable, and they should all be derived from the same base class (not counting the Object class). Implementation of the Comparable interface is necessary since the elements must be compared to each other in order to determine commonality. They must all be derived from the same base class since comparisons between different data types is undefined. 3. Duplicate elements should be allowed; e.g., if there are M instances of the value, XYZ, in all the input collections, there should be M instances of the value, XYZ, in the collection of common elements. For example, suppose you have the following collections: banana quince plum apple raspberry pomegranate pear banana lime banana lemon banana pomegranate apple jujube pineapple banana blueberry cherry cherry apple jujube blueberry cherry cherry jujube grape orange mango banana The collection of common elements would be (order doesnt matter): banana apple banana cherry jujube 4. The collections should be allowed to be of varying lengths. 5. Your algorithm should designate one of the collections as the query collection, which is the collection that will be compared against the other collections. 7. Your algorithm may manipulate the input collections as needed to accomplish its purpose. 8. The total number of element comparisons performed should be less than the value for the quadratic solution described above. That is, the total number of comparisons in the worst case should be less than (k - 1)N2. Do not be concerned about average performance or best case performance. Also, the total number of comparisons is defined, for this assignment, to be only those comparisons that are performed once the traversal of the query collection begins, and the other collections are checked for the presence of the elements in the query collection. Any comparisons performed to manipulate the data prior to searching for the common elements should be ignored. The framework for your algorithm should satisfy the following criteria, for ease in testing: 1. Create a class called CommonElements, to contain your algorithm and associated methods and attributes. 2. In your CommonElements class, encapsulate your algorithm within a method called findCommonElements, that has the following signature: public Comparable[] findCommonElements(Comparable[][] collections). The argument to this method, collections, will be the set of k collections discussed earlier The type of the argument is an array of Comparable arrays. Note that in Java, a 2D array will support arrays of varying sizes provided it is initialized without first specifying the two dimensions. For example, the following statement: Comparable[][] collections = {{A}, {A, B}, {A, B, C}}; results in an array of 3 Comparable arrays of varying sizes. The following syntax also works: Comparable[] col_1 = {A}; Comparable[] col_2 = {A, B}; Comparable[] col_3 = {A, B, C}; Comparable[][] collections = {col_1, col_2, col_3}; 3. The value returned by your findCommonElements method should be a collection of Comparable elements that contains only the elements common to all the input collections. 4. Since you are being asked to evaluate your algorithm based on the number of comparisons performed, you will need to have your findCommonElements method maintain a running total of comparisons performed for each set of collections tested. You should create an attribute called comparisons in your CommonElements class to store the number of comparisons, and provide a getter method called getComparisons() to return this value. In order to keep a running total of comparisons, you will need to instrument your code by incrementing the comparisons attribute each time a comparison between two elements is made. Since element comparisons are typically performed in if statements, you may need to increment comparisons immediately before each comparison is actually performed. Although that may sound counter-intuitive, if you try to increment comparisons inside the if statement, after the element comparison has been made, you will miss all the comparisons that cause the condition inside the if statement to evaluate to false. It is important that you adhere to the framework specification above. To facilitate testing of your program, I will use a test harness that will do the following: 1. Creates an instance of your CommonElements class. 2. Calls your findCommonElements method with a set of test collections as input. 3. Verifies that the collection that is returned by findCommonElements correctly contains the elements common to all the input collections. 4. Retrieves the number of comparisons that were performed, via your getComparisons() method. 5. Compares the number of comparisons performed to the target value stated in criterion #5 above for the algorithm. Thus, it is essential that you name your class and methods as described above, or my test harness will not work, and it will take longer to test your program. If your code doesnt meet the above specifications, I will notify you to change your implementation so that it meets the specifications I have described above. What you need to submit: 1. Your CommonElements.java file, implemented as specified above. 2. The set of test collections you used to test your algorithm.