[Solved] This project will use actual Twitter data

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 09, 2024

This project will use actual Twitter data collected between July 1st and 7th, 2012. The data is available from the course website, called twitter_data.txt. It

This project will use actual Twitter data collected between July 1st and 7th, 2012. The data is

available from the course website, called twitter_data.txt. It contains information about who is

following whom. This information has been anonymized by replacing usernames with id

numbers (from 0 to 456630). The format of the file is int int, where the first int is the ID

of the user doing the following and the second int is the ID of the user being followed.

Your task is to create a TwitterUser class to represent the users in this dataset. The class should

meet the following requirements:

Store the users ID and the list of user IDs that user is following

Implement Comparable to sort TwitterUser objects based on their ID

Implement Cloneable to make a DEEP copy of the object

Write a recursive getNeighborhood method. The method should take an ID and a maximum

depth as arguments and return a List of the users that Twitter user is following, and who

those users are following, and so on, up to the requested depth

Include any other methods to the TwitterUser class that are necessary to meet the

project requirements

After you have developed the TwitterUser class, create the following:

1. Create an interface, called, TwitterUserManager and define the following methods on

the interface

a. int loadTwitterData()

i. Populates a data container with a TwitterUser instance for each twitter

user read from the twitter_data.txt file. Each TwitterUser instance

should contain a list of TwitterUsers that user is following

ii. Return the number of unique twitter users read from the twitter_data.txt file.

b. List getNeighborhood(TwitterUser user, int maximumDepth)

i. Returns a list of Twitter users that the supplied user is following, and

who those users are following, and so on, up to the supplied depth.

1. Accepts the twitter user to get the neighborhood from

2. Accepts the maximum level to traverse into the Twitter user

hierarchy. Supply zero to not enforce a maximum depth,

returning the entire hierarchy of users starting from the supplied

user.

ii. Returns the list of Twitter users. The list should not contain duplicate

users.

2. Create a class called, TwitterUserManagerImpl, and make the class implement the

TwitterUserManager interface. Implement the loadTwitterData() and

getNeighborhood() methods as described on the interface.

Create the appropriate number of unit tests to achieve a minimum of 90% coverage for the

TwitterUser and TwitterUserManagerImpl classes. Your tests should include the following:

Test the loading of the Twitter data from the twitter_data.txt and ensure the number of

users returned from your loadTwitterData() method is 456,631.

Test the getNeighborhood method with a supplied maximum depth and ensure there

are no duplicate users returned in the list.

Test the getNeighborhood method with a maximum depth of zero to test the case

where there is no maximum depth.

Test the clone method on your TwitterUser class and ensure a deep clone is acheived.

You should ensure that the cloned users Follow list is a different list than the Follow list

of the user that was cloned

Test the Comparables compareTo method on the TwitterUser class and ensure that the

class is sorting by the ID of the twitter user

Hints:

1. The twitter_data.txt file is large, so you will need to be patient. It takes approximately

60 seconds to read in the data file using a Scanner object and an average laptop.

Alternatively, you could use the Files.lines method, which is new to Java 8. It is much

more efficient and easier to use. It returns a Stream containing Strings, which each

String in the stream is a line read from the file. You can use a lambda expression (also

new to Java 8) to loop through the stream to process each line read from the file.

// uses try-with-resources syntax to ensure the stream is closed at the end

try (Stream stream = Files.lines(Paths.get(PATH + FILE_NAME))) {

stream.forEach(line -> { // <- lambda expresion

// will contain each line read from the file

});

} catch (IOException exception) {

exception.printStackTrace();

}

2. You could use the StringTokenizer class to extract the two user IDs read on each line.

// Assuming the data looks like this: 0 5

// tokenizes the string using a space character to separate each token in the

// string

StringTokenizer tokenizer = new StringTokenizer(line, " ");

// hasMoreTokens returns true if there are tokens left to be read

if (tokenizer.hasMoreTokens()) {

// would grab the 0 from the input line, which is the follower ID,

// and converts it to an int

int followerId = Integer.parseInt(tokenizer.nextToken().trim());

}

if (tokenizer.hasMoreElements()) {

// would grab the 5 from the input line, which is the user being followed,

// and converts it to an int

int following = Integer.parseInt(tokenizer.nextToken().trim());

}

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

What Is A Database And How Do I Use It

Authors: Matt Anniss

1st Edition

★★★★★

Exposure to market forces has focused attention on the cost of human resources as a high proportion of operating costs.

Answered: 1 week ago

Previous Question Next Question