Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The purpose of these exercises is to get you started with network analysis and the SNAP software. For this homework, you need to install and

The purpose of these exercises is to get you started with network analysis and the SNAP software. For this homework, you need to install and try out the SNAP network analysis tools. We strongly encourage you to use Snap.py for Python (available from http://snap.stanford.edu/snappy/). However, you can also use SNAP for C++ (available from http://snap.stanford.edu/snap/ download.html) 1 Analyzing the Wikipedia voters network [27 points] Download the Wikipedia voting network wiki-Vote.txt.gz: http://snap.stanford.edu/ data/wiki-Vote.html. Using one of the network analysis tools above, load the Wikipedia voting network. Note that Wikipedia is a directed network. Formally, we consider the Wikipedia network as a directed graph G = (V, E), with node set V and edge set E V V where (edges are ordered pairs of nodes). An edge (a, b) E means that user a voted on user b. To make our questions clearer, we will use the following small graph as a running example: Gsmall = (Vsmall, Esmall), where Vsmall = {1, 2, 3} and Esmall = {(1, 2),(2, 1),(1, 3),(1, 1)}. 2 Compute and print out the following statistics for the wiki-Vote network: 1. The number of nodes in the network. (Gsmall has 3 nodes.) 2. The number of nodes with a self-edge (self-loop), i.e., the number of nodes a V where (a, a) E. (Gsmall has 1 self-edge.) 3. The number of directed edges in the network, i.e., the number of ordered pairs (a, b) E for which a 6= b. (Gsmall has 3 directed edges.) 4. The number of undirected edges in the network, i.e., the number of unique unordered pairs (a, b), a 6= b, for which (a, b) E or (b, a) E (or both). If both (a, b) and (b, a) are edges, this counts a single undirected edge. (Gsmall has 2 undirected edges.) 5. The number of reciprocated edges in the network, i.e., the number of unique unordered pairs of nodes (a, b), a 6= b, for which (a, b) E and (b, a) E. (Gsmall has 1 reciprocated edge.) 6. The number of nodes of zero out-degree. (Gsmall has 1 node with zero out-degree.) 7. The number of nodes of zero in-degree. (Gsmall has 0 nodes with zero in-degree.) 8. The number of nodes with more than 10 outgoing edges (out-degree > 10). 9. The number of nodes with fewer than 10 incoming edges (in-degree < 10). 2 Further Analyzing the Wikipedia voters network [6 points] For this problem, we use the Wikipedia voters network. If you are using Python, you might want to use NumPy, SciPy, and/or Matplotlib libraries. 1. Plot the distribution of out-degrees of nodes in the network on a log-log scale. Each data point is a pair (x, y) where x is a positive integer and y is the number of nodes in the network with out-degree equal to x. Restrict the range of x between the minimum and maximum out-degrees. You may filter out data points with a 0 entry. For the log-log scale, use base 10 for both x and y axes. 3 Finding Experts on the Java Programming Language on StackOverflow [12 points] Download the StackOverflow network stackoverflow-Java.txt.gz. An edge (a, b) i n the network means t hat person a endorsed an answer f rom person b on a Java-related question. 3 Using one of the network analysis tools above, load the StackOverflow network. Note that StackOverflow is a directed network. Compute and print out the following statistics for the stackoverflow-Java network: 1. The number of weakly connected components in the network. This value can be calculated in Snap.py via function GetWccs. 2. The number of edges and the number of nodes in the largest weakly connected component. The largest weakly connected component is calculated in Snap.py with function GetMxWcc. 3. The top 3 most central nodes in the network by PagePank scores. PageRank scores are calculated in Snap.py with function GetPageRank. 4. The top 3 hubs and top 3 authorities in the network by HITS scores. HITS scores are calculated in Snap.py with function GetHits.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Informix Database Administrators Survival Guide

Authors: Joe Lumbley

1st Edition

0131243144, 978-0131243149

More Books

Students also viewed these Databases questions