Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Introduction To understand how the parallel computing works for data mining, we are going to imitate the work of computers in small groups to calculate

Introduction

To understand how the parallel computing works for data mining, we are going to imitate the work of computers in small groups to calculate simple statistical characteristics (mean and standard deviation) by acting as a node of a distributed computer cluster (1 student = 1 node).

Directions

Download the Unit 5 Group Assignment dataset.csv file. It has recorded data values. The goal is to calculate the average and a standard deviation of that variable as a group.

In your initial post describe the algorithm that a central node and computing nodes will need to do to compute the average and the standard deviation of the dataset, given that computing nodes can only work with the assigned fraction in the dataset. Explain what parallelization technique you will use, and why.

The first student who submits the initial post will be serving as a central node, which should split the dataset and assign each portion to each student in the group (no data should be assigned to himself).

Then each of you should conduct calculation of the fraction of the dataset and post the needed aggregated information in the discussion board. When all partial results are in, the student playing the central node should aggregate them and post the dataset results.

To calculate standard deviation you may need to conduct two iterations. Make sure to complete both of them by Sunday.

If one of the students did not submit the partial results, a student who plays the central node may decide how to distribute the missing fraction of the dataset between other students (nodes).

All communication should be conducted within the discussion board.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Oracle RMAN For Absolute Beginners

Authors: Darl Kuhn

1st Edition

1484207637, 9781484207635

More Books

Students also viewed these Databases questions