Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

In C- You will write a program learn that uses a training data set to learn weights for a set of house attributes, and then

In C-

You will write a program learn that uses a training data set to learn weights for a set of house attributes, and then applies those weights to a set of input data to calculate prices for those houses. learn takes two arguments, which are the paths to files containing the training data and input data.

Training data format The first line will be the word train. The second line will contain an integer k, giving the number of attributes. The third line will contain an integer n, giving the number of houses. The next n lines will contain k + 1 floating-point numbers, separated by spaces. Each line gives data for a house. The first k numbers give the values x1 xk for that house, and the last number gives its price y.

For example, a file train.txt might contain:

train

4

7

3.000000 1.000000 1180.000000 1955.000000 221900.000000

3.000000 2.250000 2570.000000 1951.000000 538000.000000

2.000000 1.000000 770.000000 1933.000000 180000.000000

4.000000 3.000000 1960.000000 1965.000000 604000.000000

3.000000 2.000000 1680.000000 1987.000000 510000.000000

4.000000 4.500000 5420.000000 2001.000000 1230000.000000

3.000000 2.250000 1715.000000 1995.000000 257500.000000

This file contains data for 7 houses, with 4 attributes and a price for each house. The corresponding matrix X will be 7 5 and Y will be 7 1. (Recall that column 0 of X is all ones.)

Input data format The first line will be the word data. The second line will be an integer k, giving the number of attributes. The third line will be an ineteger m, giving the number of houses. The next m lines will contain k floating-point numbers, separated by spaces. Each line gives data for a house, not including its price.

For example, a file data.txt might contain:

data

4

2

3.000000 2.500000 3560.000000 1965.000000

2.000000 1.000000 1160.000000 1942.000000

This contains data for 2 houses, with 4 attributes for each house. The corresponding matrix X will be 2 5.

Output format Your program should output the prices computed for each house in the input data using the weights derived from the training data. Each house price will be printed on a line, rounded to the nearest integer.

To print a floating-point number rounded to the nearest integer, use the formatting code %.0f, as in: printf("%.0f ", price);

Usage Assuming the files train.txt and data.txt exist in the same directory as learn:

$ ./learn train.txt data.txt

737861

203060

Implementation notes You MUST use double to represent the attributes, weights, and prices. Using float may result in incorrect results due to rounding. To read double values from the training and input data files, you can use fscanf with the format code %lf.

If learn successfully completes, it MUST return exit code 0.

You MAY assume that the training and input data files are correctly formatted. You MAY assume that the first argument is a training data file and that the second argument is an input data file. However, checking that the training data file begins with train and that the input data file begins with data may be helpful if you accidentally give the wrong arguments to learn while you are testing it. To read a string containing up to 5 non-space characters, you can use the fscanf format code %5s.

learn SHOULD check that the training and input data files specify the same value for k.

If the training or input files do not exist, are not readable, are incorrectly formatted, or specify different values of k, learn MAY print error and return exit code 1. Your code will not be tested with these scenarios.

Algorithm: Given matrices X and Y , your program will compute (X^T X) 1X^T Y in order to learn W. This will require (1) multiplying, (2) transposing, and (3) inverting matrices.

Transposing an m n matrix produces an n m matrix. Each row of the X becomes a column of X^T .

To find the inverse of X^T X, you will use a simplified form of Gauss-Jordan elimination.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Theory Icdt 97 6th International Conference Delphi Greece January 8 10 1997 Proceedings Lncs 1186

Authors: Foto N. Afrati ,Phokion G. Kolaitis

1st Edition

3540622225, 978-3540622222

More Books

Students also viewed these Databases questions

Question

Develop clear policy statements.

Answered: 1 week ago

Question

Draft a business plan.

Answered: 1 week ago

Question

Describe the guidelines for appropriate use of the direct plan.

Answered: 1 week ago