Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Your task is to analyze a non-functional property, performance, of a fictional tool SECompress, a configurable command-line tool for compressing data. In addition, the data

Your task is to analyze a non-functional property, performance, of a fictional tool SECompress, a configurable command-line tool for compressing data. In addition, the data can be encrypted, signed, segmented, and/or timestamped. a) Implement an algorithm that creates a CART from performance data .you must not take a prefabricated Python implementation. The calculations used in CART here is based on calculating mean and squared error loss for both left and right node and then the sum of squared error loss of the 2 nodes. The format of the sample data and the representation of the CART are described below. Important: If two split options are equally good, we use the alphabetic ordering of feature names as tie breaker. We use the following internal datastructure for a CART. We represent a CART as a python dict with exactly the entries as shown in the example below. This example CART has three nodes. The root node X, and the two child nodes XL and XR. As you can see the child nodes only have a name and a mean but all other fields are set to None. A parent node also has a name and a mean but additionally a feature by which the split is performed, the error of the split and two sucessors. Example: cart = { "name":"X", "mean":456, "split_by_feature": "aes", "error_of_split":73, "successor_left": { "name":"XL", "mean":1234, "split_by_feature": None, , "error_of_split":None, "successor_left":None, "successor_right":None }, "successor_right":{ "name":"XR", "mean":258, "split_by_feature": None, , "error_of_split":None, "successor_left":None, "successor_right":None } } The performance data, given in a csv file, contains different configurations of SECompress with performance measurements Id:secompress,encryption,aes,blowfish,algorithm,rar,zip,signature,timestamp,segmentation,onehundredmb,onegb,performance 0:1,0,0,0,1,1,0,0,0,0,0,0,750 1:1,0,0,0,1,1,0,0,0,1,1,0,773 2:1,0,0,0,1,1,0,0,0,1,0,1,770 3:1,0,0,0,1,1,0,0,1,0,0,0,750 4:1,0,0,0,1,1,0,0,1,1,1,0,773 Given also this python template for the answer and Do not use any further libraries or imports. #Code Start import pandas as pd cart = { "name":"X", "mean":456, "split_by_feature": "aes", "error_of_split": 0.0, "successor_left": { "name":"XL", "mean":1234, "split_by_feature": None, "error_of_split":None, "successor_left":None, "successor_right":None }, "successor_right":{ "name":"XR", "mean":258, "split_by_feature": None,"error_of_split":None, "successor_left":None, "successor_right":None } } features = ["secompress", "encryption", "aes", "blowfish", "algorithm", "rar", "zip", "signature", "timestamp", "segmentation", "onehundredmb", "onegb"] def get_cart(sample_set_csv): # The sample_set_csv is a file path to a csv data, this can be imported into a dataframe df = pd.read_csv(sample_set_csv) # TODO: Write your code here. And change the return. return { "name":"X", "mean":1234, "split_by_feature": "rar", "error_of_split":42, "successor_left":None,"successor_right":None} **Note that the python code needs to be optimized and I need it to run successfully within 10 minutes. Someone on Chegg answered this but his code is slow . Kindly test it before submission. I'm not in hurry

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Microsoft Office 365 For Beginners 2022 8 In 1

Authors: James Holler

1st Edition

B0B2WRC1RX, 979-8833565759

More Books

Students also viewed these Databases questions