Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Tips: In order to work on this lab, you have to get some software packages such as numpy and sklearn installed on your computer. In

Tips: In order to work on this lab, you have to get some software packages such as numpy and sklearn installed on your computer.

In python environment (non-anaconda), here is the installation steps (from ssh client):

cp ~nyu/get-pip.py ~

python get-pip.py --user

pip install numpy user

In Anaconda:

Run anaconda prompt first (search bar -> anaconda). Then, type python to run python prompt. Type import sklearn to see if any error (use exit() to quit). If any error, quit python prompt first and you may install the package in anaconda prompt (the prompt starts with (base)).

conda install pip

pip install scikit-learn

pip install mglearn

Please run the python program my_python_package_test.py posted on BlackBoard to verify your installation environment.

My_Python_Package:

import numpy as np #%matplotlib inline import matplotlib.pyplot as plt from scipy import sparse import mglearn from IPython.display import display

import sys print("Python version:", sys.version)

import pandas as pd print("pandas version:", pd.__version__)

import matplotlib print("matplotlib version:", matplotlib.__version__)

print("NumPy version:", np.__version__)

import scipy as sp print("SciPy version:", sp.__version__)

import IPython print("IPython version:", IPython.__version__)

import sklearn print("scikit-learn version:", sklearn.__version__)

x = np.array([[1, 2, 3], [4, 5, 6]])

print("x: {}".format(x))

# Create a 2D NumPy array with a diagonal of ones, and zeros everywhere else eye = np.eye(4) print("NumPy array: ", eye)

# Convert the NumPy array to a SciPy sparse matrix in CSR format # Only the nonzero entries are stored

sparse_matrix = sparse.csr_matrix(eye) print(" SciPy sparse CSR matrix: ", sparse_matrix)

data = np.ones(4) row_indices = np.arange(4) col_indices = np.arange(4) eye_coo = sparse.coo_matrix((data, (row_indices, col_indices))) print("COO representation: ", eye_coo)

# Generate a sequence of numbers from -10 to 10 with 100 steps in between x = np.linspace(-10, 10, 100) # Create a second array using sine y = np.sin(x) # The plot function makes a line chart of one array against another plt.plot(x, y, marker="^")

# create a simple dataset of people data = {'Name': ["John", "Anna", "Peter", "Linda"], 'Location' : ["New York", "Paris", "Berlin", "London"], 'Age' : [24, 13, 53, 33] }

data_pandas = pd.DataFrame(data)

# IPython.display allows "pretty printing" of dataframes # in the Jupyter notebook

display(data_pandas)

# Select all rows that have an age column greater than 30 display(data_pandas[data_pandas.Age > 30])

plt.show()

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

1a) Create a python file named mystat.py and import the following package:

import numpy as np

from sklearn import preprocessing

input_data = np.array([[5, -2, 3], [-1, 7, -6],[3, 0, 2],[7, -9, -4]])

print(input_data)

1b )We use threshold=2.2 for the input_data to get a Boolean values. Add the following lines to the same python file. Run the python file and show the below printout.

data_binarized = preprocessing.Binarizer(threshold=2.2).transform(input_data)

print(" Binarized data: ", data_binarized)

1c) Mean and Variance. Add the following lines to the python file to the same python file. Run the python file and show the below printout.

print("axis=0")

print("Mean =", input_data.mean(axis=0))

print("variance =", input_data.var(axis=0))

print("Std deviation =", input_data.std(axis=0))

print("axis=1")

print("Mean =", input_data.mean(axis=1))

print("variance =", input_data.var(axis=1))

print("Std deviation =", input_data.std(axis=1))

1d) What is the meaning for axis=0 and axis=1 respectively? Write your answer below.

1e) Data set can be scaled into a range with mean = 0 and std = 1. Add the following lines into the same python file. Run the file and indicate the below printout.

data_scaled = preprocessing.scale(input_data)

print(" AFTER:")

print("Mean =", data_scaled.mean(axis=0))

print("variance =", data_scaled.var(axis=0))

print("Std deviation =", data_scaled.std(axis=0))

1f) Min-max scaler can scale the data set to a range of [0,1]. Add the following lines to the same python file. Run the file and show the below printout.

minmax_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))

data_scaled_minmax = minmax_scaler.fit_transform(input_data)

print(" Min-max scaled data: ", data_scaled_minmax)

2a) Recall what we have learned in the class and explain how the min-max scaler works?

2b) L1 norm and L2 norm are both commonly used in deep learning. Add the following lines to the same python file. Run the file and show the below printout.

data_normalized_l1 = preprocessing.normalize(input_data, norm='l1')

data_normalized_l2 = preprocessing.normalize(input_data, norm='l2')

print(" L1 normalized data: ", data_normalized_l1)

print(" L2 normalized data: ", data_normalized_l2)

2c) Recall what we have learned in the class and explain the principles of L1 and L2 norms?

2d) Encoding the labels. Creating a new python file named mylabel.py and add the following lines. Run the file and show the below printout.

import numpy as np

from sklearn import preprocessing

input_labels = ['red', 'black', 'red', 'green', 'black', 'yellow', 'white']

# Create label encoder and fit the labels

encoder = preprocessing.LabelEncoder()

encoder.fit(input_labels)

print(" Label mapping:")

for i, item in enumerate(encoder.classes_):

print(item, '-->', i)

2e) Add the following lines to the same file. Run the file and show the below printout.

test_labels = ['green', 'red', 'black']

encoded_values = encoder.transform(test_labels)

print(" Labels =", test_labels)

print("Encoded values =", list(encoded_values))

2f) Add the following lines to the same file. Run the file and show the below printout.

encoded_values = [3, 0, 4, 1]

decoded_list = encoder.inverse_transform(encoded_values)

print(" Encoded values =", encoded_values)

print("Decoded labels =", list(decoded_list))

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Modern Database Management

Authors: Jeff Hoffer, Ramesh Venkataraman, Heikki Topi

12th edition

133544613, 978-0133544619

More Books

Students also viewed these Databases questions