Question
Tips: In order to work on this lab, you have to get some software packages such as numpy and sklearn installed on your computer. In
Tips: In order to work on this lab, you have to get some software packages such as numpy and sklearn installed on your computer.
In python environment (non-anaconda), here is the installation steps (from ssh client):
cp ~nyu/get-pip.py ~
python get-pip.py --user
pip install numpy user
In Anaconda:
Run anaconda prompt first (search bar -> anaconda). Then, type python to run python prompt. Type import sklearn to see if any error (use exit() to quit). If any error, quit python prompt first and you may install the package in anaconda prompt (the prompt starts with (base)).
conda install pip
pip install scikit-learn
pip install mglearn
Please run the python program my_python_package_test.py posted on BlackBoard to verify your installation environment.
My_Python_Package:
import numpy as np #%matplotlib inline import matplotlib.pyplot as plt from scipy import sparse import mglearn from IPython.display import display
import sys print("Python version:", sys.version)
import pandas as pd print("pandas version:", pd.__version__)
import matplotlib print("matplotlib version:", matplotlib.__version__)
print("NumPy version:", np.__version__)
import scipy as sp print("SciPy version:", sp.__version__)
import IPython print("IPython version:", IPython.__version__)
import sklearn print("scikit-learn version:", sklearn.__version__)
x = np.array([[1, 2, 3], [4, 5, 6]])
print("x: {}".format(x))
# Create a 2D NumPy array with a diagonal of ones, and zeros everywhere else eye = np.eye(4) print("NumPy array: ", eye)
# Convert the NumPy array to a SciPy sparse matrix in CSR format # Only the nonzero entries are stored
sparse_matrix = sparse.csr_matrix(eye) print(" SciPy sparse CSR matrix: ", sparse_matrix)
data = np.ones(4) row_indices = np.arange(4) col_indices = np.arange(4) eye_coo = sparse.coo_matrix((data, (row_indices, col_indices))) print("COO representation: ", eye_coo)
# Generate a sequence of numbers from -10 to 10 with 100 steps in between x = np.linspace(-10, 10, 100) # Create a second array using sine y = np.sin(x) # The plot function makes a line chart of one array against another plt.plot(x, y, marker="^")
# create a simple dataset of people data = {'Name': ["John", "Anna", "Peter", "Linda"], 'Location' : ["New York", "Paris", "Berlin", "London"], 'Age' : [24, 13, 53, 33] }
data_pandas = pd.DataFrame(data)
# IPython.display allows "pretty printing" of dataframes # in the Jupyter notebook
display(data_pandas)
# Select all rows that have an age column greater than 30 display(data_pandas[data_pandas.Age > 30])
plt.show()
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1a) Create a python file named mystat.py and import the following package:
import numpy as np
from sklearn import preprocessing
input_data = np.array([[5, -2, 3], [-1, 7, -6],[3, 0, 2],[7, -9, -4]])
print(input_data)
1b )We use threshold=2.2 for the input_data to get a Boolean values. Add the following lines to the same python file. Run the python file and show the below printout.
data_binarized = preprocessing.Binarizer(threshold=2.2).transform(input_data)
print(" Binarized data: ", data_binarized)
1c) Mean and Variance. Add the following lines to the python file to the same python file. Run the python file and show the below printout.
print("axis=0")
print("Mean =", input_data.mean(axis=0))
print("variance =", input_data.var(axis=0))
print("Std deviation =", input_data.std(axis=0))
print("axis=1")
print("Mean =", input_data.mean(axis=1))
print("variance =", input_data.var(axis=1))
print("Std deviation =", input_data.std(axis=1))
1d) What is the meaning for axis=0 and axis=1 respectively? Write your answer below.
1e) Data set can be scaled into a range with mean = 0 and std = 1. Add the following lines into the same python file. Run the file and indicate the below printout.
data_scaled = preprocessing.scale(input_data)
print(" AFTER:")
print("Mean =", data_scaled.mean(axis=0))
print("variance =", data_scaled.var(axis=0))
print("Std deviation =", data_scaled.std(axis=0))
1f) Min-max scaler can scale the data set to a range of [0,1]. Add the following lines to the same python file. Run the file and show the below printout.
minmax_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
data_scaled_minmax = minmax_scaler.fit_transform(input_data)
print(" Min-max scaled data: ", data_scaled_minmax)
2a) Recall what we have learned in the class and explain how the min-max scaler works?
2b) L1 norm and L2 norm are both commonly used in deep learning. Add the following lines to the same python file. Run the file and show the below printout.
data_normalized_l1 = preprocessing.normalize(input_data, norm='l1')
data_normalized_l2 = preprocessing.normalize(input_data, norm='l2')
print(" L1 normalized data: ", data_normalized_l1)
print(" L2 normalized data: ", data_normalized_l2)
2c) Recall what we have learned in the class and explain the principles of L1 and L2 norms?
2d) Encoding the labels. Creating a new python file named mylabel.py and add the following lines. Run the file and show the below printout.
import numpy as np
from sklearn import preprocessing
input_labels = ['red', 'black', 'red', 'green', 'black', 'yellow', 'white']
# Create label encoder and fit the labels
encoder = preprocessing.LabelEncoder()
encoder.fit(input_labels)
print(" Label mapping:")
for i, item in enumerate(encoder.classes_):
print(item, '-->', i)
2e) Add the following lines to the same file. Run the file and show the below printout.
test_labels = ['green', 'red', 'black']
encoded_values = encoder.transform(test_labels)
print(" Labels =", test_labels)
print("Encoded values =", list(encoded_values))
2f) Add the following lines to the same file. Run the file and show the below printout.
encoded_values = [3, 0, 4, 1]
decoded_list = encoder.inverse_transform(encoded_values)
print(" Encoded values =", encoded_values)
print("Decoded labels =", list(decoded_list))
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started