Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

import numpy as np import matplotlib.pyplot as plt def getSamplar(): mu=np.random.normal(0,10) sd=abs(np.random.normal(5,2)) getSample=lambda: np.random.normal(mu,sd) return getSample def e_greedy(Q, e): ################################################## # Your code here ##################################################

image text in transcribed

import numpy as np import matplotlib.pyplot as plt

def getSamplar(): mu=np.random.normal(0,10) sd=abs(np.random.normal(5,2)) getSample=lambda: np.random.normal(mu,sd) return getSample

def e_greedy(Q, e):

################################################## # Your code here ################################################## return

Please finish the code in python

Task 1 - Make One-step Decision Using e-greedy The formula is A a rg maxa Q(a), with probability 1- (break ties randomly) a random action, with probability e You are supposed to complete the e_greedy function in the kBandit.py file. In this function you choose action following the e-greedy algorithm. There are two input parameters of the function e_greedy. (1) Q - A dictionary. The keys are the possible actions. The values are the average reward you got when taking the action. (2) E- a scalar between 0 and 1. The return value of the function e_greedy is a scalar. It represents the action you are taking if you follow the e-greedy algorithm. Task 1 - Make One-step Decision Using e-greedy The formula is A a rg maxa Q(a), with probability 1- (break ties randomly) a random action, with probability e You are supposed to complete the e_greedy function in the kBandit.py file. In this function you choose action following the e-greedy algorithm. There are two input parameters of the function e_greedy. (1) Q - A dictionary. The keys are the possible actions. The values are the average reward you got when taking the action. (2) E- a scalar between 0 and 1. The return value of the function e_greedy is a scalar. It represents the action you are taking if you follow the e-greedy algorithm

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions