Answered step by step
Verified Expert Solution
Question
1 Approved Answer
help with python: Average Path Length In Part 4 , you will apply value iteration to a relatively large Frozen Platform environment and will then
help with python:
Average Path Length
In Part you will apply value iteration to a relatively large Frozen Platform environment and will then study the average path length for successful and unsuccessful episodes run under the optimal policy.
A Create Environment
Create a x instance of the FrozenPlatform environment with sprange a start position of with holes, and with randomstate Display the environment with cells, set fill to shade the cells according to their slip probabilities, set size and set shownumsFalse.
B Value Iteration
Create an instance of the DPAgent class for the environment created in Step A with gamma and randomstate Run value iteration with the default parameters.
Display the environment again, this time set fill to shade the cells accoprding to the statevalue function for the optimal policy, set contents to display the optimal policy, set size and set shownumsFalse.
C Average Performance
You will now study the average performance of an agent following the optimal policy found in B You will estimate the agent's success rate, and will also determine the average path length for successful episodes as well as for unsuccessful episodes.
Starter code has been provided below. Fill in the blanks as required to accomplish the tasks described below.
The code should generate episodes following the optimal policy. After each episode, determine if the agent reached the goal. If so increment the goal count and append the length of the resulting path to the list slengths. If the agent did not reach the goal, then append the path length to the list flengths.
Then print messages regarding the success rate under the optimal policy, as well as the average path length for both successful and failed episodes.
N
slengths
flengths
goals
nprandom.seed
for i in rangeN:
ep generateepisodepolicypolicy
pathlength
if epstate ep:
goals
slengths.append
else:
flengths.append
sr
printWhen working under the optimal policy:
printfThe agent's success rate was :f
printfThe average path length for successful episodes was npmean:f
printfThe average path length for unsuccessful episodes was npmean:f
D Visualizing Results
Use Matplotlib to create a x grid of subplots. Each subplot should contain a histogram indicating the distribution of path lengths. One histogram should correspond to path lengths for successful episodes and the other to unsuccessful episodes. The figure should have the following characteristics:
Set the figure size to be
The subplots should be titled "Successful Episodes" and "Unsuccessful Episodes".
The xaxis of each subplot should be labeled "Path Length" and the yaxis should be labeled "Episode Count".
Use bins for each histogram.
Select a different color for each subplot. Set the edgecolor of the bars to 'black' or k
Set the xlim to be the same for both subplots. Select values that result in no bars getting cut off.
Set the ylim to be the same for both subplots. Select values that result in no bars getting cut off.
E Successful Episode
Use the generateepisode method of the environment to simulate an episode following the optimal policy in found by value iteration in B Set showresultTrue and set a value of your choice for randomstate.
Call the display method of the enviornment, setting the fill, contents, and showpath parameters sp that cells are shaded to indicate the optimal statevalue function, arrows for the the optimal policy are displayed, and the path taken during the episode is shown.
Experiment with the value of randomstate to find one that results in the agent finding the goal. Use that value for your final submission.
F Failed Episode
Repeat Step E but this time find a value for randomstate that results in a failed episode.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started