Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Use Python Scenario We are on the way to setting foot on Mars. To do so , we need to refuel our space capsule in

Use Python
Scenario
We are on the way to setting foot on Mars. To do so, we need to refuel our space capsule in a space station thousands of miles away from Earth.
We need to perform a docking operation similarly as shown in this video.
Goal
Build a reinforcement learning model to perfom the docking operation autonomously while respecting safety and time constraints.
Time constraint
The docking operation must be done in 5 minutes (300 s)
Safety constraint
For the docking operation to be performed safely, the space capsule needs to be aligned +/-15 cm from the docking center of the space station.
Actions
You only have 3 actions: go left, do nothing, go right.
Observations
Space capsule alignment with the space station. It is assumed the alignment can range from -2 m to 2 m, where 0 m means perfect alignment. We also assume that negative values describe a misalignment to the left and positive values represent a misalignment to the right.
Tasks
Create a new .py file called SpaceCapsuleEnvironment.py and build a model that keeps the space capsule in the safety range for as long as possible using SARSA algorithm.
1)[15 points] Create a SpaceEnv() class with the following function
[5 points]__init__(self), an initilization function that defines the initial values for the environment.
[5 points] step(self, action), a step function that provides the next observation (state) of the system, a reward, and stopping criteria (done) given an action.
[5 points] reset(self), a function to reset the values of the environment after each epsiode.
2)[5 points] Discretize the alignment range in 100 evenly spaced numbers over the interval -200 cm and 200 cm. You should use the function np.linspace. Also, update the function getState(observation).
3)[10 points] Run SARSA algorithm with the SpaceEnv() you've just created. You should save the plot containing the results.
4)[10 points] Change the learning rate parameter ALPHA to 0.2 and then to 0.5, and discuss your findings.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Concepts of Database Management

Authors: Philip J. Pratt, Joseph J. Adamski

7th edition

978-1111825911, 1111825912, 978-1133684374, 1133684378, 978-111182591

More Books

Students also viewed these Databases questions