Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Write python code to navigate in a maze, make robot to receive (-10) negative score if move into one of the bold colored trap, in

image text in transcribed

Write python code to navigate in a maze, make robot to receive (-10) negative score if move into one of the bold colored trap, in case the robot move in the grid#12 it will receive score(+20), robot received score (0) if not grid (#12, #7, #14). Make the robot to learn how to move from grid (#1) to grid (#12) and collect max score. let also the robot to have probability of 10% to move in grid #7, 10% to grid #5 and 80% to grid #10

In this project, you are required to develop a Python program using the reinforcement learning algorithm (Q-learning) to let a robot learn how to navigate in a maze. You may use ROS to show yuor simulation result to get bonus points, but it is not required In particular, the project requirements are below: 1. In the first line of your Python code, use a comment line to show all group members' names. 2. The 4x4 maze has 16 grids, numbered from =1 to =16. 3. The robot can move left, right, up, or down but not diagonally. 4. If the robot hits the wall, it will return to its current grid. 5. There are two traps in the maze. If the robot moves in one of them, it will receive a negative score (-10). 6. If the robot moves in the goal (Grid =12), it will receive a positive score (+20). 7. If the robot move in other grids (not Grid =12,=7 or =14 ), it will receive a score of 0 . 8. The robot needs to learn how to move from grid =1 (the start position) to grid =12 (Goal) and collect the maximum score. 9. A probability state transition function is assumed in this project. For example, if the robot is located in Grid =6 currently and the "move down" action is selected, after the action is executed, the robot has a probability of 80% to move in Grid =10; has a probability of 10% to move in Grid =5, and has a probability of 10% to move in Grid =7. 10. After the robot is trained for 5,000 steps, your Python program prints the path of the robot moving from Grid =1 to Grid =12. Meanwhile, it displays the curve of the history of four Q values (Q(9,1),Q(9,2),Q(9,3) and Q(9,4))

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Design Application Development And Administration

Authors: Mannino Michael

5th Edition

0983332401, 978-0983332404

More Books

Students also viewed these Databases questions