Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Deep Reinforcement Learning - - - Please answer all question Assignment 0 1 Problem Statement 1 3 Marks Title: Propose a suitable title Problem Statement:
Deep Reinforcement Learning Please answer all question
Assignment Problem Statement
Marks
Title: Propose a suitable title
Problem Statement: Define a problem statement of your own with a welldefined objective, gaming environment, and game controls. Mark
Concept Sketch: A pen and paperbased game concept sketching to illustrate the proposed gaming problem statement. Mark
Additional Information: Provide any necessary information assumedconsidered for the game implementation.
Requirements and Deliverables:
Elaborate on how the described problem could be solved using deep neural network and explain the action plan to create a gaming environment. Mark
Prepare a Colab sheet with outputs saved satisfying the following requirements. Implementation should be in OpenAI gym with python. Develop a deep neural network architecture and training procedure that effectively learns the optimal policy for the spaceship to avoid collisions with asteroids and maximize its survival time in the game environment.
i
Environment Setup: Define the game environment, including the state space, action space, rewards, and terminal conditions. Mark
ii
Replay Buffer: Implement a replay buffer to store experiences state action, reward, next state, terminal flag Mark
iii.
Deep QNetwork Architecture: Design the neural network architecture for the DQN using Convolutional Neural Networks. The input to the network is the game state, and the output is the Qvalues for each possible action. Marks
iv
EpsilonGreedy Exploration: Implement an exploration strategy such as epsilongreedy to balance exploration trying new actions and exploitation using learned knowledge Mark
v
Training Loop: Initialize the DQN and the target network a separate network used to stabilize training In each episode, reset the environment and observe the initial state. Marks
vi
Testing and Evaluation: After training, evaluate the DQN by running it in the environment without exploration set epsilon to Monitor metrics such as average reward per episode, survival time, etc., to assess the performance. Mark
Please provide the complete code based solution.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started