Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Aug 09, 2024

Deep Reinforcement Learning - - - Please answer all question Assignment 0 1 Problem Statement 1 3 Marks Title: Propose a suitable title Problem Statement:

Deep Reinforcement Learning

- - -

Please answer all question

Assignment

01

Problem Statement

13

Marks

Title: Propose a suitable title

Problem Statement: Define a problem statement of your own with a well

-

defined objective, gaming environment, and game controls.

[1

Mark

]

Concept Sketch: A pen and paper

-

based game concept sketching to illustrate the proposed gaming problem statement.

[1

Mark

]

Additional Information: Provide any necessary information assumed

/

considered for the game implementation.

Requirements and Deliverables:

Elaborate on how the described problem could be solved using deep neural network and explain the action plan to create a gaming environment.

[1

Mark

]

Prepare a Colab sheet with outputs saved satisfying the following requirements. Implementation should be in OpenAI gym with python. Develop a deep neural network architecture and training procedure that effectively learns the optimal policy for the spaceship to avoid collisions with asteroids and maximize its survival time in the game environment.

.

Environment Setup: Define the game environment, including the state space, action space, rewards, and terminal conditions.

[1.5

Mark

]

.

Replay Buffer: Implement a replay buffer to store experiences

(

state

,

action, reward, next state, terminal flag

) . [1.5

Mark

]

iii.

Deep Q

-

Network Architecture: Design the neural network architecture for the DQN using Convolutional Neural Networks. The input to the network is the game state, and the output is the Q

-

values for each possible action.

[2

Marks

]

.

Epsilon

-

Greedy Exploration: Implement an exploration strategy such as epsilon

-

greedy to balance exploration

(

trying new actions

)

and exploitation

(

using learned knowledge

) . [1

Mark

]

.

Training Loop: Initialize the DQN and the target network

(

a separate network used to stabilize training

) .

In each episode, reset the environment and observe the initial state.

[2

Marks

]

.

Testing and Evaluation: After training, evaluate the DQN by running it in the environment without exploration

(

set epsilon to

0) .

Monitor metrics such as average reward per episode, survival time, etc., to assess the performance.

[2

Mark

]

Please provide the complete code based solution.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Concepts

Authors: David Kroenke, David Auer, Scott Vandenberg, Robert Yoder

9th Edition

0135188148, 978-0135188149, 9781642087611

More Books

Students also viewed these Databases questions

Question

★★★★★

Andy, Bob, and Charley have all been serving time for grand theft auto. According to prison scuttlebutt, the warden plans to release two of the three next week. They all have identical records, so...

Answered: 1 week ago

Question

★★★★★

11.9 At very high temperatures, molecular hydrogen dissociates into hydrogen atoms: Obtain the equilibrium constant Kc(T) for this reaction, expressing your answer in terms of the internal partition...

Answered: 1 week ago

Question

★★★★★

3. Use the childs name.

Answered: 1 week ago

Question

★★★★★

You and several classmates are studying for the next accounting examination. They ask you to answer the following questions: 1. If cash is borrowed on a $60,000, 9-month, 10% note on August 1, how...

Answered: 1 week ago

Question

★★★★★

Which of the following are important when assessing if an adverse event could impact the value of an investment risk? probability and time probability and money cost and impact probability and impact...

Answered: 1 week ago

Question

★★★★★

A proof of claim is which of the following? A document a creditor must file with the bankruptcy court in which the amount owed by the debtor to the creditor is stated as well as the basis of the...

Answered: 1 week ago

Question

★★★★★

A 64-year male patient gets admitted to the ICU in the hospital with respi- ratory rate of 22, heart rate of 90, Glasgow Coma Scale score of 6 and blood pressure of 63, compute the mortality (i.e.,...

Answered: 1 week ago

Question

★★★★★

At steady state, a reversible heat pump cycle discharges energy at the rate to a hot reservoir at temperature TH, while receiving energy at the rate 2c from a cold reservoir at temperature Tc. (a) If...

Answered: 1 week ago

Question

★★★★★

Problem 8-24 (Algo) Cash Budget with Supporting Schedules (L08-2, LO8-4, LO8-8] Garden Sales, Incorporated, sells garden supplies. Management is planning its cash needs for the second quarter. The...

Answered: 1 week ago

Question

★★★★★

A. Draw a graph to show equilibrium price. (3 points) Demand Supply Equilibrium price Quantity A Draw a graph to show equilibriurn price. (3 points)

Answered: 1 week ago

Question

★★★★★

Imagine a senior executive has asked you to conduct an organizational self-assessment based on the Four-Frame Model. Choose an organization that you either worked for, are familiar with, or have...

Answered: 1 week ago

Question

★★★★★

1 Analyse the Newman Academy Secondary School by comparing the steps taken with those of Kotter and Cohens model, which was introduced earlier. The 60 faculty members in the Newman Academy Secondary...

Answered: 1 week ago

Question

★★★★★

2 Why might the issue of trust be particularly important in managing change? In 2002 the British Museum was regarded as being in crisis (Edwards, 2005). There had been a downturn in tourism...

Answered: 1 week ago

Question

★★★★★

Have you been clear about where you are focusing and where you are reducing focus/investment?

Answered: 1 week ago

Previous Question Next Question