Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 10, 2024

Each student is asked to implement an assembler for a simple assembly language. 1.1 Ob jectives Begin to build familiarity with Assembly opcodes Process input

Each student is asked to implement an assembler for a simple assembly language.

1.1 Ob jectives

Begin to build familiarity with Assembly opcodes Process input from Standard Input (STDIN) Send output to Standard Output (STDOUT) Implement decisions and branching in code

2 Detailed Description

2.1 Introduction

Programs run on computers by having the hardware (or system software) execute basic operations called instructions. Many languages (such as Java) represent the program to be executed as byte codes which are very similar to machine instructions. In this assignment, you will build an assembler for a simple assembly/machine language.

Assembly languages by design are very close to the underlying machine code. Historically, early programmers would have hand assembled their programs into machine code manually. Each assembly language instruction is composed of an opcode (representing an operation to perform), and some number of operands to the operation. The opcode and each operand would be converted into the appropriate number of bits of data and entered into memory.

2.2 the Assembly Language

Our Assembly language will support a simulated 16-bit computer has a memory that contains up to 216 (65536) words of memory, each 16-bits long.

In addition to memory, the computer has 16 registers that can be used to hold values, called R0R15. There is also another register (PC - the program counter) and CPU flags that store some state information about the program execution.

The language supports the following opcodes: NOTE: unnecessary details of the CPU execution are omitted from this table

2.3 Description of Opcodes

Instruction Binary opcode

Description

Add, sets the value of the first operand to the sum of the second and third Subtract, sets the value of the first operand to the difference of the second and third

Bitwise AND, sets the value of the first operand to the bitwise AND of the second and third Bitwise OR, sets the value of the first operand to the bitwise OR of the second and third

Bitwise XOR, sets the value of the first operand to the bitwise exclusive OR of the second and third Logical Shift Left, sets the value of the first operand to the value of the second, shifted left by the amount indicated in the third

Logical Shift Right, sets the value of the first operand to the value of the second, shifted right by the amount indicated in the third. Zeros are added to the most significant bits of the first operand as necessary

Arithmetic Shift Right, sets the value of the first operand to the value of the second, shifted right by the amount indicated in the third. either zero or one is added to the most significant bits of the first operand as necessary to maintain the sign of the resulting value

Load, sets the value of the first operand from the memory location specified by the second Store, places the value stored in the first operand into a memory location specified by the second

Compare, compares the values in the operands and sets CPU flags accordingly Move, sets the value of the first operand to the signed 8-bit value from the second operand

Branch, changes the current program counter by adding the signed 12-bit operand to it Branch if Equal, conditionally branches based on CPU flags Terminates CPU execution, denotes the end of a program No operation, this instruction does nothing

ADD SUB AND ORR EOR LSL

LSR

ASR

LDR

STR

CMP

MOV

BEQ END NOP

0000 0001 0010 0011 0100 0101

0110

0111

[] 1000 [] 1001 1010 1011

1100

1101 1110 1111

2.4 What to do

2.4.1 Input Format

Create a C program called "asm" that acts as an assembler for a simple language. Your program should accept upper or lowercase input from STDIN (the default input location in C); the input stream will consist of a series of lines of code, each containing either:

blank lines, containing just the newline character ( ) comment lines, which is any line with a pound character (#) as the initial character instructions, consisting of an opcode followed by 03 operands.

So a program listing in the assembly language might have the following (abstract) format:

# Comments have a pound-character as the first character of the line # Instructions contain from 0-3 operands  [] [] []

# Blank lines are allowed  [] [] []

[] [] [] 9 [] [] [] 11

NOTE: You can assume that the maximum program length is 1024 instructions.

EOF is represented on Unix systems as . All integer values in this project are to be input and output in hexadecimal format (e.g. 0x14 is the hexadecimal representation of the decimal number 20). The only exception is registers names where are specified as the letter R followed by a decimal integer (e.g. R15 is the 16th register since we start counting at 0).

2.4.2 Output Format

The output to your program will be a sequence of hexadecimal digits representing the machine code for our simu- lated CPU. Since each instruction is represented by a 16-bit value, each instruction line can be represented as four hexadecimal digits. So, the length of the output should be exactly 4-times the number hexadecimal digits as the number of instruction lines in the input. For example, if the input is 100 instruction lines long, the output should consist of exactly 400 hexadecimal digits.

2.4.3 Converting instructions

Your assembler should take each line of output and convert it into our simulated machine code. Each instruction should be converted into a 16-bit value representing the opcode and any operands. Each instruction should take exactly 16-bits (padding instructions with 0 as necessary). The format for each instruction will partially depend on the number and type of its operands.

The first 4-bits of each instruction should be the value of the opcode. Registers in an instruction should each be represented by a 4-bit value indicating the register number (R0 == 0000;

R15 == 1111) For instructions with a single constant as the operand, the next 12-bits are used to encode the operand as a twos

complement integer (-2048 -- 2047) The MOV instruction which takes a register and a constant value, should use 4-bits to indicate the register, and

8-bits to indicate an 8-bit twos complement integer (-128 -- 127) Any unused bits in an instruction should be cleared (ie. set to 0). Example:

Assembly Instruction

ADDR1R1R2 LDR R4 [R10] CMP R0 R12 MOV R15 -1

B 512 END

Machine Code (Hex)

0112 84A0 A0C0 BFFF C200 E000

Use good C programming style (style and documentation accounts for 10% of your project grade). Refer to the posted style guide for tips on C programming style.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Previous Question Next Question