Answered step by step
Verified Expert Solution
Question
1 Approved Answer
I need help in this program. I don't understand how I should start working on this. 2 Machine details and background concepts This section explains
I need help in this program. I don't understand how I should start working on this.
2 Machine details and background concepts This section explains many background concepts, while the following one describes the functions to be written. 2.1 Hardware concepts For a program written in any computer language to be executed on a particular machine it must be translated to the instructions that the machine's processor can execute. These instructions are referred to as machine instructions, but we usually just say instructions. Instructions manipulate information in memory and also in the processor's registers. Memory is used to hold data, such as variables, and programs themselves are also stored in memory in modern systems. Registers are locations that store values like memory except they are internal to the CPU, so they are accessed much more quickly. Registers temporarily store values that instructions operate upon, as well as the results of instructions. The fundamental operation of any CPU is to get an instruction from memory, figure out what it says to do, perform that operation, and go on to get the next instruction; this is called the fetch-decode=execute cycle. (This is discussed in more detail in the Bryant & O'Hallaron text.) Although the instruction sets of different processors vary widely, instructions can be categorized by functionality, for example: computation instructions: These perform various arithmetic and logical operations. flow of control instructions: By default instructions are executed sequentially in memory, but there are instructions that affect which instruction is executed next, for implementing conditionals and repetition, as well as function calls. data movement instructions: These transfer data values between memory and registers, between different registers, and sometimes between different memory locations. invoking the operating system: These instructions are used to do things like halt a program or to access parts of the hardware that user programs are not allowed to directly access, for example, I/O devices. Some instructions allow user programs to make system calls to get the operating system to perform these types of tasks for them. 2.2 Machine specifications The hypothetical N2 processor has a 32-bit (4 byte) word length, meaning that instructions, registers, and memory words are all 32 bits. (As mentioned in lecture, processors manipulate information in multibyte quantities called words.) Machines that use the N2 CPU are quite small, with very limited system resources. They have only 32768 bytes of memory, grouped as 8192 four-byte words (8192 x 4 = 32768). The first byte in memory has address 0. Memory addresses always refer to bytes. Memory is only word-addressable, meaning that although each byte in memory has its own unique address, the CPU can only access memory in four-byte quantities, using the address of the first (or low-order) byte of a four-byte word. Consequently, the addresses that can be used to access memory are 0, 4, 8, etc., up to 32764. The value of a memory word can be interpreted as an instruction and executed, or treated as data. The N2 processor has 27 different instructions. It has seven registers, each of which as mentioned holds 32 bits. These are referred to using the names R through R6. The N2 has what is called a load/store architecture, which means that only some data movement instructions access memory. One of them loads a value from a memory location into a register and another one stores a value from a register into a memory location, while all other instructions, including computations, operate upon values in registers and put their result into a register. So computation typically involves loading operands from memory into registers, doing calculations with them, and storing the results from registers back into memory. One N2 register, R6, has a special purpose. It is the program counter, which always contains the memory address of the next instruction to be fetched and executed. A flow of control instruction can modify the program counter to cause execution to inmn to somewhere else in a program All other instructions cause the program counter to be incremented Onc N2 register, R6, has a special purposc. It is thc program counter, which always contains the memory address of the next instruction to be fetched and executed. A flow of control instruction can modify the program counter to cause execution to jump to somewhere else in a program. All other instructions cause the program counter to be incremented by 4. So unless specified by a flow of control instruction, the processor executes instructions sequentially in memory. R6 cannot be directly modified, for example, by storing the result of a computation in it. (Instructions can use the value in R6, but not directly modify it.) The other registers may be used or modified by the various instructions. 2.3 Hardware, components of instructions, and instruction format Since the N2s processor has a 32-bit word length, all instructions, just like all data, must fit into a 32-bit word. When a 32-bit word is interpreted as an instruction, it is considered to consist of fields representing(a) an "opcode, i.e., operation of the instruction, (b) an extension, whose use will be described below, (c) up to three registers used as operands by some instructions, and (d) a memory address or constant/immediate field that holds either a memory address or a constant (literal) numeric operand for some instructions. The instruction format on the N2 processor is as follows: 5 bits 3 bits 3 bits 3 bits 3 bits 15 bits opcode extension register register2 register memory address or constant/literal value The fields have the following uses and functionality: opcode: An opcode uniquely identifies the operation that an instruction performs. Short names are used in assembly language to refer to each type of instruction. (Recall that assembly instructions are just human-readable versions of machine instructions.) Since the N2 has 27 different opcodes, 5 bits are required to represent an opcode. Since 5 bits can store 32 different values but the N2 only has 27 opcodes, five five-bit values do not correspond to valid opcodes. regi or 2.4 Names, operands, and effects of the machine instructions You need to have a basic understanding of the effects of the N2s machine instructions to write the project, so after the table below, which indicates which operands are used by each instruction, their effects are briefly described. Each instruction is listed with the (decimal) number between 0 and 26 that represents its opcode, according to the enum Op_code defined in machine.h. (To save space on the page the table is in two columns="val is the decimal value of the instruction opcodes, "ext refers to the extension field of instructions, regi, reg2, and regz are the three register operand fields, and addr./imm." refers to the memory address/immediate operand field.) A checkmark means that an instruction uses that field. opcode val ext reg2 reg3 addr./imm. opcode val ext regi reg2 reg3 addr./imm. plus 0 14 minus 1 not 15 times 2 eq 16 div 3 neq 17 mod 4 le 18 neg 5 lt 19 abs ge 20 shl 7 gt 21 shr 8 22 band lw 23 bor 10 SW 24 bxor 11 li 25 bneg 12 sys 26 and 13 6 move 9 Note that the sys instruction uses the first register operand only when the value of the extension field is 0...3, not 4. Also note that different instructions use between 11 bits and 26 bits of their word (there are no instructions that use all 32 bits). Unuscd ficlds may just contain any bits at all. Also note that not cvery 32-bit word represents a valid N2 instruction. The plus instruction adds the contents of register and registerz and puts the result in register]. Similarly, minus does subtraction (computes the difference of register2 and register3), times does multiplication, div performs division, and mod extension: One instruction on the N2 has five different variants. For this instruction, a value in the extension field is used to indicate which variant an instruction represents. Not all values that can be stored in 3 bits represent valid extensions. register1, register2, registerz: As Section 2.4 says, some instructions operate upon one register, others operate on two registers, and some have three register operands. These three fields of instructions contain numbers indicating the registers that instructions use as operands. Since the N2 CPU has 7 registers, 3 bits can indicate any of them. Since 3 bits can store 8 different values but the N2 only has 7 registers, there is one value that can be in a register field that does not correspond to a valid register. address or constant: Some of the N2s instructions have the address of a memory location as an operand. For example, an instruction may copy the value in memory location 216 to one of the registers, so 216 would be stored in this field of the instruction. One instruction (li) has a constant or literal value, typically called an immediate in assembly language, as an operand. For example, an instruction may store the numeric value 250 in one of the registers, so 250 itself would be encoded in the instruction. This field is 15 bits, which is why the N2 has 32768 bytes of memory (with addresses 0...32767), because 215 32768. This also means that the N2 CPU only has 32768 different literal (constant) values that can appear in the li instruction, although values up to 4,294,967,295 can be stored in a 32-bit word. For example, say an instruction has opcode value 3 and is an instruction that uses all three register operands, and this particular instruction is going to operate upon registers R1, R2, and R4. Also suppose this instruction doesn't use the extension field or the address/constant field, and just has zeros in those 18 bits. Then this instruction would be represented in a 32-bit memory word as 0x182a000016, which is 40540569610. (Suggestion-convert the hexadecimal value to binary and match the bits against the diagram of the instruction format above.) Or suppose an instruction has opcode value 24 and uses one register operand, which is R5, and uses the memory address field to store 21610. If it also had Os in all of the unused fields it would be represented in a memory word as 0xc0a000d816 (which is 323171144810). computes the remainder of a division (modulus). These all apply the operation to their second and third register operands and put the result in their first register operand. In fact, all instructions that modify a register store their result into their first register operand, so this is not repeated below. Since a register can only store one value at a time, any instruction that modifies its first register operand's value has the effect of overwriting the previous value in that register. For instance, the instruction times R4 R3 R5 multiplies the values stored in registers R3 and R5, and puts the result in register R4. And minus R5 R2 R4 subtracts the contents of R4 from the contents of R2 and puts the result into R5. The neg instruction negates the value in its second register operand (if it is positive it becomes negative, and vice versa), and abs computes the absolute value of its second operand. For instance, if register R1 contains 216, then after neg R5 R1, the value in R5 would be -216. The shl instruction shifts a value to the left and shr shifts it to the right; these both shift their second operand's value by the number of bits stored in their third register operand. band, bor, bxor, and bneg perform bitwise and, bitwise or, bitwise exclusive or, and bitwise negation. The first three apply the operation to their second and third register operands while bneg is applied to its second register operand. and performs logical conjunction, or performs logical disjunction, and not performs logical negation. The first two of these operate upon their second and third register operands while not is applied to its second register operand. eq, neq, le, lt, ge, and gt compare their two register operands for equality, inequality, less than or equal, less than, greater than or equal, and greater than, and allow execution to go to another location in a program. If the relationship is true about the values in their register operands then execution jumps to the memory address in the address/constant field and continues from there. If not, execution just drops down to continue with the next sequential instruction in memory instead. move copies its second register operand's value to its first register operand. lw and sw are the only instructions that access or modify values in memory locations. lw copies a value from a memory location to a register. It uses the first register operand and a memory address, and its effect is to copy the value of the word in memory that has that address into the register specified by the register operand. For example, lw R3 200 would load the value in memory location 200 into register R3. (Assuming 200 is in decimal, this instruction would be stored in a register or memory word as Oxb86000c816, which is 309329940010.) sw does the opposite, copying a value from a register to the indicated memory location. li loads a constant or literal value (li stands for load immediate) into a register. For instance, li R4 236 would load the numeric value 236 (not the contents of memory location 236) into register number R4. (Assuming 236 is in decimal, this instruction would be stored in a word or register as 0xc08000ec16, which is 322961431610-) Lastly, sys allows a program to invoke functionality in the operating system, which as mentioned is referred to as a system call. The N2 operating system has five system calls, differentiated by the value of the extension field. System call O reads an integer, system call 1 writes an integer, system calls 2 and 3 read and write a character respectively, and system call 4 halts the executing program. These are indicated by the values 0 through 4 respectively in the extension field. The first four use the first register operand, while system call 4 (to halt a program) has no register operands. System calls 0 and 2, which read an integer and a character, modify their first register operand because whatever is read is stored there. System calls 1 and 3 print the value in their first register operand without modifying the register. All of the instructions are computation instructions except move, li, lw, and sw, which are data movement instructions, endog nogle + o andat which are flow of control instructions eve involves the onerating Wotem System calls 1 and 3 print the value in their first register operand without modifying the register. All of the instructions are computation instructions except move, li, lw, and sw, which are data movement instructions, and eq, neq, le, lt, ge, and gt, which are flow of control instructions. sys invokes the operating system. The same register may be used for multiple operands of an instruction, such as in add R3 R2 R3 or add R1 R1 R1. Any operand fields that aren't indicated in the table above are ignored by that instruction. For example, move has two register operands, therefore they will be contained in the first two register fields registerand register2; the extension, the third register field register3, and the address or immediate field are all not used by a move instruction. Of the instructions that use the address or constant field, only li uses that field to store an immediate (i.e., constant) operand; all others use that field to store a memory address. 3 Functions to be written The header file machine.h contains definitions and the prototypes of the four functions you must write. The functions use a typedef name Wrd to represent a N2 word. Wrd is an unsigned int, since on the Grace machines an int is four hytes. 3.1 unsigned short print_instruction(Wrd instruction) This function should interpret its parameter as a N2 instruction, use the bit operators to extract its fields (see the description and diagram in Section 2.4), and print the instruction's components on a single output line. However, if the instruction is invalid according to the criteria below, nothing at all should be printed and the function should just return 0. Otherwise, if the instruction is valid, the function should return 1 after printing the instruction, in this format: The instruction's opcode (the opcode field of the parameter, meaning its leftmost five bits) should be printed using its name in the table in Section 2.4 (plus, minus, times, div, etc.), spelled exactly as shown there. For example, if the value of the opcode field is the enum value PLUS (which has the value 0), then plus should be printed. If the opcode field is MINUS then minus should be printed, etc. Following the opcode, the extension and register operands that are actually used by the instruction should be printed, with the extension (if it is used) followed by the register operands that are used, in the order register), register2, and register3. For example, a not instruction uses the first two registers as operands, so those two should be printed, with the first register operand followed by the second one. A sys instruction uses the extension (usually- see the next sentence) and the first register as operands, so the extension should be printed first, then the register operand. (If the value of the extension field is 4 in a sys instruction then the register field should not be printed, because the system call to halt a program does not have a register operand.) Register names should be printed in decimal with an R immediately preceding the register number. For example, registers 0, 1, and 3 would be printed as RO, R1, and R3. For sys, the only instruction that uses the extension, the extension should just be printed as a single decimal digit between 0 and 4. If an instruction uses a memory address operand or an immediate operand that operand should be printed last, in decimal. If it is a memory address it should be printed using exactly five places, with addresses less than 1000010 printed using as many leading zeros as necessary so they occupy exactly four places. For example, the address 21610 should be printed as 00216. (This can trivially be performed with printf() formatting options covered in discussion.) Note that immediate operands li instructions should not be printed with any leading zeros, except zero itself should be printed as 0. Only memory address operands should have leading Os if needed. The printed fields must be separated with one or more whitespace characters (meaning tabs or spaces); the exact number of whitespace characters is up to you. The printed instruction should not have any whitespace before the opcode or following the last field printed. A newline should not be printed after the instruction. For example, the call print_instruction (0x38710000) should print something like shl R3 R4 R2, where the exact number of spaces or tabs separating the four things printed is up to you (one or more). As mentioned before, some values that can be stored in a 32-bit N2 word or register don't represent valid N2 instruc- tions, and if its parameter represents an invalid instruction this function should just return 0 without printing anything. An instruction would be invalid in any of these cases: The value in its opcode field is invalid. There are only 27 instructions (hence opcodes) on the N2 (with enum values between PLUS and SYS), so any value outside of that range doesn't represent an actual opcode. If a register operand that is actually used by the instruction has an invalid number. There are only seven registers If a register operand that is actually used by the instruction has an invalid number. There are only seven registers on the N2 with numbers 0-6 (and symbolic constant names R through R6 defined in machine.h), so the value 7 doesn't represent a valid register number. If it is an instruction that uses the address or constant field to store a memory address and the value of the field is not evenly divisible by 4. As will come up later, machines often have alignment requirements such that any data item must begin at a memory address that is a multiple of the size in bytes of that item. Since the N2 has this requirement, and everything on the N2 occupies a 4-byte word, everything must begin at an address that is divisible by 4. Note: if the parameters represent an instruction that uses the address or constant field for storing a memory address its value must be divisible by 4. But if the instruction is an li instruction, which is using the field to represent a immediate (constant) value, it does not have to be divisible by 4. If the parameters represent an instruction that has the effect of modifying (storing a new value into) its first register operand and reg1 is the special unmodifiable program counter register R6. The instructions that use and would modify the first register operand are all instructions except for eq, neq, le, lt, ge, gt, sw, as well as a sys instruction when the value of the extension field is 1, 3, or 4. (A sys instruction with extension 0 or 2 does modify its first register operand.) R6 can be used as the second or third register operand of any instructions that use two or three registers, but not as the first operand of instructions that would modify their first register operand's value. If the parameter represents a sys instruction and the extension field is anything other than 0 through 4 it is invalid. The N2 determines which operating system call should be invoked based on the value of the extension field, and there are only five possible system calls. Note: the values of fields that are not used by machine instructions have no effect on validity, and it's immaterial what they contain. For instance, a not instruction may have anything at all in its rightmost 18 bits (the third register operand and memory address fields) and in the 3 extension bits. As long as the opcode, register], and register2 are correct, a not instruction is valid. But instructions that do use fields, and have incorrect values in them, are invalid. Also note: in projects where output has to be produced, students sometimes lose credit for minor spelling or formatting mistakes. The submit server checks your output automatically, consequently even trivial typos would cause tests to fail. Due to the size of the course we will not be able to give back any credit lost for failed tests due to spelling or formatting errors, even if they are extremely minor, because that would necessitate manual checking of every output for hundreds of students' projects, which is not possible. Therefore it's essential for you to check carefully that your output is correct, register names and opcodes are spelled exactly, and that the output format described above is followed. 3.2 unsigned short disassemble(const Wrd program[], unsigned short num_instrs, unsigned short *const bad_instr) A disassembler converts machine language to assembly language. This function will do that for a N2 program stored in its first parameter program, which is an array of words representing N2 instructions. The parameter num_instrs gives the number of elements of the array (number of array elements, not number of bytes). The function's return value is described below. It should not produce any output at all (not even a newline) if either program or bad_instr are NULL, or if any of the program's instructions are invalid (see below). If the instructions in program are all valid the function should print them all, in this format (see some of the public test outputs for examples): Preceding each instruction in the array, its starting N2 memory address should be printed in hexadecimal using as many leading zeros as necessary so the address occupies exactly four places. A N2 program always begins at memory address 0. A colon and blank space should follow the address. In printing the memory address, remember that each array element corresponds to a four-byte word of the machine's memory. Each element of the array should be printed exactly how print_instruction () prints instructions, with each one followed by a newline (including the last one). If its parameters are valid and all the instructions in program are valid the function should return 1 after printing the array contents as described above. The function should not produce any output at all and just return 0 if its parameters are invalid, which would be when: The array memory is NULL, or the pointer bad_instr is NULL. num_instrs is larger than the number of 4-byte words in the N2 (meaning larger than the maximum number of 4-byte instructions that can be in memory; note there is a symbolic constant with this value in machine.h). The array memory is NULL, or the pointer bad_instr is NULL. num_instrs is larger than the number of 4-byte words in the N2 (meaning larger than the maximum number of 4-byte instructions that can be in memory; note there is a symbolic constant with this value in machine.h). Any of the instructions in the array program are invalid, according to the criteria described above in Section 3.1. In this case the function should also store the location (array index) of the first invalid instruction into the variable that bad_instr points to. (If the instructions are all valid the variable bad_instr points to should not be modified.) If the function's third parameter is non-NULL it may assume that it is a valid pointer that points to a Wrd (unsigned int) variable, in other words, it may assume that it is not an uninitialized or invalid pointer. This is something that the function cannot check, and is the caller's responsibility to ensure instead. 3.3 short where_set(const Wrd program[], unsigned short num_words, unsigned short reg_nbr) As you should know from CMSC 131 and 132, it is a syntax error in Java to use a local variable in a method before giving it a value. In C, a (nonstatic) local variable just has a garbage value if it is used before being set. It is commonly an indication of an error to use something in a program before it's given a value, and this could even apply in assembly programs to using the values of registers before putting a value in them. Some systems might initialize all registers to zero before a program starts, but in others, registers may have the equivalent of garbage values if they are used before being set. Since using anything before its given a value may be a bug, an assembly programmer might be helped if a tool could perform a similar type of analysis to what the Java compiler does, in detecting whether any register in an assembly program is used before it is given a value. This function and the next one will analyze an N2 program to do this. This function should return 1 if program is NULL, or if num_words is larger than the number of 4-byte words in the N2, or if reg_nbr is not a valid register number for the N2, or if register number reg_nbr is not given a value in any of the first num_words instructions of program (num_words indicates how many instructions of the array should be checked). Otherwise the function should return the index or position in the array program of the first instruction where register number reg_nbr is set or given a value. You should know from the explanation in Section 2.4 which instructions modify a register and which ones don't. For example, add modifies its first register operand but sw does not. (As mentioned, for the instructions that do modify a register, it is always the first register operand that is modified.) Notice that the sys instruction modifies its first register for two possible values of the extension (0 and 2), but not for the other three possible values. This function will always return -1 if reg_nbr is R6, since the program counter register can never be explicitly modified by any instructions. (This probably does not have to be handled as a special case.) 3.4 unsigned short is_safe(const Wrd program[], unsigned short pgm_size, unsigned short *const bad_instr) This function should return 1 if the program in program, which has pgm_size instructions, is valid. It should return 0 if program or bad_instr are NULL, or pgm_size is larger than the number of 4-byte words in the N2 or if the program is not safe. Safe means that none of the instructions in the array program are invalid, using the criteria described above in Section 3.1 and that every register used in instructions is set, or given a value, prior to the first instruction where it is used in the program. If the first and third parameters not NULL but the program is unsafe, meaning that there is some instruction that is using the value of a register that has not been given a value in any of the preceding instructions, then before returning O the function should also store the location of the first such instruction into the variable that bad_instr points to. (If the program is safe the function should not modify the variable that bad_instr points to.) What this function will have to do is to examine all of the instructions in the program, and determine which register operands each instruction uses the values of, based on its opcode. (Note this refers to which registers have their values used by instructions, not whether instructions modify their first register operand.) Then the function needs to see whether the register or registers that an instruction uses (if it is an instruction that uses the values of any registers) were set in any earlier instructions in the program. Of course your function where_set() can be used for this. 2 Machine details and background concepts This section explains many background concepts, while the following one describes the functions to be written. 2.1 Hardware concepts For a program written in any computer language to be executed on a particular machine it must be translated to the instructions that the machine's processor can execute. These instructions are referred to as machine instructions, but we usually just say instructions. Instructions manipulate information in memory and also in the processor's registers. Memory is used to hold data, such as variables, and programs themselves are also stored in memory in modern systems. Registers are locations that store values like memory except they are internal to the CPU, so they are accessed much more quickly. Registers temporarily store values that instructions operate upon, as well as the results of instructions. The fundamental operation of any CPU is to get an instruction from memory, figure out what it says to do, perform that operation, and go on to get the next instruction; this is called the fetch-decode=execute cycle. (This is discussed in more detail in the Bryant & O'Hallaron text.) Although the instruction sets of different processors vary widely, instructions can be categorized by functionality, for example: computation instructions: These perform various arithmetic and logical operations. flow of control instructions: By default instructions are executed sequentially in memory, but there are instructions that affect which instruction is executed next, for implementing conditionals and repetition, as well as function calls. data movement instructions: These transfer data values between memory and registers, between different registers, and sometimes between different memory locations. invoking the operating system: These instructions are used to do things like halt a program or to access parts of the hardware that user programs are not allowed to directly access, for example, I/O devices. Some instructions allow user programs to make system calls to get the operating system to perform these types of tasks for them. 2.2 Machine specifications The hypothetical N2 processor has a 32-bit (4 byte) word length, meaning that instructions, registers, and memory words are all 32 bits. (As mentioned in lecture, processors manipulate information in multibyte quantities called words.) Machines that use the N2 CPU are quite small, with very limited system resources. They have only 32768 bytes of memory, grouped as 8192 four-byte words (8192 x 4 = 32768). The first byte in memory has address 0. Memory addresses always refer to bytes. Memory is only word-addressable, meaning that although each byte in memory has its own unique address, the CPU can only access memory in four-byte quantities, using the address of the first (or low-order) byte of a four-byte word. Consequently, the addresses that can be used to access memory are 0, 4, 8, etc., up to 32764. The value of a memory word can be interpreted as an instruction and executed, or treated as data. The N2 processor has 27 different instructions. It has seven registers, each of which as mentioned holds 32 bits. These are referred to using the names R through R6. The N2 has what is called a load/store architecture, which means that only some data movement instructions access memory. One of them loads a value from a memory location into a register and another one stores a value from a register into a memory location, while all other instructions, including computations, operate upon values in registers and put their result into a register. So computation typically involves loading operands from memory into registers, doing calculations with them, and storing the results from registers back into memory. One N2 register, R6, has a special purpose. It is the program counter, which always contains the memory address of the next instruction to be fetched and executed. A flow of control instruction can modify the program counter to cause execution to inmn to somewhere else in a program All other instructions cause the program counter to be incremented Onc N2 register, R6, has a special purposc. It is thc program counter, which always contains the memory address of the next instruction to be fetched and executed. A flow of control instruction can modify the program counter to cause execution to jump to somewhere else in a program. All other instructions cause the program counter to be incremented by 4. So unless specified by a flow of control instruction, the processor executes instructions sequentially in memory. R6 cannot be directly modified, for example, by storing the result of a computation in it. (Instructions can use the value in R6, but not directly modify it.) The other registers may be used or modified by the various instructions. 2.3 Hardware, components of instructions, and instruction format Since the N2s processor has a 32-bit word length, all instructions, just like all data, must fit into a 32-bit word. When a 32-bit word is interpreted as an instruction, it is considered to consist of fields representing(a) an "opcode, i.e., operation of the instruction, (b) an extension, whose use will be described below, (c) up to three registers used as operands by some instructions, and (d) a memory address or constant/immediate field that holds either a memory address or a constant (literal) numeric operand for some instructions. The instruction format on the N2 processor is as follows: 5 bits 3 bits 3 bits 3 bits 3 bits 15 bits opcode extension register register2 register memory address or constant/literal value The fields have the following uses and functionality: opcode: An opcode uniquely identifies the operation that an instruction performs. Short names are used in assembly language to refer to each type of instruction. (Recall that assembly instructions are just human-readable versions of machine instructions.) Since the N2 has 27 different opcodes, 5 bits are required to represent an opcode. Since 5 bits can store 32 different values but the N2 only has 27 opcodes, five five-bit values do not correspond to valid opcodes. regi or 2.4 Names, operands, and effects of the machine instructions You need to have a basic understanding of the effects of the N2s machine instructions to write the project, so after the table below, which indicates which operands are used by each instruction, their effects are briefly described. Each instruction is listed with the (decimal) number between 0 and 26 that represents its opcode, according to the enum Op_code defined in machine.h. (To save space on the page the table is in two columns="val is the decimal value of the instruction opcodes, "ext refers to the extension field of instructions, regi, reg2, and regz are the three register operand fields, and addr./imm." refers to the memory address/immediate operand field.) A checkmark means that an instruction uses that field. opcode val ext reg2 reg3 addr./imm. opcode val ext regi reg2 reg3 addr./imm. plus 0 14 minus 1 not 15 times 2 eq 16 div 3 neq 17 mod 4 le 18 neg 5 lt 19 abs ge 20 shl 7 gt 21 shr 8 22 band lw 23 bor 10 SW 24 bxor 11 li 25 bneg 12 sys 26 and 13 6 move 9 Note that the sys instruction uses the first register operand only when the value of the extension field is 0...3, not 4. Also note that different instructions use between 11 bits and 26 bits of their word (there are no instructions that use all 32 bits). Unuscd ficlds may just contain any bits at all. Also note that not cvery 32-bit word represents a valid N2 instruction. The plus instruction adds the contents of register and registerz and puts the result in register]. Similarly, minus does subtraction (computes the difference of register2 and register3), times does multiplication, div performs division, and mod extension: One instruction on the N2 has five different variants. For this instruction, a value in the extension field is used to indicate which variant an instruction represents. Not all values that can be stored in 3 bits represent valid extensions. register1, register2, registerz: As Section 2.4 says, some instructions operate upon one register, others operate on two registers, and some have three register operands. These three fields of instructions contain numbers indicating the registers that instructions use as operands. Since the N2 CPU has 7 registers, 3 bits can indicate any of them. Since 3 bits can store 8 different values but the N2 only has 7 registers, there is one value that can be in a register field that does not correspond to a valid register. address or constant: Some of the N2s instructions have the address of a memory location as an operand. For example, an instruction may copy the value in memory location 216 to one of the registers, so 216 would be stored in this field of the instruction. One instruction (li) has a constant or literal value, typically called an immediate in assembly language, as an operand. For example, an instruction may store the numeric value 250 in one of the registers, so 250 itself would be encoded in the instruction. This field is 15 bits, which is why the N2 has 32768 bytes of memory (with addresses 0...32767), because 215 32768. This also means that the N2 CPU only has 32768 different literal (constant) values that can appear in the li instruction, although values up to 4,294,967,295 can be stored in a 32-bit word. For example, say an instruction has opcode value 3 and is an instruction that uses all three register operands, and this particular instruction is going to operate upon registers R1, R2, and R4. Also suppose this instruction doesn't use the extension field or the address/constant field, and just has zeros in those 18 bits. Then this instruction would be represented in a 32-bit memory word as 0x182a000016, which is 40540569610. (Suggestion-convert the hexadecimal value to binary and match the bits against the diagram of the instruction format above.) Or suppose an instruction has opcode value 24 and uses one register operand, which is R5, and uses the memory address field to store 21610. If it also had Os in all of the unused fields it would be represented in a memory word as 0xc0a000d816 (which is 323171144810). computes the remainder of a division (modulus). These all apply the operation to their second and third register operands and put the result in their first register operand. In fact, all instructions that modify a register store their result into their first register operand, so this is not repeated below. Since a register can only store one value at a time, any instruction that modifies its first register operand's value has the effect of overwriting the previous value in that register. For instance, the instruction times R4 R3 R5 multiplies the values stored in registers R3 and R5, and puts the result in register R4. And minus R5 R2 R4 subtracts the contents of R4 from the contents of R2 and puts the result into R5. The neg instruction negates the value in its second register operand (if it is positive it becomes negative, and vice versa), and abs computes the absolute value of its second operand. For instance, if register R1 contains 216, then after neg R5 R1, the value in R5 would be -216. The shl instruction shifts a value to the left and shr shifts it to the right; these both shift their second operand's value by the number of bits stored in their third register operand. band, bor, bxor, and bneg perform bitwise and, bitwise or, bitwise exclusive or, and bitwise negation. The first three apply the operation to their second and third register operands while bneg is applied to its second register operand. and performs logical conjunction, or performs logical disjunction, and not performs logical negation. The first two of these operate upon their second and third register operands while not is applied to its second register operand. eq, neq, le, lt, ge, and gt compare their two register operands for equality, inequality, less than or equal, less than, greater than or equal, and greater than, and allow execution to go to another location in a program. If the relationship is true about the values in their register operands then execution jumps to the memory address in the address/constant field and continues from there. If not, execution just drops down to continue with the next sequential instruction in memory instead. move copies its second register operand's value to its first register operand. lw and sw are the only instructions that access or modify values in memory locations. lw copies a value from a memory location to a register. It uses the first register operand and a memory address, and its effect is to copy the value of the word in memory that has that address into the register specified by the register operand. For example, lw R3 200 would load the value in memory location 200 into register R3. (Assuming 200 is in decimal, this instruction would be stored in a register or memory word as Oxb86000c816, which is 309329940010.) sw does the opposite, copying a value from a register to the indicated memory location. li loads a constant or literal value (li stands for load immediate) into a register. For instance, li R4 236 would load the numeric value 236 (not the contents of memory location 236) into register number R4. (Assuming 236 is in decimal, this instruction would be stored in a word or register as 0xc08000ec16, which is 322961431610-) Lastly, sys allows a program to invoke functionality in the operating system, which as mentioned is referred to as a system call. The N2 operating system has five system calls, differentiated by the value of the extension field. System call O reads an integer, system call 1 writes an integer, system calls 2 and 3 read and write a character respectively, and system call 4 halts the executing program. These are indicated by the values 0 through 4 respectively in the extension field. The first four use the first register operand, while system call 4 (to halt a program) has no register operands. System calls 0 and 2, which read an integer and a character, modify their first register operand because whatever is read is stored there. System calls 1 and 3 print the value in their first register operand without modifying the register. All of the instructions are computation instructions except move, li, lw, and sw, which are data movement instructions, endog nogle + o andat which are flow of control instructions eve involves the onerating Wotem System calls 1 and 3 print the value in their first register operand without modifying the register. All of the instructions are computation instructions except move, li, lw, and sw, which are data movement instructions, and eq, neq, le, lt, ge, and gt, which are flow of control instructions. sys invokes the operating system. The same register may be used for multiple operands of an instruction, such as in add R3 R2 R3 or add R1 R1 R1. Any operand fields that aren't indicated in the table above are ignored by that instruction. For example, move has two register operands, therefore they will be contained in the first two register fields registerand register2; the extension, the third register field register3, and the address or immediate field are all not used by a move instruction. Of the instructions that use the address or constant field, only li uses that field to store an immediate (i.e., constant) operand; all others use that field to store a memory address. 3 Functions to be written The header file machine.h contains definitions and the prototypes of the four functions you must write. The functions use a typedef name Wrd to represent a N2 word. Wrd is an unsigned int, since on the Grace machines an int is four hytes. 3.1 unsigned short print_instruction(Wrd instruction) This function should interpret its parameter as a N2 instruction, use the bit operators to extract its fields (see the description and diagram in Section 2.4), and print the instruction's components on a single output line. However, if the instruction is invalid according to the criteria below, nothing at all should be printed and the function should just return 0. Otherwise, if the instruction is valid, the function should return 1 after printing the instruction, in this format: The instruction's opcode (the opcode field of the parameter, meaning its leftmost five bits) should be printed using its name in the table in Section 2.4 (plus, minus, times, div, etc.), spelled exactly as shown there. For example, if the value of the opcode field is the enum value PLUS (which has the value 0), then plus should be printed. If the opcode field is MINUS then minus should be printed, etc. Following the opcode, the extension and register operands that are actually used by the instruction should be printed, with the extension (if it is used) followed by the register operands that are used, in the order register), register2, and register3. For example, a not instruction uses the first two registers as operands, so those two should be printed, with the first register operand followed by the second one. A sys instruction uses the extension (usually- see the next sentence) and the first register as operands, so the extension should be printed first, then the register operand. (If the value of the extension field is 4 in a sys instruction then the register field should not be printed, because the system call to halt a program does not have a register operand.) Register names should be printed in decimal with an R immediately preceding the register number. For example, registers 0, 1, and 3 would be printed as RO, R1, and R3. For sys, the only instruction that uses the extension, the extension should just be printed as a single decimal digit between 0 and 4. If an instruction uses a memory address operand or an immediate operand that operand should be printed last, in decimal. If it is a memory address it should be printed using exactly five places, with addresses less than 1000010 printed using as many leading zeros as necessary so they occupy exactly four places. For example, the address 21610 should be printed as 00216. (This can trivially be performed with printf() formatting options covered in discussion.) Note that immediate operands li instructions should not be printed with any leading zeros, except zero itself should be printed as 0. Only memory address operands should have leading Os if needed. The printed fields must be separated with one or more whitespace characters (meaning tabs or spaces); the exact number of whitespace characters is up to you. The printed instruction should not have any whitespace before the opcode or following the last field printed. A newline should not be printed after the instruction. For example, the call print_instruction (0x38710000) should print something like shl R3 R4 R2, where the exact number of spaces or tabs separating the four things printed is up to you (one or more). As mentioned before, some values that can be stored in a 32-bit N2 word or register don't represent valid N2 instruc- tions, and if its parameter represents an invalid instruction this function should just return 0 without printing anything. An instruction would be invalid in any of these cases: The value in its opcode field is invalid. There are only 27 instructions (hence opcodes) on the N2 (with enum values between PLUS and SYS), so any value outside of that range doesn't represent an actual opcode. If a register operand that is actually used by the instruction has an invalid number. There are only seven registers If a register operand that is actually used by the instruction has an invalid number. There are only seven registers on the N2 with numbers 0-6 (and symbolic constant names R through R6 defined in machine.h), so the value 7 doesn't represent a valid register number. If it is an instruction that uses the address or constant field to store a memory address and the value of the field is not evenly divisible by 4. As will come up later, machines often have alignment requirements such that any data item must begin at a memory address that is a multiple of the size in bytes of that item. Since the N2 has this requirement, and everything on the N2 occupies a 4-byte word, everything must begin at an address that is divisible by 4. Note: if the parameters represent an instruction that uses the address or constant field for storing a memory address its value must be divisible by 4. But if the instruction is an li instruction, which is using the field to represent a immediate (constant) value, it does not have to be divisible by 4. If the parameters represent an instruction that has the effect of modifying (storing a new value into) its first register operand and reg1 is the special unmodifiable program counter register R6. The instructions that use and would modify the first register operand are all instructions except for eq, neq, le, lt, ge, gt, sw, as well as a sys instruction when the value of the extension field is 1, 3, or 4. (A sys instruction with extension 0 or 2 does modify its first register operand.) R6 can be used as the second or third register operand of any instructions that use two or three registers, but not as the first operand of instructions that would modify their first register operand's value. If the parameter represents a sys instruction and the extension field is anything other than 0 through 4 it is invalid. The N2 determines which operating system call should be invoked based on the value of the extension field, and there are only five possible system calls. Note: the values of fields that are not used by machine instructions have no effect on validity, and it's immaterial what they contain. For instance, a not instruction may have anything at all in its rightmost 18 bits (the third register operand and memory address fields) and in the 3 extension bits. As long as the opcode, register], and register2 are correct, a not instruction is valid. But instructions that do use fields, and have incorrect values in them, are invalid. Also note: in projects where output has to be produced, students sometimes lose credit for minor spelling or formatting mistakes. The submit server checks your output automatically, consequently even trivial typos would cause tests to fail. Due to the size of the course we will not be able to give back any credit lost for failed tests due to spelling or formatting errors, even if they are extremely minor, because that would necessitate manual checking of every output for hundreds of students' projects, which is not possible. Therefore it's essential for you to check carefully that your output is correct, register names and opcodes are spelled exactly, and that the output format described above is followed. 3.2 unsigned short disassemble(const Wrd program[], unsigned short num_instrs, unsigned short *const bad_instr) A disassembler converts machine language to assembly language. This function will do that for a N2 program stored in its first parameter program, which is an array of words representing N2 instructions. The parameter num_instrs gives the number of elements of the array (number of array elements, not number of bytes). The function's return value is described below. It should not produce any output at all (not even a newline) if either program or bad_instr are NULL, or if any of the program's instructions are invalid (see below). If the instructions in program are all valid the function should print them all, in this format (see some of the public test outputs for examples): Preceding each instruction in the array, its starting N2 memory address should be printed in hexadecimal using as many leading zeros as necessary so the address occupies exactly four places. A N2 program always begins at memory address 0. A colon and blank space should follow the address. In printing the memory address, remember that each array element corresponds to a four-byte word of the machine's memory. Each element of the array should be printed exactly how print_instruction () prints instructions, with each one followed by a newline (including the last one). If its parameters are valid and all the instructions in program are valid the function should return 1 after printing the array contents as described above. The function should not produce any output at all and just return 0 if its parameters are invalid, which would be when: The array memory is NULL, or the pointer bad_instr is NULL. num_instrs is larger than the number of 4-byte words in the N2 (meaning larger than the maximum number of 4-byte instructions that can be in memory; note there is a symbolic constant with this value in machine.h). The array memory is NULL, or the pointer bad_instr is NULL. num_instrs is larger than the number of 4-byte words in the N2 (meaning larger than the maximum number of 4-byte instructions that can be in memory; note there is a symbolic constant with this value in machine.h). Any of the instructions in the array program are invalid, according to the criteria described above in Section 3.1. In this case the function should also store the location (array index) of the first invalid instruction into the variable that bad_instr points to. (If the instructions are all valid the variable bad_instr points to should not be modified.) If the function's third parameter is non-NULL it may assume that it is a valid pointer that points to a Wrd (unsigned int) variable, in other words, it may assume that it is not an uninitialized or invalid pointer. This is something that the function cannot check, and is the caller's responsibility to ensure instead. 3.3 short where_set(const Wrd program[], unsigned short num_words, unsigned short reg_nbr) As you should know from CMSC 131 and 132, it is a syntax error in Java to use a local variable in a method before giving it a value. In C, a (nonstatic) local variable just has a garbage value if it is used before being set. It is commonly an indication of an error to use something in a program before it's given a value, and this could even apply in assembly programs to using the values of registers before putting a value in them. Some systems might initialize all registers to zero before a program starts, but in others, registers may have the equivalent of garbage values if they are used before being set. Since using anything before its given a value may be a bug, an assembly programmer might be helped if a tool could perform a similar type of analysis to what the Java compiler does, in detecting whether any register in an assembly program is used before it is given a value. This function and the next one will analyze an N2 program to do this. This function should return 1 if program is NULL, or if num_words is larger than the number of 4-byte words in the N2, or if reg_nbr is not a valid register number for the N2, or if register number reg_nbr is not given a value in any of the first num_words instructions of program (num_words indicates how many instructions of the array should be checked). Otherwise the function should return the index or position in the array program of the first instruction where register number reg_nbr is set or given a value. You should know from the explanation in Section 2.4 which instructions modify a register and which ones don't. For example, add modifies its first register operand but sw does not. (As mentioned, for the instructions that do modify a register, it is always the first register operand that is modified.) Notice that the sys instruction modifies its first register for two possible values of the extension (0 and 2), but not for the other three possible values. This function will always return -1 if reg_nbr is R6, since the program counter register can never be explicitly modified by any instructions. (This probably does not have to be handled as a special case.) 3.4 unsigned short is_safe(const Wrd program[], unsigned short pgm_size, unsigned short *const bad_instr) This function should return 1 if the program in program, which has pgm_size instructions, is valid. It should return 0 if program or bad_instr are NULL, or pgm_size is larger than the number of 4-byte words in the N2 or if the program is not safe. Safe means that none of the instructions in the array program are invalid, using the criteria described above in Section 3.1 and that every register used in instructions is set, or given a value, prior to the first instruction where it is used in the program. If the first and third parameters not NULL but the program is unsafe, meaning that there is some instruction that is using the value of a register that has not been given a value in any of the preceding instructions, then before returning O the function should also store the location of the first such instruction into the variable that bad_instr points to. (If the program is safe the function should not modify the variable that bad_instr points to.) What this function will have to do is to examine all of the instructions in the program, and determine which register operands each instruction uses the values of, based on its opcode. (Note this refers to which registers have their values used by instructions, not whether instructions modify their first register operand.) Then the function needs to see whether the register or registers that an instruction uses (if it is an instruction that uses the values of any registers) were set in any earlier instructions in the program. Of course your function where_set() can be used for thisStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started