Question
Introduction The purpose of this laboratory is to practice the idea of register-transfer level design using Verilog and FPGA. Most of the code will be
Introduction
The purpose of this laboratory is to practice the idea of register-transfer level design using Verilog and FPGA. Most of the code will be written using behavioral modeling. From this lab, you will see that whatever software can be implemented using hardware, if you need to accelerate the computing speed.
The Algorithm
A very simple algorithm to compute the square root of a number is shown below. To simplify the problem, we assume that the number is a perfect square (e.g., 4, 9, 16, 25, .). There are three registers Rn, Rdelta, and Rdata in the data path to store the data. For example, to compute the square root of 25, 25 will be loaded into register Rdata from input port num. Rn is initialized to 1 and Rdelta is initialized to 3 as shown in iteration 1.
Iteration Rn Rdelta Rdata
1 1 3 25
2 4 5 25
3 9 7 25
4 16 9 25
5 25 11 25
Since Rn (i.e., 1) is not equal to 25, in the next iteration (iteration 2), we have Rn = Rn + Rdelata and then Rdelta = Rdelta+2 as shown in the table above. The process is repeated until Rn is equal to Rdata in iteration 5. From the above table, you can see that Rn keeps getting the next square number and Rdelata keeps being added by 2. The algorithm is so simple and so beautiful, right?
The next thing is how to get the sqrt out of the above table. Amazingly, it can be easily obtained by taking the quotient of Rdelta/2. In this case, 11/2 has quotient equal 5. But, how to do division?
It would be very bad to synthesize a multiplier or divider since each of them uses lots of hardware. A very simple way is to right-shift Rdelta one bit. In this example, Rdelta = 4b1011 (that is 11 in decimal). By right-shifting one bit on Rdelta, we have Rdelta = 4b101 that is 5 in decimal. Again, it is so beautiful. But, how to do right-shifting one bit using Verilog? This can be found in Chapter 8 of the textbook and it is Rdelta >>> 1.
The Data Path
The data path is the machine that is really doing the entire data computation. The module head is given below to help you start. The right way to design the data path is: (1) based on the algorithm, design the block diagram which contains adders, comparators, registers, , and the connections; (2) identify the control signals (e.g., load_data, incr_delta, find_next_sq) whose activations enable the data path to compute; (3) identify flag signals (e.g., finish) to tell the controller what is happening in the data path.
module sqrt_data_path (num, load_data, incr_delta, find_next_sq, clk, finish, result
);
input load_data, incr_delta, find_next_sq, clk;
input [6:0] num;
output reg finish;
output [3:0] result;
We have 8 switches on the FPGA board, so we use 7 switches to input the square number (called num above). The one switch left is used to clear (reset) the registers. So, the largest square number that can be represented in this data representation is 121 and the sqrt is 11.
Control signal load_data is used to load the square number from switches to register Rdata. Control signal incr_delta is used to increment Rdelta by 2. Control signal find_next_sq is used to load the new square number (by adding Rdelta and Rn) into Rn. These three control signals are generated by the controller. Do we need to implement a clear signal to reset all registers in the data path? The answer is NO, because the registers will be initialized data in iteration 1.
Flag signal finish is use to tell the controller (from data path) that the computation is finished and the entire process must be stopped. Output result, which outputs the sqrt computed, will be presented to a bin2bcd module for the purpose of converting the binary result to bcd digits. The bcd digits finally will be displayed by seven-segment displays.
The controller
The controller is used to generate control signals for the data path, and also control the entire process based on the flag signals from the data path and the signals from the user. The module head of an implementation of the controller is given below.
module sqrt_controller (start, finish, clk, clear, ready, load_data,
incr_delta, find_next_sq
);
input start, finish, clk, clear;
output reg ready, load_data,incr_delta, find_next_sq;
parameter S0=0, S1=1, S2=2, S3=3;
reg [1:0] state, next_state;
Here, the start signal is implement by a push-button to start the sqrt computation process. Signal clear implemented by a switch is used to reset all registers. Signal ready is used to let the user know that the sqrt generator is ready for use. All other signals have been discussed in Section 3.
The controller is in fact a sequential machine which is a mixed Mealy and Moore machine, because some outputs (e.g., ready) depend on the states only, while other outputs (e.g., load_data) depend on states and inputs. It should also be understood that data path is always one clock lag behind the controller. For example, the controller generates load_data in clock k, it is in clock k+1 the data path really loads the square data into register Rdata. Thus, the controller can check and see whether Rn == Rdata by checking the finish signal in clock k+2. Therefore, in clock k+1, the controller may use a dummy state to wait for the availability of signal finish. If you are lucky, instead of doing nothing in the dummy state, you can have the controller do something else in the dummy state.
The bin2bcd Module
Since the square root finally will be presented on seven-segment displays, we have to convert the binary result (the decimal values from 1 to 11 in the lab) to bcd (binary-coded decimal) data. For example, 4b1011 (i.e., 11 in decimal) will be converted into eight bits 00010001 in bcd form. The following code will achieve this purpose.
module bin2bcd(bin, bcd
);
input [3:0] bin;
output [7:0] bcd;
assign bcd = (bin > 9) ? bin+6 : bin;
endmodule
The bcd7seg Module
The bcd7seg module given below is used to generate the 7-segment code for each bcd data. It should be noted that clock is used to activate the display operation one bcd digit at a time. The frequency clock is made about 200Hz that is fast enough to see all four digits continuously. Output signal seg is the corresponding seven segment code for the bcd digit displayed. Finally, output signal an is used to select one of the four seven-segment displays to illuminate. More details will be discussed in the class.
module bcd7seg(clock, bcd, seg, an);
input clock;
input [7:0] bcd;
output reg [6:0] seg;
output reg [3:0] an;
reg [1:0] step = 0;
integer digit = 4'b1111;
// Choose the slice of bcd to update
always @ (posedge clock) begin
case (step)
0: digit = bcd[3:0];
1: digit = bcd[7:4];
2: digit = 0;
3: digit = 0;
default: digit = bcd[3:0];
endcase
an = 4'b1111;
case(digit) // Pick the segment bits
0: seg = 7'b0000001;
1: seg = 7'b1001111;
2: seg = 7'b0010010;
3: seg = 7'b0000110;
4: seg = 7'b1001100;
5: seg = 7'b0100100;
6: seg = 7'b0100000;
7: seg = 7'b0001111;
8: seg = 7'b0000000;
9: seg = 7'b0000100;
default: seg = 7'b1111111;
endcase
if (an[step]==1) an[step]=0; // Pick the digit anode to
step = step + 1;
end
endmodule
The clock_down_converter Module
As discussed in Lab 8, clock_down_coverter shown below is to reduce the clock speed from 50MHz to about 1Hz such that you can synchronize the data entering and the clock. Details have been discussed in Lab 8.
module clock_down_converter(
input clock, // 50 MHz clock, built in function
input clear, // Reset down-converter
output clk1, // ~0.5 Hz clock
output clk200 // 500 Hz clock
);
// 27-bit counter
reg[26:0] q;
always @ (posedge clock, posedge clear) begin
if (clear)
q <= 0;
else
q <= q+1;
end
// Uncomment and comment following statements as needed for
// simulation and synthesis
//clk1 = q[0]; //40 ns period (for simulation)
//clk200 = q[1]; //80 ns period (for simulation)
assign clk200 = q[16];//500 Hz(for 7-seg display)
assign clk1 = q[26]; //~0.5 Hz (for the sequential ckt)
endmodule
The Top Module
The top module is used to integrate all modules together. The code is shown below. Again, here, the start signal is implemented by a push-button to start the sqrt computation process. Signal clear implemented by a switch is used to reset all registers.
module sqrt_Top(start, clk, clear, num, ready, seg, an, clk1
);
input start, clk, clear;
input [6:0] num;
output ready, clk1;
//output [7:0] bcd;
output [6:0] seg;
output [3:0] an;
wire finish, load_data,incr_delta, find_next_sq, clk1, clk200;
wire [3:0] result;
wire [7:0] bcd;
sqrt_controller m1(start, finish, clk1, clear, ready, load_data,
incr_delta, find_next_sq);
sqrt_data_path m2(num, load_data, incr_delta, find_next_sq, clk1, finish, result);
bin2bcd m3 (result, bcd);
bcd7seg m4 (clk200, bcd, seg, an);
clock_down_converter m5(clk, clear, clk1, clk200);
endmodule
Test Benches
The following testBenches are used to debug your Verilog modules.
9.1. TestBench for Data Path
module sim;
// Inputs
reg [6:0] num;
reg load_data;
reg incr_delta;
reg find_next_sq;
reg clk;
// Outputs
wire finish;
wire [3:0] result;
// Instantiate the Unit Under Test (UUT)
sqrt_data_path uut (
.num(num),
.load_data(load_data),
.incr_delta(incr_delta),
.find_next_sq(find_next_sq),
.clk(clk),
.finish(finish),
.result(result)
);
initial begin
// Initialize Inputs
num = 0;
load_data = 0;
incr_delta = 0;
find_next_sq = 0;
clk = 0;
num = 9;
end
initial
begin
repeat (100) #5 clk = ~clk;
end
// Wait 100 ns for global reset to finish
//#100;
// Add stimulus here
initial fork
#8 load_data = 1;
#20 load_data = 0;
#20 find_next_sq = 1;
#30 find_next_sq = 0;
#30 incr_delta = 1;
#40 find_next_sq = 1;
#40 incr_delta = 0;
#50 find_next_sq = 0;
#50 incr_delta = 1;
#60 find_next_sq = 1;
#60 incr_delta = 0;
#70 find_next_sq = 0;
#70 incr_delta = 1;
join
endmodule
TestBench for Controller
module sim;
// Inputs
reg start;
reg finish;
reg clk;
reg clear;
// Outputs
wire ready;
wire load_data;
wire incr_delta;
wire find_next_sq;
// Instantiate the Unit Under Test (UUT)
sqrt_controller uut (
.start(start),
.finish(finish),
.clk(clk),
.clear(clear),
.ready(ready),
.load_data(load_data),
.incr_delta(incr_delta),
.find_next_sq(find_next_sq)
);
initial begin
// Initialize Inputs
start = 0;
finish = 0;
clk = 0;
clear = 0;
end;
initial
begin
repeat (100) #5 clk = ~clk;
end
// Wait 100 ns for global reset to finish
//#100;
// Add stimulus here
initial fork
#2 clear = 1;
#4 clear = 0;
#12 start = 1;
#17 start = 0;
#62 finish = 1;
join
endmodule
9.3 TestBench for Top
module sim;
// Inputs
reg start;
reg clk;
reg clear;
reg [6:0] num;
// Outputs
//wire [3:0] result;
//wire [7:0] bcd;
wire ready;
wire [6:0] seg;
wire [3:0] an;
wire clk1;
// Instantiate the Unit Under Test (UUT)
sqrt_Top uut (
.start(start),
.clk(clk),
.clear(clear),
.num(num),
.ready(ready),
//.bcd(bcd)
.seg(seg),
.an(an),
.clk1(clk1)
);
initial begin
// Initialize Inputs
start = 0;
clk = 0;
clear = 0;
num = 100;
end
// Wait 100 ns for global reset to finish
//#100;
initial
begin
repeat (100) #5 clk = ~clk;
end
// Add stimulus here
initial fork
#2 clear = 1;
#7 clear = 0;
#12 start = 1;
#22 start = 0;
join
endmodule
Last Advises
Use the test benches provided above to fully simulate the data path module, controller module and Top module before pin assignment, logic synthesis, bit file generation, and bit file down loading to the FPGA board. Display your clk1 on an LED such that you can synchronize your inputs start and reset with clk1 well. You do NOT need to synchronize your Rdata (the square number) loading. Why?
Things Needed:
Design specification.
Block diagram (modules and their relationship) of the entire lab.
Block diagram of the data path.
Verilog code (your behavioral level design) for the Data Path module.
State diagram of the Controller.
Testbench (you can copy of testbench given above) for the Data Path module, and the simulation waveform by the testbench.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started