zl程序教程

您现在的位置是:首页 >  硬件

当前栏目

数字IC手撕代码-流水握手(利用握手解决流水线断流、反压问题)

IC代码 解决 利用 数字 握手 流水线 问题
2023-09-14 09:15:33 时间

 前言:

        本专栏旨在记录高频笔面试手撕代码题,以备数字前端秋招,本专栏所有文章提供原理分析、代码及波形,所有代码均经过本人验证。

目录如下:

1.数字IC手撕代码-分频器(任意偶数分频)

2.数字IC手撕代码-分频器(任意奇数分频)

3.数字IC手撕代码-分频器(任意小数分频)

4.数字IC手撕代码-异步复位同步释放

5.数字IC手撕代码-边沿检测(上升沿、下降沿、双边沿)

6.数字IC手撕代码-序列检测(状态机写法)

7.数字IC手撕代码-序列检测(移位寄存器写法)

8.数字IC手撕代码-半加器、全加器

9.数字IC手撕代码-串转并、并转串

10.数字IC手撕代码-数据位宽转换器(宽-窄,窄-宽转换)

11.数字IC手撕代码-有限状态机FSM-饮料机

12.数字IC手撕代码-握手信号(READY-VALID)

13.数字IC手撕代码-流水握手(利用握手解决流水线断流、反压问题)

14.数字IC手撕代码-泰凌微笔试真题

15.数字IC手撕代码-平头哥技术终面手撕真题

16.数字IC手撕代码-兆易创新笔试真题

17.数字IC手撕代码-乐鑫科技笔试真题(4倍频)

18.数字IC手撕代码-双端口RAM(dual-port-RAM)

        ...持续更新

 更多手撕代码题可以前往 数字IC手撕代码--题库


目录

流水握手

流水线

总代码

总testbench

波形


流水握手

        之前在这篇博客中讲过流程之间的握手操作。

        数字IC手撕代码-握手信号(READY-VALID)

        今天讲讲流水线技术,以及如何用握手操作解决流水线因上游断流及下游反压导致的可能的数据丢失及计算错误问题。

流水线

        所谓流水线,其实就是通过将一段具有长延时的组合逻辑,利用寄存器,切分成多段具有小延时的组合逻辑,从而达到提升整个模块时钟频率的目的。

        例如我们在两个寄存器之间有一段组合逻辑,实现4个8bit数的乘加计算,代码写为:

        assign result = a1*b1 + a2*b2 + a3*b3 + a4*b4;

        这条语句的组合逻辑延时很大,有几个原因,一个是四个乘法在一个组合逻辑实现,再考虑四个数的加法,这样组合逻辑的延时就会很大,可以利用流水线技术对这段组合逻辑进行切分,将乘法和加法分开执行且并行执行,例如:

module pipeline_handshake(
  input                 clk                     ,
  input                 rstn                    ,
  input         [7:0]   a1,a2,a3,a4,b1,b2,b3,b4 ,
  output  reg   [19:0]  result
);

reg [15:0] temp1,temp2,temp3,temp4;

//pipeline stage 1
always @(posedge clk)begin
  temp1 <= a1 * b1;
  temp2 <= a2 * b2;
  temp3 <= a3 * b3;
  temp4 <= a4 * b4;
end

//pipeline stage 2
always @(posedge clk)begin
  result <= temp1 + temp2 + temp3 + temp4;
end

endmodule

       如图所示,将一段长的乘加组合逻辑,切了两级流水,把乘法和加法分开来,好处是显而易见的,原来的组合逻辑延时是8bit乘法和3个16bit加法的延时,现在最大的组合逻辑延时是一个8bit乘法,或者3个16bit加法延时,该模块的最大时钟频率可以提高很多,这就是流水线技术。 

format,png

        运行结果如上,可以看到,数据因为做了两级流水,结果会比直接一条组合逻辑走到底的慢一个周期,但是流水线完全转起来之后,吞吐量都是一样的。


        前一文章讲完了握手,本文讲完了流水线,下面讲怎么把两个结合在一起,实现流水握手。上面只有两级流水,太短了,我们把战线拉长一点,在原来的基础上,令a1=c1+c2;b1=c3+c4; 

always @(posedge clk)begin
  a1 <= c1 + c2;
  b1 <= c3 + c4;
end

        即多加一级流水,分成如下3级。 

//pipeline stage 1
always @(posedge clk)begin
  a1 <= c1 + c2;
  b1 <= c3 + c4;
end

//pipeline stage 2
always @(posedge clk)begin
  temp1 <= a1 * b1;
  temp2 <= a2 * b2;
  temp3 <= a3 * b3;
  temp4 <= a4 * b4;
end

//pipeline stage 3
always @(posedge clk)begin
  result <= temp1 + temp2 + temp3 + temp4;
end

       确定好切分几级流水以及各个流水的工作内容后,就可以为各级流水线添加握手信号了。

        握手信号我们采用的是预取(pre-fetch)的方式,这种方式的握手可以解决数据卡在流水线中出不来的情况:对于本级来说,如果下一级准备好了,或者本级的输出为0(即本级流水肚子里没货),那么本级就可以接收数据。各级流水线运转的条件是:本级流水准备好(ready为高,且上级流水输出有效valid为高,即ready_本级 && valid_上级时,本级流水才开始工作,如果上一级数据不有效或者本级还没准备好,那么流水线就会一直停滞,直到数据有效和流水ready同时满足。

//pipeline stage 1
always @(posedge clk)begin
  if(!rstn)begin
    valid_r1 <= 1'b0;
  end
  else if(ready_o)begin    //如果本级准备好了,则将上一级的valid信号传递过来
    valid_r1 <= valid_i;
  end
end

always @(posedge clk)begin
  if(ready_o && valid_i)begin //输入数据ready_valid信号同时拉高时,数据有效并传入
    a1 <= c1 + c2;
    b1 <= c3 + c4;
    a2_r1 <= a2; a3_r1 <= a3; a4_r1 <= a4; //数据进来打一拍到第二级流水
    b2_r1 <= b2; b3_r1 <= b3; b4_r1 <= b4;
  end
end

        在第一级流水中,我们对a2~a4,b2~b4做了打一拍的操作,这里解释下为什么,因为我们输入的时候,a2~a4,b2~b4是跟c1~c4一起输入的,但是我们规划的流水线的第一级并没有对a2~a4,b2~b4进行操作而是在第二级对他们操作,所以如果我们不对输入的a2~a4,b2~b4有效数据进行一拍的缓存的话,若输入数据仅有效一个周期时,那么下个周期的时候,流水线使用的a2~a4,b2~b4就会出错,所以要在这里对第一级没有处理的数据进行打一拍,保证同时刻进来的数据走同一条流水

        同理可以写第二级和第三级的流水: 

//pipeline stage 2
assign ready_r1 = valid_r2 || ready_r2;
always @(posedge clk)begin
  if(!rstn)begin
    valid_r2 <= 1'b0
  end
  else if(ready_r1)begin   //如果本级准备好了,则将上一级的valid信号传递过来
    valid_r2 <= valid_r1;
  end
end
always @(posedge clk)begin
  temp1 <= a1    * b1;
  temp2 <= a2_r1 * b2_r1;
  temp3 <= a3_r1 * b3_r1;
  temp4 <= a4_r1 * b4_r1;
end

//pipeline stage 3
assign ready_r2 = ~valid_r3 || ready_i;
always @(posedge clk)begin
  if(!rstn)begin
    valid_r3 <= 1'b0;
  end
  else if(ready_r2)begin
    valid_r3 <= valid_r2;
  end
end

always @(posedge clk)begin
  if(ready_r2 && valid_r2)begin
    result <= temp1 + temp2 + temp3 + temp4;
  end
end

assign valid_o = valid_r3;

如上,就完成了对三级流水线添加握手信号的操作。下面我们来测试一下

initial begin
  clk   <= 1'b0;
  rstn  <= 1'b0;
  ready_i <= 1'b1;
  valid_i <= 1'b0;
  #20
  rstn <= 1'b1;
  #45
  ready_i <= 1; valid_i <= 1;   //流水线工作状态
  a2 <= 8'd2; a3 <= 8'd2; a4 <= 8'd2;
  b2 <= 8'd2; b3 <= 8'd2; b4 <= 8'd2;
  c1 <= 8'd2; c2 <= 8'd2; c3 <= 8'd2; c4 <= 8'd2;
  #10
  a2 <= 8'd3; a3 <= 8'd3; a4 <= 8'd3;
  b2 <= 8'd3; b3 <= 8'd3; b4 <= 8'd3;
  c1 <= 8'd3; c2 <= 8'd3; c3 <= 8'd3; c4 <= 8'd3;
  #10
  ready_i <= 1; valid_i <= 0;   //流水线上游断流
  a2 <= 8'd4; a3 <= 8'd4; a4 <= 8'd4;
  b2 <= 8'd4; b3 <= 8'd4; b4 <= 8'd4;
  c1 <= 8'd4; c2 <= 8'd4; c3 <= 8'd4; c4 <= 8'd4;
  #30
  a2 <= 8'd5; a3 <= 8'd5; a4 <= 8'd5;
  b2 <= 8'd5; b3 <= 8'd5; b4 <= 8'd5;
  c1 <= 8'd5; c2 <= 8'd5; c3 <= 8'd5; c4 <= 8'd5;
  #10
  ready_i <= 0; valid_i <= 1;   //流水线下游反压
  a2 <= 8'd6; a3 <= 8'd6; a4 <= 8'd6;
  b2 <= 8'd6; b3 <= 8'd6; b4 <= 8'd6;
  c1 <= 8'd6; c2 <= 8'd6; c3 <= 8'd6; c4 <= 8'd6;
  #30
  a2 <= 8'd7; a3 <= 8'd7; a4 <= 8'd7;
  b2 <= 8'd7; b3 <= 8'd7; b4 <= 8'd7;
  c1 <= 8'd7; c2 <= 8'd7; c3 <= 8'd7; c4 <= 8'd7;
  #10
  ready_i <= 1; valid_i <= 0;   //数据传输完毕
  a2 <= 8'd8; a3 <= 8'd8; a4 <= 8'd8;
  b2 <= 8'd8; b3 <= 8'd8; b4 <= 8'd8;
  c1 <= 8'd8; c2 <= 8'd8; c3 <= 8'd8; c4 <= 8'd8;
end

总代码

module pipeline_handshake(
  input                 clk                     ,
  input                 rstn                    ,
  input         [7:0]   a2,a3,a4,b2,b3,b4       ,
  input         [7:0]   c1,c2,c3,c4             ,
  output  reg   [19:0]  result                  ,

  input                 ready_i                 ,
  input                 valid_i                 ,
  output                ready_o                 ,
  output                valid_o                 
);

wire ready_r1,ready_r2;
reg valid_r1,valid_r2,valid_r3;
reg [7:0] a1,b1;
reg [7:0] a2_r1,a3_r1,a4_r1,b2_r1,b3_r1,b4_r1;

reg [15:0] temp1,temp2,temp3,temp4;

//pipeline stage 1
assign ready_o = ~valid_r1 || ready_r1;
always @(posedge clk)begin
  if(!rstn)begin
    valid_r1 <= 1'b0;
  end
  else if(ready_o)begin    //如果本级准备好了,则将上一级的valid信号传递过来
    valid_r1 <= valid_i;
  end
end

always @(posedge clk)begin
  if(ready_o && valid_i)begin //输入数据ready_valid信号同时拉高时,数据有效并传入
    a1 <= c1 + c2;
    b1 <= c3 + c4;
    a2_r1 <= a2; a3_r1 <= a3; a4_r1 <= a4; //数据进来打一拍到第二级流水
    b2_r1 <= b2; b3_r1 <= b3; b4_r1 <= b4;
  end
end


//pipeline stage 2
assign ready_r1 = ~valid_r2 || ready_r2;
always @(posedge clk)begin
  if(!rstn)begin
    valid_r2 <= 1'b0;
  end
  else if(ready_r1)begin   //如果本级准备好了,则将上一级的valid信号传递过来
    valid_r2 <= valid_r1;
  end
end
always @(posedge clk)begin
  if(ready_r1 && valid_r1)begin
    temp1 <= a1    * b1;
    temp2 <= a2_r1 * b2_r1;
    temp3 <= a3_r1 * b3_r1;
    temp4 <= a4_r1 * b4_r1;
  end
end

//pipeline stage 3
assign ready_r2 = ~valid_r3 || ready_i;
always @(posedge clk)begin
  if(!rstn)begin
    valid_r3 <= 1'b0;
  end
  else if(ready_r2)begin
    valid_r3 <= valid_r2;
  end
end

always @(posedge clk)begin
  if(ready_r2 && valid_r2)begin
    result <= temp1 + temp2 + temp3 + temp4;
  end
end

assign valid_o = valid_r3;

endmodule

总testbench

module pipeline_handshake_tb();

reg clk,rstn;

always #5 clk = ~clk;

reg valid_i,ready_i;
wire ready_o,valid_o;

reg [7:0] a2,a3,a4,b2,b3,b4,c1,c2,c3,c4;
wire [19:0] result;

initial begin
  clk   <= 1'b0;
  rstn  <= 1'b0;
  ready_i <= 1'b1;
  valid_i <= 1'b0;
  #20
  rstn <= 1'b1;
  #45
  ready_i <= 1; valid_i <= 1;   //流水线工作状态
  a2 <= 8'd2; a3 <= 8'd2; a4 <= 8'd2;
  b2 <= 8'd2; b3 <= 8'd2; b4 <= 8'd2;
  c1 <= 8'd2; c2 <= 8'd2; c3 <= 8'd2; c4 <= 8'd2;
  #10
  a2 <= 8'd3; a3 <= 8'd3; a4 <= 8'd3;
  b2 <= 8'd3; b3 <= 8'd3; b4 <= 8'd3;
  c1 <= 8'd3; c2 <= 8'd3; c3 <= 8'd3; c4 <= 8'd3;
  #10
  ready_i <= 1; valid_i <= 0;   //流水线上游断流
  a2 <= 8'd4; a3 <= 8'd4; a4 <= 8'd4;
  b2 <= 8'd4; b3 <= 8'd4; b4 <= 8'd4;
  c1 <= 8'd4; c2 <= 8'd4; c3 <= 8'd4; c4 <= 8'd4;
  #30
  a2 <= 8'd5; a3 <= 8'd5; a4 <= 8'd5;
  b2 <= 8'd5; b3 <= 8'd5; b4 <= 8'd5;
  c1 <= 8'd5; c2 <= 8'd5; c3 <= 8'd5; c4 <= 8'd5;
  #10
  ready_i <= 0; valid_i <= 1;   //流水线下游反压
  a2 <= 8'd6; a3 <= 8'd6; a4 <= 8'd6;
  b2 <= 8'd6; b3 <= 8'd6; b4 <= 8'd6;
  c1 <= 8'd6; c2 <= 8'd6; c3 <= 8'd6; c4 <= 8'd6;
  #30
  a2 <= 8'd7; a3 <= 8'd7; a4 <= 8'd7;
  b2 <= 8'd7; b3 <= 8'd7; b4 <= 8'd7;
  c1 <= 8'd7; c2 <= 8'd7; c3 <= 8'd7; c4 <= 8'd7;
  #10
  ready_i <= 1; valid_i <= 0;   //数据传输完毕
  a2 <= 8'd8; a3 <= 8'd8; a4 <= 8'd8;
  b2 <= 8'd8; b3 <= 8'd8; b4 <= 8'd8;
  c1 <= 8'd8; c2 <= 8'd8; c3 <= 8'd8; c4 <= 8'd8;
end

pipeline_handshake u_pipeline_handshake(
  .clk      (clk)     ,
  .rstn     (rstn)    ,
  .a2(a2),.a3(a3)     ,
  .a4(a4),.b2(b2)     ,
  .b3(b3),.b4(b4)     ,
  .c1(c1),.c2(c2)     ,
  .c3(c3),.c4(c4)     ,
  .result   (result)  ,
  .ready_i  (ready_i) ,
  .ready_o  (ready_o) ,
  .valid_i  (valid_i) ,
  .valid_o  (valid_o)

);

endmodule

波形

        这里关键就是这个波形,请一定要看好,流水握手掌握了之后,在我们写项目的过程中非常有用! 

        在tb中分别模拟流水线正常工作、上游断流、下游反压以及最后数据传输完毕之后的状态,看看在不同状态下,流水线是否会出现数据传输错误或者数据丢失的情况。 

        把整个模块包装成一个module后,我们这个就相当于一个流水处理一段长延时组合逻辑的IP。对于我们这个IP来说,valid_i和ready_o同时为高时,数据有效,即a2\b2=2,3,6时数据是有效的,一开始数据有效了两拍,分别是2和3,过了三拍后,数据输出valid_o拉高两拍,因为ready_i为1,所以下游可以接收数据,所以valid_o拉高两周期把输出的数据传完后就拉低了。

        输入数据有效两周期后,流水线断流,从而导致输入数据无效,所以流水线在输出两个结果后valid_o保持为低,直到上游又输入有效数据。

        此时进入下游反压状态,下游反压并不会立刻反馈到第一级的流水,而是由下至上逐级反馈,所以下游反压的时候,空的流水线不会立即停止,而是等数据把流水线灌满之后才停止。下游反压时,流水线进了三周期数据6,把流水线灌满之后,流水线就停滞了(表现为ready_o拉低,告诉上游我吃不进数据了)。

        数据停止传输时,又把ready_i拉高,即下游不反压了,但此刻上游数据已经传输完毕,所以流水线不再接受数据,而是把之前吃进来的三周期数据6,计算后再吐出来,可以看到确实输出时ready_i和valid_o同时保持三周期高,把三级流水线内部数据吐完后,valid_o拉低,流水线清空。

        这段流水握手的代码是我写的比较好看的一段代码,解决了我特别多的数据传输问题,大家感兴趣的可以借鉴。 


 更多手撕代码题可以前往 数字IC手撕代码--题库