Sunday, 18 January 2015

How to extract clock and data from input bit stream in serial communication

Courtesy: http://www.arrowdevices.com/blog/beginners-guide-to-clock-data-recovery/

Different Techniques of Data Communication: 

Before starting on CDR, we will have a look at different techniques of data communication, which are:

1. Serial Data Communication

In serial communication data bits are transmitted sequentially one by one.
Serial-Data-Communication

2. Parallel Data Communication

In parallel communication data bits are driven on multiple wires simultaneously.
Parallel-Data-Communication
By looking at the above figures, one can easily judge that parallel communication will be much faster than serial communication.
But then the question arises, why is serial communication preferred over parallel communication???
This is because in practice, parallel communication is not faster than serial communication. This is due to the following reasons-
a)     Skew
Travelling path length for every bit is going to be different.Due to this some bits can arrive early or before than others which may corrupt the information.
Parallel-Transmission-With-Skew
To solve this you can pad the bits. But this would be on the cost of speed as it will reduce speed of every link to the slowest of all.
b)     Inter symbol interference and Cross talk
Due to several parallel links ISI and Cross talk is introduced in the system which gets more severe as length of link is increased. So this limits the length of a connection.
c)     Limitation of I/O pin count
Parallel data communication requires a lot more I/O pins than what is required by serial data communication.

What is Clock Data Recovery? 

Since most of the high speed serial interfaces do not have any accompanying clock, the receiver needs to recover the clock in order to sample the data on serial lines.
To recover the sampling clock, receiver needs a reference a clock of approximately same frequency. To generate the recovered clock, the receiver needs to phase align the reference clock to the transitions on the incoming data stream. This is called as Clock recovery.
Sampling of that incoming data signal with recovered clock to generate a bit stream is called as Data recovery. Together, this is called Clock Data Recovery, or CDR.
CDR is required to recover data from incoming data stream in the absence of any accompanying clock signal, without any bit errors due to over/under sampling.

How Does Clock Data Recovery Work?

The two main functions for performing CDR are- frequency detection and phase alignment.

I)  Frequency Detection

It is a process of locking on a frequency that is retrieved from incoming data stream. This is done by detecting the time difference between two consecutive edges on data stream.
This locked frequency is used in regenerating the transmitted data bit stream.
To make you more familiar with frequency detection, let me give you an analogy of punctuation in a sentence. You may have observed that whenever a road undergoes repair work the construction company puts up a display message on a board to slow down the vehicles passing by. That message is something like this -
                                               SLOW, MEN AT WORK
Now if the same message is written without proper punctuation then it may imply something totally different! -
                                               SLOW MEN AT WORK
Punctuations are like  detected frequency. If locked on a wrong frequency that will lead to incorrect data sampling!   
Frequency-Sampling     
So the question here is how to sample an incoming data bit stream correctly?
One solution that comes to mind instantly is to sample the bit stream on the same frequency at which it was transmitted.
To do that one has to generate the clock on receiver having the same frequency on which data was transmitted. But it is not possible to generate two clocks having the exact same frequency by using two different clock generators even if they have the same specifications.
Also it is not possible to generate a clock with a precise frequency.
At the same time, a minute difference in sampling frequency can lead to a bit error as described in the following diagram:
Frequency-Detection-Minor-Difference
As shown in the above figure above, a single bit is getting sampled twice due to a minute difference in TX and RX frequency.
How else a clock with the same frequency of ‘TX clock’ be generated?
This can be done by checking the edges on incoming data bit stream.However in this process, the initial bits that get used to detect the frequency get lost. To solve this, a particular set of bit sequences are transmitted before transmitting the valid data. These sequences are called as training sequences. Training sequences posses very high edge density, so that receiver can easily lock on a frequency by checking the consecutive edges on the wire before start of valid data. Below figure shows a sequence with high edge density.
Frequency-Sequence-High-Edge-Density
Frequency from incoming data bit stream has been recovered. RX clock can now be generated based on recovered frequency.
The above recovered frequency is fine for an ideal case when there is no any noise introduced in transmission i.e. clock frequency for TX Clock is same throughout. Also data is an integral multiple of TX Clock period. However that’s not true practically, as there are a number of attributes which affectthe data transmission and distort the uniformity of clock.
Below figure depicts a real time clock which has variations in its period.
Frequency-Clock-Variable-Period
There are mainly two attributes which affects most high speed serial data communications-
A)    Jitter:
Jitter is a shift in the edges of a periodic signal. This breaks the periodicity of the signal.
 Frequency-Jitter
Jitter is a short term effect. It follows Gaussian distribution that’s why the average mean of jitter is zero i.e. the cumulative effect of jitter is null.
Frequency-Jitter-Distribution
Since there is a shift in the edges of clock signal due to jitter, the question is what is the optimum position to sample a bit?
A bit should be sampled at the centre. It is the optimum position where maximum shift in the edges on either side (from left to right or right to left) can be encountered. However if the shift in an edge becomes greater than half of the bit period then there will be a bit error.
Frequency-Jitter-Free-Range
B)    PPM (parts per million):
PPM is an inaccuracy of certain components (quartz crystal in case of clock generator) in a circuit which leads to generation of a signal with inaccurate period. PPM does not break the periodicity of a signal. As its name states, PPM is a long term effect which denotes the inaccuracy in the bit period over a million clock cycles. PPM is additive or subtractive in nature.
Frequency-PPM-effect
Onlyif cumulative effect of jitter or PPM in TX CLK becomes more than half of RX CLK then there would be errors due to over/under sampling.
An example below shows how ongoing variations in incoming data stream can affect the sampling of data. This same example will be considered to resolve the issues as we progress further.
Variations-Data-Sampling-Frequency
RX CLK(FD) is frequency locked during frequency detection. As the incoming data stream is being sampled on FD, as depicted in the red box, a single bit is getting sampled twice. This occurs because of the variations in the incoming data bit stream. 
To encounter these variations in frequency of TX CLK, the second function of CDR, Phase alignment comes in picture. This  readjusts RX CLK edges.

II) Phase Alignment

Phase alignment is a process of matching the phase of a signal with another signal. Here it is matching the phase of clock recovered in frequency detection with the incoming data bit stream.
Let me give you an analogy for phase alignment.
You may have seen an analog radio. There are two knobs namely coarse tune and fine tune on analog radio. When one wants to listen to any audible signals, coarse tune knob is used to lock onto a frequency where signals are audible but with some disturbances. Here coarse tune is as good as frequency detection and disturbances are jitter and PPM. To remove these disturbances and make the voice audible, fine tune knob is used which adjusts the pre locked frequency a little bit here and there to get a perfect audible signals. Here fine tuning is  akin to phase alignment.
Phase-Alignment-Radio-Example
The following rules need to be followed for Phase Alignment:
  1. If a transition is detected on wire then make level of RX CLK(FD+PD) = 1.
  2. If RX CLK(FD) period is completed after a posedge on RX CLK(FD+PD) and no any transition is detected on wire then assertposedge of RX CLK(FD+PD).
Here RX CLK(FD) is clock frequency locked during frequency detection process and RX CLK(FD+PD) is clock frequency during phase alignment process.
It is time to look in to the working of phase alignment. Let us take the same example considered earlier for PPM jitter.
PPM-Frequency-Jitter-Phase-Alignment
Here clock period RX CLK(FD) = 10
Clocks which have not been assigned with a period in the figure have by default period of 10.
In the above figure we have seen previously that last bit was getting sampled twice as a result of continuous constant variation (from 10 to 12) in TX CLK (can be because of PPM).However now, as depicted in red box, the bit is sampled correctly.
In the first TX clock cycle, period is 10 time unit which is locked after frequency detection and also reflected on RX CLK(FD+PD). According to rule 1 the edge on data will make the level of RX CLK(FD+PD) to 1(denoted by first dotted arrow). Then negedge will be asserted on RX CLK(FD+PD) after half of the RX CLK(FD) period. Then posedge will be asserted on RX CLK(FD+PD) depending on half of RX CLK(FD) period or transition on DATA whichever comes first (rule number 2).
On sixth and seventh TX clock cycle the DATA bits are 0 and 0, no transition on line. So RX CLK(FD+PD) will follow rule number 2 to have a clock period of RX CLK(FD), depicted in first cycle of RX CLK(FD+PD) in red box. Since the period for seventh clock cycle is 12 time unit, transition on DATA will occur after 2 unit of time than expected. Now RX CLK(FD+PD) already asserted the posedge and starts waiting for completion of half of RX CLK(FD) period to assert negedge on RX CLK(FD+PD). However after completion of 2 time unit, a transition gets detected which causes to restart the wait time of half of RX CLK(FD). That leads to negedge of RX CLK(FD+PD) after 7 time unit instead of 5 (i.e. half of RX CLK(FD) period). Then posedge after 5 time unit (2nd cycle of RX CLK(FD+PD) in red box). Likewise phase alignment adjusts the clock period based on constant variations in incoming data stream.
Below figure depicts negative jitter case…
frequency-negative-jitter-case
The third TX clock period has been varied from 10 time unit to 7 time unit due to negative jitter. This variation encountered correctly by phase alignment as depicted in red box.
The posedge on RX CLK(FD+PD) occurs as transition detected on DATA. Negedge on RX CLK(FD+PD) occurs after completion of half of RX CLK(FD) period. Now a transition is seen on DATA before completion of half of RX CLK(FD) period which causes a level transition from 0 to 1 on RX CLK(FD+PD) (shown by 3rd curved dotted arrow).
Below figure depicts positive jitter case…
 frequency-positive-jitter-case
Fifth TX clock period has been varied from 10 time unit to 13 time unit due to positive jitter. This variation encountered correctly by phase alignment as depicted in red box.
RX CLK(FD+PD) will follow rule number 2 to have a clock period of RX CLK(FD), depicted in first cycle of RX CLK(FD+PD) in red box. Since the period for fifth clock cycle is 13, transition on DATA will occur after 3 unit of time than expected. Now RX CLK(FD+PD) already asserts the posedge and starts waiting for completion of half of RX CLK(FD) period to assert negedge on RX CLK(FD+PD). However after completion of 3 time unit, a transition gets detected which causes to restart the wait time of half of RX CLK(FD). That leads to negedge of RX CLK(FD+PD) after 8 time unit instead of 5 (i.e. half of RX CLK(FD) period). Then posedge after 5 time unit (2nd cycle of RX CLK(FD+PD) in red box).
That was all about working process of phase alignment.
One caveat…!
There is one issue that is not covered by frequency detection and phase alignment! Phase alignment works on transition in incoming data stream. However it is possible to have long identical bit stream which does not have any transition in it. In this case if cumulative shift in an edge becomes more than half of recovered clock period (RX CLK(FD)) then it will lead to bit errors and sample the data incorrectly.
To solve this problem, bit sequences are processed with various types of encoding before transmitting it on wire. This limits the number of consecutive identical bits to a certain level. This reduces the probability of approaching the cumulative shift to more than half of RX CLK(FD).
For example, in USB 3.0 the data bits are processed with 8B10B encoding before transmitting it on a wire.
Below figure describes the behavioral block diagram of CDR
behavioural-block-diagram-cdr
Incoming data stream passed as input to FD(frequency detector), ED(edge detector) and a D flip flop. Frequency detector generates a frequency based on training sequences. Edge detector gives output whenever it detects a transition on incoming data. The outputs of frequency detector and edge detector passed as input to clock generator block which generates a clock to sample the data. This generated clock and incoming data passed to D flip flop to regenerate the bit stream.
Below figure depicts the flow chart for clock generation after phase alignment
 clock-generation-post-phase-alignment
RX CLK(FD+PD) will be initialized to either 1 or 0. After initialization two parallel processes will start, wait timer for half of RX CLK(FD) time and edge detection on incoming data. Whichever is completed first out of the two parallel processes will disable the other process. If timer timed out before any edge is detected then it will check for current level of RX CLK(FD+PD). If level of RX CLK(FD+PD) is 1 then go to “RX CLK(FD+PD)=0” activity and reset RX CLK(FD+PD) to 0 else go to “RX CLK(FD+PD)=1” activity.  Restart both the parallel processes again and wait for completion of either one. If an edge is detected before completion wait timer then move to “RX CLK(FD+PD)=0” activity and set the level of RX CLK(FD+PD) to 1.Restart both the parallel processes again and wait for completion of either one.
So this was all about the behavioral modeling of clock and data recovery! You could also view a Slideshare presentation on behavioural clock data recovery here.

Tuesday, 13 January 2015

How to start developing custom uvm libraries

Any company uses custom cut methodology(uvm/ovm) libraries to suit their environment requirements. If company uses some additional debug mechanisms or company has some common functionality to be shared across multiple components, they use their own library. Below method provides a way of doing that.

Assume UVM is TB Methodology. Assume all the components uses a common method "check_ports".  You want this task to be present in every component you extend from uvm_component. Below shown example does that.

Create a base class as below and include all the common features in this base class which have to be shared across the uvm_component classes.

class lib_check_ports #(type T = uvm_component) extends T;
  
  function new(string name, uvm_component parent);
    super.new(name, parent);
  endfunction
  
  function void start_of_simulation_phase(uvm_phase phase);
    check_ports();
  endfunction
  
  function void check_ports();
    `uvm_info("LIB", "I'm checking if all of my ports are connected", UVM_LOW);
  endfunction
  
endclass


Then, we can create our lib_* class family by just parameterizing this lib_check_ports with appropriate base class.

typedef lib_check_ports #(uvm_component) lib_uvm_component;
typedef lib_check_ports #(uvm_monitor) lib_uvm_monitor;
typedef lib_check_ports #(uvm_driver) lib_uvm_driver;

We can do this for any uvm_component sub classes. In business code, instead of inheriting from uvm_monitor inherit from lib_uvm_monitor and so on....

Interface class in system verilog:

courtesy: http://blog.verificationgentleman.com/2014/08/systemverilog-2012-has-even-more-class.html

An interface class has nothing to do with the interface construct. It represents the same concept as an interface in Java (a lot ofSystemVerilog's object oriented programming constructs are pretty similar similar to Java's). What does an interface class do? It's basically a collection of method declarations. Notice I've used the word 'declarations' and not 'definitions', as all methods of an interface class must be pure. Another class can implement an interface class, which requires it to implement all of the methods declared in that interface.
Why is this useful? I'll answer this question with the help of an example. Let's say I have my own library. In this library I expect to operate on a certain type of objects (by operating on objects I mean calling methods on them). Concretely, let's say I have the 'drivable' interface, which defines the capabilities of an object that can be driven (I don't want 'car' here and you'll see why in just a bit). What can a drivable object do? Well, it can accelerate, it can turn and it can brake, to name a few things. We model these as functions that a drivable object has:
interface class drivable_if;
  pure virtual function void accelerate();
  pure virtual function void turn_left();
  pure virtual function void turn_right();
  pure virtual function void brake();
endclass
A driver can use these methods to drive a drivable object:
class driver;
  protected drivable_if m_drivable;
  
  function new(drivable_if drivable);
    m_drivable = drivable;
  endfunction

  function void drive();
    m_drivable.accelerate();
    m_drivable.turn_right();
    m_drivable.accelerate();
    m_drivable.turn_left();
    m_drivable.brake();
  endfunction
endclass
Our driver class can operate on any object that provides the methods of the drivable_if interface, regardless of how these methods are implemented. (I'll use the term 'interface' instead of 'interface class' in this post, but just know that this is what I mean.)
In our code (outside of the library), we define the car class, that implements the drivable_if interface:
class car implements drivable_if;
  
  //----------------------------------------
  // methods of drivable_if  
  //----------------------------------------
  
  virtual function void accelerate();
    $display("I'm accelerating");
  endfunction

  virtual function void turn_left();
    $display("I'm turning left");
  endfunction

  virtual function void turn_right();
    $display("I'm turning right");
  endfunction

  virtual function void brake();
    $display("I'm braking");
  endfunction
endclass
We can now use an instance of this class, together with an instance of the driver class:
module top;
  initial begin
    static car the_car = new();
    static driver the_driver = new(the_car);
    the_driver.drive();
  end
endmodule
Remember, the driver class and the drivable_if interface are defined in an own package (that we downloaded, bought, etc.), which we'll assume we can't change. We could, however, let our own car object be driven by the driver object, even though the driver class did not know anything about the car class. This is because the car class provides the methods that the driver expects to be able to drive it. It doesn't matter how those methods were implemented, just that they were implemented.
What you're now probably going to ask is: "But Tudor, why didn't you just implement a virtual class? You can essentially get the same thing: you define the methods and you can't create any instances of that class.". And you would be right, but what if we want our carclass to implement another interface at the same time? If I use a virtual class, I'm in trouble, because you can only extend one base class. You can, however, implement as many interfaces as you want.
What else do you want to do with a car besides drive it? You want to insure it. I don't know how it is in other places, but in most (if not all) European countries, the insurance premium depends on the size of the car's engine. What it may also depend on is the accident history of the car (not technically true in the real world, but please bear with me on this one). Insuring a car is a different aspect than driving it, so it makes sense to have a separate library the handles this topic. Following the example from above, this is how the interface for an insurable object (notice I didn't say car) might look like:
interface class insurable_if;
  pure virtual function int unsigned get_engine_size();
  pure virtual function int unsigned get_num_accidents();
  pure virtual function int unsigned get_damages(int unsigned accident_index);
endclass
Using these methods to query an object, an insurer could compute the premium for that object:
class insurer;
  virtual function int unsigned insure(insurable_if insurable);
    int engine_size = insurable.get_engine_size();
    int num_accidents = insurable.get_num_accidents();
    int damages;
    for (int i = 0; i < num_accidents; i++)
      damages += insurable.get_damages(i);

    // do some bogus calculation
    return engine_size * 10 + damages * 100;
  endfunction
endclass
Let's take our previous car class and expand it to be insurable. What we need to do is implement the insurable_if interface and define its methods:
class car implements drivable_if, insurable_if;
  protected int unsigned m_engine_size;
  protected int m_damages[];
  
  function new(int unsigned engine_size);
    m_engine_size = engine_size;
  endfunction

  function void crash(int damages);
    m_damages = new[m_damages.size() + 1] (m_damages);
    m_damages[m_damages.size() - 1] = damages;
  endfunction
  
  
  //----------------------------------------
  // methods of insurable_if  
  //----------------------------------------

  virtual function int unsigned get_engine_size();
    return m_engine_size;
  endfunction

  virtual function int unsigned get_num_accidents();
    return m_damages.size();
  endfunction

  virtual function int unsigned get_damages(int unsigned accident_index);
    assert (accident_index < get_num_accidents());
    return m_damages[accident_index];
  endfunction
  
  
  //----------------------------------------
  // methods of drivable_if  
  //----------------------------------------
  
  // ...
endclass
I've added a crash() method to simulate an accident. Let's insure our car:
module top;
  initial begin
    static car the_car = new(3);
    static driver the_driver = new(the_car);
    static insurer the_insurer = new();
    
    the_driver.drive();
    the_car.crash(500);
    $display("The insurance premium is ", the_insurer.insure(the_car));
  end
endmodule
What we can do now is drive the car, like before, but we can also insure it. We've managed to glue together two different behaviors into one single class (car) and then use them in objects that are each concerned with only one of these behaviors (driver and insurer). We also didn't mix in any information about insurability in the drivability package and vice-versa  This wouldn't have been possible without interface classes.
If we were to use only inheritance, this would mean that we would need to have a base class that contained both the drivable_if and the insurable_if methods. Then, both of these libraries could operate on subclasses of this class. The biggest (and I really mean big) problem with this is that this creates tight coupling between the two libraries. What if we want to use a third library? Our base class would need to contain the methods this library uses to operate on objects as well. Throw a forth library in the mix and it already becomes unmanageable. If we would want to implement just one of these behaviors in a subclass, we would still be cluttered with methods from the others. Using only inheritance results in big class hierarchies, with a lot of duplication and parallel branches.
Look at the UVM for example. It tries to do everything, simply because it has to do as much as possible. The reason is that once you're inheriting from a UVM class, you're kind of stuck in that class hierarchy. You have to use libraries that can operate with UVM classes. By using interface classes, you can happily extend from any UVM class, but at the same class implement any number of interfaces you want. This means you can now work with libraries that are completely agnostic of UVM. With interface classes using UVM stops being an "either/or" proposition.
The UVM BCL could also use a makeover. The current implementation of TLM is a mess in my opinion. It relies heavily on macros, with all TLM methods declared in all port types. The ones that are not supposed to be used in a certain port are blocked at run time by issuing an error. Ideally, calling a method not intended for a specific port should not make it past compile. Have a look at the implementation and tell me if that code is clear and maintainable to you (the files are uvm_tlm_ifs.svh, uvm_ports.svh and uvm_tlm_imps.svh).
TLM is implemented very cleanly in SystemC, using the interface concept. Dave Rich already touched on this subject in his DVCon Paper, "The Problems with Lack of Multiple Inheritance in SystemVerilog and a Solution". He already stated that interfaces would solve the problem of having to copy-paste a lot of code between classes. The paper was written in 2010, so there wasn't any interface classyet (though I suspect it was in the works). Here's a short example of how the TLM get interfaces could be implemented:
interface class uvm_blocking_get_if #(type T=int);
  pure virtual task get(output T t);
  pure virtual task peek(output T t);
endclass


interface class uvm_nonblocking_get_if #(type T=int);
  pure virtual function bit try_get(output T t);
  pure virtual function bit can_get();
  pure virtual function bit try_peek(output T t);
  pure virtual function bit can_peek();
endclass


interface class uvm_get_if #(type T) extends
  blocking_get_if #(T),
  nonblocking_get_if #(T);
Here we also see another cool fact: an interface class can extend as many interface classes as it wants. This means that theuvm_get_if will declare all of the methods of both the uvm_blocking_get_if and of the uvm_nonblocking_get_if. The family of getports will implement these interfaces:
class uvm_blocking_get_port #(type T=int) implements
  uvm_blocking_get_if #(T);
  // ...
  
  virtual task get(output T t);
    // ...
  endtask
  
  // other uvm_blocking_get_if interface methods ...
endclass


class uvm_nonblocking_get_port #(type T=int) implements
  uvm_nonblocking_get_if #(T);
  // ...
  
  virtual function bit try_get(output T t);
    // ...
  endfunction
  
  // other uvm_nonblocking_get_if interface methods ...
endclass


class uvm_get_port #(type T=int) implements
  uvm_get_if #(T);
  // ...
  
  // uvm_blocking_get_if interface methods ...
  
  // uvm_nonblocking_get_if interface methods ...
endclass
Doing the following will now result in a compile error:
uvm_nonblocking_get_port some_port = new();
some_item item;

some_port.get(item);
The get(...) method is not defined in the uvm_nonblocking_get_port_if interface, so the compiler can immediately flag an error, something that isn't possible in the current release of the UVM library.
Now, dare I say that the whole TLM aspect could be spun out into a standalone library that could be used by others that want to use TLM, but not the whole UVM? Yes I do dare, but whether this will happen is doubtful. Many more such examples could be found in the UVM BCL; to name one, the whole sequence mechanism is also a pretty unwieldy beast.
I hope this post inspires you to incorporate interface classes into your coding to enable the creation of reusable libraries that are orthogonal to each other, but can be used together. A great example of this is the Java standard library. I also hope that this new feature will lead to the creation of more open source packages that can accomplish various tasks. Great initiatives are svlib and cluelib. I don't know if they use interface classes as I didn't look at the code, but if they don't, then they should consider it.
If you want to learn more about interface classes, you can find more info in the LRM. You can also read up more on interfaces in various Java articles as the concepts are pretty much the same. You can find the for the code for the drivable and insurable interfaces on the blog repository.