11
Feb

In the last post of the series (http://www.vip-central.org/2012/10/a-strategy-to-verify-an-axi-ace-compliant-interconnect-part-2-of-4/) I wrote about basic coherent testing. In this post I will discuss some of the nuances of the specification relative to accesses to overlapping addresses. Since multiple masters may be sharing the same location and the data could be distributed across the caches of different masters, this is an important part of the verification of a coherent system. The interconnect plays a very important role in maintaining coherency for such accesses.

The Gory Details

There are three key aspects that the Interconnect should take care of relative to accesses to overlapping transactions.

  1. Sequencing transactions
  2. Timing of snoop accesses relative to responses to coherent transactions
  3. ‘Re-fetch’ of data from memory if it is possible that the data read from memory is different from the data read after all associated snoop transactions have completed.

Sequencing transactions

Consider the example below:

Here, Master 1 and Master 2 want to write to the same location and store it in their local caches, at approximately the same time. For this, Master 1 and Master 2 send MakeUnique transactions (represented in the figure by 1a and 2a). For a moment let us consider the effects of an incorrect behavior of the interconnect. Let us say that the interconnect sends both Master 1 and Master 2 MakeInvalid snoop transactions (represented by 1b and 2b) corresponding to the MakeUnique transactions it received from Master 2 and Master 1 respectively. Once the masters respond with a snoop response (represented by 1c and 2c), the interconnect sends responses back to the masters (represented by 1d and 2d). When the transactions have completed in both Master 1 and Master 2, both masters update the cache to a Unique State. This violates protocol because a cacheline can be held in a Unique state by only master. Moreover, each master may store a different value in its local cache with both masters incorrectly “thinking” that it has a unique copy of the cacheline. Clearly, the effects of not sequencing correctly is incoherency as shown in the figure, where two masters have two different views of the data. In order to deal with this, the specification requires that such accesses to overlapping addresses be sequenced. The  specification states:

“It is the responsibility of the interconnect to ensure that there is a defined order in which transactions to the same cache line can occur, and that the defined order is the same for all components. In the case of two masters issuing transactions to the same cache line at approximately the same time, then the interconnect determines which of the transactions is sequenced first and which is sequenced last. The arbitration method used by the interconnect is not defined by the protocol. The interconnect indicates the order of transactions to the same cache line by sequencing transaction responses and snoop transactions to the masters. The ordering rules are:

• if a master issues a transaction to a cache line and it receives a snoop transaction to the same cache line before it receives a response to the transaction it has issued, then the snoop transaction is defined as ordered first.

• if a master issues a transaction to a cache line and it receives a response to the transaction before it receives a snoop transaction to the same cache line, then the transaction issued by the master is defined as ordered first.” [1]

In the above example, let us assume that the interconnect gives priority to Master 1. If so, it must send a snoop transaction (1b) to Master 2, wait for the snoop response (1c) and send the response back to Master 1 (1d). At the end of this sequence, Master 1 will have its cacheline in a unique state and may write a value in its cache. The interconnect may then sequence Master 2 and can send a snoop transaction (2b) to Master 1 which will invalidate the cacheline in Master 1, wait for a snoop response (2c) and send the response back to Master 2 (2d). At the end of this sequence, Master 1 has its cacheline invalidated and Master 2 will have its cacheline allocated to a Unique state.

Timing of Snoop Accesses Relative to Responses to Coherent Transactions

The specification lays down some rules on the ordering of responses to coherent transactions and snoop transactions to the same cacheline. These are given below:

“The interconnect must ensure the following:

• if the interconnect provides a master with a response to a transaction, it must not send that master a snoop transaction to the same cache line before it has received the associated RACK or WACK response from that master

• If the interconnect sends a snoop transaction to a master, it must not provide that master with a response to a transaction to the same cache line before it has received the associated CRRESP response from that master.” [1]

An important point to note relative to this aspect of the protocol is that this requirement is not applicable to WriteBack and WriteClean transactions although it is not explicitly stated in the specification. Applying the above rules to WriteBack and WriteClean transactions could lead to a deadlock. This is because a master that receives a snoop transaction to a cacheline is allowed to stall it until any pending WriteBack or WriteClean transactions that it initiated or is about to initiate to the same cacheline is complete. In other words, this master must be allowed to receive a response to the WriteBack or WriteClean transaction before it can allow an incoming snoop to proceed (that is, respond to it). If the above rule is applied to WriteBack or WriteClean transactions, the interconnect will not be able to send a response to the WriteBack or WriteClean transaction since a snoop transaction has already been sent to the master. Therefore, it is important that this rule is not applied to WriteBack and WriteClean transactions.

Re-fetching Data from Memory

In certain circumstances, data may have to be re-fetched from memory. For example, consider that Master 1 issues a ReadShared transaction and Master 2 which has a dirty copy of the cacheline issues a WriteBack transaction. Let us say that the interconnect issues a read from main memory for the ReadShared transaction. After the Read transaction sent to main memory is complete, let us assume that the WriteBack makes progress. After this, any snoop transaction sent by the interconnect will not return data because the WriteBack would have invalidated the cacheline in Master 2. However, if the interconnect uses the data received in the prior read to memory, it will be stale, because a WriteBack transaction has updated memory after the read to memory was issued. It is therefore necessary to re-fetch data from memory and use that data to respond to Master 1. How do we detect issues related to this? These can be detected through coherency checks. In the above example, the ReadShared transaction will be passed clean data and its contents should match that of memory. If it doesn’t, it probably means that the interconnect used stale data to respond to the ReadShared transaction.

Testing Accesses to Overlapping Address

Testing all the scenarios related to accesses to overlapping addresses can be overwhelming. Given a system, there are multiple ports of different interface types which can send transactions to overlapping addresses. However, not all combinations of masters accessing a given address may be valid, because some masters may be allowed to access only certain address spaces and a group of masters may access only a restricted set of the address space and these group of masters form a shareability domain. Add to this, the fact that so many different transaction types can be initiated by a master with different initial states for a cacheline of a given address. The power of randomization and configuration-aware sequences can meet these requirements. A sequence that tests this could do the following:

  1. Based on the shareability domain given by a user, randomly choose two masters in that domain
  2. Based on the interface types of these masters, choose a random transaction type for each of these masters
  3. Initialize cachelines to valid, random states for a set of addresses
  4. Send transactions from both masters at the same time.

Key Verification Points

All the verification points mentioned in the previous blog are applicable here as well. In addition to this, the following need to be checked:

  • Sequencing of transactions.

The order of transactions must be the same as seen by all masters.

  • Ordering requirements relative to coherent responses and snoop accesses
  • Ensuring coherency when data needs to be re-fetched from memory

In this post I have described the testing strategy and the key aspects of testing relative to accesses to overlapping addresses. In the next post I will write on testing of Barrier and DVM transactions.

References

[1] AMBA AXI and ACE Protocol Specification (http://infocenter.arm.com/help/topic/com.arm.doc.ihi0022-/)

Category : AMBA | AXI3/AXI4/ACE | IP Verification
10
Feb

Saurabh Shrivastava,  Synopsys

In recent times we have seen a trend where serial data transfers have been seen to be more widely used than parallel data transfer due to reasons of improved performance and data integrity. One example of this is the migration from PCI/PCI-X to PCI Express. A serial interface between two devices results in fewer number of pins per device package. This not only results in reduced chip and board design cost but also reduces board design complexity. As serial links can be clocked considerably faster than parallel links,  they would be  highly scalable in terms of performance.

However, to accelerate the  verification of PCI Express based sub-systems and to accelerate the PCI Express endpoint development time ,  PIPE (PHY Interface for the PCI Express Architecture) was defined by Intel and was published for industry review in the year 2002. The PIPE is a standard interface defined between a PHY sub-layer which handles the lower levels of serial signaling and the Media Access Layer (MAC) which handles addressing/access control mechanisms. The following diagram illustrates the role PIPE plays in partitioning the PHY layer for PCI Express.

 

Partitioning Phy Layer

(Source:  PHY Interface for thePCI Express Architecture specification, Version 2.00)

With this interface, developers can validate their designs without having to worry about the analog circuitry associated with teh Phy interface.  For the MAC core verification, the PHY Bus Functional Model (BFM) would  be connected directly to it. Without PIPE, it would be required to have the PHY and Serdes (serializer/deserializer) combination along with the Root Complex BFM. Additionally, the user would have to ensure the correctness of the PHY and SerDes behavior as well with the serial interface.

Given the value of the PIPE interface, it is now seen to be widely used. In our recent experiences, we have observed that the different power states in the PIPE interface can create some confusion with respect to their interpretation. This blog post and the next will throw some light on the different power states of this interface. Hopefully, this will lead to a better understanding of the same. The assumption here is that the reader has a high level understanding of PCIe LTSSM.

Power states of PIPE

The power management signals allow the PHY to minimize the power consumption.  Four power states, P0, P0s, P1, and P2 are defined for this interface. P0 state is the normal operational state for the PHY. One it transitions from P0 to a lower power state, the PHY can immediately take appropriate power saving measures.

All power states are represented by signals PowerDown [2:0](MAC output). The Bit representation is as follows:

2]   [1]   [0]                           Description

0       0      0                            P0, normal operation

0       0      1                            P0s, low recovery time latency, power saving state

0       1      0                            P1, longer recovery time latency, lower power state

0       1      1                            P2, lowest power state.

PIPE interface power state can be correlated with power state of LTSSM as mentioned in Base specification (Refer to PCI_Express_Base_r3.0_10Nov10).

  1. P0 is equivalent to LTSSM State where Data/Order Set can transfer
  2. P0s is equivalent to L0s of LTSSM
  3. P1 is equivalent to Disabled, all Detect , and L1.Idle  state of LTSSM
  4. P2 is equivalent to L2 of LTSSM

 

Power state transitions in PIPE

In states P0, P0s and P1, the PHY is required to keep PCLK operational. For all state transitions between these three states, the PHY indicates successful transition into the designated power state  by a single cycle assertion of PhyStatus.

There is a limited set of legal power state transitions that a MAC can cause the PHY to make. Referencing the main state diagram of the LTSSM in the base specification and the mapping of LTSSM states to PHY power states described in the preceding paragraphs, those legal transitions are:

  1. P0 to P0s
  2. P0 to P1
  3. P0 to P2
  4. P0s to P0
  5. P1 to P0
  6. P2 to P1

Given that we understand the valid power state transitions,  I would capture more details about the individual power states and all possible transitions in more detail in my subsequent blog.  Stay tuned.


Category : IP Verification | PCI-Express
17
Dec

Satyapriya Acharya, CAE, Synopsys

In my previous blog (AMBA based Subsystems: What does it take to verify them?), I had talked about some of the key verification challenges when it comes to verifying complex SOCs based on AMBA based subsystems. It was seen that it would indeed be useful to have an extensible AMBA based verification environment which can be tweaked minimally so that it can be reused for the new systems or derivatives. 

To enable the SOC verification engineers to create highly configurable AMBA fabric, the system environment  should  provide place-holders for hooking the DUT with any of the quintessential AMBA VIP components such as AXI3/4/ACE, AHB or APB. With the use of AMBA System environment we can configure it to instantiate as many number of AXI/AHB/APB VIP with minimal additional code.  Thus, such an environment would need to encapsulate the following amongst others..

  • an AXI (3/4/ACE) system environment
  • an AHB system environment
  • an APB system environment
  • A virtual sequencer
  • An array of AMBA System Monitors.
  • Configuration descriptor of the AMBA system environment which can be used to configure the underlying AXI/AHB/APB System environments.

 The following figure shows a representation of such a verification environment:

Let’s see what features in UVM can come in useful for creating a robust environment for some of the important ‘system’ level capabilities:

  • Layered virtual sequencers to achieve synchronization between various components: A System sequencer which manages synchronization across the bus fabric can be modeled as a virtual sequencer with references to the virtual sequencers within AXI System Env, AHB System Env and APB System Env.

 

  • Leveraging Analysis ports for system level checks, score boarding  and  response handling: Each of the Port Monitor in the AXI, AHB & APB Master and Slave Agent would ideally have an  analysis port.  At the end of the transaction, the Master and Slave Agents respectively write the completed transaction object to the analysis port.  Such upstream ports and the downstream ports can be specified to be used by the system monitor to track transformations and responses across the fabric as well as to perform routing checks… 
 
  •  Using callbacks to enable user extensions and to extract coverage and   throughput measurement: Callbacks are an access mechanism that enable the insertion of user-defined code and allow access to objects for performance analysis and throughput measurements in the case of the AMBA system environment

 

  • A comprehensive sequence library to be run on the virtual sequencer in the System environment: UVM allows for a logical collection of sequences to be registered to a sequence library and this collection can execute on an associated sequencer. A system level sequencer than co-ordinate the execution of these collection of sequences across different sequencers to  create an interesting mix of scenarios while targeting the maximum coverage for a system level stimulus perspective

 

From a verification perspective, system level checks are key. As mentioned earlier, they can include:

  •  Data Integrity checks across AXI, AHB and APB ports
  •  Transaction routing checks across AXI, AHB and APB ports

In my next blog, I will talk about this aspect in more detail. I would walk you through the capabilities that you would need in your System monitor to easily perform the checks mentioned above

Category : AHB/APB | AMBA | AXI3/AXI4/ACE | IP Verification
28
Nov

In my previous blog post, I talked about guidelines to create reusable sequences. Continuing on this thread, here I am going to talk about virtual sequences and the virtual sequencer. Common questions I hear from users include: why do we need a virtual sequence? How can we use it effectively?

Most UVM testbenches are composed of reusable verification components, unless we are working on block-level verification of a simple protocol like MIPI-CSI. Consider a scenario of verifying a simple protocol; In this case, we can live with just one sequencer sending the stimulus to the driver. The top-level test will use this sequencer to process the sequences (as described in the previous blog post). Here we may not need virtual sequence (or a virtual sequencer).
But when we are trying to integrate this IP into our SOC (or top-level block), we surely want to consider reusing out testbench components, which have been used to verify these blocks. Let us consider a simple case where we are integrating two such blocks. Here, let us consider two sequencers driving these two blocks. From top-level test, we will need a way to control these two sequencers.

This can be achieved by using a virtual sequencer and virtual sequences. Other way of doing it is to call sequence’s start method explicitly from the top-level test by passing the sequencer to the start method.

I am going to explain this usage by taking an example, where USB host is integrated in an AXI environment. Let’s see how we can control USB sequencer and AXI sequencer from top-level test. For this particular test, I want to configure the AXI registers and then send USB transfers. For configuring AXI registers am using a sequence say axi_cfg_reg_sequence and for sending USB transfers am using the sequence (usb_complex_sequence) which I have used in the previous blog post. Below is an example where multiple sequencers are controlled without using a virtual sequence.

//Top-level test where multiple sequencers are controlled from the
//phase method.
class axi_cfg_usb_bulk_test extends uvm_test;
  `uvm_component_utils(usb_ltssm_bulk_test)

  //Sequences which needs to be exercised
  usb_reset_sequence    u_reset_seq;
  axi_reset_sequence    a_reset_seq;
  usb_complex_sequence   u_bulk_seq;
  axi_cfg_reg_sequence   a_cfg_reg_seq;

  function new (strint name=”axi_cfg_usb_bulk_test”,
                                            uvm_component parent=null);
    …
  endfunction: new

  //Call the reset sequences in the reset_phase
  virtual task reset_phase (uvm_phase phase);
    phase.raise_objections(this);
    …
    //Executing sequences by calling the start method directly by passing the
    //corresponding sequencer
    a_reset_seq.start(env.axi_master_agent_obj.sequencer);
    u_reset_seq.start(env.usb_host_agent_obj.sequencer);
    …
    phase.drop_objections(this);
  endtask:reset_phase

  virtual task main_phase (uvm_phase phase);
    phase.raise_objections(this);
    …
    //Executing sequences by calling the start method directly by passing the
    //corresponding sequencer
    a_cfg_reg_seq.start(env.axi_master_agent_obj.sequencer);
    u_bulk_seq.start(env.usb_host_agent_obj.sequencer);
    …
    phase.drop_objections(this);
  endtask:main_phase
endclass: axi_cfg_usb_bulk_test

This is not the efficient way of controlling the sequencers as we are directly using the simple sequences inside the test and making it complex. By doing this, we cannot reuse these complex scenarios further to develop more complex scenarios; rather if we try to create a sequence and use this sequence in the test, then we can re-use these sequences in other tests (or sequences) as well. Also it will be easier to maintain and debug these sequences compared to creating entire scenario in the top-level test.

Having understood why we need virtual sequence and virtual sequencer, let’s see how this can be achieved by taking the same example shown above.

First thing we need to do is to create a virtual sequencer, note that virtual sequences can only associate with virtual sequencer (but not with non-virtual sequencer). Virtual sequencer is also derived from uvm_sequencer like any other non-virtual sequencer but is not attached to any driver. Virtual sequencer has references to the sequencers we are trying to control. These references are assigned from top environment to the non-virtual sequencers.

//Virtual sequencer having references to non-virtual sequencers
Class system_virtual_sequencer extends uvm_sequencer;
  //References to non-virtual sequencer
  usb_sequencer usb_seqr;
  axi_sequencer axi_seqr;

  function new (string name=”usb_ltssm_bulk_test”,
                                           uvm_component parent=null);
    …
  endfunction: new

  `uvm_component_utils(system_virtual_sequencer)

endclass: system_virtual_sequencer

//Top level environment, where virtual sequencer’s references
//are connected to non-virtual sequencers
class system_env extends uvm_env;
  //Agents where the non-virtual sequencers are present
  usb_host_agent  usb_host_agent_obj;
  axi_master_agent  axi_master_agent_obj;
  //Virtual sequencer
  system_virtual_sequencer sys_vir_seqr;

  `uvm_component_utils(system_env)

  function new (string name=”system_env”, uvm_component parent=null);
    …
  endfunction: new

  function void connect_phase(uvm_phase phase);
    //Assigning the virtual sequencer’s references to non-virtual sequencers
    sys_vir_seqr.usb_seqr = usb_host_agent_obj.sequencer;
    sys_vir_seqr.axi_seqr = axi_master_agent_obj.sequencer;
  endfunction: connect_phase

endclass: system_virtual_sequencer

Now we have virtual sequencer with the references to our non-virtual sequencers, which we want to control, let’s see how we can control these non-virtual sequencers using virtual sequences.

Virtual sequences are same as any other sequence but it is associated to a virtual sequencer unlike non-virtual sequences, hence it needs to indicate which non- virtual sequencer it has to use to execute the underlying sequence. Also note that virtual sequence can only execute sequences or other virtual sequences but not the items. Use `uvm_do_on/`uvm_do_on_with to execute non-virtual sequences and `uvm_do/`uvm_do_with to execute other virtual sequences.

//virtual sequence for reset operation
class axi_usb_reset_virtual_sequence extends uvm_sequence;

  `uvm_object_utils(axi_usb_reset_virtual_sequence)

  //non-virtual reset sequences
  usb_reset_sequence    u_reset_seq;
  axi_reset_sequence    a_reset_seq;

  function new (string name=” axi_usb_reset_virtual_sequence”,
                                  uvm_component parent=null);
    …
  endfunction: new

  …
  …

  task body();
    …
    //executingnon-virtual sequence on the corresponding
    //non-virtual sequencer using `uvm_do_on
    `uvm_do_on(a_reset_seq, p_sequencer.axi_seqr)
    a_reset_seq.get_response();
    `uvm_do_on(u_reset_seq, p_sequencer.usb_seqr)
    u_reset_seq.get_response();
  endtask: body

endclass: axi_usb_reset_virtual_sequence

//virtual sequence for doing axi register configuration
//followed by USB transfer
class axi_cfg_usb_bulk_virtual_sequence extends uvm_sequence;

  `uvm_object_utils(axi_cfg_usb_bulk_virtual_sequence)
  `uvm_declare_p_sequencer(system_virtual_sequencer)

  //Re-using the non-virtual sequences
  usb_complex_sequence   u_bulk_seq;
  axi_cfg_reg_sequence   a_cfg_reg_seq;

  function new (string name=” axi_cfg_usb_bulk_virtual_sequence”,
                                          uvm_component parent=null);
    …
  endfunction: new

  task body();
    …
    //executingnon-virtual sequence on the corresponding
    //non-virtual sequencer using `uvm_do_on
    `uvm_do_on(a_cfg_reg_seq, p_sequencer.axi_seqr)
    a_cfg_req_seq.get_response();
    `uvm_do_on(u_bulk_seq, p_sequencer.usb_seqr)
    u_bulk_seq.get_response();
  endtask: body

endclass: axi_cfg_usb_bulk_virtual_sequence

In the above virtual sequence, we are executing axi_cfg_reg_sequence and then usb_complex_sequence. Now having virtual sequence and virtual sequencer ready, let’s see how we can execute this virtual sequence from the top-level test.

//Top-level test where virtual sequence is set to virtual sequencer
class axi_cfg_usb_bulk_test extends uvm_test;
  …
  virtual function void build_phase(uvm_phase phase );
    …

    //Configuring variables in underlying sequences
    uvm_config_db#(int unsigned)::set(this,
      ”env.sys_vir_seqr.axi_cfg_usb_bulk_virtual_sequence.u_bulk_sequence”,
      ”sequence_length”,10);

    //Executing the virtual sequences in virtual sequencer’s
    //appropriate phase.
    //Executing reset virtual sequence in reset_phase
    uvm_config_db#(uvm_object_wrapper)::set(this,
             "env.sys_vir_seqr.reset_phase", "default_sequence",
             axi_usb_reset_virtual_sequence::type_id::get());

    //Executing the main virtual sequence in main_phase
    uvm_config_db#(uvm_object_wrapper)::set(this,
                     "env.sys_vir_seqr.main_phase", "default_sequence",
                     axi_cfg_usb_bulk_virtual_sequence::type_id::get());
    …
  endfunction : build_phase
endclass

Until now we understood why and how we can use virtual sequences. We should also keep few things in mind while using virtual sequence and virtual sequencer to save a lot of debugging time.

1. While configuring the variables in the sequences (which are executed using virtual sequences) we have to use path thru virtual sequence. In above example, using the non-virtual sequencer path for setting the variables in the lower level sequence, will not work.

uvm_config_db#(int unsigned)::set(this,”env.usb_host_agent_obj.sequencer.u_bulk_sequence”,”sequence_length”,10);

Even though u_bulk_sequence is running on the usb_host_agent_obj.sequencer, this will not work because this sequence is created by the virtual sequence and hence hierarchal path should be from virtual sequence but not using non-virtual sequencer. So the right way of setting variables is using the virtual sequence path.

uvm_config_db#(int unsigned)::set(this,”env.sys_vir_seqr.axi_cfg_usb_bulk_virtual_sequence.u_bulk_sequence”,”sequence_length”,10);

This is also true for factory overrides. For example below factory override will not work for the same above reason.

set_inst_override_by_type(”env.usb_host_agent_obj.*”,usb_transfer_item::get_type(), cust_usb_transfer_item::get_type());

In the above example we are trying to change the underlying sequence item with a new derived type from top-level test. For doing this we need to use the virtual sequencer path.

set_inst_override_by_type(”env.sys_vir_seqr.*”,usb_transfer_item::get_type(), cust_usb_transfer_item::get_type());

Rule of thumb is:
• If the sequence is created by a virtual sequence directly or indirectly, then any hierarchical path in factory overrides or in configurations should use virtual sequencer’s hierarchical path.
• If the sequence is created by a non-virtual sequence, then any hierarchical path in factory overrides or configurations should use non-virtual sequencer’s hierarchical path.

2. Even though we have virtual sequencer to control multiple sequencers, in some tests, we may just need a single sequencer (for example USB sequencer alone). In such cases, we have to use the non-virtual sequencer’s hierarchical path directly (not the virtual sequencer’s reference path) for configuring the variables or factory overrides. Using the virtual sequencer’s reference path will not work as the hierarchy of non-virtual sequencer is incorrect.

uvm_config_db#(uvm_object_wrapper)::set(this, “env.sys_vir_seqr.usb_seqr.main_phase”, “default_sequence”, usb_complex_sequence::type_id::get());

Above configuration will not work, as non-virtual sequencer (usb_seqr/usb_host_agent_obj.sequencer) is actually created in the agent, so the parent for this sequencer is agent but not the virtual sequencer, though the reference is in virtual sequencer. Hence we should not use virtual sequencer path when trying to set variables in the actual sequencer, instead we have to use the hierarchical path through the agent (actual parent to the sequencer).

uvm_config_db#(uvm_object_wrapper)::set(this, “env.usb_host_agent_obj.sequencer.main_phase”, “default_sequence”, usb_complex_sequence::type_id::get());

3. Whenever we are using virtual sequencer and want to control non-virtual sequencers from virtual sequencer, make sure to set the default_sequence in all the actual sequencers to null.

uvm_config_db#(uvm_object_wrapper)::set(this, “env.usb_host_agent_obj.sequencer.main_phase”, “default_sequence”, null);
uvm_config_db#(uvm_object_wrapper)::set(this, “env.axi_master_agent_obj.sequencer.main_phase”, “default_sequence”, null);

This is important because if there is any default_sequence set, then our non-virtual sequencer will be running both the default_sequence and the sequence from the virtual sequence. To control non-virtual sequencers solely from virtual sequencer, we need to set the default_sequence of the non-virtual sequencers as null.

I hope you find this post useful for understanding virtual sequences and save debugging time with the guidelines outlined. I am sure there will be other guidelines while using virtual sequences, which we learn the harder way debugging complex environments; please share any such guidelines with me.

Category : VIP Design
6
Nov

In this blog, I will be sharing the necessary steps one has to take while writing a sequence to make sure it can be reusable. Having developed sequences and tests in UVM, while using Verification IPs, I think writing sequences is the most challenging part in verifying any IP.  Careful planning is required to write sequences without which we end up writing one sequence for every scenario from scratch. This makes sequences hard to maintain and debug.

As we know, sequences are made up of several data items, which together form an interesting scenario. Sequences can be hierarchical thereby creating more complex scenarios. In its simplest form, a sequence should be a derivative of the uvm_sequence base class by specifying request and response item type parameter and implement body task with the specific scenario you want to execute.

class usb_simple_sequence extends uvm_sequence #(usb_transfer); 

    rand int unsigned sequence_length;
    constraint   reasonable_seq_len { sequence_length < 10 };
    //Constructor
    function new(string name=”usb_simple_bulk_sequence”);
        super.new(name);
    endfunction


    //Register with factory
    `uvm_object_utils(usb_simple_bulk_sequence)


    //the body() task is the actual logic of the sequence
    virtual task body();
        repeat(sequence_length)
        `uvm_do_with(req,   {
            //Setting the device_id to 2
            req.device_id   == 8’d2;
            //Setting transfer type to BULK
            req.type   == usb_transfer::BULK_TRANSFER;
        })
    endtask   : body
endclass

In the above sequence we are trying to send usb bulk transfer to a device whose id is 2. Test writers can invoke this by just assigning this sequence to the default sequence of the sequencer in the top-level test.

class usb_simple_bulk_test extends uvm_test; 

    virtual function void   build_phase(uvm_phase phase );
        uvm_config_db#(uvm_object_wrapper)::set(this, "sequencer_obj.
        main_phase","default_sequence", usb_simple_sequence::type_id::get());
    endfunction : build_phase
endclass

So far, things look simple and straight forward. To make sure the sequence is reusable for more complex scenarios, we have to follow a few more guidelines.

  • First off, it is important to manage the end of test by raising and dropping objections in the pre_start and post_start tasks in the sequence class. This way we raise and drop objection only in the top most sequence instead of doing it for all the sub sequences.
task pre_start() 

    if(starting_phase != null)
    starting_phase.raise_objection(this);
endtask : pre_start


task post_start()

    if(starting_phase   != null)
    starting_phase.drop_objection(this);
endtask : post_start

Note that starting_phase is defined only for the sequence which is started as the default sequence for a particular phase. If you have started it explicitly by calling the sequence’s start method then it is the user’s responsibility to set the starting_phase.

class usb_simple_bulk_test extends uvm_test; 

    usb_simple_sequence seq;
    virtual function void main_phase(uvm_phase   phase );
        //User need to set the starting_phase as   sequence start method
        is explicitly called to invoke the sequence
        seq.starting_phase = phase;
        seq.start();
    endfunction : main_phase

endclass

  • Use UVM configurations to get the values from top level test. In the above example there is no controllability given to test writers as the sequence is not using configurations to take values from the top level test or sequence (which will be using this sequence to build a complex scenario). Modifying the sequence to give more control to the top level test or sequence which is using this simple sequence.
class usb_simple_sequence extends uvm_sequence #(usb_transfer); 

    rand int unsigned sequence_length;
    constraint reasonable_seq_len {   sequence_length < 10 };
    virtual task body();
        usb_transfer::type_enum local_type;
        bit[7:0] local_device_id;
        //Get the values for the variables in case toplevel
         //test/sequence sets it.
        uvm_config_db#(int   unsigned)::get(null, get_full_name(),
            “sequence_length”, sequence_length);
        uvm_config_db#(usb_transfer::type_enum)::get(null,
            get_full_name(), “local_type”, local_type);
        uvm_config_db#(bit[7:0])::get(null, get_full_name(),�
            “local_device_id”, local_device_id);


        repeat(sequence_length)
        `uvm_do_with(req,   {
            req.device_id   == local_device_id;
            req.type   == local_type;
        })
    endtask : body

endclass

With the above modifications we have given control to the top-level test or sequence to modify the device_id, sequence_length and type. A few things to note here:-  the parameter type and string (third argument) used in uvm_config_db#()::set should be matching the type being used in uvm_config_db#()::get. Make sure to ‘set’ and ‘get’ with exact datatype. Otherwise value will not get set properly, and debugging will become a nightmare.

One problem with the above sequence is: if there are any constraints in the usb_transfer class on device_id or type, then this will restrict the top-level test or sequence to make sure it is within the constraint.

For example if there is a constraint on the device_id in the usb_transfer class, constraining it to be below 10 then top-level test or sequence should constraint it, within this range. If the top-level test or sequence sets it to a value like 15 (which is over 10) then you will see a constraint failure during runtime.

Sometimes the top-level test or sequence may need to take full control, and may not want to enable the constraints which are defined inside the lower level sequences or data items. One example where this is required is negative testing:- the host wants to make sure devices are not responding to the transfer with a device_id greater than 10 and so wants to send a transfer with device_id 15. So to give full control to the top-level test or sequence, we can modify the body task as shown below:

virtual task body(); 

    usb_transfer::type_enum local_type;
    bit[7:0] local_device_id;
    int status_seq_len = 0;
    int status_type = 0;
    int status_device_id = 0;


    status_seq_len = uvm_config_db#(int unsigned)::get(null,
        get_full_name(), “sequence_length”, sequence_length);
    status_type = uvm_config_db#(usb_transfer::type_enum)::get(null,
        get_full_name(),“local_type”,local_type);
    status_device_id = uvm_config_db#(bit[7:0])::get(null,
        get_full_name(), “local_device_id”,local_device_id);

    //If status of uvm_config_db::get is true then try to use the values
        // set by toplevel test or sequence instead of the random value.
    if(status_device_id  || status_type)
    begin
        `uvm_create(req)
        req.randomize();
        if(status_type)
        begin
        //Using the value set by top level test or sequence
        //instead of the random value.
            req.type   = local_type;
        end
        if(status_device_id)
        begin
            //Using the value set by top level test or sequence
        //instead of the random value.
            req.device_id   = local_device_id;
        end
    end
    repeat(sequence_length)
        `uvm_send(req)

endtask : body

It is always good to be cautious while using `uvm_do_with as it will add the constraints on top of any existing constraints in a lower level sequence or sequence item.

Also note that if you have more variables to ‘set’ and ‘get’ then I recommend you create the object and set the values in the created object, and then set this object using uvm_config_db from the top-level test/sequence (instead of setting each and every variable inside this object explicitly). This way we can improve runtime performance by not searching each and every variable (when we execute uvm_config_db::get) , and instead get all variables in one shot using the object.

virtual task body(); 

    usb_simple_sequence local_obj;
    int   status = 0;
    status = uvm_config_db#usb_simple_sequence)::get(null,
        get_full_name(),“local_obj”,local_obj);

    //If status of uvm_config_db::get is true   then try to use
    //the values set in the object we received.
    if(status)
    begin
        `uvm_create(req)
        this.sequence_length   = local_obj.sequence_length;
        //Copy the entire req object inside the object which we
        //received from uvm_config_db   to the local req.
        req.copy   (local_obj.req);
    end
    else
    begin
        //If we did not get the object from top level sequence/test
        //then create one and   randomize it.
        `uvm_create(req)
        req.randomize();
    end
    repeat(sequence_length)
        `uvm_send(req)

endtask : body

  • Always try to reuse the simple sequences by creating a top level sequence for complex scenarios. For example, in below sequence am trying to send bulk transfer followed by an interrupt transfer to 2 different devices. For this scenario I will be using our usb_simple_sequence as shown below:
class usb_complex_sequence extends uvm_sequence #(usb_transfer); 

    //Object of simple sequence   used for sending bulk transfer
    usb_simple_sequence simp_seq_bulk;
    //Object of simple sequence used for sending interrupt transfer
    usb_simple_sequence simp_seq_int;
    virtual task body();
        //Variable for getting device_id for bulk transfer
        bit[7:0] local_device_id_bulk;
        //Variable for getting device_id for   interrupt transfer
        bit[7:0] local_device_id_int;
        //Variable for getting sequence length for   bulk
        int unsigned local_seq_len_bulk;
        //Variable for getting sequence length for   interrupt
        int unsigned local_seq_len_int;

 

 

        //Get the values for the variables in case top level
        //test/sequence sets it.

 

        uvm_config_db#(int unsigned)::get(null, get_full_name(),
        “local_seq_len_bulk”,local_seq_len_bulk);

 

        uvm_config_db#(int unsigned)::get(null, get_full_name(),
        “local_seq_len_int”,local_seq_len_int);

 

        uvm_config_db#(bit[7:0])::get(null, get_full_name(),
        “local_device_id_bulk”,local_device_id_bulk);

 

        uvm_config_db#(bit[7:0])::get(null, get_full_name(),
        “local_device_id_int”,local_device_id_int);

        //Set the values for the variables to the   lowerlevel
        //sequence/sequence item, which we got from
        //above uvm_config_db::get.
        //Setting the values for bulk sequence
        uvm_config_db#(int unsigned)::set(null,   {get_full_name(),”.”,
        ”simp_seq_bulk”}, “sequence_length”,local_seq_len_bulk);
        uvm_config_db#(usb_transfer::type_enum)::set(null, {get_full_name(),
        “.”,“simp_seq_bulk”} , “local_type”,usb_transfer::BULK_TRANSFER);
        uvm_config_db#(bit[7:0])::set(null,   {get_full_name(), “.”,
        ”simp_seq_bulk”}, “local_device_id”,local_device_id_bulk);

        //Setting the values for interrupt   sequence
        uvm_config_db#(int unsigned)::set(null,   {get_full_name(),”.”,
        ”simp_seq_int”}, “sequence_length”,local_ seq_len_int);
        uvm_config_db#(usb_transfer::type_enum)::set(null, {get_full_name(),
        “.”,“simp_seq_int”} , “local_type”,usb_transfer::INT_TRANSFER);
        uvm_config_db#(bit[7:0])::set(null,{get_full_name(),“.”,
        ”simp_seq_bulk”},“local_device_id”,local_device_id_int);

        `uvm_do(simp_seq_bulk)
        simp_seq_bulk.get_response();
        `uvm_send(simp_seq_int)
        simp_seq_int.get_response();
    endtask : body

endclass

Note that in the above sequence, we get the values using uvm_config_db::get from the top level test or sequence, and then we set it to a lower level sequence again using uvm_config_db::set. This is important without this if we try to use `uvm_do_with and pass the values inside the constraint block then this will be applied as an additional constraint instead of setting these values.

I came across these guidelines and learned them, at times the hard way. So I am sharing them here. I sure hope these will come in handy when you use sequences that come pre-packed with VIPs to build more complex scenarios, and also when you wish to write your own sequences from scratch. If you come across more such guidelines or rules of thumb for writing re-usable, maintainable and debuggable sequences, please share them with me.

Category : IP Verification
5
Nov

Over the past two years, several design and verification teams have begun using SystemVerilog testbench with UVM. They are moving to SystemVerilog because coverage, assertions and object-oriented programming concepts like inheritance and polymorphism allow them to reuse code much more efficiently.  This helps them in not only finding the bugs they expect, but also corner-case issues. Building testing frameworks that randomly exercise the stimulus state space of a design-under-test and analyzing completion through coverage metrics seems to be the most effective way to validate a large chip. UVM offers a standard method for abstraction, automation, encapsulation, and coding practice, allowing teams to build effective, reusable testbenches quickly that can be leveraged throughout their organizations.

However, for all of its value, UVM deployment has unique challenges, particularly in the realm of debugging. Some of these challenges are:

  • Phase management: objections and synchronization
  • Thread debugging
  • Tracing issues through automatically generated code, macro expansion, and parameterized classes
  • Default error messages that are verbose but often imprecise
  • Extended classes with methods that have implicit (and maybe unexpected) behavior
  • Object IDs that are distinct from object handles
  • Visualization of dynamic types and ephemeral classes

Debugging even simple issues can be an arduous task without UVM-aware tools. Here is a public webinar that reviews how to utilize VCS and DVE to most effectively deploy, debug and optimize UVM testbenches.

Link at http://www.synopsys.com/tools/verification/functionalverification/pages/Webinars.aspx

Category : IP Verification
28
Oct

In the last post of the series (http://www.vip-central.org/2012/08/a-strategy-to-verify-an-axi-ace-compliant-interconnect%E2%80%93-part-1-of-4/)  I focused on the first level of testing required for verifying an AXI ACE Compliant Interconnect. In that post, I focused on Integration/Connectivity testing. In this post I will focus on basic ‘coherent transaction’ testing. I use the term ‘basic’ to signify something that is a prerequisite before we move on to more advanced testing. ‘Coherent transactions’ are a set of transactions used in the AXI ACE protocol to perform load and store operations. Each of these transactions have a different set of response requirements from the Interconnect. Further, each of these transactions can be used in multiple configurations. We need to verify that the Interconnect works correctly for each of these transaction types. I will first give an overview of the protocol before moving on to a testing strategy for these.

Overview of ACE Protocol

The ACE protocol provides a framework for system level coherency. It enables correctness to be maintained when sharing data across caches. It also enables maximum reuse of cached data. The protocol is designed to support different coherency protocols such as MESI,ESI,MEI and MOESI (where M stands for Modified, O for Owned, E for Exclusive, S for Shared and I for Invalid). The ACE protocol is realized using:

  • A five state cache model to define the state of any cache line in the coherent system as shown in the diagram below:

 

The defined states are:

-        Valid, Invalid: When invalid, the cacheline does not exist. When valid, the cacheline is present in the cache

-        Unique,Shared: When unique, the cache line exists only in one cache. When shared, the cacheline might exist in more than one cache

-        Clean, Dirty: When clean, the cache does not have responsibility to update main memory. When dirty, the cache line has been modified with respect to main memory and this cache must ensure that main memory is eventually updated.

  • Additional signaling on existing AXI4 channels that enables new transactions and information to be transmitted
  • Additional channels, know as snoop channels, that enables an Interconnect to access information that is stored in the cache of masters connected to it

I will try to shed more light on the ACE protocol with an example of a ‘load operation’ and a ‘store operation’ from a shareable location.

Performing a Load Operation

Consider the system given below with two masters connected to an Interconnect. Both the masters have a cache. The Interconnect is also connected to the main memory. Consider the scenario where Master 1 needs to read the value stored in a variable ‘u’. Also assume that this value is already stored in the cache of Master 2. The following sequence is used to retrieve the value of ‘u’:

  • Master 1 issues a read transaction on the read address channel (1)
  • Interconnect issues a snoop transaction on the snoop address channel of master 2 (2)
  • Master 2 returns the snoop response and data information (3a)
  • If master 2 did not return data, the Interconnect reads it from main memory (3b). Note that it is permissible for the Interconnect to read from main memory even before receiving a response to the snoop transaction
  • Once the data is received it is returned to master 1 through its read data channel (4).

A ReadClean, ReadNotSharedDirty or ReadShared transaction is used for a load operation from a shareable location. A ReadClean transaction is used when an initiating master does not want to accept responsibility to update memory. A ReadNotSharedDirty transction is used when a master wants to load data and can accept the cacheline in any state except the SharedDirty state. A ReadShared transaction is used when a master wants to load data and can accept the cache line in any state. If no cached copy is required, a ReadOnce transaction is used. A ReadNoSnoop is used to read from a nonshareable location.

 

Performing a Store Operation

In the above system consider that Master 1 wants to write a new value to the variable u. The following sequence is used to store the new value into Master1’s cache:

  • Master 1 issues a transaction indicating that it would like a unique copy of the cacheline storing ‘u’. This is done by sending a MakeUnique transaction (1)
  • The Interconnect sends a snoop transaction to Master 2 to invalidate its cacheline. This is done by sending a MakeInvalid transaction (2).
  • Once the invalidation is complete, Master 2 responds on its snoop response channel (3).
  • The Interconnect now responds back to master 1 indicating that all other masters have invalidated the cacheline storing value of variable ‘u’. (4).
  • Master 1 now writes the new value of ‘u’ in its cache. At this point the cacheline is in a unique state for master 1 and this cacheline does not exist in master 2.

Depending on whether a full cacheline store or a partial cacheline store is required and whether the master already has a copy of the cacheline, a MakeUnique, CleanUnique or ReadUnique transaction is used for a store operation. If the master that is storing does not have a cache, but would like to write into a shareable memory location, a WriteUnique or WriteLineUnique transaction  is used. A WriteNoSnoop transaction is used to write into a nonshareable location.

Other transactions used in ACE

  • Memory Update Transactions which are used to write a dirty line into memory. A WriteBack or WriteClean is used for this
  • The Evict transaction is issued by a master to indicate the address of the cacheline being evicted from its local cache.
  • Cache maintenance transactions are used to access and maintain caches of other master components in a system. A CleanShared, CleanInvalid or MakeInvalid transaction is used for this
  • Barrier transactions are used to provide guarantees about the ordering and observation of transactions in a system. This is dealt with in detail in a subsequent post.
  • Distributed Virtual Memory (DVM) transactions are used for virtual memory system maintenance.

Basic ‘Coherent Transaction’ Testing

As described above, a number of different transactions are used in ACE to maintain coherency. Since each of these transaction types have different response and coherency requirements, it is good to test each of the transaction types individually to make sure that the Interconnect meets all the specification requirements. I will take the example of ReadShared transaction to describe the verification requirements for these transaction types in general. Given below is a table from the specification showing the cacheline state changes for ReadShared transaction:

In the above table, the ‘Start State’ refers to the state of the cacheline in the master before the transaction was issued. RRESP refers to the response given by the Interconnect to the master that initiated the transaction. The ‘Expected End State’ refers to the state of the cacheline after the transaction is complete. The last two columns refer to other possible end states based on whether a snoop filter is present or not, the details of which we will not get into in this post. The second table refers to ‘speculative read’. This represents a transaction that was issued even before the master could read the status of the cacheline. Basically, a Read transaction need not be sent out of the master if its cache already has an entry for that address; however, to improve performance a master might choose to send out a transaction even before it gets information on the status of a cacheline. If the transaction was sent out in such a state, it is represented in the second table.

As seen from the tables above, there is a fairly large verification space for a single transaction. An important aspect to take note of is that the stimulus requires traffic from multiple masters. This is because the state space to be covered demands that all the different response types and cache states are tested. The different response types can be created in the system only if the masters have cachelines in certain cacheline states relative to each other. For example, a response type(RRESP) of ‘10’ indicating that the cacheline is shared by another master, requires that the cacheline is present in a master that is snooped by the Interconnect. The diagram below summarizes the key requirements for a sequence testing this:

  • The sequence must initialize the system to a random, but valid state before a transaction of a certain type is initiated. This ensures that all the different response types and cacheline states are exercised.
  • Initialization must ensure that the rules of the states of cache are adhered two. For example a cacheline can be unique or dirty in only one cache. If a cacheline is present in two masters and both cachelines are clean, then their data should be the same. Similarly, if all the cachelines of a location are clean, then the contents of the cacheline must match that of the memory.
  • Sequences must be configuration aware. This means that the sequence is aware of the number of masters in the system, the interface types of these masters and so on. Making sequences configuration aware ensures that the sequences are portable across systems with varying topologies.

Key Verification Points

  • Coherency across master caches

At any given point of time all masters must have the same view of data.

  • Coherency across master cache and memory

If all cachelines are clean, then the contents of the cacheline must match that of memory

  • Snoop transactions

Each transaction initiated by a master has a corresponding snoop transaction that will be initiated by the Interconnect. We need to ensure that the snoop transactions issued by the Interconnect are correct

  • Data integrity between snoop and coherent transactions

If a snoop transaction returns data, the same data must be returned to the master that requested the data through its read data channel.

  • Sequencing transactions

Transactions that access the same location have specific sequencing requirements by the Interconnect. This is dealt with in detail in the next post.

In this post I have described the testing strategy and the key aspects of coherent transaction testing. In the next post I will focus on some of the details of the specification relative to accesses to overlapping addresses.

References

[1] AMBA AXI and ACE Protocol Specification (http://infocenter.arm.com/help/topic/com.arm.doc.ihi0022-/)

Category : AMBA | AXI3/AXI4/ACE | IP Verification