ACE Ordering, Deadlocks & Bottlenecks

Deep dive into how the Interconnect manages transaction sequences and prevents system freezes.

1. Transaction Sequencing

In a multi-core system, multiple masters can target the same cache line simultaneously. The **Interconnect** acts as the arbiter, deciding the "canonical" order of these transactions.

  • Snoop Before Response: If a master receives a snoop *before* the response to its own request for the same line, the snoop is ordered first. The master must process the snoop as if its request hasn't happened yet.
  • Response Before Snoop: If the master gets the response first, its transaction is ordered first. Any subsequent snoop is handled according to the new state of the line.

2. Deadlock Scenarios

Deadlocks usually occur due to Circular Dependencies between channels. Common causes in ACE:

Scenario The "Dead" Lock Prevention
Buffer Full Master waits for AR response. Interconnect sends AC (Snoop). Master's AC buffer is full because it's waiting for AR data to clear space. Master must always be able to accept AC snoop, even if AR/AW are pending.
Circular Handshake Interconnect waits for ACREADY to drive ACVALID. Master waits for ACVALID to clear its internal state and accept other requests. Protocol Rule: ACVALID must never depend on ACREADY.
RACK/WACK Stall Interconnect waits for RACK to finish a transaction, but master stalls RACK waiting for another snoop to complete. RACK/WACK must be generated solely based on channel handshakes, with no internal stalls.

3. Integration Bottlenecks

Even if the system doesn't deadlock, it can slow down significantly due to these bottlenecks:

  • Snoop Multiplication: One request from a master can trigger 3-4 snoop transactions to other cores. In a 16-core system, this can flood theAC channel.
  • Snoop Filter Latency: If the Interconnect uses a snoop filter (to avoid snooping everyone), the filter lookup time adds latency to every coherent transaction.
  • Serialization Point: All transactions to the same line must be serialized at the Interconnect. This "chokepoint" limits the throughput for hot memory locations (like locks or semi-global variables).

4. Livelocks

A livelock happens when cores keep retrying transactions but never make progress. In ACE, this can happen if two cores keep trying to get Unique access (MakeUnique) to the same line, repeatedly invalidating each other's work before it's completed. Most ACE masters implement a "Back-off" or "Starvation Counter" to solve this.