UCIe Specification Rev2p0 Ver1p0 Final 2024aug06 Public Clean (Website Requests)
UCIe Specification Rev2p0 Ver1p0 Final 2024aug06 Public Clean (Website Requests)
Express (UCIe)
Specification
August 6, 2024
Revision 2.0, Version 1.0
LEGAL NOTICE FOR THIS PUBLICLY AVAILABLE SPECIFICATION FROM UNIVERSAL CHIPLET INTERCONNECT
EXPRESS, INC.
This Universal Chiplet Interconnect Express, Inc. Specification (this “UCIe Specification” or this “document”) is owned by and
is proprietary to Universal Chiplet Interconnect Express, Inc., a Delaware nonprofit corporation (sometimes referred to as “UCIe”
or the “UCIe Consortium” or the “Company”) and/or its successors and assigns.
NOTICE TO USERS WHO ARE MEMBERS OF UNIVERSAL CHIPLET INTERCONNECT EXPRESS, INC.:
If you are a Member of Universal Chiplet Interconnect Express, Inc. (herein referred to as a “UCIe Member”), and even if you
have received this publicly available version of this UCIe Specification after agreeing to the UCIE Consortium’s Evaluation Copy
Agreement (a copy of which is available at www.uciexpress.org/specifications, each such UCIE Member must also be in
compliance with all of the following UCIe Consortium documents, policies and/or procedures (collectively, the “UCIe Governing
Documents”) in order for such UCIe Member’s use and/or implementation of this UCIe Specification to receive and enjoy all of
the rights, benefits, privileges and protections of UCIe Consortium membership: (i) the UCIe Consortium’s Intellectual Property
Policy; (ii) the UCIe Consortium’s Bylaws; (iii) any and all other UCIe Consortium policies and procedures; and (iv) the UCIe
Member’s Participation Agreement.
If you are not a UCIe Member and have received this publicly available version of this UCIe Specification, your use of this
document is subject to your compliance with, and is limited by, all of the terms and conditions of the UCIe Consortium’s
Evaluation Copy Agreement (a copy of which is available at www.uciexpress.org/specifications).
In addition to the restrictions set forth in the UCIe Consortium’s Evaluation Copy Agreement, any references or citations to this
document must acknowledge Universal Chiplet Interconnect Express, Inc.’s sole and exclusive copyright ownership of this UCIe
Specification. The proper copyright citation or reference is as follows: “© 2022–2024 UNIVERSAL CHIPLET INTERCONNECT
EXPRESS, INC. ALL RIGHTS RESERVED.” When making any such citation or reference to this document you are not permitted
to revise, alter, modify, make any derivatives of, or otherwise amend the referenced portion of this document in any way without
the prior express written permission of Universal Chiplet Interconnect Express, Inc.
Except for the limited rights explicitly given to a non-UCIe Member pursuant to the explicit provisions of the UCIe Consortium’s
Evaluation Copy Agreement which governs the publicly available version of this UCIe Specification, nothing contained in this UCIe
Specification shall be deemed as granting (either expressly or impliedly) to any party that is not a UCIe Member: (i) any kind of
license to implement or use this UCIe Specification or any portion or content described or contained therein, or any kind of license
in or to any other intellectual property owned or controlled by the UCIe Consortium, including without limitation any trademarks
of the UCIe Consortium; or (ii) any benefits and/or rights as a UCIe Member under any UCIe Governing Documents. For clarity,
and without limiting the foregoing notice in any way, if you are not a UCIe Member but still elect to implement this UCIe
Specification or any portion described herein, you are hereby given notice that your election to do so does not give you any of the
rights, benefits, and/or protections of the UCIe Members, including without limitation any of the rights, benefits, privileges or
protections given to a UCIe Member under the UCIe Consortium’s Intellectual Property Policy.
THIS DOCUMENT AND ALL SPECIFICATIONS AND/OR OTHER CONTENT PROVIDED HEREIN IS PROVIDED ON AN “AS IS” BASIS.
TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, UNIVERSAL CHIPLET INTERCONNECT EXPRESS, INC. (ALONG WITH
THE CONTRIBUTORS TO THIS DOCUMENT) HEREBY DISCLAIM ALL REPRESENTATIONS, WARRANTIES AND/OR COVENANTS,
EITHER EXPRESS OR IMPLIED, STATUTORY OR AT COMMON LAW, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, VALIDITY, AND/OR NON-INFRINGEMENT.
In the event this UCIe Specification makes any references (including without limitation any incorporation by reference) to another
standard’s setting organization’s or any other party’s (“Third Party”) content or work, including without limitation any
specifications or standards of any such Third Party (“Third Party Specification”), you are hereby notified that your use or
implementation of any Third Party Specification: (i) is not governed by any of the UCIe Governing Documents; (ii) may require
your use of a Third Party’s patents, copyrights or other intellectual property rights, which in turn may require you to
independently obtain a license or other consent from that Third Party in order to have full rights to implement or use that Third
Party Specification; and/or (iii) may be governed by the intellectual property policy or other policies or procedures of the Third
Party which owns the Third Party Specification. Any trademarks or service marks of any Third Party which may be referenced in
this UCIe Specification are owned by the respective owner of such marks.
The UCIe™ and UNIVERSAL CHIPLET INTERCONNECT EXPRESS™ trademarks and logos (the “UCIe Trademarks”) are
trademarks owned by the UCIe Consortium. The UCIe Consortium reserves all rights in and to all of its UCIe Trademarks.
NOTICE TO ALL PARTIES REGARDING THE PCI-SIG UNIQUE VALUE PROVIDED IN THIS SPECIFICATION:
NOTICE TO USERS: THE UNIQUE VALUE THAT IS PROVIDED IN THIS SPECIFICATION FOR USE IN VENDOR DEFINED MESSAGE
FIELDS, DESIGNATED VENDOR SPECIFIC EXTENDED CAPABILITIES, AND ALTERNATE PROTOCOL NEGOTIATION ONLY AND MAY
NOT BE USED IN ANY OTHER MANNER, AND A USER OF THE UNIQUE VALUE MAY NOT USE THE UNIQUE VALUE IN A MANNER
THAT (A) ALTERS, MODIFIES, HARMS OR DAMAGES THE TECHNICAL FUNCTIONING, SAFETY OR SECURITY OF THE PCI-SIG
ECOSYSTEM OR ANY PORTION THEREOF, OR (B) COULD OR WOULD REASONABLY BE DETERMINED TO ALTER, MODIFY, HARM OR
DAMAGE THE TECHNICAL FUNCTIONING, SAFETY OR SECURITY OF THE PCI-SIG ECOSYSTEM OR ANY PORTION THEREOF (FOR
PURPOSES OF THIS NOTICE, “PCI-SIG ECOSYSTEM” MEANS THE PCI-SIG SPECIFICATIONS, MEMBERS OF PCI-SIG AND THEIR
ASSOCIATED PRODUCTS AND SERVICES THAT INCORPORATE ALL OR A PORTION OF A PCI-SIG SPECIFICATION AND EXTENDS
TO THOSE PRODUCTS AND SERVICES INTERFACING WITH PCI-SIG MEMBER PRODUCTS AND SERVICES).
Terminology ..................................................................................................................... 25
Reference Documents ...................................................................................................... 34
Revision History ............................................................................................................... 34
1.0 Introduction ............................................................................................................ 35
1.1 UCIe Components.............................................................................................. 40
1.1.1 Protocol Layer....................................................................................... 40
1.1.2 Die-to-Die (D2D) Adapter....................................................................... 41
1.1.3 Physical Layer....................................................................................... 41
1.1.4 Interfaces ............................................................................................ 42
1.2 UCIe Configurations ........................................................................................... 42
1.2.1 Single Module Configuration.................................................................... 42
1.2.2 Multi-module Configurations ................................................................... 43
1.2.3 Sideband-only Configurations.................................................................. 44
1.3 UCIe Retimers................................................................................................... 45
1.4 UCIe Key Performance Targets ............................................................................ 47
1.5 Interoperability ................................................................................................. 48
2.0 Protocol Layer ......................................................................................................... 49
2.1 PCIe ................................................................................................................ 50
2.1.1 Raw Format.......................................................................................... 51
2.1.2 Standard 256B End Header Flit Format..................................................... 51
2.1.3 68B Flit Format ..................................................................................... 51
2.1.4 Standard 256B Start Header Flit Format ................................................... 51
2.1.5 Latency-Optimized 256B with Optional Bytes Flit Format............................. 52
2.2 CXL 256B Flit Mode............................................................................................ 52
2.2.1 Raw Format.......................................................................................... 52
2.2.2 Latency-Optimized 256B Flit Formats ....................................................... 52
2.2.3 Standard 256B Start Header Flit Format ................................................... 53
2.3 CXL 68B Flit Mode ............................................................................................. 53
2.3.1 Raw Format.......................................................................................... 53
2.3.2 68B Flit Format ..................................................................................... 53
2.4 Streaming Protocol ............................................................................................ 54
2.4.1 Raw Format.......................................................................................... 54
2.4.2 68B Flit Format ..................................................................................... 54
2.4.3 Standard 256B Flit Formats .................................................................... 54
2.4.4 Latency-Optimized 256B Flit Formats ....................................................... 55
2.5 Management Transport Protocol .......................................................................... 55
3.0 Die-to-Die Adapter................................................................................................... 56
3.1 Stack Multiplexing ............................................................................................. 57
3.2 Link Initialization ............................................................................................... 60
3.2.1 Stage 3 of Link Initialization: Adapter Initialization .................................... 60
3.2.1.1 Part 1: Determine Local Capabilities ......................................... 60
3.2.1.2 Part 2: Parameter Exchange with Remote Link Partner ................ 61
3.2.1.3 Part 3: FDI bring up ............................................................... 69
3.3 Operation Formats............................................................................................. 70
3.3.1 Raw Format for All Protocols ................................................................... 70
3.3.2 68B Flit Format ..................................................................................... 70
3.3.2.1 68B Flit Format Alignment and Padding Rules ............................ 72
3.3.3 Standard 256B Flit Formats .................................................................... 74
3.3.4 Latency-Optimized 256B Flit Formats ....................................................... 79
9.5.2.1 PCI Express Extended Capability Header (Offset 0h) ................. 409
9.5.2.2 Designated Vendor Specific Header 1, 2 (Offsets 4h and 8h) ...... 409
9.5.2.3 UCIe Switch Register Block (UiSRB) Base Address (Offset Ch) .... 410
9.5.3 D2D/PHY Register Block ....................................................................... 411
9.5.3.1 UCIe Register Block Header ................................................... 411
9.5.3.2 Uncorrectable Error Status Register (Offset 10h) ...................... 411
9.5.3.3 Uncorrectable Error Mask Register (Offset 14h) ........................ 412
9.5.3.4 Uncorrectable Error Severity Register (Offset 18h) ................... 413
9.5.3.5 Correctable Error Status Register (Offset 1Ch) ......................... 413
9.5.3.6 Correctable Error Mask Register (Offset 20h) ........................... 414
9.5.3.7 Header Log 1 Register (Offset 24h) ........................................ 414
9.5.3.8 Header Log 2 Register (Offset 2Ch) ........................................ 415
9.5.3.9 Error and Link Testing Control Register (0ffset 30h).................. 417
9.5.3.10 Runtime Link Testing Parity Log 0 (Offset 34h) ........................ 418
9.5.3.11 Runtime Link Testing Parity Log 1 (Offset 3Ch) ........................ 418
9.5.3.12 Runtime Link Testing Parity Log 2 (Offset 44h) ........................ 418
9.5.3.13 Runtime Link Testing Parity Log 3 (Offset 4Ch) ........................ 419
9.5.3.14 Advertised Adapter Capability Log (Offset 54h) ........................ 419
9.5.3.15 Finalized Adapter Capability Log (Offset 5Ch)........................... 419
9.5.3.16 Advertised CXL Capability Log (Offset 64h).............................. 419
9.5.3.17 Finalized CXL Capability Log (Offset 6Ch) ................................ 419
9.5.3.18 Advertised Multi-Protocol Capability Log Register (Offset 78h).... 420
9.5.3.19 Finalized Multi-Protocol Capability Log Register (Offset 80h) ...... 420
9.5.3.20 Advertised CXL Capability Log Register for Stack 1 (Offset 88h) . 420
9.5.3.21 Finalized CXL Capability Log Register for Stack 1 (Offset 90h) .... 420
9.5.3.22 PHY Capability (Offset 1000h)................................................ 421
9.5.3.23 PHY Control (Offset 1004h) ................................................... 422
9.5.3.24 PHY Status (Offset 1008h) .................................................... 423
9.5.3.25 PHY Initialization and Debug (Offset 100Ch) ............................ 424
9.5.3.26 Training Setup 1 (Offset 1010h)............................................. 425
9.5.3.27 Training Setup 2 (Offset 1020h)............................................. 425
9.5.3.28 Training Setup 3 (Offset 1030h)............................................. 426
9.5.3.29 Training Setup 4 (Offset 1050h)............................................. 426
9.5.3.30 Current Lane Map Module 0 (Offset 1060h) ............................. 427
9.5.3.31 Current Lane Map Module 1 (Offset 1068h) ............................. 427
9.5.3.32 Current Lane Map Module 2 (Offset 1070h) ............................. 427
9.5.3.33 Current Lane Map Module 3 (Offset 1078h) ............................. 427
9.5.3.34 Error Log 0 (Offset 1080h) .................................................... 428
9.5.3.35 Error Log 1 (Offset 1090h) .................................................... 429
9.5.3.36 Runtime Link Test Control (Offset 1100h)................................ 429
9.5.3.37 Runtime Link Test Status (Offset 1108h) ................................. 431
9.5.3.38 Mainband Data Repair (Offset 110Ch) ..................................... 431
9.5.3.39 Clock, Track, Valid and Sideband Repair (Offset 1134h) ............ 433
9.5.3.40 UCIe Link Health Monitor (UHM) DVSEC .................................. 434
9.5.3.40.1 UHM Status (Offset Eh) ...................................... 435
9.5.3.40.2 Eye Margin (Starting Offset 18h) ......................... 435
9.5.4 Test/Compliance Register Block............................................................. 436
9.5.4.1 UCIe Register Block Header ................................................... 437
9.5.4.2 D2D Adapter Test/Compliance Register Block Offset ................. 437
9.5.4.3 PHY Test/Compliance Register Block Offset.............................. 437
9.5.4.4 D2D Adapter Test/Compliance Register Block........................... 438
9.5.4.4.1 Adapter Compliance Control ................................ 438
9.5.4.4.2 Flit Tx Injection Control ...................................... 439
9.5.4.4.3 Adapter Test Status (Offset 30h from D2DOFF)...... 440
9.5.4.4.4 Link State Injection Control Stack 0
(Offset 34h from D2DOFF) .................................. 441
9.5.4.4.5 Link State Injection Control Stack 1
(Offset 38h from D2DOFF) .................................. 442
9.5.4.4.6 Retry Injection Control (Offset 40h from D2DOFF).. 442
Term Definition
Ack Acknowledge
Addr Address
This packaging technology is used for performance optimized applications and short reach
Advanced Package
interconnects.
Any data or mechanism used to access data that should be protected from illicit access, use,
Asset
availability, disclosure, alteration, destruction, or theft.
B2B Back-to-Back
BE Byte Enable
bubble Gap in data transfer and/or signal transitions. Measured in number of clock cycles.
Tx group or Rx group for UCIe-3D interconnects that contains data, clock, power, and ground.
bundle
A 3D Module consists of a Tx bundle and an Rx bundle.
CA Completer Abort
Integrated circuit die that contains a well-defined subset of functionality that is designed to be
chiplet
combined with other chiplets in a package.
clear
If clear or reset is used and no value is provided for a bit, it is interpreted as 0b.
cleared
CXL 68B Flit Mode This term is used to reference 68B Flit Mode related Protocol features defined in CXL Specification.
CXL 256B Flit Mode This term is used to reference 256B Flit Mode related Protocol features defined in CXL Specification.
D2C Data-to-Clock
D2D Die-to-Die
Term Definition
DevID Device ID
DLLP Data Link Layer Packet (as defined in PCIe Base Specification)
DLP In Flit modes, the Data Link Layer Payload within a flit (as defined in PCIe Base Specification).
DFx Management Hub. DFx entity that provides enumeration/global control/status of DFx
DMH
capabilities in a chiplet.
DFx Management Spoke. DFx entity that implements a specific test/debug functionality within a
DMS
DMH.
Static design time ID assigned to a DMS for ID-routed messages within a DMH. Interchangeably
DMS-ID
used with the term Spoke-ID.
Used to refer to a hardware mechanism that sets or returns all UCIe registers and state machines
Domain Reset associated with a given UCIe Link to their initialization values as specified in this document. It is
(domain reset) required for both sides of the Link to have an overlapping time window such that they are both in
domain reset concurrently.
DP Downstream Port
DVSEC Designated Vendor-Specific Extended Capability (as defined in PCIe Base Specification)
Double Word. Four bytes. When used as an addressable quantity, a Double Word is four bytes of
DWORD data that are aligned on a four-byte boundary (i.e., the least significant two bits of the address are
00b).
EM Eye Margin
Encapsulated MTP
Encapsulated Management Transport Packet. The resulting packet after Encapsulation.
eMTP
Process of splitting an MTP or Vendor defined messages (exchanged between Management Port
Gateways on both ends of a link) into smaller pieces to meet any required payload length
restrictions or for any other reasons like credit availability, adding a 2-DWORD header to each piece
Encapsulation
and if required, adding a 1-DWORD data padding at the end of an MTP to transmit the MTP over
sideband or mainband UCIe link. In the case of an MTP, the resulting packet after Encapsulation is
called the Encapsulated MTP.
Endpoint
As defined in PCIe Base Specification.
EP
F2B Face-to-Back
F2F Face-to-Face
FH Flit Header
Term Definition
Flit_Marker
Flit Marker (as defined in PCIe Base Specification)
FM
FW Firmware
FW-CLK Forwarded Clock over the UCIe Link for mainband data Lanes
HW Hardware
IL Insertion Loss
I/O Input/Output
Generic term used to refer to architecture blocks that are defined within the specification
IP
(e.g., D2D adapter, PHY, etc.).
A pair of signals mapped to physical bumps, one for Transmission, and one for Reception. A xN
Lane
UCIe Link is composed of N Lanes.
LCLK Refers to the clock at which the Logical Physical Layer, Adapter and RDI/FDI are operating.
Link A Link or UCIe Link refers to the set of two UCIe components and their interconnecting Lanes which
UCIe Link forms a dual-simplex communications path between the two components.
LTSSM Link Training and Status State Machine (as defined in PCIe Base Specification)
Mainband Connection that constitutes the main data path of UCIe. Consists of a forwarded clock, a data valid
MB pin, and N Lanes of data per module.
Type of Management Entity that bridges a Management Network within an SiP to another network
Management Bridge
that may be internal or external to the SiP.
Management Element that is responsible for discovering, configuring, and coordinating the overall
Management Director
management of the SiP and acts as the manageability Root of Trust (RoT).
One or more chiplets in an SiP that are interconnected by a Management Network and support UCIe
Management Domain
Manageability.
Management Element Type of Management Entity that can perform one or more management functions.
Addressable entity on the Management Network that can send and/or receive UCIe Management
Management Entity Transport packets. A Management Element, a Management Port, and a Management Bridge are all
a type of Management Entity.
Management Link Mechanism that defines how UCIe Management Transport packets are transferred across a point-to-
Encapsulation Mechanism point management link.
Term Definition
Network within and between chiplets that is capable of transporting UCIe Management Transport
Management Network
packets.
Management Entity that facilitates management communication between chiplets using a chiplet-
Management Port
to-chiplet management link.
Management Port Gateway Entity that provides the bridging functionality when transporting an MTP from/to a local SoC
(MPG) management fabric (which is an SoC-specific implementation) to/from a UCIe link.
Type of reset that causes all UCIe manageability and manageability structures in a chiplet to be
Management Reset
reset to their default state.
UCIe main data path on the physical bumps is organized as a group of Lanes called a Module. For
Module Standard Package, 16 Lanes constitute a single Module. For Advanced Package, 64 Lanes constitute
a single Module.
NOP No Operation
One-Time Programmable Any data storage mechanism that is capable of being programmed only once (e.g., fuse).
PCIe (PCI Express) Peripheral Component Interconnect Express (defined in PCIe Base Specification)
PCIe Flit Mode This term is used to reference Flit Mode related Protocol features defined in PCIe Base Specification.
This term is used to reference non-Flit Mode related Protocol features defined in PCIe Base
PCIe non-Flit Mode
Specification.
PHY Physical Layer (PHY and Physical Layer are used interchangeable throughout the Specification)
PI Phase Interpolator
Power Management states, used to refer to behavior and/or rules related to Power Management
PM
states (covers both L1 and L2).
Quad Word. Eight bytes. When used as an addressable quantity, a Quad Word is eight bytes of data
QWORD that are aligned on an eight-byte boundary (i.e., the least significant three bits of the address are
000b).
Term Definition
RCKN_P
RXCKN Physical Lane for Clock Receiver Phase-2
rxckn
RCKP_P
RXCKP Physical Lane for Clock Receiver Phase-1
rxckp
RD_P[N]
RD_PN
Nth Physical Lane for Data Receiver
RXDATA[N]
rxdataN
This term is used throughout this specification to denote the logic associated with the far side of the
remote Link partner
UCIe Link; to denote actions or messages sent or received by the Link partner of a UCIe die.
Replay
Retry and Replay are used interchangeably to refer to the Link level reliability mechanisms.
Retry
The contents, states, or information are not defined at this time. Using any Reserved area (for
example, packet header bit-fields, configuration register bits) is not permitted. Reserved register
fields must be read only and must return 0 (all 0s for multi-bit fields) when read. For packets
transmitted and received over the UCIe Link (mainband or sideband), the Reserved bits must be
cleared to 0b by the sender and ignored by the receiver. Reserved encodings for register and packet
Reserved fields must not be used. Any implementation dependence on a Reserved field value or encoding will
result in an implementation that is not UCIe-compliant. The functionality of such an implementation
cannot be guaranteed in this or any future revision of this specification.
For registers, UCIe uses the “RsvdP” or “RsvdZ” attributes for reserved fields, as well as Rsvd, and
these follow the same definition as PCIe Base Specification for hardware as well as software.
reset If reset or clear is used and no value is provided for a bit, it is interpreted as 0b.
RID Revision ID
RL Register Locator
Root Port
As defined in PCIe Base Specification.
RP
RRDCK_P
RXCKRD Physical Lane for redundant Clock/Track Receiver
rxckRD
RRD_P[N]
RRD_PN
Nth Physical Lane for redundant Data Receiver
RXDATARD[N]
rxdataRD[N]
RRDVLD_P
RXVLDRD Physical Lane for redundant Valid Receiver
rxvldRD
RTRK_P
RXTRK Physical Lane for Track Receiver
rxtrk
RVLD_P
RXVLD Physical Lane for Valid Receiver
rxvld
Rx Receiver
Term Definition
RXCKSB
Physical Lane for sideband Clock Receiver
rxcksb
RXCKSBRD
Physical Lane for redundant sideband Clock Receiver
rxcksbRD
RXDATASB
Physical Lane for sideband Data Receiver
rxdatasb
RXDATASBRD
Physical Lane for redundant sideband Data Receiver
rxdatasbRD
Sideband message requests or responses are referred to by their names enclosed in curly brackets.
See Chapter 7.0 for the mapping of sideband message names to relevant encodings. An asterisk in
{<SBMSG>}
the <SBMSG> name is used to denote a group of messages with the same prefix or suffix in their
name.
SC Successful Completion
Process of taking a large MTP, splitting it into smaller “segments” and sending those segments on
Segmentation
multiple sideband links or mainband stacks.
SERDES Serializer/Deserializer
A 64-bit serial packet is defined on the sideband I/O interface to the remote chiplet as shown in
serial packet
Figure 4-8.
set If set is used and no value is provided for a bit, it is interpreted as 1b.
Connection used for parameter exchanges, register accesses for debug/compliance and
Sideband coordination with remote partner for Link training and management. Consists of a forwarded clock
pin and a data pin in each direction. The clock is fixed at 800 MHz regardless of the main data path
SB speed. The sideband logic for the UCIe Physical Layer must be on auxiliary power and an “always
on” domain. Each module has its own set of sideband pins.
SM State Machine
SO Sideband-only
This packaging technology is used for low cost and long reach interconnects using traces on organic
Standard Package
package/substrate
SW Software
TC Traffic Class
TCKN_P
TXCKN Physical Lane for Clock Transmitter Phase-2
txckn
TCKP_P
TXCKP Physical Lane for Clock Transmitter Phase-1
txckp
Term Definition
TD_P[N]
TD_PN
Nth Physical Lane for Data Transmitter
TXDATA[N]
txdataN
TRD_P[N]
TRD_PN
Nth Physical Lane for redundant Data Transmitter
TXDATARD[N]
txdataRD[N]
TRDCK_P
TXCKRD Physical Lane for redundant Clock/Track Transmitter
txckRD
TRDVLD_P
TXVLDRD Physical Lane for redundant Valid Transmitter
txvldRD
Trx Transceiver
TTRK_P
TXTRK Physical Lane for Track Transmitter
txtrk
TVLD_P
TXVLD Physical Lane for Valid Transmitter
txvld
Tx Transmitter
TXCKSB
Physical Lane for sideband Clock Transmitter
txcksb
TXCKSBRD
Physical Lane for redundant sideband Clock Transmitter
txcksbRD
TXDATASB
Physical Lane for sideband Data Transmitter
txdatasb
TXDATASBRD
Physical Lane for redundant sideband Data Transmitter
txdatasbRD
Used to denote x32 Advanced Package module. See Chapter 5.0 for UCIe-A x32 Advanced Package
UCIe-A x32
bump matrices, and interoperability between x32 to x32 and x32 to x64 module configurations.
UCIe DFx Message Generic term for all UCIe Management Transport packets with Protocol ID set to ‘Test and Debug
UDM Protocols’.
This term is used throughout this specification to denote the logic associated with the UCIe Link on
UCIe die any given chiplet with a UCIe Link connection. It is used as a common noun to denote actions or
messages sent or received by an implementation of UCIe.
Term Definition
Operating Mode in which CRC bytes are inserted and checked by the D2D Adapter. If applicable,
UCIe Flit Mode
Retry is also performed by the D2D Adapter.
A UCIe connection between two chiplets. These chiplets are Link partners in the context of UCIe
UCIe Link
since they communicate with each other using a common UCIe Link.
UCIe Mainband Chiplet port that implements the Management Link Encapsulation Mechanism and can transfer UCIe
Management Port Management Transport packets across a point-to-point UCIe mainband link.
UCIe Management
Protocol used to transfer UCIe Management Transport packets between management entities.
Transport Protocol
UCIe Raw Format Operating format in which all the bytes of a Flit are populated by the Protocol Layer.
UCIe Sideband Chiplet port that implements the Management Link Encapsulation Mechanism and can transfer UCIe
Management Port Management Transport packets across a point-to-point UCIe sideband link.
Unit Interval Given a data stream of a repeating pattern of alternating 1 and 0 values, the Unit Interval is the
value measured by averaging the time interval between voltage transitions, over a time interval
UI sufficiently long to make all intentional frequency modulation of the source clock negligible.
UP Upstream Port
UR Unsupported Request
zero Numerical value of 0 in a bit, field, or register, of appropriate width for that bit, field, or register.
b bit
B byte
dB decibel
fF femtofarad
GHz gigahertz
MHz megahertz
mm millimeter
ms millisecond
mV millivolt
um micrometer
us microsecond
ns nanosecond
pJ picojoule
pk peak
ps picosecond
s second
V volt
Reference Documents
Table 3. Reference Documentsa
ACPI Specification
www.uefi.org
(version 6.5 or later)
a. References to these documents throughout this specification relate to the versions/revisions listed here.
Revision History
Table 4 lists the significant changes in different revisions.
§§
This chapter provides an overview of the Universal Chiplet Interconnect express (UCIe) architecture.
UCIe is an open, multi-protocol capable, on-package interconnect standard for connecting multiple
dies on the same package. The primary motivation is to enable a vibrant ecosystem supporting
disaggregated die architectures which can be interconnected using UCIe. UCIe supports multiple
protocols (PCIe, CXL, Streaming, and a raw format that can be used to map any protocol of choice as
long as both ends support it) on top of a common physical and Link layer. It encompasses the
elements needed for SoC construction such as the application layer, as well as the form-factors
relevant to the package (e.g., bump location, power delivery, thermal solution, etc.).
The specification is defined to ensure interoperability across a wide range of devices having different
performance characteristics. A well-defined debug and compliance mechanism is provided to ensure
interoperability. It is expected that the specification will evolve in a backward compatible manner.
While UCIe supports a wide range of usage models, some are provided here as an illustration of the
type of capability and innovation it can unleash in the compute industry. The initial protocols being
mapped to UCIe are PCIe, CXL, and Streaming. The mappings for all protocols are done using a Flit
Format, including the Raw Format. Both PCIe and CXL are widely used and these protocol mappings
will enable more on-package integration by replacing the PCIe SERDES PHY and the PCIe/CXL LogPHY
along with the Link level Retry with a UCIe Adapter and PHY to improve the power and performance
characteristics. UCIe provisions for Streaming protocols to also leverage the Link Level Retry of the
UCIe Adapter, and this can be used to provide reliable transport for protocols other than PCIe or CXL.
UCIe also supports a Raw Format that is protocol-agnostic to enable other protocols to be mapped;
while allowing usages such as integrating a standalone SERDES/transceiver tile (e.g., ethernet) on-
package. When using Raw Format, the Protocol Layer is responsible for reliable transport across the
UCIe Link.
Figure 1-1 demonstrates an SoC package composed of CPU Dies, accelerator Die(s) and I/O Tile
Die(s) connected through UCIe. The accelerator or I/O Tile can use CXL transactions over UCIe when
connected to a CPU — leveraging the I/O, coherency, and memory protocols of CXL. The I/O tile can
provide the external CXL, PCIe, and DDR pins of the package. The accelerator can also use PCIe
transactions over UCIe when connected to a CPU. The CPU to CPU connectivity on-package can also
use the UCIe interconnect, running coherency protocols.
A UCIe Retimer may be used to extend the UCIe connectivity beyond the package using an Off-
Package Interconnect. Examples of Off-Package Interconnect include electrical cable or optical cable
or any other technology to connect packages at a Rack/Pod level as shown in Figure 1-2. The UCIe
specification requires the UCIe Retimer to implement the UCIe interface to the Die that it connects on
its local package and ensure that the Flits are delivered to the remote UCIe Die interface in the
separate package following UCIe protocol using the channel extension technology of its choice.
Figure 1-2 demonstrates a rack/pod-level disaggregation using CXL protocol. Figure 1-2a shows the
rack level view where multiple compute nodes (virtual hierarchy) from different compute chassis
connect to a CXL switch which connects to multiple CXL accelerators/Type-3 memory devices which
can be placed in one or more separate drawer. The logical view of this connectivity is shown in
Figure 1-2b, where each “host” (H1, H2,…) is a compute drawer. Each compute drawer connects to
the switch using an Off-Package Interconnect running CXL protocol through a UCIe Retimer, as shown
in Figure 1-2c. The switch also has co-package Retimers where the Retimer tiles connect to the main
switch die using UCIe and on the other side are the PCIe/CXL physical interconnects to connect to the
accelerators/memory devices, as shown in Figure 1-2c.
UCIe permits three different packaging options: Standard Package (2D), and Advanced Package
(2.5D), and UCIe-3D. This covers the spectrum from lowest cost to best performance interconnects.
1. Standard Package — This packaging technology is used for low cost and long reach (10 mm to
25 mm, when measured from a bump on one Die to the connecting bump of the remote Die)
interconnects using traces on organic package/substrate, while still providing significantly better
BER characteristics compared to off-package SERDES. Figure 1-3 shows an example application
using the Standard Package option. Table 1-1 shows a summary of the characteristics of the
Standard Package option with UCIe.
Package Substrate
Index Value
Supported speeds (per Lane) 4 GT/s, 8 GT/s, 12 GT/s, 16 GT/s, 24GT/s, 32 GT/s
Index Value
Supported speeds (per Lane) 4 GT/s, 8 GT/s, 12 GT/s, 16 GT/s, 24 GT/s, 32 GT/s
Bump pitch 25 um to 55 um
Channel reach 2 mm
1e-27 (<=12GT/s)
Raw Bit Error Rate (BER)a
1e-15 (>=16GT/s)
Package Substrate
Package Substrate
3. UCIe-3D: This packaging technology uses a two-dimensional array of interconnect bumps for
data transmission between dies where one die is stacked on top of another. A menu of design
options are provided for vendors to develop standard building blocks.
Table 1-3 shows a summary of the main characteristics of UCIe-3D. Figure 1-7 shows an example
of UCIe-3D. See Chapter 6.0 for a detailed description of UCIe-3D.
Index Value
<10 um (optimizeda)
Bump pitch
10 to 25 um (functionala)
Channel 3D vertical
a. Circuit Architecture is optimized for < 10 um bump pitches. 10 to 25 um are supported functionally.
b. See Chapter 6.0 for details about BER characteristics.
Protocol Layer
Link Training
Lane Repair (when applicable)
PHY Logic
Lane Reversal (when applicable)
Physical Layer Scrambling/De-scrambling
Sideband Initialization and Transfers
Sideband/ Analog Front End
Electrical/AFE [k slices]
Global Clock Forwarding
• UCIe Management Transport protocola: This is an end-to-end media independent protocol(s) for
management communication on the UCIe Management Network within the UCIe Manageability
Architecture.
For each protocol, different optimizations and associated Flit transfers are available for transfer over
UCIe. Chapter 2.0 and Chapter 3.0 cover the relevant details of different modes and Flit Formats.
a. UCIe Management Transport protocol can be encapsulated for transport over the UCIe sideband
or the UCIe mainband. Section 8.1 covers the details of this protocol. Section 8.2 covers the
details around encapsulation of this protocol over the UCIe sideband and the UCIe mainband.
D2D Adapter is responsible for coordinating higher level Link state machine and bring up, protocol
options related parameter exchanges with remote Link partner, and when supported, power
management coordination with remote Link partner. Chapter 3.0 covers the relevant details for the
D2D Adapter.
Link Training
Lane Repair (when applicable)
PHY Logic
Physical Layer Lane Reversal (when applicable)
Scrambling/De-scrambling
Sideband Initialization and Transfers
Sideband/ Analog Front End
Electrical/AFE [k slices]
Global Clock Forwarding
The UCIe main data path on the physical bumps is organized as a group of Lanes called a Module. A
Module forms the atomic granularity for the structural design implementation of the UCIe AFE. The
number of Lanes per Module for Standard and Advanced Packages is specified in Chapter 4.0. A given
instance of Protocol Layer or D2D adapter can send data over multiple modules where bandwidth
scaling is required.
For the Standard Package option, N=16 (also referred to as x16) or N=8 (also referred to as x8)
and no extra pins for repair are provided.
The Logical Physical Layer coordinates the different functions and their relative sequencing for proper
Link bring up and management (e.g., sideband transfers, mainband training and repair, etc.).
Chapter 4.0 and Chapter 5.0 cover the details on Physical Layer operation.
1.1.4 Interfaces
UCIe defines the interfaces between the Physical Layer and the D2D Adapter (Raw D2D Interface),
and the D2D Adapter and the Protocol Layer (Flit-aware D2D Interface) in Chapter 10.0. A reference
list of signals is also provided to cover the interactions and rules of the Management Transport
protocol between the SoC and the UCIe Stack.
Die-to-Die Adapter
PHY Logic
Sideband Electrical/AFE
x64 Valid
Sideband FW-CLK Track
or
x32
Die-to-Die Adapter
PHY Logic
Sideband Electrical/AFE
x16 Valid
Sideband FW-CLK or Track
x8
Die-to-Die Adapter
Multi-Module PHY Logic
PHY Logic PHY Logic
Valid Valid
Sideband FW-CLK x16 Track Sideband FW-CLK x16 Track
Die-to-Die Adapter
Multi-Module PHY Logic
PHY Logic PHY Logic PHY Logic PHY Logic
Sideband Electrical/AFE Sideband Elect rical/ AFE Sideband Elect rical/ AFE Sideband Elect rical/ AFE
Die-to-Die Adapter
Multi-Module PHY Logic
PHY Logic PHY Logic
PHY Logic
Sideband
Sideband
PHY Logic
Sideband Sideband
Sideband
PHY Logic
Sideband
Package 0 Package 1
SB Retimer
Receiver Buffer
SB
Retimer
Receiver Buffer
UCIe Die 0 UCIe Retimer 0 UCIe Retimer 1 UCIe Die 1
partner. For this scenario, protocols are permitted to use any of the applicable Flit Formats for
transport over the UCIe Link.
— The Retimer provides its own FEC by replacing the native PCIe or CXL defined FEC with its
own, or adding its FEC in addition to the native PCIe or CXL defined FEC, but takes advantage
of the built in CRC and Replay mechanisms of the underlying protocol. In this scenario, the
queue sizes (Protocol Layer buffers, Retry buffers) must be adjusted on the UCIe Dies to meet
the underlying round trip latency.
• Resolution of Link and Protocol Parameters with remote Retimer partner to ensure interoperability
between UCIe Dies end-to-end (E2E). For example, Retimers are permitted to force the same Link
width, speed, protocol (including any relevant protocol specific parameters) and Flit Formats on
both Package 0 and Package 1 in Figure 1-18. The specific mechanism of resolution including
message transfer for parameter exchanges across the Off Package Interconnect is
implementation specific for the Retimers and they must ensure a consistent operational mode
taking into account their own capabilities along with the UCIe Die capabilities on both Package 0
and Package 1. However, for robustness of the UCIe Links to avoid unnecessary timeouts in case
the external interconnect requires a longer time to Link up or resolution of parameters with
remote Retimer partner, UCIe Specification defines a “Stall” response to the relevant sideband
messages that can potentially get delayed. The Retimers must respond with the “Stall” response
within the rules of UCIe Specification to avoid such unnecessary timeouts while waiting for, or
negotiating with remote Retimer partner. It is the responsibility of the Retimer to ensure the UCIe
Link is not stalled indefinitely.
• Resolution of Link States for Adapter Link State Machine (LSM) or the RDI states with remote
Retimer partner to ensure correct E2E operation. See Chapter 3.0 for more details.
• Flow control and back-pressure:
— Data transmitted from a UCIe Die to a UCIe Retimer is flow-controlled using credits. These
credits are on top of any underlying protocol credit mechanism (such as PH, PD credits in
PCIe). These UCIe D2D credits must be for flow control across the two UCIe Retimers and any
data transmitted to the UCIe Retimer must eventually be consumed by the remote UCIe die
without any other dependency. Every UCIe Retimer must implement a Receiver Buffer for Flits
that it receives from the UCIe die within its package. The Receiver buffer credits are
advertised to the UCIe die during initial parameter exchanges for the D2D Adapter, and the
UCIe die must not send any data to the UCIe Retimer if it does not have a credit for it. One
credit corresponds to 256B of data (including any FEC, CRC, etc.). Credit returns are
overloaded on the Valid framing (see Section 4.1.2). Credit counters at the UCIe Die are re-
assigned to their initial advertised value whenever RDI states transition away from Active.
UCIe Retimer must drain or dump (as applicable) the data in its receiver buffer before re-
entering Active state.
— Data transmitted from a UCIe Retimer to a UCIe die is not flow-controlled at the D2D adapter
level. The UCIe Retimer may have its independent flow-control with the other UCIe Retimer if
needed, which is beyond the scope of this specification.
4 GT/s 165 28
8 GT/s 329 56
Die Edge Bandwidth 12 GT/s 494 84
Densitya
(GB/s per mm) 16 GT/s 658 112
a. Die edge bandwidth density is defined as total I/O bandwidth in GB per sec per mm silicon die edge, with 45-
um (Advanced Package) and 110-um (Standard Package) bump pitch. For a x32 Advanced Package module,
the Die Edge Bandwidth Density is 50% of the corresponding value for x64.
b. Energy Efficiency (energy consumed per bit to traverse from FDI to bump and back to FDI) includes all the
Adapter and Physical Layer-related circuitry including, but not limited to, Tx, Rx, PLL, Clock Distribution, etc.
Channel reach and termination are discussed in Chapter 5.0.
c. Latency includes the latency of the Adapter and the Physical Layer (FDI to bump delay) on Tx and Rx. See
Chapter 5.0 for details of Physical Layer latency. Latency target is based on 16 GT/s. Latency at other data
rates may differ due to data rate-dependent aspects such as data accumulation and transfer time. Note that
the latency target does not include the accumulation of bits required for processing; either within or across
Flits.
Bandwidth Densitya
4 GT/s 4000
(GB/s/mm2)
1.5 Interoperability
Package designers need to ensure that Dies that are connected on a package can inter-operate. This
includes compatible package interconnect (e.g., Advanced vs. Standard), protocols, voltage levels,
etc. It is strongly recommended that a Die adopts Transmitter voltage of less than 0.85 V so that the
Die can inter-operate with a wide range of process nodes in the foreseeable future.
This specification comprehends interoperability across a wide range of bump pitch for Advanced
Packaging options. It is expected that over time, the smaller bump pitches will be predominantly
used. With smaller bump pitch, we expect designs will reduce the maximum advertised frequency
(even though they can go to 32G) to optimize for area and to address the power delivery and thermal
constraints of high bandwidth with reduced area. Table 1-6 summarizes these bump pitches across
four groups. Interoperability is guaranteed within each group as well as across groups, based on
the PHY dimension specified in Chapter 5.0. The performance targets provided in Table 1-4 are
with the 45 um bump pitch, based on the technology widely deployed at the time of publication of
UCIe 1.0 and UCIe 1.1 Specifications (2022 – 2023).
Group 1: 25 - 30 4 12
Group 2: 31 - 37 4 16
Group 3: 38 - 44 4 24
Group 4: 45 - 55 4 32
§§
Universal Chiplet Interconnect express (UCIe) maps PCIe and CXL, as well as any Streaming protocol.
Throughout the UCIe Specification, Protocol-related features are kept separate from Flit Formats and
packetization. This is because UCIe provides different transport mechanisms that are not necessarily
tied to protocol features (e.g., PCIe non-Flit mode packets are transported using CXL.io 68B Flit
Format). Protocol features include the definitions of Transaction Layer and higher layers, as well as
Link Layer features not related to Flit packing/Retry (e.g., Flow Control negotiations etc.).
The following terminology is used throughout this specification to identify Protocol-level features:
• PCIe Flit mode: To reference Flit mode-related Protocol features defined in PCIe Base Specification
• PCIe non-Flit mode: To reference non-Flit mode-related Protocol features defined in PCIe Base
Specification
• CXL 68B Flit mode: To reference 68B Flit mode-related Protocol features defined in CXL
Specification
• CXL 256B Flit mode: To reference 256B Flit mode-related Protocol features defined in CXL
Specification
The following protocol mappings are supported over the UCIe mainband:
• PCIe Flit mode
• CXL 68B Flit mode, CXL 256B Flit Mode: If CXL is negotiated, each of CXL.io, CXL.cache, and
CXL.mem protocols are negotiated independently.
• Streaming protocol: This offers generic modes for a user defined protocol to be transmitted using
UCIe.
• Management Transport protocol: This allows transport of manageability packets.
Note: RCD/RCH/eRCD/eRCH are not supported. PCIe non-Flit Mode is supported using CXL.io
68B Flit Format as the transport mechanism.
IMPLEMENTATION NOTE
Table 2-1 summarizes the mapping of the above rules from a specification version to a protocol
mode.
Native Specification
PCIe Non-Flit Mode CXL 68B Flit Mode CXL 256B Flit Mode PCIe Flit Mode
Supporteda
Mandatory
CXL 2.0 Mandatory N/A N/A
(for CXL.io)
Mandatory Mandatory
CXL 3.0 Mandatory Mandatory
(for CXL.io) (for CXL.io)
a. The same table applies to derivative version numbers for the specifications.
The Die-to-Die (D2D) Adapter negotiates the protocol with the remote Link partner and
communicates it to the Protocol Layer(s). For each protocol, UCIe supports multiple modes of
operation (that must be negotiated with the remote Link partner depending on the advertised
capabilities, Physical Layer Status as well as usage models). These modes have different Flit Formats
and are defined to enable different trade-offs around efficiency, bandwidth and interoperability. The
spectrum of supported protocols, advertised modes and Flit Formats must be determined at SoC
integration time or during the Die-specific reset bring up flow. The Die-to-Die Adapter uses this
information to negotiate the operational mode as a part of Link Training and informs the Protocol
Layer over the Flit-aware Die-to-Die Interface (FDI). See Section 3.2 for parameter exchange rules in
the Adapter.
The subsequent sections provide an overview of the different modes from the Protocol Layer’s
perspective, hence they cover the supported formats of operation as subsections per protocol. The
Protocol Layer is responsible for transmitting data over FDI in accordance with the negotiated mode
and Flit Format. The illustrations of the Flit Formats in this chapter show an example configuration of
a 64B data path in the Protocol Layer mapped to a 64-Lane module of Advanced Package
configuration on the Physical Link of UCIe. Certain Flit Formats have dedicated bit positions filled in by
the Adapter, and details associated with these are illustrated separately in Chapter 3.0. For other Link
widths, see the Byte to Lane mappings defined in Section 4.1.1. Figure 2-1 shows the legend for
color-coding convention used when showing bytes within a Flit in the Flit Format examples in the UCIe
Specification.
Some bits populated by the Protocol Layer, some bits populated by the Adapter.
2.1 PCIe
UCIe supports the Flit Mode defined in PCIe Base Specification. See PCIe Base Specification for the
protocol definition. UCIe supports the non-Flit Mode using the CXL.io 68B Flit Formats as the transport
mechanism. There are five UCIe operating formats supported for PCIe, and these are defined in the
subsections that follow.
The Latency-Optimized formats enable the Protocol Layer to consume the Flit at 128B boundary,
reducing the accumulation latency significantly. When this format is negotiated, the Protocol Layer
must follow this Flit Format for Flit transfer on FDI, driving 0 on the fields reserved for Die-to-Die
Adapter.
The Ack, Nak, PM, and Link Management DLLPs are not used over UCIe for CXL.io for any of the 256B
Flit Modes. The other DLLPs and Flit_Marker definitions follow the same rules as defined in CXL
Specification. Portions of the DLP bytes must be driven by the Protocol Layer for Flit_Marker
assignment; see Section 3.3.3 for details on how DLP bytes are driven.
For CXL.cachemem for this mode, FDI provides an lp_corrupt_crc signal to help optimize for
latency while guaranteeing Viral containment. See Chapter 10.0 for details of interface rules for Viral
containment.
For CXL.cachemem in this format, FDI provides an lp_corrupt_crc signal to help optimize for
latency while guaranteeing Viral containment. See Section 10.2 for details of interface rules for Viral
containment.
The Ack, Nak, and PM DLLPs are not used for CXL.io in this mode. Credit updates and other remaining
DLLPs for CXL.io are transmitted in the Flits as defined in CXL Specification. For CXL.io, the
Transmitter must not implement Retry in the Protocol Layer (because Retry is handled in the
Adapter). To keep the framing rules consistent, Protocol Layer for CXL.io must still drive the LCRC
bytes with a fixed value of 0, and the Receiver must ignore these bytes and never send any Ack or
Nak DLLPs. Framing tokens are applied as defined for CXL.io 68B Flit Mode operation. It is
recommended for the transmitter to drive the sequence number, DLLP CRC, Frame CRC and Frame
parity in STP to 0. The receiver must ignore these fields. Given that UCIe Adapter provides reliable Flit
transport, framing errors, if detected by the Protocol Layer are likely due to uncorrectable internal
errors and it is permitted to treat them as such.
For CXL.cachemem, the “Ak” field defined by CXL Specification in the Flit is reserved, and the Retry
Flits are not used (because Retry is handled in the Adapter). Link Initialization begins with sending the
INIT.Param Flit without waiting for any received Flits. Viral containment (if applicable) must be
handled within the Protocol Layer for the 68B Flit Mode. CXL Specification introduced Error Isolation
as a way to reduce the blast radius of downstream component fatal errors compared to CXL Viral
Handling and provide a scalable way to handle device failures across a network of switches shared
between multiple Hosts. Specifically, Viral relies on a complete host reset to recover whereas Error
Isolation may recover by resetting the virtual hierarchy below the root port. Because CXL-defined
Retry Flits (which carry the viral notification for 68B Flits in CXL) are not used in 68B Flit mode in
UCIe, it is recommended for implementations to rely on error isolation at the CXL Root Port for fatal
errors on CXL.cachemem downstream components in 68B Flit mode (similar to Downstream Port
Containment for CXL.io).
+63
+0
The Protocol Layer presents 64B per Flit on FDI, and the Die-to-Die Adapter inserts a 2B Flit Header
and 2B CRC and performs the byte shifting required to arrange the Flits in the format shown in
Figure 3-11. On the receive data path, the Adapter strips out the Flit Header and CRC bytes to only
present the 64B per Flit to the Protocol Layer on FDI.
Format from the Adapter for Streaming protocols. See Section 3.3.3 for details of the Flit Format and
to see which of the reserved fields in the Flit Header are driven by the Protocol Layer. The Protocol
Layer presents 256B per Flit on FDI, driving 0b on the bits reserved for the Adapter. The Adapter fills
in the applicable Flit Header and CRC bytes. On the Rx datapath, the Adapter forwards the Flit
received from the Link as it is, and the Protocol Layer must ignore the bits reserved for the Adapter
(for example the CRC bits).
See Section 8.2.5.2.3 for details of mapping the Management Transport Packets (MTPs) over
Management Flits.
§§
Figure 3-1 shows a high level description of the functionality of the Adapter.
Protocol Layer
Flit aware
(FDI)
D2D interface
PHY Logic
Physical Layer
Sideband/
Electrical/AFE [k slices]
Global
The Adapter interfaces to the Protocol Layer using one or more instances of the Flit-aware Die-to-Die
interface (FDI), and it interfaces to the Physical Layer using the raw Die-to-Die interface (RDI). See
Chapter 10.0 for interface details and operation.
The D2D Adapter must follow the same rules as the Protocol Layer for protocol interoperability
requirements. Figure 3-2 shows example configurations for the Protocol Layer and the Adapter, where
the Protocol identifiers (e.g., PCIe) only signify the protocol, and not the Flit Formats. To provide cost
and efficiency trade-offs, UCIe allows configurations in which two protocol stacks are multiplexed onto
the same physical Link.
When Multi_Protocol_Enable is supported and negotiated, the Adapter must guarantee that it will not
send consecutive flits from the same protocol stack on the Link. This applies in all cases including
when Flits are sourced from FDI, from Retry Buffer, and when the data stream is paused and
restarted. Adapter is permitted to insert NOP Flits to guarantee this (these Flits bypass the Tx Retry
buffer, and are not forwarded to the Protocol Layer on the receiver). When Flits are transmitted from
the Retry Buffer, it is required to insert NOP Flits as needed to avoid sending consecutive Flits from
the same Protocol stack. When Management Transport protocol is negotiated for mainband with
Multi_Protocol_Enable, the Management Flit carries the same stack identifier as the Protocol Layer it
is multiplexed with. From the Adapter perspective, for the purposes of throttling and interleaving, it is
treated the same as flits received from the corresponding Protocol Layer. Note that there is no fixed
pattern of Flits alternating from different Protocol Layers. For example, a Flit from Protocol Stack 0
followed by a NOP Flit, followed by a Flit from Protocol Stack 0 is a valid transmit pattern. A NOP Flit
is defined as a Flit where the protocol identifier in the Flit Header corresponds to the D2D Adapter,
and the body of the Flit is filled with all 0 data (the NOP Flit is defined for all Flit Formats supported by
the Adapter, for all cases when it is operating in Raw Format). It is permitted for NOP flits to bypass
the Retry buffer, as long as the Adapter guarantees that it is not sending consecutive Flits for any of
the Protocol Layers. On the receiving side, the Adapter must not forward these NOP flits to the
Protocol Layer. The receiving Protocol Layer must be capable of receiving consecutive chunks of the
same Flit at the maximum Link speed, but it will not receive consecutive Flits. In addition to the
transfer rate, both protocol stacks must operate with the same protocol and Flit Formats.
Multi_Protocol_Enable and Raw Format are mutually exclusive. Each stack is given a single bit stack
identifier that is carried along with the Flit header for de-multiplexing of Flits on the Receiver. The
Stack Mux shown maintains independent Link state machines for each protocol stack. Link State
transition-related sideband messages have unique message codes to identify which stack’s Link State
Management is affected by that message.
IMPLEMENTATION NOTE
The primary motivation for enabling the Multi_Protocol_Enable parameter is to allow
implementations to take advantage of the higher bandwidth provided by the UCIe
Link for lower-bandwidth individual Protocol Layers, without the need to make a lot of
changes to the UCIe Link. For example, two Protocol Layers that support the
maximum bandwidth for CXL 68B Flit Mode (i.e., the equivalent of 32 GT/s CXL
SERDES bandwidth) can be multiplexed over a UCIe Link that supports their
aggregate bandwidth.
Multi_Protocol_Enable and Management Transport protocol are negotiated, each stack can have
different protocols with or without MPG mux. For example, in Figure 8-27, the Enhanced
Multi_Protocol_Enable parameter must be negotiated for configs e, f, and h. The parameter is also
negotiated for configs b and d if the two stacks have different protocol pairs. Both protocol stacks and
the Adapter must support a common Flit Format for this feature to be enabled. “Enhanced
Multi_Protocol_Enable” and Raw Format are mutually exclusive. The Adapter must advertise the
maximum percentage of bandwidth that the receiver for each Protocol Layer can accept. The Adapter
transmitter must support 100% (no throttling) and throttling one or both Protocol Layer(s) to 50% of
maximum bandwidth. When 50% of the maximum bandwidth is advertised for a stack by an Adapter,
the remote Link partner must guarantee that it will not send consecutive Flits for the same stack on
the Link. This applies in all cases including when Flits are sourced from FDI, from Retry Buffer, and
when the data stream is paused and restarted. Adapter is permitted to insert NOP Flits to guarantee
this (these Flits bypass the Tx Retry buffer, and are not forwarded to the Protocol Layer on the
receiver). When Flits are transmitted from the Retry Buffer, it is required to insert NOP Flits as needed
to avoid exceeding the negotiated maximum bandwidth. The receiving Protocol Layer must be capable
of sinking Flits at the advertised maximum bandwidth percentage; in addition, Protocol Layer must be
able to receive consecutive chunks of the same Flit at the maximum advertised Link speed. When this
capability is supported, the Adapter must be capable of allowing each Protocol Layer to independently
utilize 100% of the Link bandwidth. Furthermore, the arbitration is per Flit, and the Adapter must
support round robin arbitration between the Protocol Layers if both of them are permitted to use
100% of the Link bandwidth. Additional implementation specific arbitration schemes are permitted as
long as they are per Flit and do not violate the maximum bandwidth percentage advertised by the
remote Adapter for a given stack. The Flit header has a single bit stack identifier to identify the
destination stack for the flit. The Stack Mux maintains independent Link state machines for each
protocol stack. Link State transition-related sideband messages have unique message codes to
identify which stack’s Link State Management is affected by that message.
FDI FDI
Arb/Mux
Die-to-Die Adapter Die-to-Die Adapter
RDI RDI
Pro toco l Layer Pro toco l Layer Pro toco l Layer Pro toco l Layer
(CXL.io) (CXL.cachemem) (CXL.io) (CXL.cachemem)
FDI
Arb/Mux Arb/Mux
Stack Mux
Die-to-Die Adapter
RDI
Protocol Protocol
Layer A Layer B
FDI
Stack Mux
Die-to-Die Adapter
RDI
(d) Two Stacks Multiplexed with Enhanced Multi_Protocol_Enable Negotiated
Protocol Layer D2D Adapter Physical Layer D2D Physical Layer D2D Adapter Protocol Layer
Die 0 Die 0 Die 0 CHANNEL Die 1 Die 1 Die 1
Sideband initialization
Stage 1
Time Progression
Link partner during Parameter Exchanges. For UCIe Retimers, the Adapter must also determine the
credits to be advertised for the Retimer Receiver Buffer. Each credit corresponds to 256B of Mainband
data storage.
Table 3-1. Capabilities that Must Be Negotiated between Link Partners (Sheet 1 of 2)
This parameter is advertised if the corresponding bit in the UCIe Link Control register is 1b. Software/
Firmware enables this based on system usage scenario. If the PCIe or CXL protocols are not
supported, and Streaming protocol is to be negotiated without any vendor-specific extensions and
“Raw Format” without Streaming Flit Format capability support, “Raw Format” must be 1b and advertised. If
Streaming Flit Format capability or Enhanced Multi-Protocol capability is supported, then this must be
advertised as 1b only if Raw Format is the intended format of operation. Software/firmware-based
control on setting the corresponding UCIe Link Control is permitted to enable this.
This is a protocol parameter. This must be advertised if the Adapter and Protocol Layer support CXL
68B Flit mode (mandatory for CXL) or PCIe Non-Flit mode (mandatory for PCIe). If PCIe Non-Flit
“68B Flit Mode” mode is the final negotiated protocol, it will use the CXL.io 68B Flit mode formats as defined in CXL
Specification. This is an advertised Protocol for Stack 0 if “Enhanced Multi_Protocol_Enable” is
supported and enabled.
This is a protocol parameter. This must be advertised if the Adapter and Protocol Layer support CXL
“CXL 256B Flit Mode” 256B Flit mode. This is an advertised Protocol for Stack 0 if Enhanced Multi-Protocol capability is
supported and enabled.
This is a protocol parameter. This must be advertised if the Adapter and Protocol Layer support PCIe
“PCIe Flit Mode” Flit mode. This is an advertised Protocol for Stack 0 if Enhanced Multi-Protocol capability is supported
and enabled.
This is a protocol parameter. This must be advertised if the Adapter and Protocol Layer support
Streaming protocol in Raw Format or Streaming Flit Format capability is supported and the
“Streaming” corresponding capabilities are enabled. This is an advertised Protocol for Stack 0 if Enhanced Multi-
Protocol capability is supported and enabled. PCIe or CXL protocol must not be advertised if
Streaming is advertised for a given Protocol Layer.
This must be advertised if the Adapter supports Retry. With the exception of the Link operating in Raw
“Retry” Format, the Link cannot be operational if the Adapter has determined Retry is needed, but “Retry” is
not advertised or negotiated. See also Section 3.8.
This must only be advertised if the Adapter is connected to multiple FDI instances corresponding to
two sets of Protocol Layers. It must only be advertised if the Adapter (or SoC firmware in Stage 0 of
“Multi_Protocol_Enable”
Link Initialization) has determined that the UCIe Link must be operated in this mode. Both
“Stack0_Enable” and “Stack1_Enable” must be 1b if this bit is advertised.
This must be advertised if the Protocol Layer corresponding to Stack 0 exists and is enabled for
“Stack0_Enable”
operation with support for the advertised protocols.
This must be advertised if the Protocol Layer corresponding to Stack 1 exists and is enabled for
“Stack1_Enable”
operation with support for the advertised protocols.
This must be advertised if the Adapter and Protocol Layer support Format 5 defined in Section 3.3.4.
“CXL_LatOpt_Fmt5” The Protocol Layer does not take advantage of the spare bytes in this Flit Format. This must not be
advertised if CXL protocol and CXL 256B Flit mode are not supported or enabled.
This must be advertised if the Adapter and Protocol Layer support Format 6 defined in Section 3.3.4.
“CXL_LatOpt_Fmt6” The Protocol Layer takes advantage of the spare bytes in this Flit Format. This must not be advertised
if CXL protocol and CXL 256B Flit mode are not supported or enabled.
This must be advertised if the Adapter of a UCIe Retimer is performing Parameter Exchanges with a
“Retimer”
UCIe Die within its package.
This is a 9-bit value advertising the total credits available for Retimer’s Receiver Buffer. Each credit
“Retimer_Credits”
corresponds to 256B data. This parameter is applicable only when “Retimer” is 1b.
Table 3-1. Capabilities that Must Be Negotiated between Link Partners (Sheet 2 of 2)
This is set by Downstream Ports to inform the remote Link partner that it is a Downstream Port. It is
useful for Retimers to identify whether they are connected to a Downstream Port UCIe die. It is
“DP” currently only applicable for PCIe and CXL protocols; however, Streaming protocols are not precluded
from utilizing this bit. If Enhanced Multi-Protocol capability is supported, this bit is applicable if either
of the Protocol Layers is PCIe or CXL. This bit must be set to 0b if “Retimer” is set to 1b.
This is set by Upstream Ports to inform the remote Link partner that it is an Upstream Port. It is
useful for Retimers to identify whether they are connected to an Upstream Port UCIe die. It is
“UP” currently only applicable for PCIe and CXL protocols; however, Streaming protocols are not precluded
from utilizing this bit. If Enhanced Multi-Protocol capability is supported, this bit is applicable if either
of the Protocol Layers is PCIe or CXL. This bit must be set to 0b if “Retimer” is set to 1b.
This must only be advertised if the Adapter is connected to multiple FDI instances corresponding to
two sets of Protocol Layers. The two sets of Protocol Layers are permitted to be different protocols,
“Enhanced
but must support at least one common Flit Format. This must only be advertised if the Enhanced
Multi_Protocol_Enable”
Multi-Protocol capability is supported and enabled; otherwise, it must be set to 0b. Both
“Stack0_Enable” and “Stack1_Enable” must be 1b if this bit is advertised.
“Stack 0 Maximum This must be advertised if Enhanced Multi_Protocol_Enable is advertised and the Stack 0 protocol
Bandwidth_Limit” Receiver is limited to 50% of the maximum bandwidth; otherwise, it must be set to 0b.
“Stack 1 Maximum This must be advertised if Enhanced Multi_Protocol_Enable is advertised and the Stack 1 protocol
Bandwidth_Limit” Receiver is limited to 50% of the maximum bandwidth; otherwise, it must be set to 0b.
This bit must be set to 1 if the Protocol Layer and Adapter both support Management Transport
“Management Transport
protocol (either as the only protocol or multiplexed with one of CXL.io, PCIe, or Streaming). The
Protocol”
mechanism by which this bit is set to 1 is implementation-specific.
Once local capabilities are established, the Adapter sends the {AdvCap.Adapter} sideband message
advertising its capabilities to the remote Link partner.
If PCIe or CXL protocol support is going to be advertised, the Upstream Port (UP) Adapter must wait
for the first {AdvCap.Adapter} message from the Downstream Port (DP) Adapter, review the
capabilities advertised by DP and then send its own sideband message of advertised capabilities. UP is
permitted to change its advertised capabilities based on DP capabilities. Once the DP receives the
capability advertisement message from the UP, the DP responds with the Finalized Configuration using
{FinCap.Adapter} sideband message to the UP as shown in Figure 3-4. See Section 7.1.2.3 to see the
message format for the relevant sideband messages.
If “68B Flit Mode” or “CXL 256B Flit Mode” is set in the {FinCap.Adapter} message, there must be
another handshake of Parameter Exchanges using the {AdvCap.CXL} and the {FinCap.CXL}
messages to determine the details associated with this mode. Note that because CXL 68B Flit Mode
protocol is mandatory for CXL, and because PCIe Non-Flit Mode protocol is mandatory for PCIe, the
“68B Flit Mode” parameter is always set to 1 for CXL or PCIe protocols. This additional handshake is
shown in Figure 3-4. The combination of {FinCap.CXL} and {FinCap.Adapter} determine the Protocol
and Flit Format. See Section 7.1.2.3 for the message format of the relevant sideband messages. See
Section 3.4 for how Protocol and Flit Formats are determined.
Figure 3-4. Parameter Exchange for CXL or PCIe (i.e., “68B Flit Mode”
or “CXL 256B Flit Mode” is 1 in {FinCap.Adapter})
If Streaming protocol is negotiated, there is no notion of DP and UP for parameter exchanges and
each side independently advertises its capabilities. Additional Vendor Defined sideband messages are
permitted to be exchanged to negotiate vendor-specific extensions. See Table 7-8 and Table 7-10 for
additional descriptions of Vendor Defined sideband messages. Similarly, if Management Transport
protocol is negotiated on a stack without “Streaming protocol,” “CXL 256B Flit mode,” or “PCIe Flit
mode,” there is no notion of DP and UP for parameter exchanges and each side independently
advertises its capabilities.
{FinCap.*} messages are not sent for Streaming protocol. Adapter must determine vendor specific
requirements in an implementation specific manner.
If “68B Flit Mode” or “CXL 256B Flit Mode” is set in the {MultiProtFinCap.Adapter} message, there
must be another handshake of Parameter Exchanges using the {AdvCap.CXL} and the {FinCap.CXL}
messages to determine the details associated with this mode. The non-Stall {*.CXL} messages are
sent with a MsgInfo encoding of 0001h indicating that these messages are for Stack 1 negotiation.
Figure 3-5 to Figure 3-9 represent examples of different scenarios where Stack 0 and Stack 1 are of
different protocols.
The Adapter must implement a timeout of 8 ms (-0%/+50%) for successful Parameter Exchange
completion. For the purposes of measuring a timeout for Parameter Exchange completion, all steps in
Part 1 and Part 2 of Stage 3 of Link Initialization are included. The timer only increments while RDI is
in Active state. The timer must reset if the Adapter receives an {AdvCap.*.Stall}, {FinCap.*.Stall},
{MultiProtAdvCap.*.Stall}, or {MultiProtFinCap.*.Stall} message from the remote Link partner. The
8-ms timeouts for Parameter Exchanges or Link State Machine transitions are treated as UIE and the
Adapter must take the RDI to LinkError state. UCIe Retimers must ensure that they resolve the
capability advertisement with remote Retimer partner (and merge with their own capabilities) before
responding/initiating parameter exchanges with the UCIe die within its package. While resolution is in
progress, they must send the corresponding stall message once every 4 ms to ensure that a timeout
does not occur on the UCIe die within its package.
The data width on FDI is a function of the frequency of operation of the UCIe stack as well as the total
bandwidth being transferred across the UCIe physical Link (which in turn depends on the number of
Lanes and the speed at which the Lanes are operating). The data width on RDI is fixed to at least one
byte per physical Lane per module that is controlled by the Adapter. The illustrations of the formats in
this chapter are showing an example configuration of RDI mapped to a 64 Lane module of Advanced
Package configuration on the Physical Layer of UCIe.
+63
+0
The Protocol Layer sends 64B of protocol information. The Adapter adds a two byte prefix of Flit
Header and a two byte suffix of CRC. Table 3-3 gives the Flit Header format for Format 2 when Retry
from the Adapter is required. If Retry from the Adapter is not required, then the Flit Header format is
as provided in Table 3-2.
Even if Retry is not required, the Adapter still computes and drives CRC bytes — the Receiver is
strongly recommended to treat a CRC error as an Uncorrectable Internal Error in this situation. For
CRC computation, Flit Byte 0 (i.e., Flit Header Byte 0) is assigned to CRC message Byte 0, Flit Byte 1
(i.e., Flit Header Byte 1) is assigned to CRC message Byte 1 and so on until Flit Byte 65 is assigned to
CRC message Byte 65.
Description
Byte Bit
PCIe or CXL Streaming Protocol
[3:0] Reserved
[6:0] Reserved
a. For a Test Flit, bits [7:6] of Byte 1 are 01b. See Section 11.2 for more details.
Description
Byte Bit
PCIe or CXL Streaming Protocol
[3:0] The upper four bits of Sequence number “S” (i.e., S[7:4])
a. For a Test Flit, bits [7:6] of Byte 1 are 01b. See Section 11.2 for more details.
The Adapter may optionally insert continuous NOPs instead of terminating the data stream with a PDS
when no other flits are available to transmit. There is a trade-off between the longer idle latency for a
new flit to be transmitted after a PDS vs. the power consumption of continuously transmitting NOP
flits. It is the responsibility of the transmitting Adapter to make the determination between
transmitting NOP flits vs. inserting a PDS in an implementation-specific manner.
If Retry is enabled, the Receiver must interpret this Flit header as PDS if any two of the above four
conditions are true. If Retry is disabled, the Receiver must interpret this Flit header as a PDS if
conditions (1) and (2) are true.
A PDS must be inserted when Retry is triggered or RDI state goes through Retrain. The transmitter
must insert PDS Flit Header and corresponding padding of 0s as it would for an actual PDS and start
the replayed Flit from fresh alignment (i.e., flit begins from a 256B-aligned boundary). Note that for
Retry, this should occur before the Transmitter begins replaying the Flits from the Retry buffer; and
for Retrain entry, this should occur before asserting lp_stallack to the Physical Layer.
For Retry and Retrain scenarios, the Receiver must also look for the expected sequence number in
Byte 0 and Byte 1 of the received data bus with a corresponding valid Flit (i.e., CRC passes). Note that
for a Retrain scenario, a PDS might not be received at the receiver before the RDI state changes to
Retrain, and the Adapter must discard any partially received 68B Flits after state change.
When resuming the data stream after a PDS token (i.e., a PDS Flit Header and the corresponding
padding of 0s), the first Flit is always 256B aligned; any valid Flit transfer after a PDS token will
resume the data stream. After a PDS Flit Header has been transmitted, the corresponding padding of
0b to satisfy the PDS token padding requirements must be finished before resuming the data stream
with new Flits.
When Retry is enabled, the BER is 1E-15 or lower, which results in the probability of
two or more bit errors within the Flit Header is very low. However, implementations
must consider the following two scenarios:
• PDS Flit Header aliasing to a regular Flit Header: Checking for two out of the
four conditions guarantees that at least three bit errors must occur within the two
bytes of the PDS Flit Header for it to alias to a regular Flit Header. Even for three
bit errors, there will be a CRC which will result in a retry and will be handled
seamlessly through the retry rules.
• Regular Flit Header aliasing to a PDS Flit Header: It is possible for two bit
errors to cause a Regular Flit Header to alias to a PDS Flit Header. This will likely
result in a CRC error for future Flits. However, to reduce the probability of a data
corruption that escapes CRC even further, it is strongly recommended that if a
PDS Flit Header was detected without all four conditions being satisfied (i.e., two
out of four or three out of four were satisfied), the receiver checks for an explicit
sequence number Flit with the expected sequence number in Byte 0 and Byte 1 of
the first received data transfer and that it is a valid Flit (i.e., CRC passes) after
the PDS (including the PDS token and the corresponding padding) have
completed; and triggers a Retry if it does not pass the check. Note that this is the
same check a Receiver performs after a Retry or Retrain.
Figure 3-11 shows the 68B Flit Format. Figure 3-12 and Figure 3-13 provide examples of PDS
insertion.
+63
+10
F1 B62c F1H B0b +0
F1 B63c F1H B1b +1
+2
+3
+4
+5
+6
+7
+8
+9
F3H B1b
C2 B0d
C2 B1d
F3H B0
6B of Flit 2
Byte 128 54B of Flit 3 (Next Flit)
(from Protocol Layer)
+63
FH B0b +0
FH B1b +1
+2
+3
+4
+5
+6
Byte 0 62B (from Protocol Layer)
2B
PDS B0d
B1d
B0c
CRC B1c
(from
Byte 64 58B all 0 data
CRC
Protocol PDS
Layer)
Figure 3-13. Format 2: 68B Flit Format PDS Example 2 — Extra 0s Padded
to Make the Data Transfer a Multiple of 256Ba
+63
+10
F2 B58e F1 B62c F1H B0b +0
+1
+2
+3
+4
+5
+6
+7
+8
+9
B1b
PDS B0f
PDS B1f
B59e
B60e
B63e
C2 B0d
C2 B1d
F2
The Protocol Layer sends data in 256B Flits, but it drives 0 on the bytes reserved for the Adapter
(shown in light orange in Figure 3-14 through Figure 3-19). The 6B of DLP defined in PCIe Base
Specification exist in Format 3 and Format 4 as well for PCIe and CXL.io protocols. However, since
DLLPs are required to bypass the Tx Retry buffer in PCIe and CXL.io protocols, the DLP bytes end up
being unique since they are partially filled by the Protocol Layer and partially by the Adapter. DLP0
and DLP1 are replaced with the Flit Header for UCIe and are driven by UCIe Adapter. However, if the
Flit carries a Flit Marker, the Protocol Layer must populate bit 4 of Flit Header Byte 0 to 1b, as well as
the relevant information in the Flit_Marker bits (these are driven as defined in PCIe Base
Specification). Protocol Layer must also populate the Protocol Identifier bits in the Flit Header for the
Flits it generates.
For Streaming protocols, Figure 3-17 shows the applicable Flit Format. Protocol Layer only populates
bits [7:6] of Byte 0 of the Flit Header, and it must never set 00b for bits [7:6].
Standard 256B Start Header Flit Format is optional for PCIe Flit Mode protocol. Figure 3-18 shows the
Flit Format example.
FDI provides a separate interface for DLLP transfer from the Protocol Layer to the Adapter and vice-
versa. The Adapter is responsible for inserting DLLP into DLP Bytes 2:5 if a Flit Marker is not present.
The credit update information is transferred as regular Update_FC DLLPs over FDI from the Protocol
Layer to the Adapter. The Adapter is also responsible for formatting these updates as
Optimized_Update_FC format when possible and driving them on the relevant DLP bytes. The Adapter
is also responsible for adhering to all the DLLP rules defined for Flit Mode in PCIe Base Specification.
On the receive path, the Adapter is responsible for extracting the DLLPs or Optimized_Update_FC
from the Flit and driving it on the dedicated DLLP interface provided on FDI.
Two sets of CRC are computed (CRC0 and CRC1). The same 2B over 128B CRC computation as
previous formats is used.
If Retry is not required, the Adapter still computes and drives CRC bytes — the Receiver is strongly
recommended to treat a CRC error as an Uncorrectable Internal Error (UIE) in this situation.
The Flit Header byte formats are shown in Table 3-5 when Retry is required; otherwise, it is as shown
in Table 3-4.
The Protocol Layer must drive bits [7:6] in Byte 1 of Flit Header to 00b for CXL/PCIe/Streaming
protocol Flits and to 10b for Management Flits (when successfully negotiated).
For Management Flits, Bytes 238 to 241 are driven from the Protocol Layer with Management
Transport Credit Return DWORD (CRD) Bytes 0 to 3 (see Section 8.2.5.2.2 for CRD format). Bytes
232 to 235 in Format 3 and Bytes 234 to 237 in Format 4 are driven from the Protocol Layer with 0s
for Management Flits. See Figure 3-16 and Figure 3-19 for details of Format 3 and Format 4 for
Management Flits, respectively.
If PCIe/CXL.io is negotiated along with Management Transport protocol on the same stack:
• If bits [7:6] of Byte 1 are 10b, the Adapter passes through Bytes 238 to 241 from the Protocol
Layer to the Link
• If bits [7:6] of Byte 1 are 00b, Bytes 238 to 241 are treated per PCIe/CXL.io DLP rules for this flit
format
Figure 3-14. Format 3: Standard 256B End Header Flit Format for PCIea
+43
+49
+59
+63
+44
+45
+46
+50
+60
+0
DLP B3c
DLP B4c
DLP B5c
C0 B0d
FH B0b
C0 B1d
C1 B0d
C1 B1d
FH B1b
DLP B2
10B
Byte 192 44B of Flit Chunk 3 (from Protocol Layer)
Reserved
Figure 3-15. Format 3: Standard 256B End Header Flit Format for Streaming Protocola
+43
+49
+59
+63
+44
+45
+46
+50
+60
+0
4B
FH B0b
C0 B0c
FH B1b
C0 B1c
B0c
B1c
10B
Byte 192 44B of Flit Chunk 3 (from Protocol Layer) (from Protocol
Reserved
C1
C1
Layer)
+43
+45
+49
+59
+63
+40
+44
+46
+50
+60
+0
4B CRD
FH B0b
C0 B0c
FH B1b
C0 B1c
B0c
B1c
4B 10B
Byte 192 40B of Flit Chunk 3 (from Protocol Layer) (from Protocol
Rsvd Reserved
C1
C1
Layer)
+49
+59
+63
+50
+60
FH B0b +0
FH B1b +1
+2
Byte 0 62B of Flit Chunk 0 (from Protocol Layer)
C0 B0c
C0 B1c
C1 B0c
C1 B1c
10B
Byte 192 50B of Flit Chunk 3 (from Protocol Layer)
Reserved
Figure 3-18. Format 4: Standard 256B Start Header Flit Format for CXL.io or PCIea
+45
+49
+59
+63
+46
+50
+60
FH B0b +0
+1
+2
B1b
DLP B3c
DLP B4c
DLP B5c
C0 B0d
C0 B1d
B0d
B1d
DLP B2
10B
Byte 192 46B of Flit Chunk 3 (from Protocol Layer)
Reserved
C1
C1
a. See Figure 2-1 for color mapping.
b. Flit Header Byte 0 and Byte 1, respectively.
c. DLP Byte 2, Byte 3, Byte 4, and Byte 5, respectively.
d. CRC0 Byte 0, CRC0 Byte 1, CRC1 Byte 0, and CRC1 Byte 1, respectively.
+45
+49
+59
+63
+46
+50
+60
FH B0b +0
+1
+2
B1b
4B CRD
C0 B0c
C0 B1c
B0c
B1c
4B 10B
Byte 192 42B of Flit Chunk 3 (from Protocol Layer) (from Protocol
Rsvd Reserved
C1
C1
Layer)
Table 3-4. Flit Header for Format 3, Format 4, Format 5, and Format 6 without Retry
Description
Byte
Bit
[3:0] Reserved
Flit Type:
00b: CXL/PCIe/Streaming Flit/D2D Adapter NOP Flit
[7:6] 01b: Test Flit (see Section 11.2 for details)
1 10b: Management Flit
11b: Reserved
[5:0] Reserved
Table 3-5. Flit Header for Format 3, Format 4, Format 5, and Format 6 with Retry
Description
Byte
Bit
[3:0] The upper four bits of Sequence number “S” (i.e., S[7:4])
Flit Type:
00b: CXL/PCIe/Streaming Flit/D2D Adapter NOP Flit
[7:6] 01b: Test Flit (see Section 11.2 for details)
10b: Management Flit
11b: Reserved
Both formats look the same from the Adapter perspective, the only difference is whether the Protocol
Layer is filling in the optional bytes of protocol information . The Latency-Optimized 256B without
Optional bytes Flit Format (or Format 5) is when the Protocol Layer is not filling in the optional bytes,
whereas the Latency-Optimized 256B with Optional bytes Flit Format (or Format 6) is when the
Protocol Layer is filling in the optional bytes.
Latency-Optimized 256B Flit Formats (with Optional bytes or without Optional bytes) support is
optional with Streaming protocols. Protocol Layer only populates bits [7:6] of the Flit Header, and it
must never set 00b for bits [7:6].
Latency-Optimized Flit with Optional Bytes Flit Format is optional for PCIe Flit Mode protocol.
Figure 3-23 shows the Flit Format example.
Two sets of CRC are computed. CRC0 is computed using Flit Bytes 0 to 125 assigned to the
corresponding bytes of the CRC message input (including the Flit Header bits and if applicable, the
DLP bits inserted by the Adapter). CRC1 is computed using Flit Bytes 128 to 253 as the message input
with Flit Byte 128 assigned to CRC message Byte 0, Flit Byte 129 assigned to CRC message Byte 1
and so on until Flit Byte 253 assigned to CRC message Byte 125. If Retry is not required, the Adapter
still computes and drives CRC bytes — the Receiver is strongly recommended to treat a CRC error as
UIE in this situation.
For Management Flits (when successfully negotiated), the Protocol Layer must drive bits [7:6] in Byte
1 of Flit Header to 00b for Protocol Flit and to 10b.
For Management Flits using Format 5, Bytes 240 to 243 are driven from the Protocol Layer with
Management Transport Credit Return DWORD (CRD) Bytes 0 to 3 (see Section 8.2.5.2.2 for CRD
format). See Figure 3-22 for details.
If CXL.io is negotiated along with Management Transport protocol on the same stack for Format 5:
• If bits [7:6] of Byte 1 are 10b, the Adapter drives 0 on Bytes 122 to 125 and 244 to 253
• If bits [7:6] of Byte 1 are 00b, then Bytes 122 to 125 are treated per the CXL.io DLP rules of this
flit format and Bytes 250 to 253 are treated per the CXL.io FM rules of this flit format
For Management Flits using Format 6, Bytes 250 to 253 are driven from the Protocol Layer with
Management Transport Credit Return DWORD (CRD) Bytes 0 to 3 (see Section 8.2.5.2.2 for CRD
format). Similarly, Bytes 244 to 249 are driven from the Protocol Layer as 0. See Figure 3-26 for
details.
If PCIe/CXL.io is negotiated along with Management Transport protocol on the same stack for Format
6:
• If bits [7:6] of Byte 1 are 10b, the Adapter passes through Bytes 122 to 125 and 248 to 253
• If bits [7:6] of Byte 1 are 00b, then Bytes 122 to 125 are treated per the PCIe/CXL.io DLP rules of
this flit format, Bytes 250 to 253 are treated per the PCIe/CXL.io FM rules of this flit format, and
the Adapter drives 0 on Bytes 248 and 249
Figure 3-20. Format 5: Latency-Optimized 256B without Optional Bytes Flit Format for CXL.ioa
+51
+57
+61
+63
+52
+58
+62
FH B0b +0
FH B1b +1
+2
C0
6B
Byte 192 52B of Flit Chunk 3 (from Protocol Layer)
Reserved
C1
FM
Figure 3-21. Format 5: Latency-Optimized 256B without Optional Bytes Flit Format
for CXL.cachemem and Streaming Protocola
+51
+57
+61
+63
+52
+58
+62
FH B0b +0
FH B1b +1
+2
Byte 0 62B of Flit Chunk 0 (from Protocol Layer)
B0c
C0 B1c
4B
Byte 64 58B of Flit Chunk 1 (from Protocol Layer)
Reserved
C0
Byte 128 Flit Chunk 2 64B (from Protocol Layer)
B0c
C1 B1c
10B
Byte 192 52B of Flit Chunk 3 (from Protocol Layer)
Reserved
C1
a. See Figure 2-1 for color mapping.
b. Flit Header Byte 0 and Byte 1, respectively.
c. CRC0 Byte 0, CRC0 Byte 1, CRC1 Byte 0, and CRC1 Byte 1, respectively.
Figure 3-22. Format 5: Latency-Optimized 256B without Optional Bytes Flit Format
for Management Transport Protocola
+47
+51
+57
+61
+63
+48
+52
+58
+62
FH B0b +0
+1
+2
B1b
B0c
C0 B1c
4B
Byte 64 58B of Flit Chunk 1 (from Protocol Layer)
Reserved
C0
Byte 128 Flit Chunk 2 64B (from Protocol Layer)
4B CRD
B0c
C1 B1c
10B
Byte 192 48B of Flit Chunk 3 (from Protocol Layer) (from Protocol
Reserved
C1
Layer)
Figure 3-23. Format 6: Latency-Optimized 256B with Optional Bytes Flit Format
for CXL.io or PCIea
+55
+57
+61
+63
+56
+58
+62
FH B0b +0
+1
+2
B1b
B2c
DLP B3c
DLP B4c
DLP B5c
B0d
C0 B1d
C0
2B
Byte 192 56B of Flit Chunk 3 (from Protocol Layer)
Rsvd
C1
FM
Figure 3-24. Format 6: Latency-Optimized 256B with Optional Bytes Flit Format
for CXL.cachemema
+51
+57
+61
+63
+52
+58
+62
FH B0b +0
FH B1b +1
+2
Byte 0 62B of Flit Chunk 0 (from Protocol Layer)
B0d
C0 B1d
B0c
H B1c
H B2c
H B3c
Byte 64 58B of Flit Chunk 1 (from Protocol Layer)
C0
H
Byte 128 Flit Chunk 2 64B (from Protocol Layer)
B0d
C1 B1d
B10c
H B11c
H B12c
H B13c
H B4c
H B5c
H B6c
H B7c
H B8c
H B9c
Byte 192 52B of Flit Chunk 3 (from Protocol Layer)
C1
H
a. See Figure 2-1 for color mapping.
b. Flit Header Byte 0 and Byte 1, respectively.
c. H-slot Byte 0 through Byte 13, respectively (from Protocol Layer).
d. CRC0 Byte 0, CRC0 Byte 1, CRC1 Byte 0, and CRC1 Byte 1, respectively.
Figure 3-25. Format 6: Latency-Optimized 256B with Optional Bytes Flit Format
for Streaming Protocola
+61
+63
+62
FH B0b +0
+1
+2
B1b
B0c
C0 B1c
Byte 64 62B of Flit Chunk 1 (from Protocol Layer)
C0
Byte 128 Flit Chunk 2 64B (from Protocol Layer)
B0c
C1 B1c
Byte 192 62B of Flit Chunk 3 (from Protocol Layer)
C1
a. See Figure 2-1 for color mapping.
b. Flit Header Byte 0 and Byte 1, respectively.
c. CRC0 Byte 0, CRC0 Byte 1, CRC1 Byte 0, and CRC1 Byte 1, respectively.
Figure 3-26. Format 6: Latency-Optimized 256B with Optional Bytes Flit Format
for Management Transport Protocola
+51
+57
+61
+63
+52
+58
+62
FH B0b +0
+1
+2
B1b
B0c
C0 B1c
4B CRD
B0c
C1 B1c
6B
Byte 192 52B of Flit Chunk 3 (from Protocol Layer) (from Protocol
Rsvd
C1
Layer)
The Flit Header byte formats are the same as Table 3-5 when Retry is required; otherwise, they are
the same as Table 3-4. The DLP rules are also the same as defined in Section 3.3.3 for CXL protocol,
except that Flit_Marker/Optimized_Update_FC has dedicated space in the Flit (i.e., bit [4] of Byte 0
corresponds to the Flit_Marker bytes, and not the DLP bytes). If Optimized_Update_FC is sent, the
DLP Bytes 2:5 shown in Figure 3-20 must be reserved. If bit [4] of Byte 0 in the Flit Header is 0b,
then the Flit_Marker bytes are reserved.
Protocol Layer populates all the bytes on FDI. Adapter passes to RDI • Section 3.3.1
Format 1 Raw
without modifications or additions. • Figure 3-10
Protocol Layer transmits 64B per Flit on FDI. Adapter inserts two
bytes of Flit header and two bytes of CRC and performs the required • Section 3.3.2
Format 2 68B Flit barrel shifting of bytes before transmitting on RDI. On the RX, • Figure 3-11
Adapter strips out the Flit header and CRC only sending the 64B per • Figure 3-12
Flit to the Protocol Layer on FDI.
Table 3-7 gives the implementation requirements and Protocol Mapping for the different Flit Formats.
For PCIe and CXL protocols, the implementation requirements must be followed by the Protocol Layer
as well as the Adapter implementations. For Streaming protocols, the implementation requirements
are for the Adapter only; Protocol Layer interoperability and implementation requirements are vendor
specific.
PCIe Management
Format Flit Format PCIe CXL 68B CXL 256B Streaming
Non-Flit Transport
Number Name Flit Mode Flit Mode Flit Mode Protocol
Mode Protocol
Standard 256B
3 N/A Mandatory N/A N/A Optionala Optional
End Header
Standard 256B
4 N/A Optionalb N/A Mandatory Optionala Optional
Start Header
Latency-
Optimized
5 N/A N/A N/A Optional Optionala Optional
256B without
Optional Bytes
Latency-
Optimized Strongly Strongly Strongly
6 N/A N/A Optional
256B with Recommendedc Recommended Recommendeda
Optional Bytes
Management Protocol
68B Flit CXL 256B PCIe Flit Streaming
Transport PCIe CXL.io
Mode Flit Mode Mode Protocol
Protocol
Management Transport
N/A N/A N/A N/A N/A N/A N/A
protocolh
IMPLEMENTATION NOTE
The “68B Flit Mode” parameter is advertised as set to 1 for both the CXL and PCIe
protocols in {AdvCap.Adapter} sideband messages. As seen in Table 3-8, this
parameter is set to 1 in {FinCap.Adapter} sideband messages whenever the CXL OR
PCIe protocols are negotiated.
The “CXL.io” and “PCIe” bits in the {AdvCap.CXL} sideband message disambiguate
between CXL support vs. PCIe support. It is permitted to set both to 1 in
{AdvCap.CXL} sideband messages. However, as seen in Table 3-8, only one of these
must be set in the {FinCap.CXL} sideband message to reflect the final negotiated
protocol for the corresponding stack. For example:
• If the DP and UP both support CXL and PCIe protocols, then both “CXL.io” and
“PCIe” will be set to 1 in the {AdvCap.CXL} sideband message
• If the DP decides to operate in CXL, the DP will set “CXL.io” to 1 and clear “PCIe”
to 0 in the {FinCap.CXL} sideband message, in which case the remaining CXL-
related bits in the {FinCap.CXL} sideband message are also applicable and are
assigned as per the negotiation
Table 3-9 (Truth Table 1) shows the truth table for deciding the Flit format in which to operate if PCIe
or CXL protocols are negotiated (with or without Management Transport protocol), and none of the
following are negotiated:
• Enhanced Multi_Protocol_Enable
• Standard 256B Start Header for PCIe protocol capability
• Latency-Optimized Flit with Optional Bytes for PCIe protocol capability
Table 3-10 (Truth Table 2) provides the Truth Table for determining the Flit Format for Streaming
protocols if Streaming Flit Format capability is negotiated or if Management Transport protocol is
negotiated without CXL or PCIe or Streaming protocols on the same stack. Note that for Streaming
protocol negotiation or for Management Transport protocol negotiation without CXL or PCIe protocol
multiplexed on the same stack, there are no {FinCap.*} messages exchanged. Each side of the UCIe
Link advertises its own capabilities in the {AdvCap.Adapter} message it sends. The bits in Table 3-10
represent the logical AND of the corresponding bits in the sent and received {AdvCap.Adapter}
messages. Truth Table 2 must be followed for determining the Flit Format if both sides of the Link
have any of the following capabilities are supported and enabled for both sides of the Link:
• Enhanced Multi-Protocol Capability
• Standard Start Header Flit for PCIe protocol capability
• Latency-Optimized Flit with Optional Bytes for PCIe protocol capability
For situations where {FinCap.Adapter} messages are sent, the bits in the truth table represent the
bits set in the {FinCap.Adapter} message.
It is permitted for the Adapter OR the Protocol Layer to take the Link down to LinkError if the desired
Flit Format is not negotiated or the negotiated Flit format and protocol combination is illegal (e.g., 68B
Flit Format 2 and Management Transport protocol combination).
{FinCap.Adapter} bitsa
Flit Format
Raw 68B Flit CXL 256B PCIe Flit CXL_LatOpt_ CXL_LatOpt_
Format Mode Flit Mode Mode Fmt5 Fmt6
Format 5: Latency-
0 0 0 0 1 0 Optimized 256B without
Optional Bytes Flit Format
Format 6: Latency-
0 x x x x 1 Optimized 256B with
Optional Bytes Flit Format
a. Format 6 is the highest priority format when Raw Format is not advertised because it has the best performance characteristics.
Between Format 4 and Format 3, Format 4 is higher priority because it enables lower latency through the D2D Adapter when
multiplexing different protocols. Format 5 has the highest overhead and therefore has the lowest priority relative to other
formats.
b. Raw Format is always explicitly enabled through UCIe Link Control register and advertised only when it is the required format of
operation to ensure interoperability, and therefore appears as a higher priority in the decision table.
c. x indicates don’t care.
Figure 3-27 shows examples of state machine hierarchy for different configurations. For CXL, the
ARB/MUX vLSMs are exposed on FDI pl_state_sts. The Adapter LSM is used to coordinate Link
states with remote Link Partner and is required for all configurations. Each protocol stack has its
corresponding Adapter LSM. For PCIe or Streaming protocols, the Adapter LSM is exposed on FDI
pl_state_sts.
The RDI state machine (SM) is used to abstract the Physical Layer states for the upper layers. The
Adapter data path and RDI data width can be extended for multi-module configurations; however,
there is a single RDI state machine for this configuration. The Multi-module PHY Logic creates the
abstraction and coordinates between the RDI state and individual modules. The following rules apply:
• vLSM state transitions are coordinated with remote Link partner using ALMPs on mainband data
path. The rules for state transitions follow the CXL 256B Flit Mode rules in the CXL Specification.
• Adapter LSM state transitions are coordinated with remote Link partner using
{LinkMgmt.Adapter*} sideband messages. These messages are originated and received by the
D2D Adapter.
• RDI SM state transitions are coordinated with the remote Link partner using {LinkMgmt.RDI*}
sideband messages. These messages are originated and received by the Physical Layer.
FDI
FDI
ARB/MUX
vLSM vLSM
Adapter LSM
Adapter LSM
RDI
RDI SM
RDI
RDI SM
(a) CXL example (b) PCIe or Streaming example
General rules for State transition hierarchy are captured below. For specific sequencing, see the rules
outlined in Chapter 10.0.
• Active State transitions: RDI SM must be in Active before Adapter LSM can begin negotiation to
transition to Active. Adapter LSM must be in Active before vLSMs can begin negotiations to
transition to Active.
• Retrain State transitions: RDI SM must be in Retrain before propagating Retrain to Adapter LSMs.
If RDI SM is in Retrain, Retrain must be propagated to all Adapter LSMs that are in Active state.
Adapter must not request Retrain exit on RDI before all the relevant Adapter LSMs have
transitioned to Retrain.
• PM State transitions (both L1 and L2): Both CXL.io and CXL.cachemem vLSMs (if CXL), must
transition to PM before the corresponding Adapter LSM can transition to PM. All Adapter LSMs
(if multiple stacks are enabled on the same Adapter) must be in PM before RDI SM is transitioned
to PM.
• LinkError State transitions: RDI SM must be in LinkError before Adapter LSM can transition to
LinkError. RDI SMs coordinate LinkError transition with remote Link partner using sideband, and
each RDI SM propagates LinkError to all enabled Adapter LSMs. Adapter LSM must be in LinkError
before propagating LinkError to both vLSMs if CXL. LinkError transition takes priority over
LinkReset or Disabled transitions. Adapter must not request LinkError exit on RDI before all the
relevant Adapter LSMs and CXL vLSMs have transitioned to LinkError.
• LinkReset or Disabled State transitions: Adapter LSM negotiates LinkReset or Disabled transition
with its remote Link partner using sideband messages. LinkReset or Disabled is propagated to RDI
SM only if all the Adapter LSMs associated with it transition to LinkReset or Disabled. Disabled
transition takes priority over LinkReset transition. If RDI SM moves to LinkReset or Disabled, it
must be propagated to all Adapter LSMs. If Adapter LSM moves to LinkReset or Disabled, it must
propagate it to both vLSMs for CXL protocol.
For UCIe Retimers, it is the responsibility of the Retimer die to negotiate state transitions with the
remote Retimer partner and make sure the different UCIe Die are in sync and do not time out waiting
for a response. As an example, referring to Figure 1-18, if UCIe Die 0 sends an Active Request
message for the Adapter LSM to UCIe Retimer 0, UCIe Retimer 0 must resolve with UCIe Retimer 1
that an Active Request message has been forwarded to UCIe Die 1 and that UCIe Die 1 has responded
with an Active Status message before responding to UCIe Die 0 with an Active Status message. The
Off Package Interconnect cannot be taken to a low power state unless all the relevant states on UCIe
Die 0 AND UCIe Die 1 have reached the low power state. UCIe Retimers must respond with “Stall”
encoding every 4ms while completing resolution with the remote Retimer partner.
since all ALMPs also go through the Retry buffer in UCIe). Even CXL 68B Flit Mode over UCIe uses
the “CXL 256B Flit Mode” ALMP formats and flows (but the Flit is truncated to 64B and two bytes
of Flit header and two bytes of CRC are added by the Adapter to make a 68B Flit). For PCIe
protocol in UCIe Flit Mode, PM DLLP handshakes are NOT used. Protocol Layer requests PM entry
on FDI based on Link idle time. The specific algorithm and hysteresis for determining Link idle
time is implementation specific.
2. Adapter Link State Machine PM entry: The PM transition for this is coordinated over sideband
with remote Link partner. In scenarios where the Adapter is multiplexing between two protocol
stacks, each stack’s Link State Machine must transition to PM independently.
3. PM entry on RDI: Once all the Adapter’s LSMs are in a PM state, the Adapter initiates PM entry
on the RDI as defined in Section 10.2.9.
4. Physical Layer moves to a deeper PM state and takes the necessary actions for power
management. Note that the sideband Link must remain active because the sideband Link is used
to initiate PM exit.
ARB/MUX vLSM Adapter LSM Physical Layer SM D2D Physical Layer SM Adapter LSM ARB/MUX vLSM
Die 0 Die 0 Die 0 CHANNEL Die 1 Die 1 Die 1
Once both vLSMs are in PM, Retry buffer empty, then do Adapter LSM handshake
(Sideband)
PM Entry Complete
The CRC is always computed over 128 bytes of the message. For smaller messages, the message is
zero extended in the MSB. Any bytes which are part of the 128B CRC message but are not
transmitted over the Link are assigned to 0b. Whenever non-CRC bytes of the Flit populated by the
Adapter are included for CRC computation (e.g., the Flit Header or DLP bytes), CRC is computed after
the Adapter has assigned those bytes the values that will be sent over the UCIe Link. Any reserved
bits which are part of the Flit are assigned 0b for the purpose of CRC computation.
The initial value of CRC bits for CRC LFSR computation is 0000h. The CRC calculation starts with bit 0
of byte 0 of the message, and proceeds from bit 0 to bit 7 of each byte as shown in Figure 3-29. In
the figure, C[15] is bit 7 of CRC Byte 1, C[14] is bit 6 of CRC Byte 1 and so on; C[7] is bit 7 of CRC
Byte 0, C[6] is bit 6 of CRC Byte 0 and so on.
The Verilog code for CRC code generation is provided in crc_gen.vs (attached to the PDF copy of this
Specification). This Verilog code must be used as the golden reference for implementing the CRC
during encode or decode. The code is provided for the Transmit side. It takes 1024 bits (bit 1023 is bit
7 of message Byte 127, 1022 is bit 6 of message Byte 127 and so on; bit 1015 is bit 7 of message
Byte 126 and so on until bit 0 is bit 0 of message Byte 0) as an input message and outputs 16 bits of
CRC. On the Receiver, the CRC is computed using the received Flit bytes with appropriate zero
padding in the MSB to form a 128B message. If the received CRC does not match the computed CRC,
the flit is declared Invalid and a replay must be requested.
76543210
Message Byte 127
Message Byte 1
Message Byte 0
Bit order
Byte Order
+ +
C[0] C[1] C[2] C[3] C[4] C[5] C[6] C[7] C[8] C[9] C[10] C[11] C[12] C[13] C[14] C[15]
The Retry scheme on UCIe is a simplified version of the Retry mechanism for Flit Mode defined in PCIe
Base Specification. The rules that differ from PCIe are as follows:
• Selective Nak and associated rules are not applicable and must not be implemented. Rx Retry
Buffer-related rules are also not applicable and must not be implemented.
• Throughout the duration of Link operation, when not conflicting with PCIe rules of replay, Explicit
Sequence number Flits and Ack/Nak Flits alternate. This allows for faster Ack turnaround and thus
smaller Retry buffer sizes. It is permitted to send consecutive Explicit Sequence number Flits if
there are no pending Ack/Nak Flits to send (see also the Implementation Note below). To meet
this requirement, all Explicit Sequence Number Flit transmissions described by the PCIe rules of
replay that require the condition “CONSECUTIVE_TX_EXPLICIT_SEQ_NUM_FLIT < 3” to be met
require “CONSECUTIVE_TX_EXPLICIT_SEQ_NUM_FLIT < 1” to be met instead, and it is not
required to send three consecutive Flits with Explicit Sequence Number.
• All 10-bit retry related counters are replaced with 8-bit counters, and the maximum-permitted
sequence number is 255 (hence 1023 in all calculations is replaced with 255 and any variables
defined in the “Flit Sequence Number and Retry Mechanism” section of PCIe Base Specification
which had an initial value of 1023 instead have an initial value of 255).
• REPLAY_TIMEOUT_FLIT_COUNT is a 9-bit counter that saturates at 1FFh.
— In addition to incrementing REPLAY_TIMEOUT_FLIT_COUNT as described in PCIe Base
Specification, the count must also be incremented when in Active state and a Flit Time
(Number of Adapter clock cycles (lclk) that are required to transfer 256B of data at the
current Link speed and width) has elapsed since the last flit was sent and neither a Payload
Flit nor a NOP flit was transmitted. The counter must be incremented for every Flit Time in
which a flit was not sent (this could lead to it being incremented several times in-between flits
or prior to the limit being met). The added requirement compensates for the noncontinuous
transfer of NOP flits. For 68B Flit Format, data transfers are also in 256B granularity (including
the PDS bytes), and thus this counter increments every time 256B of data are transmitted,
OR during idle conditions in Active state, it must be incremented according to the time that is
required to transfer 256B of data at the current Link speed and width.
— Replay Schedule Rule 0 of PCIe Base Specification must check for
REPLAY_TIMEOUT_FLIT_COUNT ≥ 375. Replay Timer Timeout error is logged in the
Correctable Internal Error in the Adapter for UCIe.
• For the FLIT_REPLAY_NUM counter, it is strongly recommended to follow the rules provided in
PCIe Base Specification for speeds ≤ 32.0 GT/s. This counter tracks the number of times that a
Replay has occurred without making forward progress. Given the significantly lower probability of
Replay for UCIe Links, the rules associated with ≤ 32.0 GT/s PCIe speeds are sufficient for UCIe.
• NAK_WITHDRAWAL_ALLOWED is always cleared to 0. Note that this requires implementations to
set the flag NAK_SCHEDULED=1 in the “Nak Schedule 0” set of rules.
• IDLE Flit Handshake Phase is not applicable. This is because the transition to Link Active
(equivalent to LTSSM being in L0 for PCIe) is managed via handshakes on sideband, and there is
no requirement for IDLE Flits to be exchanged. As per PCIe rules, any Flits received with all 0s in
the Flit Header bytes are discarded by the Adapter. Any variables that are initialized during the
IDLE Flit Handshake Phase in PCIe Base Specification are initialized to the corresponding value
whenever the RDI is in Reset state or Retrain state. Similarly, PCIe rules that indicate relation to
“last entry to IDLE Flit Handshake Phase” would instead apply for UCIe to “last exit from Reset or
Retrain state on RDI”.
• Variables applicable to Flit Sequence number and Retry mechanism that are initialized during
DL_Inactive, as with PCIe, would be initialized to their corresponding values when RDI is in Reset
state for UCIe.
• Sequence Number Handshake Phase must be performed on every entry of the RDI to Active state
from Reset state or Retrain state (after Flit transfers are permitted). Sequence Number
Handshake Phase timeout and exit to Link Retrain is 128 Flits transmitted without exiting
Sequence Number Handshake Phase. As with PCIe, both NOP flits or Payload flits are permitted to
be used to complete the Sequence Number Handshake Phase. If there are no Payload flits to
send, the Adapter must generate NOP flits to complete the Sequence Number Handshake Phase.
• The variable “Prior Flit was Payload” is always set to 1. This bit does not exist in the Flit Header,
and thus from the Retry perspective, implementations must assume that it is always set to 1.
• MAX_UNACKNOWLEDGED_FLITS is set to the lesser of:
— Number of Flits that can be stored in the Tx Retry Buffer, or
— 127
• Flit Discard 2 rule from PCIe does not result in a Data Link Protocol Error condition in UCIe.
Receiving an invalid Flit Sequence number in a received Ack or Nak flit (see the corresponding
conditions in PCIe Base Specification with the adjusted variable widths and values) OR a Payload
Flit with an Explicit Sequence number of 0 results in an Uncorrectable Internal Error in UCIe
(instead of a Data Link Protocol Error).
• Conditions from the “Flit Sequence Number and Retry Mechanism” section in PCIe Base
Specification that led to Recovery for the Port must result in the Adapter initiating Retrain on the
RDI for UCIe.
IMPLEMENTATION NOTE
In UCIe, to encourage power savings through dynamic clock gating, it is not required
to continuously transmit NOP flits during periods in which there are no Payload flits or
any Ack/Nak pending. Consider an example in which an Adapter’s Tx Retry Buffer is
empty and it transmitted a NOP flit with an Ack as the last flit before it stopped
sending additional flits to the Physical Layer. Let’s say this flit had a CRC error and
hence the remote Link partner never receives this Ack. Moreover, because the remote
Link partner received a flit with a CRC error, it would transmit a Nak to original
sender. If the Ack is never re-sent and the remote Link partner has a corresponding
Payload flit in its Tx Retry Buffer, eventually a Replay Timeout will trigger from the
remote Link partner and resolve this scenario. However, rather than always relying on
Replay Timeout for these kind of scenarios, it is recommended for implementations to
ensure they have transmitted at least two flits with an Ack (these need not be
consecutive Ack flits) before stopping flit transfer whenever a Nak is received and the
transmitter has completed all the requirements of received Nak processing, including
any Replay related transfers. If no new Payload Flits were received from the remote
Link partner, as per PCIe rules, it is permitted to re-send the last transmitted Ack on
a NOP flit as well to meet this condition.
When this mechanism is enabled, the Adapter inserts 64*N Bytes every 256*256*N Bytes of data,
where N is obtained from the Error and Link Testing Control register (Field name: Number of 64 Byte
Inserts). Software must set N=4 when this feature is enabled during regular Link operation for UCIe
Flit mode because that makes the parity bytes also a multiple of 256B and is more consistent with the
granularity of data transfer. Only bit 0 of the inserted byte has the parity information which is
computed as follows:
The Transmitter and Receiver in the Adapter must independently keep track of the number of data
bytes elapsed to compute or check the parity information. If the RDI state moves away from Active
state, the data count and parity is reset, and both sides must renegotiate the enabling of the Parity
insertion before next entry to Active from Retrain (if the mechanism is still enabled in the Error and
Link Testing Control register). When entering Active state with Parity insertion enabled, the number of
data bytes elapsed begins counting from 0. On the transmitter, following the insertion of the parity
information, the counter for the number of bytes elapsed to compute the parity information is reset.
On the Receiver, following the receipt and check of parity bytes, the counter for the number of bytes
elapsed to check the parity information is reset.
This mechanism is enabled by Software writing 1b to the enable bit in the register located in both
Adapters across a UCIe Link (see Section 9.5.3.9 for register details). Software must trigger UCIe
Link Retrain after writing to the enable bit on both the Adapters. Support for this feature in Raw
Format is beyond the scope of this specification and is implementation-dependent. The Adapters
exchange sideband messages while the Adapter LSMs are in Retrain to ensure the remote Link
partner’s receiver is prepared to receive the extra parity bytes in the data stream once the states
transition to Active. The Adapter must not request Retrain exit to local RDI until the Parity Feature
exchanges are completed. It is permitted to enable it during Initial Link bring up, by using sideband to
access the remote Link partner’s registers or other implementation specific means; however software
must trigger Link Retrain for the feature to take effect.
Adapter sends a {ParityFeature.Req} sideband message to remote Link Partner if its Transmitter is
enabled to send parity bytes (“Runtime Link Testing Tx Enable” bit in Section 9.5.3.9). Remote
Adapter responds with a {ParityFeature.Ack} sideband message if its receiver is enabled and ready to
accept parity bytes (“Runtime Link Testing Rx Enable” bit in Section 9.5.3.9). Figure 3-30 shows an
example of a successful negotiation. If Die 0 Adapter Transmitter is enabled to insert parity bytes, it
must send a {ParityFeature.Req} from Die 0 to Die 1.
Adapter responds with a {ParityFeature.Nak} if it is not ready to accept parity bytes, or if the feature
has not been enabled for it yet. The requesting Adapter must log the Nak in a status register so
that Software can determine that a Nak had occurred. Figure 3-31 shows an example of an
unsuccessful negotiation.
Note: The Adapters are permitted to transition to a higher latency data path if the Parity
Feature is enabled. The explicit Ack/Nak handshake is provided to ensure both sides
have sufficient time to transition to alternate data path for this mechanism.
The Parity bytes do not consume Retimer receiver buffer credits. The Retimer receiver must not
write the Parity bytes into its receiver buffer or forward these to remote Retimer partner over the Off
Package Interconnect. This mechanism is to help characterize local UCIe Links only.
Figure 3-30. Successful Parity Feature negotiation between Die 1 Tx and Die 0 Rx
Die 1 sends
{ParityFeature.Req} if
it wants to enable
Parity insertion on its
Die 0 sends Tx
{ParityFeature.Ack} in
response to accept the
request from Die 1. It’s
Rx must be ready to
receive the extra
parity bytes before the
response is sent.
RDI Active Entry Handshake followed by Adapter LSM Active Entry Handshake
Figure 3-31. Unsuccessful Parity Feature negotiation between Die 1 Tx and Die 0 Rx
Die 1 sends
{ParityFeature.Req} if
it wants to enable
Parity insertion on its
Tx
Die 0 sends
{ParityFeature.Nak} in
response to reject the
request from Die 1.
RDI Active Entry Handshake followed by Adapter LSM Active Entry Handshake
If a parity error is detected by a chiplet, the error is treated as a Correctable error and reported via
the correctable error reporting mechanism. By enabling interrupt on correctable errors, SW can
implement a BER counter in SW, if so desired.
When a Pause Data Stream occurs the Pause Data Stream and corresponding padding bytes are
included in the number of bytes elapsed before parity injection as well as parity computation.
§§
Clock
Valid
8 UI
Each Byte is transmitted on a separate Lane. Byte 0 (B0) is transmitted on Lane 0, Byte 1 is
transmitted on Lane 1 and so on.
Figure 4-2 shows an example of a 256B Flit transmitted over a x64 interface (one x64 Advanced
Package module or two x32 Advanced Package modules or four Standard Package modules). If the I/
O width changes to x32 or x16 interface (Standard Package), transmission of one Byte per Lane is
preserved as shown in Figure 4-3 and Figure 4-4 respectively.
Figure 4-5 shows an example for a width degraded Standard Package module.
LANE
UI 0 1 2 3 4 5 6 7 … 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
0-7 B00 B01 B02 B03 B04 B05 B06 B07 … B48 B49 B50 B51 B52 B53 B54 B55 B56 B57 B58 B59 B60 B61 B62 B63
8 - 15 B64 B65 B66 B67 B68 B69 B70 B71 … B112 B113 B114 B115 B116 B117 B118 B119 B120 B121 B122 B123 B124 B125 B126 B127
16 - 23 B128 B129 B130 B131 B132 B133 B134 B135 … B176 B177 B178 B179 B180 B181 B182 B183 B184 B185 B186 B187 B188 B189 B190 B191
24 - 31 B192 B193 B194 B195 B196 B197 B198 B199 … B240 B241 B242 B243 B244 B245 B246 B247 B248 B249 B250 B251 B252 B253 B254 B255
Figure 4-5. Byte to Lane mapping for Standard package x16 degraded to x8
Note: An 8-UI block assertion is enforced by the Transmitter and tracked by the Receiver
during Active state. This means that following the first valid transfer of data over
mainband in Active state, each subsequent transfer is after an integer multiple of 8 UI
from the rising edge of Valid of the first transfer. Note that for Retimers, this means
that the first transfer after entering the Active state cannot be a ‘No Flit data transfer +
1 credit release’ encoding; this is acceptable because the Retimer-advertised credits
are replenished or readvertised whenever the state moves away from Active.
Clock
Valid
8 UI 8 UI
a. Note that the bits above are transmitted on the Link in order from right to left (i.e., bit 0 is transmitted on the
Link first, followed by bit 1 and so on until bit 7).
Note that the clock postamble is required any time that the clock can toggle with Valid assertion, and
the clock needs to stop toggling, regardless of LTSM state.
Valid
8 UI 8 UI
As shown in Section 7.1.2, the sideband message formats are defined as a 64-bit header with no
data, with 32 bits or 64 bits of data. A 64-bit serial packet is defined on the I/O interface to the
remote die as shown in Figure 4-8. 32-bit data is sent using the 64-bit serial packet with MSBs
padded with 0b. Two sideband serial packets on the I/O interface are separated by a minimum of 32
bits low as shown in Figure 4-9. A sideband message with data would be transferred as a 64-bit
header followed by 32 bits of low followed by 64-bit data followed by 32 bits of low.
64 UI >=32 UI
SB Clock
When Sideband PMO capability is enabled, the 32-UI dead time between the 64-UI data transfers on
the sideband is no longer applicable and the sideband can transmit 64-UI data back-to-back with no
gaps. See Figure 4-10 and Figure 4-11 for illustration. The transmitter must follow this new mode
after the transmitter has sent and received the {MBINIT.PARAM SBFE resp} sideband message with
the Sideband PMO bit set to 1, across all modules. The receiver must be ready to accept packets in
this mode after the receiver has transmitted the {MBINIT.PARAM SBFE resp} sideband message with
the Sideband PMO bit set to 1. After Sideband PMO is enabled, the transmitter operates in Performant
Mode in all states until entry into the RESET state with the SB_MGMT_UP flag cleared to 0.
Additionally, PMO can then only be renegotiated on a training sequence with the SB_MGMT_UP flag
cleared to 0. Note that from a receiver perspective, due to timing differences, packets might be
received without the Sideband Performant Mode even after the chiplet has transmitted an
{MBINIT.PARAM SBFE resp} sideband message with the PMO bit set to 1. The sideband receiver must
be backward compatible and be able to handle 32 UI of gaps between consecutive 64-UI transfers
over the sideband Link.
Figure 4-10. Example 64-bit Sideband Serial Packet Transfer in Sideband Performant Mode
64 UI
SB Clock
64 UI 64 UI
SB Clock
Devices must support Lane reversal of data Lanes within a Module. An example of Lane reversal is
when physical Data Lane 0 on local die is connected to physical Data Lane (N-1) on the remote die
(physical Data Lane 1 is connected to physical Data Lane N-2 and so on) where N = 8 for a x8
Standard Package, N = 16 for a x16 Standard Package, N = 32 for a x32 Advanced Package, and N =
64 for a x64 Advanced Package. Redundant Lanes, in case of Advanced Package, are also reversed.
Lane reversal must be implemented on the Transmitter only. The Transmitter reverses the logical Lane
order on Data and Redundant Lanes.
Lane reversal is discovered and applied during initialization and training (see Section 4.5.3.3.5).
4.2.1 Lane ID
To allow Lane reversal discovery, each logical Data and redundant Lane within a module is assigned a
unique Lane ID. The assigned Lane IDs are shown in Table 4-2 and Table 4-3 for Advanced and
Standard Package modules, respectively. Note that logical Lane numbers in Table 4-2 and Table 4-3
represent the logical Transmitter and Receiver Lanes. For example, Logical Lane Number = 0
represents TD_L[0]/RD_L[0] and so on.
In Table 4-2, for a x64 Advanced Package module, logical Lane numbers 64, 65, 66, and 67 represent
Logical redundant Lanes TRD_L[0]/RRD_L[0], TRD_L[1]/RRD_L[1], TRD_L[2]/RRD_L[2],
TRD_L[3]/RRD_L[3], respectively. For a x32 Advanced Package module, the Lane ID for
TD_L[0:31]/RD_L[0:31], TRD_L[0]/RRD_L[0] and TRD_L[1]/RRD_L[1] will be represented
by the set of Lane ID {0…31, 64, 65} respectively.
In Table 4-3, for a x16 Standard Package module, the Lane ID for TD_L[0:15]/RD_L[0:15] will be
represented by the set of Lane ID {0…15} respectively. For a x8 Standard Package module, the Lane
ID for TD_L[0:7]/RD_L[0:7] will be represented by the set of Lane ID {0…7} respectively.
0 00000000b 34 00100010b
1 00000001b 35 00100011b
2 00000010b 36 00100100b
3 00000011b 37 00100101b
4 00000100b 38 00100110b
5 00000101b 39 00100111b
6 00000110b 40 00101000b
7 00000111b 41 00101001b
8 00001000b 42 00101010b
9 00001001b 43 00101011b
10 00001010b 44 00101100b
11 00001011b 45 00101101b
12 00001100b 46 00101110b
13 00001101b 47 00101111b
14 00001110b 48 00110000b
15 00001111b 49 00110001b
16 00010000b 50 00110010b
17 00010001b 51 00110011b
18 00010010b 52 00110100b
19 00010011b 53 00110101b
20 00010100b 54 00110110b
21 00010101b 55 00110111b
22 00010110b 56 00111000b
23 00010111b 57 00111001b
24 00011000b 58 00111010b
25 00011001b 59 00111011b
26 00011010b 60 00111100b
27 00011011b 61 00111101b
28 00011100b 62 00111110b
29 00011101b 63 00111111b
30 00011110b 64 01000000b
31 00011111b 65 01000001b
32 00100000b 66 01000010b
33 00100001b 67 01000011b
0 00000000b 8 00001000b
1 00000001b 9 00001001b
2 00000010b 10 00001010b
3 00000011b 11 00001011b
4 00000100b 12 00001100b
5 00000101b 13 00001101b
6 00000110b 14 00001110b
7 00000111b 15 00001111b
Lane remapping is accomplished by “shift left” or “shift right” operation. A “shift left” is when data
traffic of logical Lane TD_L[n] on TD_P[n] is multiplexed onto TD_P[n-1]. A shift left puts TD_L[0]
onto TRD_P0 or TD_L[32] onto TRD_P2. A shift right operation is when data traffic TD_L[n] is
multiplexed onto TD_P[n+1]. A shift right puts TD_L[31] onto TRD_P1 or TD_L[63] onto TRD_P3.
See the pseudo codes in Section 4.3.3.1 and Section 4.3.3.2 that show the changes in mapping post
repair.
Note: If the lower index redundant Lane (TRD_P[0] or TRD_P[2]) is faulty, no data lanes
can be repaired for its group. Note that if the higher index redundant lane (TRD_P[1]
or TRD_P[3]) is faulty, one data lane can be repaired for its group.
After a data Lane is remapped, the Transmitter associated with the faulty physical Lane is tri-stated
and the Receiver is disabled. The Transmitter and the Receiver of the redundant Lane used for the
repair are enabled.
Figure 4-12 shows transmit bump side of data Lane remapping for the first group of 32 Lanes. Both
“shift left” and “shift right” remapping is needed to optimally repair up to any two Lanes within the
group. Figure 4-13 shows details of the mux structure used for data Lane repair.
Note: Example repair implementations are shown for TD_P[31:0] for clarity. It should be
noted that the same schemes are also applicable to TD_P[63:32].
Figure 4-12. Data Lane remapping possibilities to fix potential defects
TD_L[n+2]
TD_L[n+1]
TD_L[n-1]
TD_L[n]
Shift_right
Shift_left
Tx Tx Tx Tx
Die-1
Rx Rx Rx Rx
Die-2
RD_L[n+2]
RD_L[n-1]
RD_L[n]
TRD_P[0](RRD_P[0]) must be used as the redundant Lane to remap any single physical Lane
failure for TD_P[31:0](RD_P[31:0]). TRD[2](RRD[2]) must be used as the redundant Lane to
remap any single Lane failure for TD_P[63:32] (RD_P[63:32]).
Pseudo code for repair in TD_P[63:32](RD_P[63:32]) (32<= x <=63) (this does not apply to x32
Advanced Package Link):
As shown in Figure 4-14 TD_P[29] is remapped in the direction to use TRD_P[0] as the repair
resource. Figure 4-15 shows the circuit implementation.
TD_L[n+1]
TD_L[n+2]
TD_L[n-1]
TD_L[n]
Tx Tx Tx Tx
Die-1
Rx Rx Rx Rx
Die-2
RD_L[n]
Pseudo code for two Lane repair in TD_P[31:0](RD_P[31:0]) (0<= x,y <=31):
Pseudo code for two Lane repair in TD_P[63:32] (RD_P[63:32]) (32<= x,y <=63) (this does not
apply to x32 Advanced Package Link):
Shown in Figure 4-16 is an example of two (physical Lanes 25 and 26) Lane remapping. Figure 4-17
shows the circuit implementation. Both Transmitter and Receiver must apply the required remapping.
TD_L[n+2]
TD_L[n+1]
TD_L[n-1]
TD_L[n]
Tx Tx Tx Tx
Die-1
TD_P[n+1
TD_P[n-1] TD_P[n] TD_P[n+2]
]
BAD
LANES
Rx Rx Rx Rx
Die-2
RD_L[n+2]
RD_L[n-1]
RD_L[n]
After a Lane is remapped, the Transmitter is tri-stated. The Receiver of the physical redundant
(RRDCK_P) Lane is disabled.
TCKP_P
TCKN_P
TRDCK _P
TTRK_P
Clock Rx (Differential)
Normal Path TCKP_P RCKP_P
TCKP_L Normal Path P
Repair Path
Repair P
TCKN_L Normal Path TCKN_P RCKN_P RX Diff RCK_L
Normal Path N
Repair Path Repair N
Repair Path clock
No Repair TRDCK_P RRDCK_P Normal Path RD
TRDCK_L RRDCK_L
Repair Path track Repair Path Rx
Repair TRK
TTRK_L Normal Path TTRK_P RTRK_P RTRK_L
Normal Path TRK Rx
Repair Path
The implementation of Clock and Track Lane remapping is shown in Figure 4-21(a), Figure 4-21(b)
and Figure 4-21(c) respectively. The corresponding circuit level details of remapping implementation
are shown in Figure 4-22, Figure 4-23 and Figure 4-24.
Note that the both Transmitter and Receiver on CKRD Lane are required during detection phase and
can be tri-stated and turned off if not used for repair.
TRDCK _P
TRDCK _P TRDCK _P
Figure 4-25 shows the normal path for Valid and redundant valid Lanes. Figure 4-26 shows the repair
path for Valid Lane failure.
Die-1 Die-2
TVLD_P RVLD_P
Lane Repair
TVLD_L
Lane Repair
RVLD_L
Rx
Mux
Tx
Mux
TRDVLD_P RRDVLD_P
Lane Repair
Lane Repair
TRDVLD_L RRDVLD_L
Rx
Mux
Tx
Mux
Die-1 Die-2
TVLD_P RVLD_P
Lane Repair
TVLD_L
Lane Repair
RVLD_L
Rx
Mux
Tx
Mux
TRDVLD_P RRDVLD_P
Lane Repair
Lane Repair
TRDVLD_L RRDVLD_L
Rx
Mux
Tx
Mux
4.3.7 Width Degrade in Standard Package Interfaces
In the case of x16 Standard Package modules where Lane repair is not supported, resilience against
faulty Lanes is provided by configuring the Link to a x8 width (Logical Lanes 0 to 7 or Logical Lanes 8
to 15, which exclude the faulty Lanes). For example, if one or more faulty Lanes are in logical Lane 0
to 7, the Link is configured to x8 width using logical Lanes 8 to 15. The configuration is done during
Link initialization or retraining. Transmitters of the disabled Lanes are tri-stated and Receivers are
disabled.
In the case of x8 Standard Package modules, resilience against faulty Lanes is provided by
configuring the Link to a x4 width (Logical Lanes 0 to 3 or Logical Lanes 4 to 7, which exclude the
faulty Lanes). The configuration is done during Link initialization or retraining. Transmitters of the
disabled Lanes are tri-stated and Receivers are disabled.
Figure 4-5 shows the byte to Lane mapping for a width degraded x8 interface
Figure 4-27 shows the infrastructure for interface training and testing. The Transmit Die and Receive
Die implement the same Linear Feedback Shift Register (LFSR) described in Section 4.4.1. The
pattern sent from the Transmitter along with forwarded clock and Valid is compared with locally
generated reference pattern. Both transmit and receive pattern generators must start and advance in
sync. The compare circuitry checks for matching data each UI. Any mismatch between the received
pattern and pattern predicted by the local pattern generator is detected as an error.
Ref Pattern
Gen
Clock
Valid
Error
Pattern Pattern Rx Detect
Rx
Rx
Gen Pattern
Valid
Lane 0
Remote pattern
Lane 1
Lane 2
Lane 3 Error
Counter
Lane (N-1)
The LFSR uses the same polynomial as PCIe: G(X)=X23 + X21 + X16 + X8 + X5 + X2 + 1. Each
Transmitter is permitted to implement a separate LFSR for scrambling and pattern generation. Each
Receiver is permitted to implement a separate LFSR using the same polynomial for de-scrambling and
pattern comparison. The implementation is shown in Figure 4-30. The seed of the LFSR is Lane
dependent, and based on the Logical Lane number and the seed value for Lane number is modulo 8
as shown in Table 4-4.
Alternatively, implementations can choose to implement one LFSR with different tap points for
multiple Lanes as shown in Figure 4-31. This is equivalent to individual LFSR per-Lane with different
seeds.
Lane Seed
0 23’h1DBFBC
1 23’h 0607BB
2 23’h1EC760
3 23’h18C0DB
4 23’h010F12
5 23’h19CFC9
6 23’h0277CE
7 23’h1BB807
D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14 D15 D16 D17 D18 D19 D20 D21 D22
Data_Out
Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed Seed
D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14 D15 D16 D17 D18 D19 D20 D21 D22
Data_In
Seed
D22
D22
Seed
D21
D21
D20
Seed
D20
Seed
D19
D19
Seed
D18
D18
Seed
D17
D17
Seed
D16
D16
D15
Seed
D15
D14
Seed
D14
Seed
D13
Data_In_Lane_i
D13
Data_Out_Lane_i
Tap_Eqn_Lane_i
Reset Value = ‘1
Seed
D12
D12
D11
For i = 2, 10, 18, 26, 34, 42, 50, 58: Tap_Eqn_Lane_i = D13^22
For i = 3, 11, 19, 27, 35, 43, 51, 59: Tap_Eqn_Lane_i = D1^D22
For i = 4, 12, 20, 28, 36, 44, 52, 60: Tap_Eqn_Lane_i = D3^D22
Seed
D10
D10
For i = 5, 13, 21, 29, 37, 45, 53, 61: Tap_Eqn_Lane_i = D1^D3
For i = 6, 14, 22, 30, 38, 46, 54, 62: Tap_Eqn_Lane_i = D3^D9
For i = 7, 15, 23, 31, 39, 47, 55, 63: Tap_Eqn_Lane_i = D1^D9
Seed
D9
D9
Seed
D8
D8
Seed
D7
D7
Seed
D6
D6
Seed
D5
D5
Seed
D4
D4
D3
Seed
D3
Seed
D2
D2
Seed
D1
D1
Seed
D0
D0
When training during Link Initialization (i.e., Physical Layer transitions out of RESET state), hardware
is permitted to attempt training multiple times:
• Triggers for initiating Link Training for Management Transport path setup on the sideband are:
— Software writes 1 to the Retrain Link bit in the Sideband Management Port Structure register
(see Section 8.1.3.6.2.1)
— HW-autonomous trigger by the Management Port Gateway to automatically train the
Management Transport path on the sideband without SW intervention, as occurs after
Management Reset
— SBINIT pattern (two consecutive iterations of 64-UI clock pattern and 32-UI low) is observed
on any sideband Receiver clock/data pair, when the SB_MGMT_UP flag is cleared to 0
• The triggers for initiating Link Training for the mainband are:
— Software writes 1 to Start UCIe Link Training bit in UCIe Link Control register in the UCIe Link
DVSEC (see Section 9.5.1.5)
— When Management Transport protocol is supported on the UCIe link, software writes 1 to the
Retrain Link bit in the Management Port Structure register of either the Sideband or Mainband
Management Port that is associated with the UCIe link (see Section 8.1.3.6.2.1)
— Adapter triggers Link Training on the RDI (RDI status is Reset and there is a NOP to Active
transition on the state request)
If hardware fails training after an implementation-specific number of attempts, the hardware must
transition to RESET and wait for a subsequent Link Training trigger. Physical Layer must escalate a
fatal error to the D2D Adapter on the RDI if mainband Software-triggered or RDI-triggered Link
training fails or there is a Link-up-to-Link-down transition due to a Physical Layer timeout.
Throughout this section, references to mainband transmitter and receiver behavior are called out in
various state machine states. These references do not apply when the port does not have a mainband
(i.e., the port is a sideband-only port without a physically present mainband, or the port is a UCIe-S
port with only the sideband used. In the latter scenario, the mainband transmitters are held in tri-
state throughout and the mainband receivers are disabled.
The sequence of steps for this test are as follows for each UCIe Module of the UCIe Link:
1. The UCIe Module sets up the Transmitter parameters (shown in Table 4-5), sends a {Start Tx Init
D to C point test req} sideband message to its UCIe Module Partner, and waits for a response. The
data field of this message includes the required parameters, shown in Table 4-5. The Receiver on
the UCIe Module Partner must enable the pattern comparison circuits to compare incoming
mainband data to the locally generated expected pattern. Once the data to clock training
parameters for its Receiver are setup, the UCIe Module Partner responds with a {Start Tx Init D to
C point test resp} sideband message.
2. The UCIe Module resets the LFSR (scrambler) on its mainband Transmitters and sends the {LFSR
clear error req} sideband message. The UCIe Module Partner resets the LFSR and clears any prior
compare results on its mainband Receivers and responds with {LFSR clear error resp} sideband
message.
3. The UCIe Module sends the pattern (selected through “Tx Pattern Generator Setup”) for the
selected number of cycles (“Tx Pattern Mode Setup”) on its mainband Transmitter.
4. The UCIe Module Partner performs the comparison on its Receivers for each UI during the pattern
transmission based on “Rx compare setup” and logs the results.
5. The UCIe Module requests its UCIe Module Partner for the logged results in Step 4 by sending
{Tx Init D to C results req} sideband message. The UCIe Module Partner stops comparison on its
mainband Receivers and responds with the logged results {Tx Init D to C results resp} sideband
message.
6. The UCIe Module stops sending the pattern on its Transmitters and sends the {End Tx Init D to C
point test req} sideband message and the UCIe Module Partner responds with {End Tx Init D to C
point test resp}. When a UCIe Module has received the {End Tx Init D to C point test resp}
sideband message, the corresponding sequence has completed.
Valid Pattern (for Valid Lanes) VALTRAIN pattern four 1s followed by four 0s
a. See Table 7-11 for the encodings. See also the registers in Section 9.5.3.26 and Section 9.5.3.27. See also the Implementation
Note below this table.
IMPLEMENTATION NOTE
The Training Setup 1 and Training Setup 2 registers (see Section 9.5.3.26 and Section 9.5.3.27,
respectively) are applicable for compliance or debug. For regular operation, implementations must
follow the comparison mode, iteration or UI count specified in the corresponding states (for
example, Section 4.5.3.4.8 specifies 4K UI of continuous mode LFSR pattern and total (aggregate)
error count); and because these patterns are fixed, Rx is permitted to ignore Burst Count/Idle
Count/Iteration Count values in the sideband messages for regular operation. Training Setup 3 and
Training Setup 4 registers (see Section 9.5.3.28 and Section 9.5.3.29, respectively) are applicable
for regular operation as well as compliance and debug.
3. The UCIe Module Partner sends the pattern (selected through “Tx Pattern Generator Setup”) for
the selected number of cycles (“Tx Pattern Mode Setup”) on its mainband Transmitter.
4. The UCIe Module performs the comparison on its mainband Receivers for each UI during the
pattern transmission based on “Rx compare setup” and logs the results.
5. The UCIe Module Partner sends a sideband message {Rx Init D to C Tx count done req} sideband
message once the pattern count is complete. The UCIe Module, stops comparison and responds
with the sideband message {Rx Init D to C Tx count done resp}. The UCIe Module can now use
the logged data for its Receiver Lanes.
6. The UCIe Module sends an {End Rx Init D to C point test req} sideband message and the UCIe
Module Partner responds with an {End Rx Init D to C point test resp} sideband message. When a
UCIe Module has received the {End Rx Init D to C point test resp} sideband message, the
corresponding sequence has completed.
7. The UCIe Module Partner sends an {Rx Init D to C sweep done with results} sideband message
with results for its mainband Transmitter. The UCIe Module can use the sweep results for its
mainband Receivers.
8. The UCIe Module sends an {End Rx Init D to C eye sweep req} sideband message and the UCIe
Module Partner responds with an {End Rx Init D to C eye sweep resp} sideband message. When a
UCIe Module has received the {End Rx Init D to C eye sweep resp} sideband message, the
corresponding sequence has completed.
Speed/width resolution
Functional Data
State Description
RESET This is the state following primary reset or exit from TRAINERROR.
Mainband (Data, Clock and Valid signals) speed of operation is set to the highest
MBTRAIN negotiated data rate. Die-to-Die training of mainband is performed to center the
clock with respect to Data.
LINKINIT This state is used to exchange Adapter and Link management messages.
ACTIVE This is the state in which transactions are sent and received.
PHYRETRAIN This state is used to begin the retrain flow for the Link during runtime.
State is entered when a fatal or non-fatal event occurs at any point during Link
TRAINERROR
Training or operation.
RESET
Any State
TRAINERROR Except
RESET
SBINIT
From L2
MBINIT
From L1
L1/L2 MBTRAIN
LINKINIT
ACTIVE PHYRETRAIN
4.5.3.1 RESET
Physical Layer must remain in RESET for a minimum of 4 ms upon every entry to RESET, to allow PLLs
to stabilize and any other Link Training initialization requirements to be met. The minimum conditions
necessary to exit RESET are as follows:
• Power supplies are stable
• Sideband clock is available and running at 800 MHz
• If {
— Physical Layer and Die to Die Adapter internal clocks are stable and available
— Mainband clock speed is set to the slowest I/O data rate (2 GHz for 4 GT/s)
— Local SoC/Firmware not keeping the Physical Layer in RESET
— Link training trigger for the mainband has occurred (triggers are defined in the beginning of
Section 4.5)
} OR
{
— Sideband Management Transport protocol is supported
— SB_MGMT_UP = 0
— Local SoC/Firmware is not keeping the sideband in RESET
— Management Port Gateway indicates that it is ready for Management Transport path
initialization
— Link Training trigger for the Management Transport path has occurred (triggers are defined in
the beginning of Section 4.5)
}
• Data, Valid, Clock, and Track Transmitters are tri-stated
• Data, Valid, Clock, and Track Receivers are permitted to be disabled
• If [Management Transport protocol is not supported] OR [SB_MGMT_UP=0], Sideband
Transmitters are held low
• Sideband Receivers are enabled
The SB initialization procedure is performed at 800 MT/s with 800-MHz sideband clock.
Advanced Package has redundant SB clock and SB data Lanes (DATASBRD, CKSBRD) in addition to
DATASB and CKSB. SBINIT sequence for Advanced Package where interconnect repair may be
needed is as follows:
1. The UCIe Module must start and continue to send iterations of a 64-UI clock pattern (a clock
pattern is defined as starting with 1 and toggling every UI of transmission, i.e., 1010...) and 32-
UI low on both sideband data Transmitters (TXDATASB and TXDATASBRD). The UCIe Module must
send strobes on both TXCKSB and TXCKSBRD during the 64-UI clock pattern transmission and be
gated (held low) otherwise.
2. UCIe Module Partner must sample each incoming data patterns on its sideband Receivers with
both incoming sideband clocks (this forms four Receiver/clock combinations).
3. A sideband data-clock Receiver combination detection is considered successful if 128 UI clock
pattern is detected.
4. If a UCIe Module Partner detects the pattern successfully on at least one of its sideband data-
clock Receiver combination, it must stop sending data and clock on its sideband Transmitters after
four more iterations of 64-UI clock pattern and 32-UI low. This will allow for any time differences
in both UCIe Module and UCIe Module Partner coming out of RESET state. The sideband
Transmitter and Receiver on the UCIe Module must now be enabled to send and receive sideband
messages
5. If pattern is not detected on its sideband Receiver, the UCIe Module must continue to alternate
between sending the pattern on its sideband Transmitters for 1 ms, and holding low for 1 ms, for
a total of 8 ms. The sideband Receiver of the UCIe Module must remain enabled during this time.
Timeout occurs after 8 ms. If a timeout occurs, the UCIe Module enters TRAINERROR state.
6. If detection is successful on more than one sideband data/clock combination, the device can pick
a combination based on a priority order. Pseudocode for sideband assignment:
IF (Result[3:0] == XXX1):
Sideband = (DATASB/CKSB)
ELSE IF (Result[3:0] == XX10):
Sideband = (DATASB/CKSBRD)
ELSE IF (Result[3:0] == X100):
Sideband = (DATASBRD/CKSB)
ELSE IF (Result[3:0] ==1000):
Sideband = (DATASBRD/CKSBRD)
Else:
Sideband is not functional
7. If the sideband on the UCIe Module is enabled to send and receive sideband messages (Step 4),
the UCIe Module must start and continue to send {SBINIT Out of Reset} sideband message on
both TXDATASB and TXDATASBRD while sending both TXCKSB and TXCKSBRD until it detects the
same message in its sideband Receivers or a timeout occurs at 8 ms.
8. If {SBINIT Out of Reset} sideband message detection is successful on its sideband Receivers, the
UCIe Module stops sending the sideband message. Before sending any further sideband
messages, both UCIe Module and UCIe Module Partner must apply Sideband Data/Clock
assignment (called the functional sideband) based on the information included in the {SBINIT Out
of Reset} sideband message.
9. Any further sideband messages must be sent and received on the functional sideband. Any
sideband message exchange can now be performed.
10. The UCIe Module sends the {SBINIT done req} sideband message and waits for a response. If
this message is received successfully, UCIe Module Partner responds with {SBINIT done resp}
sideband message. When a UCIe Module has sent and received the {SBINIT done resp} sideband
message, the UCIe Module must exit to MBINIT. The following additional rules apply when the
transmitting/receiving chiplet supports Management Transport protocol. These additional rules
are required to initiate mainband link training for the scenario in which the Management Transport
path has already been established prior (i.e., SB_MGMT_UP flag is set to 1) and one of the
mainband link training triggers (see Section 4.5) occurred with the Link Training State Machine in
RESET state.
— If the Module partner that is receiving the {SBINIT done req} sideband message is in RESET
state and is ready to proceed with mainband initialization, the module partner must transition
to SBINIT state, respond with an {SBINIT done resp} sideband message, and then send a
{SBINIT done req} sideband message.
— Module partner must ignore a received {SBINIT done req} sideband message if the module
partner is EITHER [in RESET state but not yet ready to proceed with mainband initialization]
OR [in a state machine state other than RESET/SBINIT].
— UCIe Module that is transmitting the {SBINIT done req} sideband message must transition to
RESET state (by way of TRAINERROR state) if the UCIe Module did not receive a response
within the regular 8-ms time window. In that scenario, the UCIe Module can choose to re-
issue the {SBINIT done req} sideband message after transitioning to SBINIT state again from
RESET state. The UCIe Module can repeat this process N number of times before waiting in
RESET for a new training trigger. (The value of N is implementation-dependent.) The UCIe
Module partner must collapse multiple outstanding {SBINIT done req} sideband messages
and respond only with a single {SBINIT done resp} sideband message.
The next state is mainband initialization (MBINIT) if sideband message exchange is successful.
SBINIT sequence for Standard Package where interconnect Lane redundancy and repair are not
supported is as follows:
1. The UCIe Module must start and continue to send iterations of 64 UI clock pattern (a clock pattern
is defined as starting with 1b and toggling every UI of transmission, i.e., 1010...) and 32 UI low
on its sideband Transmitter (TXDATASB). The UCIe Module must send strobe on its sideband clock
(TXCKSB) during the 64-UI clock pattern duration and gated (held low) otherwise.
2. The UCIe Module Partner must sample incoming data pattern with incoming clock.
3. Sideband pattern detection is considered successful if 128 UI clock pattern is detected.
4. If the UCIe Module successfully detects the pattern, it stops sending data and clock on its
sideband Transmitters after four more iterations of pattern in Step 1. This will allow for any time
differences in both UCIe Modules coming out of RESET. The UCIe Module sideband Transmitter
and Receiver must now be enabled to send and receive sideband messages, respectively.
5. If pattern is not detected on its sideband Receiver, the UCIe Module continues to alternate
between sending the pattern on its Transmitters for 1 ms, and holding low for 1 ms, for a total of
8 ms. The sideband Receiver must be enabled during this time. Timeout occurs after 8 ms. If a
timeout occurs, the UCIe Module must exit to TRAINERROR. If a pattern is detected successfully
at any time, as described in Step 3, the UCIe Module enables sideband message transmission as
described in Step 4 and continues to Step 6.
6. Once sideband detection is successful (Step 5), the UCIe Module must start and continue to send
{SBINIT Out of Reset} sideband message on TXDATASB while sending TXCKSB until it detects the
same message in its sideband Receivers or a timeout occurs.
7. If {SBINIT Out of Reset} sideband message detection is successful, the UCIe Module must stop
sending the message. Any sideband message exchange can now be performed.
8. The UCIe Module must send the {SBINIT done req} sideband message. If this message is
received successfully, the UCIe Module Partner responds with the {SBINIT done resp} sideband
message. When the UCIe Module has sent and received the {SBINIT done resp} sideband
message, the UCIe Module must exit to MBINIT. The following additional rules apply when the
transmitting/receiving chiplet supports Management Transport protocol. These additional rules
are required to initiate mainband link training for the scenario in which the Management Transport
path has already been established prior (i.e., SB_MGMT_UP flag is set to 1) and one of the
mainband link training triggers (see Section 4.5) occurred with the Link Training State Machine in
RESET state.
— If the Module partner that is receiving the {SBINIT done req} sideband message is in RESET
state and is ready to proceed with mainband initialization, the module partner must transition
to SBINIT state, respond with an {SBINIT done resp} sideband message, and then send a
{SBINIT done req} sideband message.
— Module partner must ignore a received {SBINIT done req} sideband message if the module
partner is EITHER [in RESET state but not yet ready to proceed with mainband initialization]
OR [in a state machine state other than RESET/SBINIT].
— UCIe Module that is transmitting the {SBINIT done req} sideband message must transition to
RESET state (by way of TRAINERROR state) if the UCIe Module did not receive a response
within the regular 8-ms time window. In that scenario, the UCIe Module can choose to re-
issue the {SBINIT done req} sideband message after transitioning to SBINIT state again from
RESET state. The UCIe Module can repeat this process N number of times before waiting in
RESET for a new training trigger. (The value of N is implementation-dependent.) The UCIe
Module partner must collapse multiple outstanding {SBINIT done req} sideband messages
and respond only with a single {SBINIT done resp} sideband message.
The next state is mainband initialization (MBINIT) if sideband message exchange is successful. For
the remainder of initialization and operations, when not transmitting sideband packets, sideband
Transmitters are held Low and sideband Receivers are enabled.
4.5.3.3 MBINIT
In this state, the mainband (MB) interface is initialized and repaired or width degraded (when
applicable). The data rate on the mainband is set to the lowest supported data rate (4 GT/s).
For Advanced Package interconnect repair may be needed. Sub-states in MBINIT allows detection
and repair of data, clock, track and valid Lanes. For Standard Package, where no Lane repair is
needed, sub-states are used to check functionality at lowest data rate and width degrade if needed.
From SBINIT
PARAM
Cal
RepairCLK
RepairVAL
ReversalMB
RepairMB
To MBTRAIN
4.5.3.3.1 MBINIT.PARAM
This state is used to perform exchange of parameters that are required to set up the maximum
negotiated speed and other PHY settings. Mainband Transmitters remain tri-stated and mainband
Receivers are permitted to be disabled. The following parameters are exchanged over sideband with
UCIe Module Partner:
• “Voltage swing”: The five bit value indicates the Transmitter voltage swing to the UCIe Module
Partner. The UCIe Module Partner must use this value and its Receiver termination information to
set the reference voltage (Vref) for its Receivers. The corresponding bits in the {MBINIT.PARAM
configuration resp} sideband message are reserved.
• “Maximum Data Rate”: The four bit value indicates the Maximum supported Data rate to the UCIe
Module Partner. This value must take into consideration all the required features at the data rate
(BER, CRC/Retry, quadrature clock phase support etc.). The UCIe Module Partner must compare
this value with its supported maximum data rate and must respond with the maximum common
data rate encoding in the {MBINIT.PARAM configuration resp} sideband message. For example, a
UCIe Module is 8 GT/s capable while the UCIe Module Partner advertises 16 GT/s, the UCIe
Module must pick 8 GT/s and send it back in response.
• “Clock Mode”: The one bit value indicates the UCIe Module’s request to the UCIe Module Partner
for a strobe or continuous clock. The UCIe Module Partner must use this information to set up the
clock mode on its clock Transmitter. The {MBINIT.PARAM configuration resp} sideband message
must reflect the same value. Continuous clock mode requires the clock to be free running and is
enforced after receiving the {MBTRAIN.RXCLKCAL start req} sideband message from the UCIe
Module Partner. The clock remains free running through the remainder of MBTRAIN (unless
MBTRAIN.LINKSPEED is exited due to errors) and in ACTIVE.
• “Clock Phase”: The one bit value indicates the UCIe Module’s request to UCIe Module Partner for
the Clock Phase support on UCIe Module’s forwarded clock. This should only be set 1b if the
maximum data rate advertised is permitted to do so (see Table 5-8). The corresponding bit in the
{MBINIT.PARAM configuration resp} sideband message must be set to 1b if this was requested
and the operational data rate allows it.
• “Module ID”: The UCIe Module sends its “Module ID”. This can be used by the UCIe Module
Partner if in a multi-module configuration for Byte mapping, Module enable/disable information
etc. The corresponding bits in the {MBINIT.PARAM configuration resp} sideband message are
reserved.
• “UCIe-A x32”: This bit is set to 1b when APMW bit in DVSEC UCIe Link Capability register (see
Section 9.5.1.4) is set to 1b (OR) ‘Force x32 width mode in x64 Module’ in the PHY Control
register (see Section 9.5.3.23) is set; otherwise, the bit is set to 0b. If a x64 Advanced Package
module supports width reduction to interoperate with a x32 Advanced Package Module, it uses
this information from its link partner to condition the results during MBINIT.REVERSALMB. The
corresponding bits in the {MBINIT.PARAM configuration resp} sideband message are reserved.
• “UCIe-S x8”: This bit is set to 1 in message {MBINIT.PARAM configuration req} when bit 20,
SPMW, in the DVSEC UCIe Link Capability register (see Section 9.5.1.4) is set to 1 (OR) ‘Force x8
Width’ bit is set to 1 in the PHY Control register (see Section 9.5.3.23). Otherwise, this bit is set
to 0. See Section 4.5.3.3.6 for how this bit is used. The corresponding bit in the {MBINIT.PARAM
configuration resp} sideband message is reserved.
• “Sideband Feature Extensions”: This bit is set to 1 if the transmitter supports sideband feature
extensions (see Section 4.5.3.3.1.1).
It is strongly recommended that if interoperable parameters are not negotiated, then hardware maps
this scenario to an Internal Error in the Error Log 1 register and transition the LTSM to TRAINERROR,
RDI to LINKERROR, and assert pl_trainerror on RDI. For a multi-module Link, all the parameters
except “Module ID” must be the same for all the modules, and if this is not the case, it is strongly
recommended that hardware maps this scenario to the same error escalation path.
When management transport is supported, the additional conditions required for the Link training
state machine to exit MBINIT.PARAM state to MBINIT.CAL are:
• Management Transport was not negotiated.
• (OR) Management Transport was negotiated and it was either initialized successfully or an error
was detected during initialization.
• (OR) SB_MGMT_UP flag is already set on entry into MBINIT state
AND
AND
Management Transport protocol over the sideband is optional and chiplets use the mechanism
described in this section to negotiate support for it. A sample negotiation flow is shown in Figure 4-35
for a single module design. Sideband Feature Extensions Supported (SFES) is bit 14 in the
{MBINIT.PARAM configuration req} sideband message (see Table 7-11). Note that MBINIT.PARAM
state handshake relating to management path negotiation described in this section, is performed on
all transitions through the MBINIT state. If SB_MGMT_UP flag is set (see Section 8.2.3.1.2 for when
this happens) at entry into MBINIT state, management transport traffic continues without interruption
in the MBINIT.PARAM state.
Chiplet 1
Negotiation Phase
Complete (i.e., Chiplet 1
Negotiation Phase
can start Initialization
Complete (i.e., Chiplet 0
Phase)
can start Initialization
Phase)
Chiplet 0
Chiplet 1
Negotiation Phase
Complete i.e., Chiplet 1
Negotiation Phase
can start Initialization
Complete i.e., Chiplet 0
Phase
can start Initialization
Phase
a. Solid lines are for Module 0. Dashed lines are for Module 1.
• Unless otherwise specified, a chiplet can optionally check for violation of any Negotiation Phase
rules (discussed in the subsequent bullets), and when a violation is detected, the chiplet initiates
a TRAINERROR handshake (see Section 4.5.3.8) to return the LTSM to RESET state.
• Sideband Feature Extensions Supported (SFES) bit in the {MBINIT.PARAM configuration req}
sideband message (see Table 7-11, it is bit [14] in the message) is defined to indicate support for
extended sideband features (of which Management Transport is one), during MBINIT.PARAM state
of link training.
— 0 => Sideband Feature Extensions are not supported, 1 => Sideband Feature Extensions are
supported.
— All modules in a multi-module design must have the same value for this bit in the Req
message.
— After the SB_MGMT_UP flag is set, the value of this bit must remain the same on all
subsequent transitions through the MBINIT.PARAM state, until that flag is cleared.
• If the Remote link partner supports sideband feature extensions and it received the SFES bit set
to 1 in the {MBINIT.PARAM configuration req} sideband message, the Remote link partner will set
SFES bit in the {MBINIT.PARAM configuration resp} sideband message that it sends out;
otherwise, the bit is cleared to 0 in the resp message.
— All modules in a multi-module design must have the same value for this bit in the resp
message.
• If the SFES bit in the {MBINIT.PARAM configuration resp} sideband message is received as set to
1 across all modules, then the chiplet negotiates the next level of details of extended sideband
features supported with remote link partner. If the SFES bit in the {MBINIT.PARAM configuration
resp} sideband message is received as cleared to 0 in any module, then MBINIT.PARAM state is
exited to MBINIT.CAL.
— {MBINIT.PARAM SBFE req} sideband message (see Table 7-11) is sent to the remote link
partner on all modules which then sends back an {MBINIT.PARAM SBFE resp} sideband
message on all modules. This handshake happens independently in each direction.
o SBFE Req[0]/Resp[0] (also referred to as the MTP bit) indicates support for transmission/
reception of Management Port Messages. Remote link partner must set the MTP bit in the
{MBINIT.PARAM SBFE resp} sideband message if it was set in the {MBINIT.PARAM SBFE
req} sideband message, and it supports receiving Management Encapsulation messages.
o After the SB_MGMT_UP flag is set to 1, the value of this bit must remain the same on all
subsequent transitions through the MBINIT.PARAM state, until that flag is cleared to 0.
• When negotiating SFES in {MBINIT.PARAM configuration req/resp}, if a chiplet advertised SFES
support in the Req message, the chiplet must also advertise that support in the Resp message
provided the associated Req message had that capability advertised. If the chiplet did not
advertise SFES support in the Req message, then the chiplet must not advertise that support in
the Resp message.
• For a multi-module UCIe Link, the negotiation is performed independently per module.
— A Physical Layer implementation may advertise MTP bit in the SBFE Req message only on a
subset of the modules.
Note: A message sent on a given Module ID could be received on a different Module ID on the
partner sideband Receiver. Hence all sideband links in a multi-module design must be
capable of receiving MPMs even if they are limited to only supporting transmit of these
messages on a subset of sideband links. See Figure 4-37 and Figure 4-38 for examples
of multi-module transmit/receive scenarios that illustrate this point.
• After the {MBINIT.PARAM SBFE resp} sideband message has been transmitted to the remote link
partner and {MBINIT.PARAM SBFE resp} sideband message has been received from the remote
link partner are complete (successfully or unsuccessfully) during MBINIT.PARAM across all
modules, the PHY informs the Management Port Gateway of the following:
— Negotiated link count with management transport support on the transmit side, using the
pm_param_local_count[N-1:0] signals (see Section 10.1 for more details) at the end of
the negotiation phase. This is the value RxQ-Local. A link is considered to have negotiated
management transport support on the transmit side if the link transmitted the
{MBINIT.PARAM SBFE req} sideband message with the MTP bit set to 1 and received the
corresponding {MBINIT.PARAM SBFE resp} sideband message with its MTP bit also set to 1.
— Negotiated link count with management transport support on the receive side, using the
pm_param_remote_count[N-1:0] signals (see Section 10.1 for more details). This is the
value RxQ-Remote. A link is considered to have negotiated management transport support on
the receive side if the link received the {MBINIT.PARAM SBFE req} sideband message with the
MTP bit set to 1 and transmitted the corresponding {MBINIT.PARAM SBFE resp} sideband
message with its MTP bit also set to 1.
• A module must be able to receive initialization phase-related messages (see Section 8.2.3.1.2)
once it has transmitted {MBINIT.PARAM SBFE resp}.
• Negotiation phase ends when {MBINIT.PARAM SBFE resp} has been sent and received across all
modules.
• While in SBINIT state, if the SB_MGMT_UP flag transitioned from 1 to 0, the chiplet must move
the LTSM to TRAINERROR state -> RESET state.
• While in MBINIT state, if the SB_MGMT_UP flag transitioned from 1 to 0, the chiplet must perform
a TRAINERROR handshake and move the LTSM to TRAINERROR state -> RESET state.
Figure 4-37. Example Sideband MPM Logical Flow with Two Modules and No Module Reversal
Tx for far-
end RxQ- RxQ-ID=0
ID=0
Module 0 SB Module 0
Tx for far-
RxQ-ID=0 end RxQ-
ID=0
Tx for far-
end RxQ- RxQ-ID=1
ID=1
Module 1 SB Module 1
Tx for far-
RxQ-ID=1 end RxQ-
ID=1
Chiplet 0 Chiplet 1
Figure 4-38. Example Sideband MPM Logical Flow with Two Modules and Module Reversal
Tx for far-
end RxQ- RxQ-ID=1
ID=0
Module 0 SB Module 1
Tx for far-
RxQ-ID=0 end RxQ-
ID=1
Tx for far-
end RxQ- RxQ-ID=0
ID=1
Module 1 SB Module 0
Tx for far-
RxQ-ID=1 end RxQ-
ID=0
Chiplet 0 Chiplet 1
To support firmware download and other functionality that might have to be configured before the
mainband link can start training, a capability is provided to “pause” the MBINIT state machine after
the PARAM sub-state has completed.
This is an optional capability that is enabled only when both chiplets have indicated support for
Sideband Feature Extensions as described in Section 4.5.3.3.1.1.
To “pause” link training after MBINIT.PARAM, either side can send a “Stall” encoding of FFFFh in the
MsgInfo field of the {MBINIT.PARAM SBFE resp} sideband message. For example, if a chiplet needs to
download firmware by way of the partner port before the chiplet can bring up the mainband, the
partner port can respond with “stall” encoding as stated above. Stall encoding instructs the other side
to pause and not move beyond the MBINIT.PARAM state. The payload in the {MBINIT.PARAM SBFE
resp} sideband message with stall encoding is still valid and must be accurate to the responder’s
capabilities. An {MBINIT.PARAM SBFE resp} sideband message with “Stall” encoded must be sent
once every 4 ms until the sender determines that it no longer needs to stall, at which time the sender
either sends the {MBINIT.PARAM SBFE resp} message without the stall encoding (in which case the
state machine advances to MBINIT.CAL state if other conditions allow (see Section 8.2.3.1.2)), OR the
sender does not send an {MBINIT.PARAM SBFE resp} sideband message, the link times out, and the
link transitions from TRAINERROR state to RESET state and starts over again.
It is legal for one side to indicate a stall and the other side to not indicate a stall. In that case, both
sides are stalled. It is also valid for either side to explicitly request an entry to TRAINERROR state
while in MBINIT.PARAM state. This can occur if either side is not yet ready to train the mainband.
See Section 4.5.3.3.1.3 for details of what happens at the end of MBINIT.PARAM when either end is a
sideband-only port.
Support for receiving an {MBINIT.PARAM SBFE resp} sideband message with “stall” encoding is
required in all ports that advertise the SFES bit set to 1 in the {MBINIT.PARAM configuration req}
sideband message. Support for transmitting the {MBINIT.PARAM SBFE resp} sideband message with
“stall” encoding is implementation-dependent. For example, if a design needs to support the firmware
download feature, the design can support this capability if the design cannot complete firmware
download within 8 ms.
Figure 4-39 shows a scenario in which the link training state machine initially moves through
RESET -> SBINIT -> MBINIT.PARAM with “stall” -> TRAINERROR -> RESET. During this phase, a
chiplet “stalls” in MBINIT.PARAM for additional Init time, such as for downloading chiplet firmware.
When the chiplet Init is complete, the chiplet either initiates entry to TRAINERROR state with a
TRAINERROR handshake message, OR the chiplet can stop sending the {MBINIT.PARAM SBFE resp}
sideband message with “stall” encoding, which would eventually trigger entry to TRAINERROR state
initiated by the partner chiplet. In the second phase, the state machine moves through to SBINIT ->
MBINIT.PARAM without “stall” -> MBINIT.CAL, thus training the mainband. This flow is useful for
scenarios in which a chiplet potentially needs to change the advertised parameters for Link training
after chiplet Init. Note that the transition to TRAINERROR in this case does not escalate to RDI
transitioning to LinkError (or pl_trainerror assertion on RDI).
Sideband Management path setup occurs first, followed by mainband training but
with an entry to RESET in between.
RESET TRAINERROR
Phase
1
SBINIT
Phase
2
Management Transport path on the
Phase sideband is negotiated/initialized,
1 chiplet “stalls” in MBINIT.PARAM
for additional “Init” time
MBINIT.PARAM
(MBINIT.Stall
mechanism used Phase
for additional Init Mainband is trained
time in Phase 1)
2
MBINIT.CAL
...
Figure 4-40 shows a scenario in which the link training state machine initially moves through
RESET -> SBINIT -> MBINIT.PARAM with “stall”. The chiplet “stalls” in MBINIT.PARAM for additional
Init time. After the chiplet Init is complete, the chiplet sends an {MBINIT.PARAM SBFE resp} sideband
message without a “stall” encoding that triggers state machine entry to MBINIT.CAL. Mainband
training resumes from that point.
RESET TRAINERROR
SBINIT
Phase
1
Management Transport path on the
Phase sideband is negotiated/initialized,
1 chiplet “stalls” in MBINIT.PARAM
for additional “Init” time
MBINIT.PARAM
(MBINIT.Stall
mechanism used Phase
for additional Init Mainband is trained
time in Phase 1)
2
Phase
2 MBINIT.CAL
...
A UCIe-S Sideband-only port will advertise the SFES bit in an {MBINIT.PARAM configuration req}
sideband message. If that negotiation is successful, the port advertises bit 2 (SO bit) in an
{MBINIT.PARAM SBFE req} sideband message to indicate that the port is a “Sideband-only port”. If
the port receives an {MBINIT.PARAM SBFE resp} sideband message with the SO bit set to 1, training
pauses on both sides at the exit of the MBINIT.PARAM phase until the next Management Reset or a
transition to the TRAINERROR state is triggered (see Section 4.5.3.8). State residency timeout is
disabled in MBINIT.PARAM. If the remote link partner did not set the SO bit in the {MBINIT.PARAM
SBFE resp} sideband message, the link goes to the TRAINERROR state. This is an SiP integration error
and is fatal. All links that support management transport over sideband (i.e., those links that set bit 0
as 1 in the {MBINIT.PARAM SBFE req} sideband message that they transmit) must set the SO bit to 1
in the {MBINIT.PARAM SBFE resp} sideband message that they transmit, provided that the
corresponding bit was set to 1 in the {MBINIT.PARAM SBFE req} sideband message that they
received.
4.5.3.3.2 MBINIT.CAL
This state is used to perform any calibration needed (e.g., Tx Duty Cycle correction, Receiver offset
and Vref calibration). This state is common for both Advanced Package and Standard Package.
1. The UCIe Module maintains tri-state on all its mainband Transmitters, and mainband Receivers
are permitted to be disabled in this state. The UCIe Module is permitted to perform
implementation specific steps for Transmitter and Receiver calibration.
2. The UCIe Module must send the {MBINIT.CAL Done req} sideband message, and then wait for a
response. If this message is received successfully, the UCIe Module Partner responds with the
{MBINIT.CAL Done resp} sideband message. Once the UCIe Module has sent and received
{MBINIT.CAL Done resp}, it must exit to MBINIT.REPAIRCLK.
4.5.3.3.3 MBINIT.REPAIRCLK
This sub-state is used to detect and apply repair (if needed) to clock and track Lanes for Advanced
Package and for functional check of clock and track Lanes for Standard Package.
Clock repair mapping is described in Section 4.3. Each clock, track and their redundant physical Lanes
(TCKP_P/RCKP_P, TCKN_P/RCKN_P, TTRK_P/RTRK_P, and TRDCK_P/RRDCK_P) are independently
checked to detect possible electrical opens or electrical shorts between the two clock pins. Single-
ended clock Receivers or independent detection mechanism is required to ensure clock repair. The
UCIe Module must enable Transmitters and Receivers on Clock, Track and their redundant Lanes. All
other Transmitters are maintained in tri-state and Receivers are permitted to be disabled.
1. The UCIe Module sends the {MBINIT.REPAIRCLK init req} sideband message and waits for a
response. The UCIe Module Partner when ready to receive pattern on RCKP_L, RCKN_L, RTRK_L,
and RRDCK_L responds with {MBINIT.REPAIRCLK init resp}.
2. The UCIe Module must now send 128 iterations of clock repair pattern (16 clock cycles followed
by 8 cycles of low) on its TCKP_L only (TCKN_L, TTRK_L, and TRDCK_L are tri-stated). Clock
repair pattern must not be scrambled.
3. The UCIe Module Partner detects this pattern on RCKP_L, RCKN_L, RTRK_L, and RRDCK_L.
Detection is considered successful if at least 16 consecutive iterations of clock repair pattern are
detected. The UCIe Module Partner logs the detection result for RCKP_L, RCKN_L, RTRK_L, and
RRDCK_L.
4. The UCIe Module after completing pattern transmission sends {MBINIT.REPAIRCLK result req}
sideband message to get the logged result and waits for a response.
5. The UCIe Module Partner responds with {MBINIT.REPAIRCLK result resp} sideband message with
log result of RCKP_L, RCKN_L, RTRK_L, and RRDCK_L. If detection is successful on RCKP_L only
and not on any of RCKN_L, RTRK, and/or RRDCK_L, no repair is needed. Else if detection is
unsuccessful on any of RCKP_L, RCKN_L, RTRK_L, and RRDCK_L, repair is needed on the physical
Lane TCKP_P/RCKP_P. Else an electrical short is implied.
6. After receiving the {MBINIT.REPAIRCLK result resp} sideband message, the UCIe Module sends
the sideband message {MBINIT.REPAIRCLK init req} and waits for a response. The UCIe Module
Partner when ready to receive pattern on RCKP_L, RCKN_L, RTRK_L, RRDCK_L responds with
{MBINIT.REPAIRCLK init resp}.
7. After receiving the {MBINIT.REPAIRCLK init resp} sideband message, the UCIe Module must send
128 iterations of clock repair pattern (16 clock cycles followed by 8 cycles of low) on its TCKN_L
only. (TCKP_L, TTRK_L, and TRDCK_L are tri-stated)
8. The UCIe Module Partner detects this pattern on all RCKP_L, RCKN_L, RTRK_L, and RRDCK_L.
Detection is considered successful if at least 16 consecutive cycles of clock repair pattern are
detected. The UCIe Module Partner logs the detection result for RCKP_L, RCKN_L, RTRK_L, and
RRDCK_L.
9. The UCIe Module after completing the pattern transmission, sends {MBINIT.REPAIRCLK result
req} sideband message to get the logged result.
10. The UCIe Module Partner on receiving it responds with {MBINIT.REPAIRCLK result resp} sideband
message with logged result of RCKP_L, RCKN_L, RTRK_L, and RRDCK_L. If detection is successful
on RCKN_L and not on RCKP_L, RTRK_L, RRDCK_L, no repair is needed. Else if detection is
unsuccessful on any of RCKP_L, RCKN_L, RTRK_L, and RRDCK_L, repair is needed on the physical
Lane TCKN_P/RCKN_P. Else an electrical short is implied.
11. After receiving the {MBINIT.REPAIRCLK result resp} sideband message, the UCIe Module sends
the sideband message {MBINIT.REPAIRCLK init req}. The UCIe Module Partner when ready to
receive pattern on RCKP_L, RCKN_L, RTRK_L, RRDCK_L responds with {MBINIT.REPAIRCLK init
resp}.
12. After receiving the {MBINIT.REPAIRCLK init resp} sideband message, the UCIe Module sends 128
iterations of clock repair pattern (16 clock cycles followed by 8 cycles of low) on TRDCK_L only.
(TCKP_L, TTRK_L, and TCKN_L tri-stated)
13. The UCIe Module Partner detects this pattern on all RCKP_L, RCKN_L, RTRK_L, and RRDCK_L.
Detection is considered successful if at least 16 consecutive cycles of clock repair pattern are
detected. The UCIe Module Partner logs the detection result for RCKP_L, RCKN_L, RTRK_L, and
RRDCK_L.
14. The UCIe Module device after completing the pattern transmission sends {MBINIT.REPAIRCLK
result req} sideband message to get the logged result.
15. The UCIe Module Partner responds with {MBINIT.REPAIRCLK result resp} sideband message with
logged result of RCKP_L, RCKN_L, RTRK_L, and RRDCK_L. If detection is successful only on
RRDCK_L and not on RCKP_L, RTRK_L, RCKN_L, TRDCK_P/RRDCK_P is available as a repair
resource. Else if detection is unsuccessful on any of RCKP_L, RCKN_L, RTRK_L, and RRDCK_L,
the physical Lane TRDCK_P/RRDCK_P is not available as a repair resource. Else an electrical short
is implied.
16. After receiving the {MBINIT.REPAIRCLK result resp} sideband message, the UCIe Module sends
the sideband message {MBINIT.REPAIRCLK init req} and waits for a response. The UCIe Module
Partner when ready to receive pattern on RCKP_L, RCKN_L, RTRK_L, RRDCK_L responds with
{MBINIT.REPAIRCLK init resp}.
17. After receiving the {MBINIT.REPAIRCLK init resp} sideband message, the UCIe Module sends 128
iterations of clock repair pattern (16 clock cycles followed by 8 cycles of low) on TTRK_L only.
(TCKP_L, TCKN_L, and TRDCK_L are tri-stated).
18. The UCIe Module Partner detects this pattern on all RCKP_L, RCKN_L, RTRK_L, and RRDCK_L.
Detection is considered successful if at least 16 consecutive cycles of clock repair pattern are
detected. The UCIe Module Partner logs the detection result for RCKP_L, RCKN_L, RTRK_L, and
RRDCK_L.
19. The UCIe Module after completing pattern transmission sends {MBINIT.REPAIRCLK result req}
sideband message to get the logged result and waits for a response.
20. The UCIe Module Partner stops comparison and responds with {MBINIT.REPAIRCLK result resp}
sideband message with logged result of RCKP_L, RCKN_L, RTRK_L, and RRDCK_L. If detection is
successful only on RTRK_L and not on RCKP_L, RCKN_L, RRDCK_L, no repair is needed. Else if
detection is unsuccessful on any of RCKP_L, RCKN_L, RTRK_L, and RRDCK_L, repair is needed on
the physical Lane TTRK_P/RTRK_P. Else an electrical short is implied.
21. Clock or Track is unrepairable if any of the following are true:
— If repair is needed on any two of RCKP_L, RCKN_L, and RTRK_L
— Electrical short is detected
— RRDCK_L is unavailable for repair when repair is needed
If the clock or track is unrepairable, the UCIe Module and UCIe Module Partner must exit to
TRAINERROR after performing TRAINERROR handshake (see Section 4.5.3.8).
If repair is required on only one of the clock or track lanes and a repair resource is available, then
the UCIe Module applies repair on its clock/track Transmitter and sends the {MBINIT.REPAIRCLK
apply repair req} sideband message with repair information. If a repair is needed for one of the
clock or track pins (CKP, CKN, or TRK) and a repair resource is available, repair is applied as
described in Section 4.3. The UCIe Module Partner applies repair and sends {MBINIT.REPAIRCLK
apply repair resp} sideband message.
22. If a repair is applied, UCIe Module must check the repair success by applying clock repair pattern
and checking on the Receiver.
a. The UCIe Module sends sideband message {MBINIT.REPAIRCLK check repair init req} to
initiate check repair and waits for a response. The UCIe Module Partner responds with sideband
message {MBINIT.REPAIRCLK check repair init resp} and is ready to receive and check clock
repair pattern.
b. After receiving the {MBINIT.REPAIRCLK check repair init resp} sideband message, the UCIe
Module sends 128 iterations of clock repair pattern (16 clock cycles followed by 8 cycles of low)
on TCKP_L. The UCIe Module Partner detects this pattern RCKN_L, RCKP_L, RTRK_L.
Detection is considered successful if at least 16 consecutive cycles of clock repair pattern are
detected. The UCIe Module requests for check result request using the sideband message
{MBINIT.REPAIRCLK check results req} and the UCIe Module Partner responds with the
sideband message {MBINIT.REPAIRCLK check results resp}. Repair is considered successful if
pattern is detected only on RCKP_L. If repair is unsuccessful, the UCIe Module and UCIe
Module Partner must exit to TRAINERROR after performing TRAINERROR handshake (see
Section 4.5.3.8).
c. Step a and Step b are repeated for TCKN_L and TTRK_L.
23. If repair is successful or repair is not required, the UCIe Module sends {MBINIT.REPAIRCLK done
req} sideband message and the UCIe Module Partner responds with {MBINIT.REPAIRCLK done
resp} sideband message. When the UCIe Module has sent and received {MBINIT.REPAIRCLK done
resp}, it must exit to REPAIRVAL.
For Standard Package, clock and track Lanes are checked for functional operation at the lowest data
rate. The sequence is as follows:
1. The UCIe Module sends the sideband message {MBINIT.REPAIRCLK init req} and waits for a
response. When ready to receive pattern on RCKP_L, RCKN_L, and RTRK_L, the UCIe Module
Partner responds with {MBINIT.REPAIRCLK init resp}. On receiving the sideband message
{MBINIT.REPAIRCLK init resp}, the UCIe Module sends 128 iterations of clock repair pattern
(16 clock cycles followed by 8 cycles of low) on TCKP_L, TCKN_L, and TTRK_L. Clock repair
pattern must not be scrambled.
2. The UCIe Module Partner detects this pattern on RCKP_L, RCKN_L, and RTRK_L. Detection is
considered successful if at least 16 consecutive cycles of clock repair pattern are detected. The
UCIe Module Partner logs the detection result for RCKP_L, RCKN_L, and RTRK_L.
3. After completing pattern transmission, the UCIe Module sends {MBINIT.REPAIRCLK result req}
sideband message to get the logged result.
4. The UCIe Module Partner stops comparison and responds with {MBINIT.REPAIRCLK result resp}
sideband message with logged result of RCKP_L, RCKN_L, and RTRK_L.
5. If detection is unsuccessful on any one of RCKP_L, RCKN_L, and RTRK_L, the UCIe Module and
UCIe Module Partner must exit to TRAINERROR after performing TRAINERROR handshake.
6. If detection is successful, the UCIe Module sends {MBINIT.REPAIRCLK done req} sideband
message and the UCIe Module Partner responds with {MBINIT.REPAIRCLK done resp} sideband
message. When the UCIe Module has sent and received the sideband message
{MBINIT.REPAIRCLK done resp}, it must exit to MBINIT.REPAIRVAL.
4.5.3.3.4 MBINIT.REPAIRVAL
The UCIe Module sets the clock phase at the center of the data UI. The UCIe Module Partner must
sample the received Valid with the received forwarded Clock. All Data Lanes must be held low during
this state. Track and Data Receivers are permitted to be disabled. When not performing the actions
relevant to this state:
• Clock Transmitters are held differential low (for differential clocking) or simultaneous low (for
Quadrature clocking)
• Clock Receivers are enabled
• When not transmitting the VALTRAIN pattern, the transmitters for TVLD_L and TRDVLD_L are
disabled (tri-stated)
• The receivers for RVLD_L and RRDVLD_L are enabled
This state can be used to detect and apply repair (if needed) to Valid Lane for Advanced Package
and for functional check of Valid for Standard Package.
For Standard Package, Valid Lane is checked for functional operation at the lowest data rate.
Following is the flow:
1. The UCIe Module must send the sideband message {MBINIT.REPAIRVAL init req} and wait for
a response. The UCIe Module Partner when ready to receive pattern on RVLD_L, must respond
with {MBINIT.REPAIRVAL init resp}.
2. After receiving the sideband message {MBINIT.REPAIRVAL init resp}, the UCIe Module sends
128 iterations of VALTRAIN pattern (four 1’s followed by four 0’s) on TVLD_L along with the
forwarded clock.
3. The UCIe Module Partner detects this pattern on RVLD_L. Detection is considered successful
if at least 16 consecutive iterations of valid repair pattern are detected. The Receiver logs the
detection result for RVLD_L.
4. After completing pattern transmission, the UCIe Module must send {MBINIT.REPAIRVAL result
req} sideband message and wait to get the logged result.
5. The UCIe Module Partner must stop comparison and respond with {MBINIT.REPAIRVAL result
resp} sideband message with result in the previous step.
6. If detection fails, the UCIe Module must exit to TRAINERROR after completing the TRAINERROR
handshake.
7. If detection is successful, the UCIe Module must send {MBINIT.REPAIRVAL done req} sideband
message and the UCIe Module Partner responds with {MBINIT.REPAIRVAL done resp}. When a
UCIe Module has sent and received {MBINIT.REPAIRVAL done resp} sideband message, it must
exit to REVERSALMB.
4.5.3.3.5 MBINIT.REVERSALMB
This state is entered only if Clock and Valid Lanes are functional. In this state, Data Lane reversal is
detected. All the Transmitters and Receivers are enabled. The UCIe Module sets the forwarded clock
phase at the center of the data UI. The UCIe Module Partner must sample the incoming Data with the
incoming forwarded clock. The Track Transmitter is held low and the Track Receiver is permitted to be
disabled. When not performing the actions relevant to this state:
• Clock Transmitters are held differential low (for differential clocking) or simultaneous low (for
Quadrature clocking)
• Clock Receivers are enabled
• Data and Valid Transmitters are held low
• Data and Valid Receivers are enabled
A 16-bit “Per Lane ID” pattern, shown in Table 4-7, is a Lane-specific pattern using the Lane ID
described in Section 4.2.1. Example of “Per Lane ID” pattern for Lane 1 and Lane 31 are shown in
Table 4-8. When “Per Lane ID” pattern is used, it must not be scrambled.
Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
a. Note that bit 0 of Lane ID maps to bit 4 in the Per Lane ID pattern, bit 1 to bit 5 and so on until bit 7 to bit 11.
Lane 1 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 1
Lane 31 0 1 0 1 1 1 1 1 1 0 0 0 0 1 0 1
Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Following is the Reversal MB sequence for Advanced Package and Standard Package:
1. The UCIe Module must send the {MBINIT.REVERSALMB init req} sideband message. When ready
to receive “Per Lane ID” pattern and perform per-Lane pattern comparison, the UCIe Module
Partner must respond with {MBINIT.REVERSALMB init resp}.
2. On Receiving the {MBINIT.REVERSALMB init resp} sideband message or entering from Step 8, the
UCIe Module must send {MBINIT.REVERSALMB clear error req} sideband message. Upon
receiving this message, the UCIe Module Partner clears any prior errors and responds with
{MBINIT.REVERSALMB clear error resp}. After receiving {MBINIT.REVERSALMB clear error resp},
the UCIe Module sends 128 iterations of Per Lane ID pattern (see Table 4-7) on all N data Lanes
with correct Valid framing on the Valid Lane (see Section 5.11 and Section 4.1.2) along with the
forwarded clock. Table 4-7 and Table 4-8 show examples of the Per Lane ID pattern. N is 68 (64
Data + 4 RD) for a x64 Advanced Package. N is 34 (32 Data + 2 RD) for a x32 Advanced Package.
N is 16 for a x16 Standard Package. N is 8 for a x8 Standard Package.
3. The UCIe Module Partner must perform a per-Lane compare on its Receivers on all N Lanes (see
Section 4.4). Detection on a Lane is considered successful if at least 16 consecutive iterations of
“Per Lane ID” pattern are detected. The UCIe Module Partner logs the detection result for its
Receiver Lanes to be used for Lane reversal Detection.
4. After sending 128 iterations of “Per Lane ID” pattern, the UCIe Module stops sending the pattern
and sends {MBINIT.REVERSALMB result req} sideband message to get the logged result.
5. The UCIe Module Partner stops comparison and must respond with {MBINIT.REVERSALMB result
resp} sideband message with per Lane result (see Table 7-11 for the message format).
6. If majority of the Lanes show success (since some Lanes may need repair), Lane reversal is not
needed. Skip to Step 11. Note that if exactly 50% of the Lanes showed success, Lane reversal is
applied.
7. The UCIe Module applies Lane reversal on its Transmitters (see Section 4.2).
8. Following the Lane reversal application on its Transmitters, the UCIe Module repeats Step 2
through Step 5.
9. If majority of Lanes show success, the Lane reversal is needed. Lane reversal preserved for rest
of the device operation. Skip to Step 11.
10. The UCIe Module must exit to TRAINERROR after completing the TRAINERROR handshake.
11. The UCIe Module must send {MBINIT.REVERSALMB done req} sideband message and the UCIe
Module Partner responds with {MBINIT.REVERSALMB done resp}. When the UCIe Module has sent
and received {MBINIT.REVERSALMB done resp} sideband message, it must exit to REPAIRMB.
If a x64 Advanced Package Module that supports interoperation with a x32 Advanced Package module
had received “UCIe-A x32” as 1b during parameter exchanges, it must recognize that it is connected
to a x32 Advanced Package Module and appropriately interpret the received {MBINIT.REVERSALMB
result resp} sideband message. In this scenario, the x64 applies steps 6 through 9 to lower 32 data
lane set and 2 repair lane set. The x64 module applies Lane reversal (if required) within the lower 32
data lane set and 2 repair lane set.
If a x32 Advanced Package Module had received “UCIe-A x32” as 0b during parameter exchanges, it
must recognize that it is connected to a x64 Advanced Package Module and appropriately interpret the
received {MBINIT.REVERSALMB result resp} sideband message, looking for majority of success in the
lower 32 data lane set and 2 repair lane set of the x64 module (the x64 module will always place the
results of its receiver on the lower half of its data/repair lane set).
If a x16 Standard Package Module that supports interoperation with a x8 Standard Package Module
had its SPMW bit set to 1b OR has transmitted or received “UCIe-S x8” as 1b during parameter
exchanges, the x16 Standard Package Module must recognize that it needs to operate in x8 mode and
appropriately interpret the received {MBINIT.REVERSALMB result resp} sideband message. In this
scenario, the x16 Standard Package Module applies Step 6 through Step 9 to the lower-8 data-lane
set. Additionally, the x16 Standard Package Module applies Lane reversal (if required) within the
lower-8 data-lane set.
When a x8 Standard Package Module receives the {MBINIT.REVERSALMB result resp} sideband
message, the module must look for majority of success in the bits that correspond to the lower-8
data-lane set only.
4.5.3.3.6 MBINIT.REPAIRMB
This state is entered only after Lane reversal detection and application is successful. All the
Transmitters and Receivers on a UCIe Module are enabled. The UCIe Module sets the clock phase at
the center of the data UI on its mainband Transmitter for data Lanes (including the redundant Lanes
for Advanced Package). The UCIe Module Partner must sample the incoming Data with the incoming
forwarded clock on its mainband Receivers for data Lanes (including the redundant Lanes for
Advanced Package). The Track Transmitter is held low and the Track Receiver is permitted to be
disabled. When not performing the actions relevant to this state:
• Clock Transmitters are held differential low (for differential clocking) or simultaneous low (for
Quadrature clocking)
• Clock Receivers are enabled
• Data and Valid Transmitters are held low
• Data and Valid Receivers are enabled
In this state, the mainband Lanes are detected and repaired (if needed) for Advanced Package. In
this state, functional checks and width degrade (if needed) are performed for Standard Package.
4. If Lane repair is required and the necessary repair resources are available, the UCIe Module
applies repair on its mainband Transmitters for data Lanes as described in Section 4.3.1, and
sends {MBINIT.REPAIRMB Apply repair req} sideband message. Upon receiving this sideband
message, the UCIe Module Partner applies repair on its mainband Receivers for data Lanes as
described in Section 4.3.1, and sends {MBINIT.REPAIRMB Apply repair resp} sideband message.
Otherwise, if the number of Lane failures are more than repair capability (see Section 4.3), the
mainband is unrepairable and the UCIe Module must exit to TRAINERROR after performing the
TRAINERROR handshake.
5. If repair is not required, perform Step 7.
6. If Lane repair is applied (Step 4), the applied repair is checked by UCIe Module by repeating
Step 2 and Step 3. If post repair Lane errors are logged in Step 5, the UCIe Module must exit to
TRAINERROR after performing the TRAINERROR handshake. If repair is successful, perform
Step 7.
7. The UCIe Module sends {MBINIT.REPAIRMB end req} sideband message and the UCIe Module
Partner responds with {MBINIT.REPAIRMB end resp}. When UCIe Module has sent and received
{MBINIT.REPAIRMB end resp}, it must exit to MBTRAIN.
For Standard Package, mainband is checked for functional operation at the lowest data rate.
Following is the sequence of steps:
1. The UCIe Module sends the {MBINIT.REPAIRMB start req} sideband message and waits for a
response. The UCIe Module Partner responds with the {MBINIT.REPAIRMB start resp} sideband
message when ready to receive the pattern on its mainband Receivers for data Lanes.
2. The UCIe Module performs Transmitter-initiated Data-to-Clock point test as described in
Section 4.5.
a. The transmit pattern must be set up to send 128 iterations of continuous mode “Per Lane ID”
Pattern. The Receiver must be set up to perform per Lane comparison.
b. Detection on a Receiver Lane is considered successful if at least 16 consecutive iterations of
“Per Lane ID” pattern are detected.
c. LFSR Reset has no impact in MBINIT.REPAIRMB.
3. The UCIe Module must send the {MBINIT.REPAIRMB apply degrade req} sideband message
indicating the functional Lanes on its Transmitter using one of the logical Lane map encodings
from Table 4-9. Encodings 100b and 101b can be used only when the SPMW bit is set to 1 or the
UCIe-S x8 bit was set to 1 in the transmitted or received {MBINIT.PARAM configuration req}
sideband message. If the remote Link partner indicated a width degrade in the functional Lanes,
the UCIe Module must apply the corresponding width degrade to its Receiver. If the remote Link
partner indicated all Lanes are functional (i.e. a Lane map code of 011b), the UCIe Module sets its
Transmitter and Receiver to the width corresponding to the functional Lane encoding determined
on its Transmitter. The UCIe Module sends the {MBINIT.REPAIRMB apply degrade resp} sideband
message after setting its Transmitter and Receiver lanes to the relevant width. If the width on the
Transmitter or Receiver has changed, both Link partners must repeat Step 2. If the width on the
Transmitter or Receiver has not changed, proceed to Step 4. If a “Degrade not possible” encoding
is sent or received in the {MBINIT.REPAIRMB apply degrade req} sideband message, the UCIe
Module must exit to TRAINERROR after performing the TRAINERROR handshake.
4. The UCIe Module sends the {MBINIT.REPAIRMB end req} sideband message and the UCIe Module
Partner responds with the {MBINIT.REPAIRMB end resp} sideband message. When the UCIe
Module has sent and received the {MBINIT.REPAIRMB end resp} sideband message, it must exit
to MBTRAIN.
011b 0 - 15
IMPLEMENTATION NOTE
Consider an example in which Die A is communicating with Die B over a Standard
Package UCIe Link.
During the first iteration of Step 2 of MBINIT.REPAIRMB, let’s say that Tx on Die A
detects errors on Lane ID 1 and not on any other Lanes, but Tx on Die B detects
errors on Lane ID 10 and not on any other Lanes. Thus, as per the rules in Step 3, Die
A sends {MBINIT.REPAIRMB apply degrade req} with a Lane map code of 010b.
Similarly, in Step 3, Die B sends {MBINIT.REPAIRMB apply degrade req} with a Lane
map code of 001b. The Rx on Die B disables Lanes 0 to 7, and the Tx on Die B tri-
states Lanes 8 to 15. The Rx on Die A disables Lanes 8 to 15, and the Tx on Die A tri-
states Lanes 0 to 7. Following the rules in Step 3, each die goes back and repeats
Step 2.
In this second iteration of Step 2, both Die know that some of the Lanes are disabled,
and they will ignore the information related to the disabled Lanes in {Tx Init D to C
results resp} (e.g., Die A will ignore the information related to Lanes 0 to 7 and
perform only the Transmitter-initiated Data to Clock point test on Lanes 8 to 15).
Let’s say that in the second iteration of Step 2, no errors are reported on the enabled
Lanes. In Step 3, Die A sends {MBINIT.REPAIRMB apply degrade req} with a Lane
map code of 010b. Similarly, in Step 3, Die B sends {MBINIT.REPAIRMB apply
degrade req} with a Lane map code of 001b. Because the width of the Tx and Rx
have not changed, both Die proceed to Step 4.
4.5.3.4 MBTRAIN
MBTRAIN state is used to setup operational speed and perform clock to data centering. At higher
speeds, additional calibrations like Rx clock correction, Tx and Rx deskew may be needed to ensure
Link performance. MBTRAIN uses sub-states to perform all the required calibration and training. UCIe
Modules must enter each sub-state and the exit from each sub-state is coordinated between UCIe
Module Partners through sideband handshakes. If a particular action within a sub-state is not needed,
the UCIe Module is permitted to exit the sub-state through the relevant sideband handshake without
performing the described operations in that sub-state.
Devices enter this state once the MBINIT is completed. This state is common for Advanced and
Standard Packages.
4.5.3.4.1 MBTRAIN.VALVREF
Receiver reference voltage (Vref) to sample the incoming Valid is optimized in this state. The data
rate on the mainband continues to be at the lowest supported data rate (4 GT/s). The UCIe Module
Partner must set the forwarded clock phase at the center of the data UI on its mainband Transmitters.
The UCIe Module must sample the pattern on Valid signal with the forwarded clock. All data Lanes and
Track must be held low during Valid Lane reference voltage training. Track Receivers are permitted to
be disabled. When not performing the actions relevant to this state:
• Clock Transmitters are held differential low (for differential clocking) or simultaneous low (for
Quadrature clocking)
• Clock Receivers are enabled
2. UCIe Module optimizes Vref on its Valid Receiver by adjusting Receiver reference voltage and
performing one or more Receiver-initiated Data-to-Clock point tests (see Section 4.5.1.3) (AND/
OR) one or more Receiver-initiated Data-to-Clock eye width sweeps (see Section 4.5.1.4).
a. The transmit pattern must be set to send 128 iterations of continuous mode “VALTRAIN” (four
1s and four 0s) pattern (see Table 4-5). This pattern must not be scrambled. The Receiver must
be set up to perform comparison on the Valid Lane.
b. Detection on a Receiver Lane is considered successful if “VALTRAIN” pattern detection errors
are less than the set threshold (per Lane comparison threshold in Section 9.5.3.29).
c. It should be noted that LFSR RESET has no impact in MBTRAIN.VALVREF.
3. The UCIe Module must send {MBTRAIN.VALVREF end req} sideband message after the Vref
optimization (One way to perform Vref Optimization is to step through Vref and perform Step 2 at
each setting). When {MBTRAIN.VALVREF end req} is received, the UCIe Module Partner must
respond with {MBTRAIN.VALVREF end resp}. When the UCIe Module has sent and received the
sideband message {MBTRAIN.VALVREF end resp}, it must exit to MBTRAIN.DATAVREF.
4.5.3.4.2 MBTRAIN.DATAVREF
Receiver reference voltage (Vref) to sample the incoming data is optimized in this state. The data rate
on the UCIe Module mainband continues to be at the lowest supported data rate (4 GT/s). The
Transmitter sets the forwarded clock phase at the center of the data UI. The Track Transmitter is held
low and the Track Receiver is permitted to be disabled. When not performing the actions relevant to
this state:
• Clock Transmitters are held differential low (for differential clocking) or simultaneous low (for
Quadrature clocking)
• Clock Receivers are enabled
• Data and Valid Transmitters are held low
• Data and Valid Receivers are enabled
4.5.3.4.3 MBTRAIN.SPEEDIDLE
This is an electrical idle state to allow frequency change. Clock Transmitters are held differential low
(for differential clocking) or simultaneous low (for Quadrature clocking). Clock Receivers are enabled.
Data, Valid, and Track Transmitters are held low.
4.5.3.4.4 MBTRAIN.TXSELFCAL
The UCIe Module calibrates its circuit parameters independent of the UCIe Module Partner. Data,
Clock, Valid, and Track Transmitters are tri-stated. Data, Clock, Valid, and Track Receivers are
permitted to be disabled.
1. UCIe Module is permitted to perform implementation specific Transmitter-related calibration.
2. Upon completion of calibration, the UCIe Module must send the {MBTRAIN.TXSELFCAL Done req}
sideband message. When {MBTRAIN.TXSELFCAL Done req} sideband message is received, the
UCIe Module Partner must respond with {MBTRAIN.TXSELFCAL Done resp}. When the UCIe
Module has sent and received the {MBTRAIN.TXSELFCAL Done resp} sideband message, it must
exit to MBTRAIN. RXCLKCAL.
4.5.3.4.5 MBTRAIN.RXCLKCAL
In this state, Data, Valid Transmitters are held low (Data and Valid Receivers are permitted to be
disabled). When not performing the actions relevant to this state, if Strobe mode was advertised by
the UCIe Module partner, the Clock Transmitters are held differential low (for differential clocking) or
simultaneous low (for Quadrature clocking). When not performing the actions relevant to this state, if
continuous clock mode was advertised by the UCIe Module partner and the {MBTRAIN.RXCLKCAL
start req} sideband message has been received, then the Clock Transmitters are providing the free-
running forwarded clock.
1. The UCIe Module, when ready to perform calibration on its Clock receive path, sends the
{MBTRAIN.RXCLKCAL start req} sideband message. When the {MBTRAIN.RXCLKCAL start req}
sideband message is received, the UCIe Module Partner starts sending the forwarded clock and
Track. Subsequently, the UCIe Module Partner sends the {MBTRAIN.RXCLKCAL start resp}
sideband message. The Transmitter clock must be free running and all Data Lanes and Valid must
be held low. The UCIe Module is permitted to use the forwarded clock to perform any clock path-
related and Clock-to-Track-related calibration. The UCIe Module Partner must not adjust any
circuit or PI phase parameters on its Transmitters within this state.
2. When the required calibration (if any) is performed, the UCIe Module sends {MBTRAIN.RXCLKCAL
done req} sideband message. When {MBTRAIN.RXCLKCAL done req} is received, the UCIe
Module Partner stops sending forwarded clock and responds by sending {MBTRAIN.RXCLKCAL
done resp} sideband message. When a UCIe Module has sent and received {MBTRAIN.RXCLKCAL
done resp} sideband message, it must exit to MBTRAIN.VALTRAINCENTER.
4.5.3.4.6 MBTRAIN.VALTRAINCENTER
To ensure the valid signal is functional, valid to clock training is performed before the data Lane
training. The Receiver samples the pattern on Valid with the forwarded clock. Receiver reference
voltage is set to the optimized value achieved through Vref training (see Section 4.5.3.4.1 and
Section 4.5.3.4.2). All data and Track Transmitters are held low during valid to clock training. When
not performing the actions relevant to this state, if Strobe mode was advertised by the UCIe Module
partner, then the Clock Transmitters are held differential low (for differential clocking) or simultaneous
low (for Quadrature clocking). When not performing the actions relevant to this state, if continuous
clock mode was advertised by the UCIe Module partner, then the Clock Transmitters are providing the
free-running forwarded clock.
4.5.3.4.7 MBTRAIN.VALTRAINVREF
UCIe Module is permitted to optionally optimize the reference voltage (Vref) to sample the incoming
Valid at the operating data rate. All Data and Track Transmitters are held low during Valid-to-Clock
training. When not performing the actions relevant to this state, if Strobe mode was advertised by the
UCIe Module partner, then the Clock Transmitters are held differential low (for differential clocking) or
simultaneous low (for Quadrature clocking). When not performing the actions relevant to this state, if
continuous clock mode was advertised by the UCIe Module partner, then the Clock Transmitters are
providing the free-running forwarded clock.
1. The UCIe Module must send the sideband message {MBTRAIN.VALTRAINVREF start req}. When
{MBTRAIN.VALTRAINVREF start req} sideband message is received, the UCIe Module Partner
responds with {MBTRAIN.VALTRAINVREF start resp}.
2. UCIe Module optionally optimizes Vref by adjusting Receiver reference voltage on its Valid
Receiver and performing one or more Receiver-initiated Data-to-Clock eye width sweeps (see
Section 4.5.1.4) (AND/OR) one or more Receiver-initiated Data-to-Clock point tests (see
Section 4.5.1.3). Step 2 is optional and implementation-specific.
a. If Valid centering is performed, the transmit pattern must be set to send 128 iterations of
continuous mode “VALTRAIN” (four 1s and four 0s) pattern (see Table 4-5). This pattern must
not be scrambled. The Receiver must be set up to perform comparison on the Valid Lane.
b. Detection on a Receiver Lane is considered successful if “VALTRAIN” pattern detection errors
are less than set threshold (per Lane comparison threshold in Section 9.5.3.29).
c. It should be noted that LFSR RESET has no impact in MBTRAIN.VALVREF.
3. The UCIe Module must send {MBTRAIN.VALTRAINVREF end req} sideband message after the Vref
optimization is complete. When {MBTRAIN.VALTRAINVREF end req} is received, the UCIe Module
Partner must respond with {MBTRAIN.VALTRAINVREF end resp}. Once the UCIe Module has sent
and received the sideband message {MBTRAIN.VALTRAINVREF end resp}, it must exit to
MBTRAIN.DATATRAINCENTER1.
4.5.3.4.8 MBTRAIN.DATATRAINCENTER1
In this state, the UCIe Module performs Data-to-Clock training (including valid). LFSR patterns
described in Section 4.4.1 must be used in this state. The Track Transmitter is held Low. When not
performing the actions relevant to this state:
• Clock Receivers are enabled
• Data and Valid Transmitters are held low
• Data and Valid Receivers are enabled
• If Strobe mode was advertised by the UCIe Module partner, then the Clock Transmitters are held
differential low (for differential clocking) or simultaneous low (for Quadrature clocking)
• If continuous clock mode was advertised by the UCIe Module partner, then the Clock Transmitters
are providing the free-running forwarded clock
3. If the test is a success, the UCIe Module must set the clock phase to sample the data eye
at the optimal point to maximize eye margins. The UCIe Module must send
{MBTRAIN.DATATRAINCENTER1 end req} sideband message. When
{MBTRAIN.DATATRAINCENTER1 end req} sideband message is received, the UCIe Module Partner
responds with {MBTRAIN.DATATRAINCENTER1 end resp}. Once the UCIe Module has sent and
received the {MBTRAIN.DATATRAINCENTER1 end resp} sideband message, it must exit to
MBTRAIN.DATATRAINVREF.
4.5.3.4.9 MBTRAIN.DATATRAINVREF
UCIe Module is permitted to optionally optimize the reference voltage (Vref) on its data Receivers to
optimize sampling of the incoming Data at the operating data rate. The Track Transmitter is held Low.
When not performing the actions relevant to this state:
• Clock Receivers are enabled
• Data and Valid Transmitters are held low
• Data and Valid Receivers are enabled
• If Strobe mode was advertised by the UCIe Module partner, then the Clock Transmitters are held
differential low (for differential clocking) or simultaneous low (for Quadrature clocking)
• If continuous clock mode was advertised by the UCIe Module partner, then the Clock Transmitters
are providing the free-running forwarded clock
Note: It is possible that the eye opening in this step is insufficient (test fails) and a per-bit
deskew may be needed on the Receiver. Thus, the UCIe Module must exit to
MBTRAIN.RXDESKEW.
4.5.3.4.10 MBTRAIN.RXDESKEW
The UCIe Module is permitted to optionally perform per Lane deskew on its Receivers to improve
timing margin in this state. The Track Transmitter is held Low. When not performing the actions
relevant to this state:
• Clock Receivers are enabled
4.5.3.4.11 MBTRAIN.DATATRAINCENTER2
This state is needed for the UCIe Module to recenter clock to aggregate data in case the UCIe Module
Partner’s Receiver performed a per Lane deskew. The Track Transmitter is held Low. When not
performing the actions relevant to this state:
• Clock Receivers are enabled
• Data and Valid Transmitters are held low
• Data and Valid Receivers are enabled
• If Strobe mode was advertised by the UCIe Module partner, then the Clock Transmitters are held
differential low (for differential clocking) or simultaneous low (for Quadrature clocking)
• If continuous clock mode was advertised by the UCIe Module partner, then the Clock Transmitters
are providing the free-running forwarded clock
a. The Transmitter must be set up to send 4K UI of continuous mode LFSR pattern described in
Section 4.4.1. The LFSR pattern on data must be accompanied by correct valid framing on the
Valid Lane as described in Section 4.1.2. The Receiver must be set up to perform per Lane
comparison.
b. Detection on a Receiver Lane is considered successful if total error count is less than the set
error threshold (see Section 9.5.3.29).
3. The UCIe Module uses the received training results to calculate the final eye center and set the
clock phase to sample the data eye at the optimal point to maximize eye margins. The UCIe
Module must send the {MBTRAIN.DATATRAINCENTER2 end req} sideband message. When
{MBTRAIN.DATATRAINCENTER2 end req} sideband message is received, the UCIe Module Partner
responds with {MBTRAIN.DATATRAINCENTER2 end resp}. Once UCIe Module has sent and
received {MBTRAIN.DATATRAINCENTER2 end resp} sideband message, it must exit to
MBTRAIN.LINKSPEED.
4.5.3.4.12 MBTRAIN.LINKSPEED
In this state, the UCIe Module checks Link stability at the operating date rate. The Track Transmitter is
held Low. When not performing the actions relevant to this state:
• Clock Receivers are enabled
• Data and Valid Transmitters are held low
• Data and Valid Receivers are enabled
• If Strobe mode was advertised by the UCIe Module partner, then the Clock Transmitters are held
differential low (for differential clocking) or simultaneous low (for Quadrature clocking)
• If continuous clock mode was advertised by the UCIe Module partner, then the Clock Transmitters
are providing the free-running forwarded clock
3. For single-Module instantiations, if errors are encountered, the UCIe Module sets its Transmitters
to an electrical idle state and sends {MBTRAIN.LINKSPEED error req} sideband message. If an
{MBTRAIN.LINKSPEED error req} sideband message is received, the UCIe Module Partner must
complete Step 1 and Step 2, evaluate the results and if not initiating an {MBTRAIN.LINKSPEED
exit to phy retrain req} sideband message, the UCIe Module Partner enters electrical idle on its
Receiver and sends the {MBTRAIN.LINKSPEED error resp} sideband message. If an
{MBTRAIN.LINKSPEED exit to phy retrain req} sideband message is received, the UCIe Module
must exit to PHYRETRAIN and send an {MBTRAIN.LINKSPEED exit to PHY retrain resp} sideband
message; any outstanding messages are abandoned. Otherwise, after the {MBTRAIN.LINKSPEED
error resp} sideband message is received, the PHY_IN_RETRAIN flag is cleared and the following
rules apply:
a. Based on the number of Lanes encountering errors, the UCIe Module checks if the failing
Lanes can be repaired (for Advanced package) or Width degraded (for standard package).
If Lanes can be can be repaired (for Advanced package) or Width degraded (for standard
package), the UCIe Module must send {MBTRAIN.LINKSPEED exit to repair req} to the UCIe
Module Partner. The UCIe Module Partner, if not initiating a speed degrade, enters
MBTRAIN.REPAIR and sends the sideband message {MBTRAIN.LINKSPEED exit to repair resp}.
If {MBTRAIN.LINKSPEED exit to repair resp} is received in response to a
{MBTRAIN.LINKSPEED exit to repair req}, the UCIe Module must exit to MBTRAIN.REPAIR. If
a UCIe Module is initiating a speed degrade, it must not respond to {MBTRAIN.LINKSPEED exit
to repair req}.
b. If the Lanes cannot be repaired (for Advanced package) or width degraded (for Standard
package), the speed must be degraded. The UCIe Module sends {MBTRAIN.LINKSPEED exit to
speed degrade req} sideband message and waits for a response from the remote Link partner.
The UCIe Module Partner must respond with {MBTRAIN.LINKSPEED exit to speed degrade
resp}. Following this handshake, the UCIe Module must exit to MBTRAIN.SPEEDIDLE to set
data rate to next lower speed.
c. If the UCIe Module receives an {MBTRAIN.LINKSPEED exit to speed degrade req} any
outstanding {MBTRAIN.LINKSPEED exit to repair req} must be abandoned and the UCIe
Module must respond to {MBTRAIN.LINKSPEED exit to speed degrade req}.
d. Any outstanding {MBTRAIN.LINKSPEED done req} must be abandoned if a UCIe Module has
received a {MBTRAIN.LINKSPEED error req}.
4. For single- or multi-module instantiations, if no errors are encountered, the UCIe Module must set
the clock phase on its Transmitter to sample the data eye at the optimal point to maximize eye
margins. If PHY_IN_RETRAIN is not set for single-module instantiations, proceed to Step 6. If
PHY_IN_RETRAIN is not set, for multi-module instantiations, the UCIe Module must send the
{MBTRAIN.LINKSPEED done req} (if not waiting for Link match criteria as a Retimer) to the
remote UCIe Module Partner and wait for multi-module PHY Logic (MMPL) resolution in Step 5c. If
the PHY_IN_RETRAIN variable is set, the following actions must be taken:
a. If a change is detected in Runtime Link Testing Control register relative to the values at
previous PHYRETRAIN entry, the UCIe Module must send {MBTRAIN.LINKSPEED exit to phy
retrain req} and wait for a response. Upon receiving this message, the UCIe Module Partner
must exit to PHY retrain and send {MBTRAIN.LINKSPEED exit to PHY retrain resp}. Once this
sideband message is received, the UCIe Module must exit to PHY retrain.
b. Else if no change is detected in the Runtime Link Testing Control register relative to the values
at previous PHYRETRAIN entry, Busy bit in Runtime Link Testing Status and PHY_IN_RETRAIN
variable must be cleared and the UCIe Module must proceed to Step 6.
5. For multi-module instantiations, if errors are encountered, the UCIe Module sets its Transmitters
to an electrical idle state and sends {MBTRAIN.LINKSPEED error req} sideband message. If an
{MBTRAIN.LINKSPEED error req} sideband message is received, the UCIe Module Partner must
complete Step 1 and Step 2, evaluate the results and if not initiating an {MBTRAIN.LINKSPEED
exit to phy retrain req} sideband message, the UCIe Module Partner enters electrical idle on its
Receiver, and sends the {MBTRAIN.LINKSPEED error resp} sideband message. If an
{MBTRAIN.LINKSPEED exit to phy retrain req} sideband message is received, the UCIe Module
must exit to PHYRETRAIN and send an {MBTRAIN.LINKSPEED exit to PHY retrain resp} sideband
message; any outstanding messages are abandoned. Otherwise, after the {MBTRAIN.LINKSPEED
error resp} sideband message is received, the PHY_IN_RETRAIN flag is cleared and the following
rules apply:
a. Based on the number of Lanes encountering errors, the UCIe Module checks whether the failing
Lanes can be repaired (for Advanced Package) or Width degraded (for Standard Package). If
Lanes can be can be repaired (for Advanced Package) or Width degraded (for Standard
Package), the UCIe Module must send {MBTRAIN.LINKSPEED exit to repair req} to the UCIe
Module Partner.
b. If the Lanes cannot be repaired (for Advanced Package) or width degraded (for Standard
Package), the speed must be degraded. The UCIe Module sends {MBTRAIN.LINKSPEED exit to
speed degrade req}.
c. The UCIe Module informs MMPL of local and remote error requests, done requests, or speed
degrade requests, and waits for resolution. It also informs MMPL of any prior width degrade
(for example in MBINIT.REPAIRMB), and MMPL treats this as the corresponding module
requesting width degrade from the full operational width.
d. Based on the resolution flow chart in Section 4.7, MMPL directs each Module to send either the
{MBTRAIN.LINKSPEED exit to repair resp} (indicating next state is REPAIR),
{MBTRAIN.LINKSPEED exit to speed degrade resp} (indicating next state is SPEEDIDLE with
target speed to next-lower speed), {MBTRAIN.LINKSPEED multi-module disable module resp}
(indicating next state is TRAINERROR and eventually RESET), or {MBTRAIN.LINKSPEED done
resp} (indicating next state is LINKINIT). This is done regardless of the module’s original error
request or done request, and indicates the result of the resolution and next state to each
module. The UCIe Module transitions to next state once it has sent and received the sideband
response message that matches the expected resolution. Any mismatch on received message
vs. expected resolution must take all modules to TRAINERROR. For Retimer dies, the resolution
must take into account any Link match requirements, and while resolving the target
configuration with remote Retimer partner, each UCIe Module from the Retimer die must send
{MBTRAIN.LINKSPEED done resp} with stall encoding every 4 ms. The UCIe Retimer must
ensure that this stall is not perpetual, and an implementation-specific timeout must be included
in the Retimer. If {MBTRAIN.LINKSPEED done resp} with stall encoding is received, it must
reset timers for state transition as well as any outstanding handshakes for multi-module
resolution.
6. If the UCIe die is not a Retimer, proceed to Step 7. If the UCIe die is a Retimer, the following rules
apply to achieve Link match (if required):
a. Retimer must not send {MBTRAIN.LINKSPEED done req} unless the target Link speed and
width of the remote Retimer partner resolves to current Link and width. Proceed to Step 7 if
Link match is achieved or if it is not required.
b. While resolving the target Link speed and width with the remote Retimer partner, if a Retimer
has received an {MBTRAIN.LINKSPEED done req}, it must send {MBTRAIN.LINKSPEED done
resp} with stall encoding every 4 ms. UCIe Retimer must ensure that this stall is not perpetual,
and an implementation specific timeout must be included in the Retimer.
c. If the local UCIe Link speed or width is greater than the remote Retimer UCIe Link, then it must
treat this as an error condition, and perform Step 3 or Step 5 with repair or speed degrade
(whichever is applicable).
7. The UCIe Module must send {MBTRAIN.LINKSPEED done req} sideband message. When
{MBTRAIN.LINKSPEED done req} is received, the UCIe Module must respond with
{MBTRAIN.LINKSPEED done resp} and when a UCIe Module has sent and received the
{MBTRAIN.LINKSPEED done resp} sideband message, both Transmitters and Receivers are now
enabled and idle and both devices exit to LINKINIT.
4.5.3.4.13 MBTRAIN.REPAIR
This state can be entered from PHYRETRAIN or from MBINIT.LINKSPEED. For Advanced package, this
state will be used to apply repair and for Standard package, this state will be used for Link width
degrade. Track, Data, and Valid Transmitters are held low. Clock Transmitters are held differential low
(for differential clocking) or simultaneous low (for Quadrature clocking).
For Advanced Package, if the number of repair resources currently available is greater than the
number of Lanes encountering errors, repair must be applied:
1. The UCIe Module sends the sideband message {MBTRAIN.REPAIR init req} for its Transmitter and
the UCIe Module Partner responds with {MBTRAIN.REPAIR init resp}.
2. If Lane repair is possible, the UCIe Module applies repair on its Transmitter Lanes as described in
Section 4.3.1 and sends {MBTRAIN.REPAIR Apply repair req} sideband message. The UCIe
Module Partner applies repair as described in Section 4.3.1 and responds with {MBTRAIN.REPAIR
Apply repair resp} sideband message once the required repair is applied.
3. The UCIe Module must send {MBTRAIN.REPAIR end req} sideband message and waits for a
response. The UCIe Module Partner must then respond with {MBTRAIN.REPAIR end resp}. When
a UCIe Module has sent and received {MBTRAIN.REPAIR end resp}, it must exit to
MBTRAIN.TXSELFCAL.
For a x16 Standard package, if the Lanes encountering errors are all contained within the set of
Lane 0 through Lane 7 or Lane 8 through Lane 15, the width must be degraded to a x8 Link using the
set of Lanes without errors (Lane 0 ... Lane 7 OR Lane 8 ... Lane 15). Likewise, for a x8 Standard
package, if the Lanes encountering errors are all contained within the set of Lane 0 through Lane 3
or Lane 4 through Lane 7, the width must be degraded to a x4 Link using the set of Lanes without
errors (Lane 0 through Lane 3 or Lane 4 through Lane 7).
1. The UCIe Module sends the sideband message {MBTRAIN.REPAIR init req} and the Receiver
responds with {MBTRAIN.REPAIR init resp}.
2. The UCIe Module must send the {MBTRAIN.REPAIR apply degrade req} sideband message,
indicating the functional Lanes on its Transmitter using one of the logical Lane map encodings
from Table 4-9. Encodings 100b and 101b can be used only when the SPMW bit is set to 1 or the
UCIe-S x8 bit was set to 1 in the transmitted or received {MBINIT.PARAM configuration req}
sideband message. If the remote Link partner indicated a width degrade in the functional Lanes,
the UCIe Module must apply the corresponding width degrade to its Receiver. If the remote Link
partner indicated all Lanes are functional, the UCIe Module sets its Transmitter and Receiver to
the logical lane map corresponding to the functional Lane encoding determined on its Transmitter.
The UCIe Module sends the {MBTRAIN.REPAIR apply degrade resp} sideband message after
setting its Transmitter and Receiver lanes to the relevant logical lane map and proceeds to Step 3
if a degrade is possible or if all Lanes are functional. If a “Degrade not possible” encoding is sent
or received in the {MBTRAIN.REPAIR apply degrade req} sideband message, the UCIe Module
must exit to TRAINERROR after performing the TRAINERROR handshake.
3. The UCIe Module must send the {MBTRAIN.REPAIR end req} sideband message and wait for a
response. The UCIe Module Partner must then respond with the {MBTRAIN.REPAIR end resp}
sideband message. When UCIe Module has sent and received the {MBTRAIN.REPAIR end resp}
sideband message, the UCIe Module must exit to MBTRAIN.TXSELFCAL.
4.5.3.5 LINKINIT
This state is used to allow die to die adapter to complete initial Link management before entering
Active state on RDI. See Section 10.1.6 for more details on RDI bring up flow. Track, Data, and Valid
Transmitters are held low. If Strobe mode was advertised by the UCIe Module partner, then the Clock
Transmitters are held differential low (for differential clocking) or simultaneous low (for Quadrature
clocking). If continuous clock mode was advertised by the UCIe Module partner, then the Clock
Transmitters are providing the free-running forwarded clock. Clock Receivers are enabled.
Once RDI is in Active state, the PHY will clear its copy of “Start UCIe Link training” bit from UCIe Link
control register.
This state is common for Advanced Package and Standard Package configurations.
4.5.3.6 ACTIVE
Physical layer initialization is complete, RDI is in Active state and packets from upper layers can be
exchanged between the two dies.
All data in this state is scrambled using the scrambler LFSR described in Section 4.4.1. Clock gating
rules as described in Section 5.11 apply.
This state is common for Advanced Package and Standard Package configurations.
4.5.3.7 PHYRETRAIN
A die can enter PHY retrain for a number of reasons. Track, Data, and Valid Transmitters are held low.
Clock Transmitters are held differential low (for differential clocking) or simultaneous low (for
Quadrature clocking). The trigger for PHY to enter PHY retrain is one of the following scenarios:
• Adapter directed PHY retrain: Adapter can direct the PHY to retrain for any reason it deems
necessary (see Section 10.3.3.4 Retrain State rules for more details and examples of Adapter-
initiated Retrain requests).
• PHY initiated PHY retrain: Local PHY must initiate retrain on detecting a Valid framing error.
• Remote die requested PHY retrain: Local PHY must enter PHY retrain on receiving a request from
the remote die.
• If a change is detected in Runtime Link Testing Control register during MBTRAIN.LINKSPEED.
Table 4-10. Runtime Link Test Status Register based Retrain encoding
0b N/A TXSELFCAL
1b No Repair TXSELFCAL
No Lane errors
001b TXSELFCAL
(Valid framing errors detected by PHY)
4.5.3.8 TRAINERROR
This state used as a transitional state due to any fatal or non-fatal events that need to bring the state
machine back to RESET state. This can happen during initialization and training or if “Start UCIe Link
training” bit from UCIe Link control register is set when state machine is not in RESET. It is also used
for any events that transition the Link from a Link Up to a Link Down condition. Data, Valid, Clock,
and Track transmitters are tri-stated, and their receivers are permitted to be disabled.
The exit from TRAINERROR to RESET is implementation specific. For cases when there is no error
escalation (i.e., RDI is not in LinkError), it is recommended to exit TRAINERROR as soon as possible.
For cases when there is error escalation (i.e., RDI is in LinkError), it is required for Physical Layer to
be in TRAINERROR as long as RDI is in LinkError. To avoid problems with entering RESET while
transmitting sideband packets, any in-progress sideband packets must finish transmission before
entering RESET state.
See Chapter 10.0 for correctable, non-fatal, and fatal error escalation on RDI.
This state is common for Advanced Package and Standard Package configurations.
If sideband is Active, a sideband handshake must be performed to enter TRAINERROR state from any
state other than SBINIT. The following is defined as the TRAINERROR handshake:
• The UCIe Module requesting exit to TRAINERROR must send {TRAINERROR Entry req} sideband
message and wait for a response. The UCIe Module Partner must exit to TRAINERROR and
respond with {TRAINERROR Entry resp}. Once {TRAINERROR Entry resp} sideband message is
received, the UCIe Module must exit to TRAINERROR. If no response is received for 8 ms, the
LTSM transitions to TRAINERROR.
4.5.3.9 L1/L2
PM state allows a lower power state than dynamic clock gating in ACTIVE. Data, Valid, Clock, and
Track transmitters are tri-stated, and their receivers are permitted to be disabled.
• This state is entered when RDI has transitioned to PM state as described in Chapter 10.0. The PHY
power saving features in this state are implementation specific.
• When local Adapter requests Active on RDI or remote Link partner requests L1 exit the PHY must
exit to MBTRAIN.SPEEDIDLE. L1 exit is coordinated with the corresponding L1 state exit
transitions on RDI.
• When local Adapter requests Active on RDI or remote Link partner requests L2 exit the PHY must
exit to RESET. L2 exit is coordinated with the corresponding L2 state exit transitions on RDI.
Figure 4-44. Example of Width Degradation with Byte Mapping for Differing Module IDs
Each module in a multi-module Link must operate at the same width and speed. During initialization
or retraining, if any module failed to train, the MMPL must ensure that the multi-module configuration
degrades to the next permitted configuration for width or speed degrade (see Figure 4-46 and
Figure 4-47). Subsequently, any differences in width and speed between the different modules must
be resolved using the following rules:
1. For Standard package multi-module configuration, if width degrade is reported for any of the
modules:
a. If less than or equal to half the number of modules report width degrade at the current Link
speed, the corresponding Modules must be disabled. The MMPL must ensure that the multi-
module configuration degrades to the next permitted configuration for width or speed degrade
(see Figure 4-46 and Figure 4-47). For example, if three out of four modules are active, the
MMPL must degrade the Link to a two-module configuration.
b. If the majority of modules report width degrade at the current Link speed, see the pseudo code
below:
2. For Advanced or Standard package multi-module configuration, if any Module reports speed
difference, see the pseudo code below:
IF modules report speed difference:
CMLS: Common Maximum Link Speed
HMLS: Highest Maximum Link Speed of next lower configuration
IF HMLS/2 > CMLS:
Modules degrade to next lower configuration
Else:
Speed for all modules degrades to CMLS
Figure 4-46 and Figure 4-47 provide a consolidated view of the above two rules as a flow chart that
Advanced Package and Standard Package implementations, respectively, must follow. Note that the
“Yes” condition for HMLS/2 > CMLS question is there to cover the base case of 4 GT/s. In other words,
if some module(s) passed MBINIT but failed 4 GT/s in LinkSpeed, then the “Yes” arc will result in
module disable instead of TrainError (because CMLS will be 0 for that) and provide the opportunity to
remain operational at 4 GT/s for the modules that were still operational at 4 GT/s.
LinkSpeed
1
Any enabled
module reporting NO LinkInit
Speedidle Repair errors?
Operational Modules
Transition to LINKINIT
Yes
LinkSpeed
1
Any enabled
Width module reporting NO LINKINIT
Speedidle errors?
Degrade
Yes
Operational Modules
Transition to
LINKINIT
No
1
Yes
More than half number of modules
BW(M, (CLS-1)) >
No report errors and all modules with
BW (M/2, CLS)? errors report width degrade Any modules with an
operational configuration?
TRAINERROR
No
This arc implies at least one
Yes
No
module reported speed
degrade
Modules reporting speed
Speed degrade HMLS/2 > degrade disabled to reach
all operational CMLS? next lower module count
modules No Yes configuration
Other sideband packets use a single sideband to send sideband packets. These include Register
Access packets (requests as well as completions), all the non-vendor defined messages in Table 7-8
and Table 7-10, and any vendor-defined messages that were defined as such. A device must send
these sideband packets on the sideband interface of the numerically least Module ID whose LTSM is
not in RESET or SBINIT. A packet sent on a given Module ID could be received on a different Module
ID on the sideband Receiver.
Similarly, Retimer credits are returned on the Valid signal of the numerically least Module ID whose
LTSM is in Active state. Credits sent on a given Module ID could be received on a different Module ID
on the remote Link partner.
The synchronization orchestration by MMPL happens in the MBTRAIN.LINKSPEED state based on the
rules outlined in Section 4.7.1. As outlined in Section 4.5.3.4.12, after Step 2 of MBTRAIN.LINKSPEED
has completed and PHY_IN_RETRAIN is not set:
• If no errors are encountered for a module, an {MBTRAIN.LINKSPEED done req} is sent to the
remote Link partner module
• If errors are encountered for a module, an {MBTRAIN.LINKSPEED error req}/
{MBTRAIN.LINKSPEED error resp} handshake is performed followed by sending an
{MBTRAIN.LINKSPEED exit to repair req} or an {MBTRAIN.LINKSPEED exit to speed degrade req}
on that module’s sideband.
The individual modules notify the MMPL of the sent and received information from these sideband
messages. MMPL collects this information from all the modules which are operational in the Link and
determines the next state based on resolution of the rules outlined in Section 4.7.1. Of course, the
case without errors is when all modules sent and received the {MBTRAIN.LINKSPEED done req}
message with no change to Link width. In this scenario, the MMPL directs them to proceed to Step 6
of MBTRAIN.LINKSPEED.
The following sections cover a few examples of this resolution for a Link with four modules and
Standard Package configuration where errors were encountered.
In this example, the four modules of the UCIe Link are in MBTRAIN.LINKSPEED at 8 GT/s. Table 4-13
shows the exchanged messages for one die (in the case where errors are encountered, it is assumed
that the {MBTRAIN.LINKSPEED error req}/ {MBTRAIN.LINKSPEED error resp} has completed before
the messages shown). Because the resolution is consistent in using the sent and received messages,
both die of the Link will reach the same resolution.
Table 4-13. Messages exchanged that are used to determine resolution for Example 1
In this example, 3 out of the 4 modules have either sent or received the {MBTRAIN.LINKSPEED exit
to repair req} message which indicate a width degrade for Standard Package configurations. The
value of “CLS” (Current Link Speed) is 8 GT/s, and the value of “CLS-1” is 4 GT/s. The value of “M”
(Number of Active Modules) is 4. Because BW (4 Links at 4 GT/s) is not greater than BW (2 Links at 8
GT/s), the flow chart in Figure 4-46 would result in the MMPL notifying all the modules to proceed
with a width degrade by moving to MBTRAIN.REPAIR as the next state (i.e., {MBTRAIN.LINKSPEED
exit to repair resp} will be sent on each Module). Note that “CLS” and “M” are re-computed using the
updated information every time the Link is MBTRAIN.LINKSPEED and there is a corresponding MMPL
resolution. Width degrade is applied per module following the steps in MBTRAIN.REPAIR for every
module in this UCIe Link. For the module where no errors were encountered, the transmitter is
permitted to pick either of Lanes 0 to 7 or Lanes 8 to 15 as the operational Lanes when transmitting
the {MBTRAIN.REPAIR apply degrade req} to the remote Link partner module. Following the exit from
MBTRAIN.REPAIR, the training continues through the substates of MBTRAIN and in the next iteration
of MBTRAIN.LINKSPEED, if no errors are encountered, MMPL will direct the modules to proceed to
Step 6 of MBTRAIN.LINKSPEED.
Note that this example is covering the case where errors occurred during the LINKSPEED state. If the
width of a module is already lower from the rest of the operational modules that are part of a multi-
module Link (e.g., if a module had degraded width during MBINIT.REPAIRMB itself), it may have sent
and received {MBTRAIN.LINKSPEED done req} during LINKSPEED. However, from a MMPL resolution
perspective, MMPL must treat this as a Module reporting errors requiring width degrade. This is
because a multi-module Link requires all modules to operate at the same width and speed.
In this example, the four modules of the UCIe Link are in MBTRAIN.LINKSPEED at 16 GT/s. Table 4-14
shows the exchanged messages for one die (in the case where errors are encountered, it is assumed
that the {MBTRAIN.LINKSPEED error req}/ {MBTRAIN.LINKSPEED error resp} has completed before
the messages shown). Because the resolution is consistent in using the sent and received messages,
both die of the Link will reach the same resolution.
Table 4-14. Messages exchanged that are used to determine resolution for Example 2
In this example, Module 3 has received a message indicating that the remote partner wants to speed
degrade. “CMLS” always maps to the next degraded Link speed and so in this case “CMLS” is 12 GT/s.
“HMLS” always ends up mapping to current Link speed and so in this case it is 16 GT/s. Because
Module 3 received a speed degrade request, following the flow chart in Figure 4-47, this would result
in MMPL notifying all the modules to proceed with a speed degrade by moving to
MBTRAIN.SPEEDIDLE (i.e. {MBTRAIN.LINKSPEED exit to speed degrade resp} will be sent on each
Module). Following the exit from MBTRAIN.SPEEDIDLE, the training continues through the substates
of MBTRAIN and in the next iteration of MBTRAIN.LINKSPEED, if no errors are encountered, MMPL will
direct the modules to proceed to Step 6 of MBTRAIN.LINKSPEED. Note that “CMLS” and “HMLS” are
using the updated information every time the Link is MBTRAIN.LINKSPEED and there is a
corresponding MMPL resolution. In this example, for the next iteration, CMLS will be 8 GT/s and HMLS
will be 12 GT/s.
In this example, the four modules of the UCIe Link are in MBTRAIN.LINKSPEED at 16 GT/s. Table 4-15
shows the exchanged messages for one die (in the case where errors are encountered, it is assumed
that the {MBTRAIN.LINKSPEED error req}/ {MBTRAIN.LINKSPEED error resp} has completed before
the messages shown). Because the resolution is consistent in using the sent and received messages,
both die of the Link will reach the same resolution.
Table 4-15. Messages exchanged that are used to determine resolution for Example 3
Because less than half of the modules are reporting errors and requesting a width degrade, as per the
flow chart in Figure 4-47, MMPL would take the configuration to a two module configuration. As per
the rules in Section 5.7.3.4.1, Module 0 and Module 1 would send the {MBTRAIN.LINKSPEED multi-
module disable module resp} to take these modules to TRAINERROR and RESET. Module 2 and
Module 3 would send the {MBTRAIN.LINKSPEED done resp} to take them to LINKINIT.
One example of interoperation between UCIe-A x64 and UCIe-A x32 is when the UCIe-A x64 Stack
(including RDI and FDI maximum throughput) is bandwidth-matched (Full Width Mode) by the remote
Link partner’s maximum throughput for a given interface. Figure 4-48 shows an example of two x64
modules that are capable of operating as two independent UCIe stacks with independent Adapters
and Protocol Layers (bypass MMPL logic in configuration (a) in Figure 4-48) and is also capable of
operating as a multi-module configuration when connected to a corresponding multi-module
configuration of x32 Advanced Package to achieve the equivalent bandwidth of a single x64 module
(configuration (b) in Figure 4-48). In the latter configuration, one of the Adapters (shown in gray) is
disabled.
Software and firmware are permitted to use UCIe DVSEC Link Capability and Control registers to
determine within which configuration to train the link.
RDI RDI
Physical Layer Physical Layer Multi-module PHY Logic
(x64 module) (x64 module) (two x64 modules operating in x32 mode)
AFE AFE Module 0 AFE/PHY Logic Module 1 AFE/PHY Logic
Two D2D stacks enabled; MMPL bypassed and each stack uses Only one D2D stack enabled; Uses MMPL with two modules and
one module and operates in Full Width mode operates in Full Width mode
For each stack,
UC Ie Link DVSEC Max Link Width x64 UCIe Link DVSEC Max Link Width x64
C apability Register APMW 0 Capability Register APMW 1
Link Training Parameter UC Ie-A x32 0 Link Training Parameter UCIe-A x32 1
(a) (b)
Another example of interoperation between UCIe-A x64 and UCIe-A x32 is when the UCIe-A x64
Stack degrades bandwidth (Degraded Width Mode) to match the remote Link partner’s maximum
throughput. Figure 4-49 shows an example of RDI byte-to-module assignments for a four-module set
of x64 Advanced Package modules interoperating with a four-module set of x32 Advanced Package
modules. The example is for a 256B RDI width on the x64 set, and a 128B RDI on the x32 set. On the
Transmitter side of x64 modules, the MMPL throttles RDI, as required, because the MMPL can only
send half the bytes over 8 UI; and on the Receiver side, the MMPL accumulates 16 UI worth of data
before forwarding it over RDI (assumes data transfers are in chunks of 256B OR appropriate pause of
data stream indications are applied and detected by MMPLs/Adapters within the data stream).
Figure 4-49. RDI Byte-to-Module Assignment Example for x64 Interop with x32
For the example shown in Figure 4-49, a single Adapter is operating with all four Modules. The D2D
stack uses MMPL with 4 modules, with each of the x64 modules operating in Degraded Width Mode,
and only 32 lanes routed per module. The corresponding values in the capability register and Link
Training parameter are as listed in Table 4-16.
See Section 5.7.2.4 for comprehensive rules of interoperation between x64 and x32 Advanced
Package modules.
Operation is not negotiated). See Section 4.1.5 for details of sideband transmission UI notation. See
Section 4.1.5.1 for Performant Mode Operation (PMO) details.
Figure 4-50 and Figure 4-51 show the arbitration at the PHY.
64 UI 32 UI 64 UI 32 UI 64 UI
Time
Figure 4-51. Example of a Large Management Packet Split into Two Encapsulated
MTPs, with No Segmentation, No Sideband PMO, and with
Two Link Management Packets between the Two Encapsulated MTPs
Encapsulated Encapsulated IDLE Encapsulated IDLE Link Mgmt IDLE Link Mgmt IDLE Link Mgmt IDLE Encapsulated IDLE
IDLE
MTP Header MTP Payload MTP Payload Msg MsgD MsgD MTP Header
QWORD 0 QWORD 6 Header Header Payload
64 UI 32 UI 64 UI 32 UI 64 UI 32 UI 64 UI 32 UI 64 UI 32 UI 64 UI 32 UI 64 UI 32 UI
Time
§§
5.1 Interoperability
IMPLEMENTATION NOTE
In typical implementations, the LCLK for UCIe Link Transmitter and LCLK for the
corresponding link partner Receiver, are both generated from the common reference
clock. In the example implementation of Figure 5-1, the LCLK for Transmitter in Die-1
can be generated from TX PLL and the LCLK for Receiver in Die-2 can be generated
from the RX PLL
Limits
Symbol Description Unit Notes
Min Rec Max
Limits
Symbol Description Unit Notes
Min Rec Max
5.2 Overview
The x16 or x8 “Standard Package Module” uses a traditional Standard packaging with larger pitch. A
Standard Package Module consists of a pair of clocks, 16 or 8 single-ended data Lanes, a data valid
Lane and Track Lane in each direction (transmit and receive). There is a low-speed sideband bus for
initialization, Link training, and configuration reads/writes. The sideband consists of a single-ended
sideband data Lane and single-ended sideband clock Lane in both directions (transmit and receive).
For some applications, multiple modules (2 or 4) can be aggregated to deliver additional bandwidth.
To avoid reliability issues, it is recommended to limit the Transmitter output high (VOH) to a maximum
of 100 mV above the receiving chiplet’s Receiver front-end circuit power supply rail. An over-stress
protection circuit may be implemented in the Receiver when VOH is more than 100 mV above the
Receiver power supply rail.
x64/x32 Data
2 Clock
1 Valid
1 Track
R T T R
x x x64/x32 Data x x
Module
Module
2 Clock
1 Valid
1 Track
SB Data
R T SB Clock T R
x x SB Data x x
SB Clock
Sideband
x16/x8 Data
2 Clock
1 Valid
1 Track
R T T R
x x x16/x8 Data x x
Module
Module
2 Clock
1 Valid
1 Track
SB Data
R T SB Clock T R
x x SB Data x x
SB Clock
Sideband
Advanced Package
Parameter Standard Package
(x64)
Idle Power
(% of peak power) 15 15 15 15 15 15 15
(target upper bound)
PHY dimension Depth (um)e 1043 1043 1320 1320 1320 1540
a. Electrical PHY latency target. For overall latency target, see Table 1-4.
b. See Table 1-4.
c. For compatibility, PHY dimension width must match spec for Advanced Package. Tolerance of PHY dimension width for Standard
Package can be higher because there is more routing flexibility. For best channel performance, it’s recommended for width to
be close to spec.
d. Standard Package PHY dimension width is the effective width of one (x16) module based on x32 interface (see Figure 5-42 and
Figure 5-43).
e. PHY dimension depth is an informative parameter and depends on bump pitch. Number in the table is based on 45-um bump
pitch for 10-column x64 Advanced Package and 100-um bump pitch for Standard Package. See Section 5.7.2 for informative
values of PHY dimension depth for combinations of the x64 and x32 Advanced Package modules in 10-column, 16-column, and
8-column bump matrix construction.
f. Reference (Industry Council on ESD Target Levels): White Paper 2: A Case for Lowering Component-level CDM ESD
Specifications and Requirements.
N N Data + Valid
Data In FIFO Serializer TXD
/N Phase-1
Deskew TX
TXCK
Clock PI
PLL DLL + Phase-2
CK
DCC Buf fCK
Track
TXD
The Valid signal is used to gate the clock distribution to all data Lanes to enable fast idle exit and
entry. The signal also serves the purpose of Valid framing, see Section 4.1.2 for details. The
Transmitter implementation for Valid signal is expected to be the same as for regular Data.
The Track signal can be used for PHY to compensate for slow-changing variables such as voltage or
temperature. Track is a unidirectional signal similar to a data bit. The UCIe Module sends a clock
pattern (1010…) aligned with Phase-1 of the forwarded clock signal on its Track Transmitter when
requested over the sideband by the UCIe Module Partner for its Track Receiver. See Section 4.6 for
more details on Runtime Recalibration steps and Section 5.5.1 for Track usage.
A control loop or training is recommended to adjust output impedance to compensate for the process,
voltage and temperature variations. Control loop and training are implementation specific and beyond
the scope of this specification. In low power states, the implementation must be capable of tri-stating
the output.
It is recommended to optimize the ESD network to minimize pad capacitance. Inductive peaking
technique such as T-coil may be needed at higher data rates.
VCCIO
M1
RS
Pre- Pad
Data
Driver
RS ESD
Network
M2
Segmented Driver
Tx equalization coefficients for 24 GT/s and 32 GT/s are based on the FIR filter shown in Figure 5-6.
Equalization coefficient is subject to maximum unity swing constraint.
The Transmitter must support the equalization settings shown in Table 5-4. Determination of de-
emphasis setting is based on initial configuration or training sequence, where the value with larger
eye opening will be selected.
Vin(n) 1 UI
C0 C+1
∑
Vout(n)
Va Vb
The received clock is used to sample the incoming data. The Receiver must match the delays between
the clock path and the data/valid path to the sampler. This is to minimize the impact of power supply
noise induced jitter. The data Receivers may be implemented as 2-way or 4-way interleaved. For
4-way interleaved implementation the Receiver needs to generate required phases internally from the
two phase of the forwarded clock. This may require duty cycle correction capability on the Receiver.
The supported forwarded clock frequencies and phases are described in Section 5.5.
At higher data rates, deskew capability may be needed in the receiver to achieve the matching
requirements between the data Lanes. Receiver Deskew, when applicable, can be performed during
mainband training. More details are provided in Section 4.5.
The UCIe Module, upon requesting the Track signal, receives a clock pattern (1010…) aligned with
Phase-1 of the forwarded clock signal on its Track Receiver from the UCIe Module Partner’s Track
Transmitter and may use the Track signal to track the impact of slow varying voltage and temperature
changes on sampling phase.
Deskew
Data Flip-
flop
Flop
RX
RX
RXD Flop FIFO
… 2/4 Phases
Phase-1
RX
Phase
RXCK Gen
Phase-2 (Optional)
…
Track
RXD
Track
Rx Voltage sensitivity - - 40 mV
a. Standard Package mode with termination. Impedance step size is an informative parameter and can be
implementation specific to meet Rx Input Impedance.
b. Based on matched architecture.
c. Includes absolute random jitter and untracked deterministic jitter of the divergent path due to delay mismatch
(in the matched architecture).
d. Require Rx per-Lane deskew if limit is exceeded.
e. Residual error post training and correction.
f. When applicable (informative).
g. Expected output (informative). Measured 20% to 80%.
h. Advanced Package.
i. Effective Pad capacitance.
5.4.2 Rx Termination
Rx termination is applicable only to Standard Package modules. All Receivers on Advanced Package
modules must be unterminated.
Receiver termination on Standard Package is data rate and channel dependent. Table 5-6 shows the
maximum data rate and channel reach combinations for which the Receivers in Standard Package
Modules are recommended to remain unterminated for a minimally compliant Transmitter. Figure 5-9
shows an alternate representation of termination requirement. The area below the curve in Figure 5-9
shows the speed and channel-reach combinations for which the Receivers in Standard Package
Modules are recommended to remain unterminated. Termination is required for all other
combinations. Receivers must be ground-terminated when applicable, as shown in Figure 5-10.
Table 5-6. Maximum channel reach for unterminated Receiver (Tx Swing = 0.4 V)
12 3
8 5
4 10
Figure 5-9. Receiver Termination Map for Table 5-6 (Tx Swing = 0.4 V)
16
14
12
Data Rate (Gd/s)
10
0
0 5 10 15 20 25
Channel Reach (mm)
Data
RXD
Clock Phase-1
RX
RXCK
Clock Phase-2
Track
RXD
For higher Transmitter swing, unterminated Receiver can be extended to longer channel and high data
rate. Table 5-7 shows the maximum data rate and channel reach combinations for Transmitter swing
and 0.85 V (maximum recommended swing). Figure 5-11 shows an alternate representation of
termination requirement. The area below the curve in Figure 5-11 shows the speed and channel reach
combinations for which the Receivers in Standard Package Modules are recommended to remain
unterminated.
Table 5-7. Maximum Channel reach for unterminated Receiver (TX swing = 0.85V)
16 5
12 10
Figure 5-11. Receiver termination map for Table 5-7 (TX Swing = 0.85 V)
32
28
24
Data Rate (Gd/s)
20
16
12
0
0 5 10 15 20 25 30 35 40
Channel Reach (mm)
IMPLEMENTATION NOTE
When the Transmitter is tri-stated and the Receiver is not required to be enabled
(e.g., SBINIT, and some MBINIT states):
• Disabled Receivers must be tolerant of a floating input pad
• Receivers are permitted to enable weak-termination directly on the input pad to
prevent crowbar current in the receiver and to lower noise sensitivity at the
receiver trip point
When the Transmitter is tri-stated and the Receiver is required to be enabled (e.g.,
REPAIRCLK and REPAIRVAL states for Advanced Package):
• Enabled Receivers for (CLKP, CLKN, CLKRD, TRK, VLD, VLDRD) must be tolerant
of a floating input signal on the pad
• Receivers are permitted to enable weak-termination directly on the input pad to
prevent crowbar current in the receiver and to lower noise sensitivity at the
receiver trip point
where, ωp2 = 2π*DataRate, ωp1 = 2π*DataRate /4, and ADC is the DC gain.
5.5 Clocking
Figure 5-13 shows the forwarded clocking architecture. Each module supports a two-phase forwarded
clock. It is critical to maintain matching between all data Lanes and valid signal within the module.
The Receiver must provide matched delays between the Receiver clock distribution and Data/Valid
Receiver path. This is to minimize the impact of power supply noise-induced jitter on Link
performance. Phase adjustment is performed on the Transmitter as shown in Figure 5-13. Link
training is required to set the position of phase adjustment to maximize the Link margin.
At higher data rates, Receiver eye margins may be small and any skew between the data Lanes
(including Valid) may further degrade Link performance. Per-Lane deskew must be supported on the
Transmitter at high data rates.
This specification supports quarter-rate clock frequencies at data rates (24 GT/s and 32 GT/s). The
forwarded clock Transmitter must support quadrature phases in addition to differential clock at these
data rates (to enable either quarter-rate or half-rate Receiver implementations). Table 5-8 shows the
clock frequencies and phases that must be supported at different data rates. Forwarded Clock Phase
is negotiated during Link Initialization and Training (see Section 4.5.3.3.1). At 24 GT/s and 32 GT/s,
Receiver has the options to support differential clock or quadrature clock. The capability register is
defined in Table 9-47, and advertised at the beginning of link negotiation. Note that to achieve
interoperability with designs of lower max data rate, differential clock must always be used at 16 GT/
s and below, independent of the choice at 24 GT/s and 32 GT/s.
5.5.1 Track
Track signal can be used to perform Runtime Recalibration to adjust the Receiver clock path against
slow varying voltage, temperature and transistor aging conditions.
When requested by the UCIe Module, the UCIe Module Partner sends a clock pattern (1010…) aligned
with Phase-1 of the forwarded clock on its Track Transmitter, as shown in Figure 5-13.
Die-1 Die-2
Deskew
Flip-
Data + Valid
flop
Flop
Data FIFO Serializer TXD RX
RX
RXD Flop FIFO
2/4 Phases
… 16x / 64x
/N Phase-1
Deskew TX RX Phase
TXCK RXCK
Clock PI Gen
PLL DLL + Phase-2
CK
DCC Buf fCK
…
Track
TXD RXD
Phase Track
Control
16 90 270 Required
32
8 45 135 Required
12 90 270 Required
24
6 45 135 Required
16 8 90 270 Required
12 6 90 270 Required
8 4 90 270 Optional
4 2 90 270 Optional
IMPLEMENTATION NOTE
This implementation note provides an example usage for Track signal to calibrate out
slow varying temperature- and voltage-related delay drift between Data and Clock on
the Receiver.
Track uses the same type of Tx driver and Rx receiver as Data (see Figure 5-13). A
clock pattern aligned with Phase-1 of the forwarded clock is sent from Track
Transmitter and received on the Track Receiver. Any initial skew can be calibrated out
during initialization and training (MBTRAIN.RXCLKCAL) on the Receiver side. During
run-time, any drift between Data and the forwarded clock can be detected. One
method for detecting the drift is to sample Track with the forwarded clock. An
implementation-specific number of samples can be collected, averaged if needed, and
used for drift detection. This drift can then be corrected on the forwarded clock
(if needed).
Fwd Clock
Track
Fwd Clock
Track
Track
Fwd Clock
(post correction)
a. I/O VCC noise includes all noise at the I/O supply bumps relative to VSS bumps. This noise includes all DC and
AC fluctuations at all applicable frequencies.
b. Applies only to multi-module instantiations.
IMPLEMENTATION NOTE
Due to different micro bump max current capacity and power delivery requirements,
PHY in Advanced Package may have TX providing I/O power supply to RX circuits.
Voltage (mV)
-0.8 0 +0.8
Time (UI)
a. Rectangular mask.
b. With equalization enabled.
c. Based on minimum Tx swing specification.
IMPLEMENTATION NOTE
Figure 5-16 shows an example circuit setup that can be used to generate the
statistical eye diagram shown in Figure 5-15. RTX is the Transmitter impedance and
RRX represents the Receiver termination. CTX, CRX represent effective Transmitter
and Receiver capacitance, respectively. For crosstalk, the 19-largest aggressors need
to be included. Transmitter equalization (TXEQ) is enabled at 24 GT/s and 32 GT/s.
RTX Vo_vic
Victim
CRX RRX
CTX
Vo_agg1
Agg # 1
CTX RTX CRX RRX
VTF loss is defined as the ratio of the Receiver voltage and the Source voltage, as shown in
Equation 5-1 and Equation 5-2.
Equation 5-1.
V (f)
L ( f ) = 20 log 10 ---r-----
Vs ( f )
Equation 5-2.
R rx
L ( 0 ) = 20 log 10 -----------------------------------
R tx + R channel + R rx
L(f) is the frequency dependent loss and L(0) is the DC loss. For unterminated channel, L(0) is
effectively 0.
VTF crosstalk is defined as the power sum of the ratios of the aggressor Receiver voltage to the
source voltage. 19 aggressors are included in the calculation. Based on crosstalk reciprocity,
VTF crosstalk can be expressed as shown in Equation 5-3.
Equation 5-3.
19
V ( f ) 2
XT ( f ) = 10 log 10 ---ai
--
- -
---
i = 1 s(f)
V
a. Based on Voltage Transfer Function Method (Tx: 25 ohm / 0.25 pF; Rx: 0.2 pF).
fN is the Nyquist frequency. The equations in the table form a segmented line in the loss-crosstalk
coordinate plane, defining the pass/fail region.
Table 5-12. x64 Advanced Package Module Signal List (Sheet 1 of 2)a
Data
Sideband
Table 5-12. x64 Advanced Package Module Signal List (Sheet 2 of 2)a
a. For x32 Advanced Package module, the TXDATA[63:32], TXRD[3:2], RXDATA[63:32], and RXRD[3:2]
signals do not apply. All other signals are the same as the x64 Advanced Package Module signals.
Figure 5-19. Viewer Orientation Looking at the Defined UCIe Bump Matrix
Figure 5-20, Figure 5-21, and Figure 5-22 show the reference bump matrix for the 10-column, 16-
column, and 8-column x64 Advanced Package Modules, respectively. The lower left corner of the
bump map will be considered “origin” of a bump matrix and the leftmost column is Column 0.
It is strongly recommended to follow the bump matrices provided in Figure 5-20, Figure 5-21, and
Figure 5-22 for x64 Advanced Package interfaces.
The 10-column bump matrix is optimal for bump pitch range of 38 to 50 um. To achieve optimal area
scaling with different bump pitches, the optional 16-column and 8-column bump matrices are defined
for bump ranges of 25 to 37 um and 51 to 55 um, respectively, which will result in optimal Module
depth while maintaining Module width of 388.8 um, as shown in Figure 5-21 and Figure 5-22,
respectively.
The following rule must be followed for the 10-column x64 Advanced Package bump matrix:
• The signal order within a column must be preserved. For example, Column 0 must contain the
signals: txdataRD0, txdata0, txdata1, txdata2, txdata3, txdata4, …, rxdata59,
rxdata60, rxdata61, rxdata62, rxdata63, rxdataRD3, and txdatasbRD. Similarly, 16-
column and 8-column x64 Advanced Packages must preserve the signal order within a column of
the respective bump matrices.
It is strongly recommended to follow the supply and ground pattern shown in the bump matrices. It
must be ensured that sufficient supply and ground bumps are provided to meet channel
characteristics (FEXT and NEXT) and power-delivery requirements.
The following rules must be followed when instantiating multiple modules of Advanced Package bump
matrix:
• Modules must be stepped in the same orientation and abutted.
• Horizontal or vertical mirroring is not permitted.
• Module stacking is not permitted.
Mirror die implementation may necessitate a jog or additional metal layers for proper connectivity.
Column0 Column1 Column2 Column3 Column4 Column5 Column6 Column7 Column8 Column9
vss vss vccio vccio vss
vss vccio vccio vss vss
vss vss vccio vccio vss
rxcksbRD rxcksb vccio rxdatasb rxdatasbRD
txdatasbRD txdatasb vccio txcksb txcksbRD
rxdata50 rxdata35 rxdata29 rxdata14 rxdataRD0
rxdataRD3 rxdata49 rxdata34 rxdata28 rxdata13
rxdata51 rxdata36 rxdata30 rxdata15 vss
rxdata63 vccio rxdata33 vccio rxdata12
rxdata52 vss rxdata31 vss rxdata0
vss rxdata48 rxdata32 rxdata27 rxdata11
rxdata53 rxdata37 rxdataRD1 rxdata16 rxdata1
rxdata62 rxdata47 rxdataRD2 rxdata26 rxdata10
rxdata54 rxdata38 vss rxdata17 vss
rxdata61 rxdata46 vccio rxdata25 rxdata9
rxdata55 rxdata39 rxckRD rxdata18 rxdata2
vss rxdata45 rxvldRD rxdata24 rxdata8
rxdata56 vss rxckn rxdata19 rxdata3
rxdata60 rxdata44 rxvld vss rxdata7
rxdata57 rxdata40 rxckp rxdata20 vss
rxdata59 rxdata43 rxtrk rxdata23 rxdata6
rxdata58 rxdata41 vss rxdata21 rxdata4
vss rxdata42 vccio rxdata22 rxdata5
vccfwdio vccfwdio vccfwdio vccfwdio vccfwdio
vccio txdata21 vccio txdata41 txdata58
txdata5 txdata22 vss txdata42 vss
txdata4 txdata20 txckp txdata40 txdata57
txdata6 txdata23 txtrk txdata43 txdata59
vss txdata19 txckn vss txdata56
txdata7 vss txvld txdata44 txdata60
txdata3 txdata18 txckRD txdata39 txdata55
txdata8 txdata24 txvldRD txdata45 vss
txdata2 txdata17 vccio txdata38 txdata54
txdata9 txdata25 vss txdata46 txdata61
vccio vccio vccio vccio vccio
txdata10 txdata26 txdataRD2 txdata47 txdata62
txdata1 txdata16 txdataRD1 txdata37 txdata53
txdata11 txdata27 txdata32 txdata48 vss
txdata0 vss txdata31 vss txdata52
txdata12 vss txdata33 vss txdata63
vss txdata15 txdata30 txdata36 txdata51
txdata13 txdata28 txdata34 txdata49 txdataRD3
txdataRD0 txdata14 txdata29 txdata35 txdata50
vccio vccio vccio vccio vccio
vccio vccio vccio vccio vccio
Die Edge
Note: In Figure 5-20, at 45-um pitch, the module depth of the 10-column reference bump
matrix as shown is approximately 1043 um.
Column0 Column1 Column2 Column3 Column4 Column5 Column6 Column7 Column8 Column9 Column10 Column11 Column12 Column13 Column14 Column15
vss vss vccio vccio vccio vccio vss vss
vss vss vccio vccio vccio vccio vss vss
vss rxcksbRD rxcksb vss rxdatasb rxdatasbRD vss vss
vss txdatasbRD txdatasb vss vss txcksb txcksbRD vss
rxdata54 rxdata50 rxdata35 vss rxdata29 rxdata14 rxdata11 rxdataRD0
vss rxdata52 rxdata49 rxdata34 rxdataRD1 rxdata28 rxdata13 rxdata9
rxdata55 rxdata51 rxdata36 rxdataRD2 rxdata30 rxdata15 rxdata10 vss
rxdataRD3 rxdata53 rxdata48 rxdata33 vss rxdata27 rxdata12 rxdata8
rxdata61 vss rxdata37 vss rxdata31 rxdata16 vss rxdata0
rxdata63 rxdata56 rxdata47 rxdata32 rxckRD rxdata26 rxdata17 rxdata2
rxdata60 rxdata46 vss rxvldRD rxdata25 vss rxdata7 vss
vss rxdata57 rxdata43 rxdata38 rxckn rxdata23 rxdata18 rxdata3
rxdata59 rxdata45 rxdata40 rxvld vss rxdata20 rxdata6 rxdata1
rxdata62 rxdata58 rxdata42 rxdata39 rxckp rxdata22 rxdata19 rxdata4
vss rxdata44 rxdata41 rxtrk rxdata24 rxdata21 rxdata5 vss
vccfwdio vccfwdio vccfwdio vccfwdio vccfwdio vccfwdio vccfwdio vccfwdio
vccio vccio vccio vccio vccio vccio vccio vccio
vss txdata5 txdata21 txdata24 txtrk txdata41 txdata44 vss
txdata4 txdata19 txdata22 txckp txdata39 txdata42 txdata58 txdata62
txdata1 txdata6 txdata20 vss txvld txdata40 txdata45 txdata59
txdata3 txdata18 txdata23 txckn txdata38 txdata43 txdata57 vss
vss txdata7 vss txdata25 txvldRD vss txdata46 txdata60
txdata2 txdata17 txdata26 txckRD txdata32 txdata47 txdata56 txdata63
txdata0 vss txdata16 txdata31 vss txdata37 vss txdata61
txdata8 txdata12 txdata27 vss txdata33 txdata48 txdata53 txdataRD3
vss txdata10 txdata15 txdata30 txdataRD2 txdata36 txdata51 txdata55
txdata9 txdata13 txdata28 txdataRD1 txdata34 txdata49 txdata52 vss
txdataRD0 txdata11 txdata14 txdata29 vss txdata35 txdata50 txdata54
vccio vccio vccio vccio vccio vccio vccio vccio
vccio vccio vccio vccio vccio vccio vccio vccio
Die Edge
Note: In Figure 5-21, at 25-um pitch, the module depth of the 16-column reference bump
matrix as shown is approximately 388 um.
Note: In Figure 5-22, at 55-um pitch, the module depth of the 8-column reference bump
matrix as shown is approximately 1,585 um.
Figure 5-23 shows the signal exit order for the 10-column x64 Advanced Package bump map.
Figure 5-23. 10-column x64 Advanced Package Bump map: Signal exit order
Left to Right
txdataRD0 txdata0 txdata1 txdata2 txdata3 txdata4 txdata5 txdata6 txdata7 txdata8 txdata9 txdata10 txdata11 txdata12 txdata13 Cont…
Tx Cont… txdata14 txdata15 txdata16 txdata17 txdata18 txdata19 txdata20 txdata21 txdata22 txdata23 txdata24 txdata25 txdata26 txdata27 txdata28 Cont1…
Breakout Cont1… txdata29 txdata30 txdata31 txdataRD1 txckRD txckn txckp txtrk txvld txvldRD txdataRD2 txdata32 txdata33 txdata34 txdata35 Cont2…
Cont2… txdata36 txdata37 txdata38 txdata39 txdata40 txdata41 txdata42 txdata43 txdata44 txdata45 txdata46 txdata47 txdata48 txdata49 txdata50 Cont3…
Cont3… txdata51 txdata52 txdata53 txdata54 txdata55 txdata56 txdata57 txdata58 txdata59 txdata60 txdata61 txdata62 txdata63 txdataRD3
Left to Right
rxdataRD3 rxdata63 rxdata62 rxdata61 rxdata60 rxdata59 rxdata58 rxdata57 rxdata56 rxdata55 rxdata54 rxdata53 rxdata52 rxdata51 rxdata50 Cont…
Rx Cont… rxdata49 rxdata48 rxdata47 rxdata46 rxdata45 rxdata44 rxdata43 rxdata42 rxdata41 rxdata40 rxdata39 rxdata38 rxdata37 rxdata36 rxdata35 Cont1…
Breakout Cont1… rxdata34 rxdata33 rxdata32 rxdataRD2 rxvldRD rxvld rxtrk rxckp rxckn rxckRD rxdataRD1 rxdata31 rxdata30 rxdata29 rxdata28 Cont2…
Cont2… rxdata27 rxdata26 rxdata25 rxdata24 rxdata23 rxdata22 rxdata21 rxdata20 rxdata19 rxdata18 rxdata17 rxdata16 rxdata15 rxdata14 rxdata13 Cont3…
Cont3… rxdata12 rxdata11 rxdata10 rxdata9 rxdata8 rxdata7 rxdata6 rxdata5 rxdata4 rxdata3 rxdata2 rxdata1 rxdata0 rxdataRD0
At higher speeds, the PHY circuits draw larger current through the bumps and require better signal
and power integrity of the packaging solution. This typically requires adding power and ground bumps
and optimizing the distribution of them, but the implementation also needs to minimize the lane-to-
lane length skew and preserve the assignment and relative order of the signals in each column to
comply with the bump matrix rules in Section 5.7.2.2.
Table 5-13. Bump Map Options and the Recommended Bump Pitch Range and Max Speed
25-30 12
16 column
31-37 16
38-44 24
10 column
45-50 32
8 column 51-55 32
This Implementation Note is formulated to provide PHY implementations a set of reference x64 bump
maps to encompass the max speed specified. Table 5-13 summarizes the corresponding max speed
for these bump map options and their recommended bump pitch ranges.
Bump maps in Figure 5-24, Figure 5-25, and Figure 5-26 are the x64 implementation references for
the corresponding max speed with an enhancement of the power and ground bumps. They all comply
with the bump matrix rules in Section 5.7.2.2, and they maintain the backward compatibility in terms
of signal exit order. These reference examples have been optimized for signal integrity, power
integrity, lane-to-lane skew, electro-migration stress and bump area based on most of the advanced
packaging technologies in the industry. Please note that technology requirements vary, and it is still
required to verify the bump map with the technology provider for actual implementation requirements
and performance targets.
1 2 3 4 5 6 7 8 9 10
1 vss vss vccio vccio vss
2 vss vccio vccio vss vss
3 vss vss vccio vccio vss
4 rxcksbRD rxcksb vccio rxdatasb rxdatasbRD
5 txdatasbRD txdatasb vss txcksb txcksbRD
6 rxdata50 rxdata35 rxdata29 rxdata14 rxdataRD0
7 rxdataRD3 rxdata49 rxdata34 rxdata28 rxdata13
8 rxdata51 vccio vccio vccio vccio
9 vccio vss rxdata33 vss rxdata12
10 rxdata52 rxdata36 rxdata30 rxdata15 vss
11 vss rxdata48 vss rxdata27 rxdata11
12 rxdata53 rxdata37 rxdata31 rxdata16 rxdata0
13 rxdata63 rxdata47 rxdata32 rxdata26 rxdata10
14 vccio vccio vccio vccio vccio
15 rxdata62 rxdata46 rxdataRD2 rxdata25 rxdata9
16 rxdata54 rxdata38 rxdataRD1 rxdata17 rxdata1
17 vss vss vss vss vss
18 rxdata55 rxdata39 vccio rxdata18 rxdata2
19 rxdata61 rxdata45 rxvldRD rxdata24 rxdata8
20 vccio vccio vccio vccio vccio
21 rxdata60 rxdata44 rxvld rxdata23 rxdata7
22 rxdata56 rxdata40 rxckRD rxdata19 rxdata3
23 vss vss vss vss vss
24 rxdata57 rxdata41 rxckn rxdata20 rxdata4
25 rxdata59 rxdata43 rxtrk rxdata22 rxdata6
26 rxdata58 vss rxckp rxdata21 vss
27 vss rxdata42 vss vss rxdata5
28 vccfwdio vccfwdio vccfwdio vccfwdio vccfwdio
29 vss vss vss vss vss
30 txdata5 vccio vccio txdata42 vccio
31 vccio txdata21 txckp vccio txdata58
32 txdata6 txdata22 txtrk txdata43 txdata59
33 txdata4 txdata20 txckn txdata41 txdata57
34 vccio vccio vccio vccio vccio
35 txdata3 txdata19 txckRD txdata40 txdata56
36 txdata7 txdata23 txvld txdata44 txdata60
37 vss vss vss vss vss
38 txdata8 txdata24 txvldRD txdata45 txdata61
39 txdata2 txdata18 vss txdata39 txdata55
40 vccio vccio vccio vccio vccio
41 txdata1 txdata17 txdataRD1 txdata38 txdata54
42 txdata9 txdata25 txdataRD2 txdata46 txdata62
43 vss vss vss vss vss
44 txdata10 txdata26 txdata32 txdata47 txdata63
45 txdata0 txdata16 txdata31 txdata37 txdata53
46 txdata11 txdata27 vccio txdata48 vccio
47 vccio txdata15 txdata30 txdata36 txdata52
48 txdata12 vccio txdata33 vccio vss
49 vss vss vss vss txdata51
50 txdata13 txdata28 txdata34 txdata49 txdataRD3
51 txdataRD0 txdata14 txdata29 txdata35 txdata50
52 vccio vss vss vccio vccio
53 vccio vccio vccio vccio vccio
Die Edge
Note: In Figure 5-24, at 45-um pitch, the module depth of the 10-column bump map as
shown is approximately 1225 um. Rows 1, 2, and 53 are required for packaging
solutions using floating bridges without through-silicon vias (TSVs). They can be
optional for packaging solutions with TSVs.
Note: In Figure 5-25, at 25-um pitch, the module depth of the 16-column bump map as
shown is approximately 400 um. Rows 1 and 31 are required for packaging solutions
using floating bridges without TSVs. They can be optional for packaging solutions with
TSVs.
1 2 3 4 5 6 7 8
1 vss vccio vccio vss
2 vss vccio vccio vss
3 vss vccio vccio vss
4 rxcksbRD rxcksb rxdatasb rxdatasbRD
5 txdatasbRD txdatasb txcksb txcksbRD
6 rxdata50 vss rxdata14 rxdataRD0
7 rxdataRD3 rxdata49 rxdata27 rxdata13
8 vccio rxdata36 vss vccio
9 rxdata63 rxdata48 rxdata28 rxdata12
10 rxdata51 rxdata35 rxdata15 rxdata0
11 vss vss vss vss
12 rxdata52 rxdata34 rxdata16 rxdata1
13 rxdata62 rxdata47 rxdata29 rxdata11
14 vccio vccio vccio vccio
15 rxdata61 rxdata46 rxdata30 rxdata10
16 rxdata53 rxdata33 rxdata17 rxdata2
17 rxdata60 rxdata37 rxdata31 rxdata9
18 rxdata54 rxdata32 rxdata26 rxdata3
19 vss vss vss vss
20 rxdata55 rxdataRD2 rxdata25 rxdata4
21 rxdata59 rxdata38 rxdataRD1 rxdata8
22 vccio vccio rxdata24 vccio
23 rxdata56 rxdata39 vccio rxdata18
24 rxdata45 rxvldRD rxdata23 rxdata7
25 vss rxdata40 vss vss
26 vccio rxvld rxdata22 rxdata6
27 rxdata57 vss rxckRD rxdata19
28 rxdata44 vccio vccio vccio
29 rxdata58 rxdata41 rxckn rxdata20
30 rxdata43 rxtrk rxdata21 rxdata5
31 vss rxdata42 rxckp vss
32 vccfwdio vccfwdio vccfwdio vccfwdio
33 vss vss vss vss
34 vccio txckp txdata42 vccio
35 txdata5 txdata21 txtrk txdata43
36 txdata20 txckn txdata41 txdata58
37 vss vss vss txdata44
38 txdata19 txckRD vccio txdata57
39 txdata6 txdata22 txvld vss
40 vccio vccio txdata40 vccio
41 txdata7 txdata23 txvldRD txdata45
42 txdata18 vss txdata39 txdata56
43 vss txdata24 vss vss
44 txdata8 txdataRD1 txdata38 txdata59
45 txdata4 txdata25 txdataRD2 txdata55
46 vccio vccio vccio vccio
47 txdata3 txdata26 txdata32 txdata54
48 txdata9 txdata31 txdata37 txdata60
49 txdata2 txdata17 txdata33 txdata53
50 txdata10 txdata30 txdata46 txdata61
51 vss vss vss vss
52 txdata11 txdata29 txdata47 txdata62
53 txdata1 txdata16 txdata34 txdata52
54 vccio vccio vccio vccio
55 txdata0 txdata15 txdata35 txdata51
56 txdata12 txdata28 txdata48 txdata63
57 vss vss txdata36 vss
58 txdata13 txdata27 txdata49 txdataRD3
59 txdataRD0 txdata14 vss txdata50
60 vccio vccio vccio vccio
61 vccio vccio vccio vccio
Die Edge
Note: In Figure 5-26, at 55-um pitch, the module depth of the 8-column bump map as shown
is approximately 1705 um. Rows 1, 2, and 61 are required for packaging solutions
using floating bridges without TSVs. They can be optional for packaging solutions with
TSVs.
Figure 5-27, Figure 5-28, and Figure 5-29 show the reference bump matrix for the 10-column, 16-
column, and 8-column x32 Advanced Package Modules, respectively. The lower left corner of the
bump map will be considered “origin” of a bump matrix and the leftmost column is Column 0.
It is strongly recommended to follow the bump matrices provided in Figure 5-27, Figure 5-28, and
Figure 5-29 for x32 Advanced Package Modules.
The following rule must be followed for the 10-column x32 Advanced Package bump matrix:
• The signals order within a column must be preserved. For example, Column 0 must contain the
signals: txdataRD0, txdata0, txdata1, txdata2, txdata3, txdata4, and txdatasbRD.
Similarly, 16-column and 8-column x32 Advanced Packages must preserve the signal order within
a column of the respective bump matrices.
It is strongly recommended to follow the supply and ground pattern shown in the bump matrices. It
must be ensured that sufficient supply and ground bumps are provided to meet channel
characteristics (FEXT and NEXT) and power-delivery requirements.
When instantiating multiple x32 Advanced Package Modules, the same rules as defined in
Section 5.7.2.2 must be followed.
Column0 Column1 Column2 Column3 Column4 Column5 Column6 Column7 Column8 Column9
vss vss vccio vccio vss
vss vccio vccio vss vss
vss vss vccio vccio vss
rxcksbRD rxcksb vccio rxdatasb rxdatasbRD
txdatasbRD txdatasb vss txcksb txcksbRD
vss txdata22 rxdata31 vccio vccio
vss txdata21 txckp rxdata30 rxdata13
txdata5 txdata23 vss rxdata14 vccio
vccio txdata20 txckn rxdata29 rxdata12
txdata6 vss rxdataRD1 rxdata15 rxdataRD0
txdata4 vss txckRD rxdata28 rxdata11
txdata7 txdata24 rxvldRD vss vss
vss txdata19 txtrk rxdata27 rxdata10
txdata8 txdata25 rxvld rxdata16 rxdata0
txdata3 txdata18 vss rxdata26 vss
vss txdata26 vss rxdata17 rxdata1
txdata2 txdata17 txvld rxdata25 rxdata9
txdata9 vss rxtrk rxdata18 vss
vccio vccio vccio vccfwdio vccfwdio
txdata10 txdata27 rxckRD rxdata19 rxdata2
txdata1 txdata16 txvldRD rxdata24 rxdata8
txdata11 txdata28 rxckn rxdata20 rxdata3
txdata0 vss vss vss rxdata7
txdata12 txdata29 rxckp vss vss
vss txdata15 txdataRD1 rxdata23 rxdata6
txdata13 txdata30 vss rxdata21 rxdata4
txdataRD0 txdata14 txdata31 rxdata22 rxdata5
vccio vccio vccfwdio vccfwdio vccfwdio
vccio vccio vccio vccfwdio vccfwdio
Die Edge
Note: In Figure 5-27, at 45-um pitch, the module depth of the 10-column reference bump
matrix as shown is approximately 680.5 um.
Figure 5-28. 16-column x32 Advanced Package Bump Map
Column0 Column1 Column2 Column3 Column4 Column5 Column6 Column7 Column8 Column9 Column10 Column11 Column12 Column13 Column14 Column15
vss vss vss vss vss vss vss vss
vccio vccio vccio vccio vccio vccio vccio vccio
txdatasbRD rxcksbRD txdatasb rxcksb txcksb rxdatasb txcksbRD rxdatasbRD
vss txdata5 vss vss vss vss vss vss
txdata4 txdata20 txdata22 txtrk rxdata30 rxdata26 rxdata13 rxdataRD0
txdata2 vss txdata21 txckp rxdata31 rxdata27 rxdata14 rxdata11
txdata3 txdata17 txdata23 vss rxdata29 vss rxdata12 vss
txdata1 txdata6 txdata19 txckn vss rxdata28 rxdata15 rxdata10
txdata8 vss vss txvld rxdataRD1 rxdata25 vss rxdata0
vss txdata7 txdata18 txckRD rxvldRD rxdata24 rxdata16 rxdata9
txdata9 txdata16 txdata24 txvldRD rxckRD rxdata18 rxdata7 vss
txdata0 vss txdata25 txdataRD1 rxvld vss rxdata17 rxdata8
txdata10 txdata15 txdata28 vss rxckn rxdata19 rxdata6 rxdata1
vss txdata12 vss txdata29 vss rxdata23 vss rxdata3
txdata11 txdata14 txdata27 txdata31 rxckp rxdata21 rxdata5 rxdata2
txdataRD0 txdata13 txdata26 txdata30 rxtrk rxdata22 rxdata20 rxdata4
vccio vccio vccio vccio vccfwdio vccfwdio vccfwdio vccfwdio
vccio vccio vccio vccio vccfwdio vccfwdio vccfwdio vccfwdio
Die Edge
Note: In Figure 5-28, at 25-um pitch, the module depth of the 16-column reference bump
matrix as shown is approximately 237.5 um.
Note: In Figure 5-29, at 55-um pitch, the module depth of the 8-column reference bump
matrix as shown is approximately 962.5 um.
Figure 5-30 shows the signal exit order for the 10-column x32 Advanced Package bump map.
Figure 5-30. 10-column x32 Advanced Package Bump Map: Signal Exit Order
Left to Right
txdataRD0 txdata0 txdata1 txdata2 txdata3 txdata4 txdata5 txdata6 txdata7 txdata8 Cont…
Tx
Cont… txdata9 txdata10 txdata11 txdata12 txdata13 txdata14 txdata15 txdata16 txdata17 txdata18 Cont1…
Breakout
Cont1… txdata19 txdata20 txdata21 txdata22 txdata23 txdata24 txdata25 txdata26 txdata27 txdata28 Cont2…
Cont2… txdata29 txdata30 txdata31 txdataRD1 txvldRD txvld txtrk txckRD txckn txckp
Left to Right
rxckp rxckn rxckRD rxtrk rxvld rxvldRD rxdataRD1 rxdata31 rxdata30 rxdata29 Cont…
Rx
Cont… rxdata28 rxdata27 rxdata26 rxdata25 rxdata24 rxdata23 rxdata22 rxdata21 rxdata20 rxdata19 Cont1…
Breakout
Cont1… rxdata18 rxdata17 rxdata16 rxdata15 rxdata14 rxdata13 rxdata12 rxdata11 rxdata10 rxdata9 Cont2…
Cont2… rxdata8 rxdata7 rxdata6 rxdata5 rxdata4 rxdata3 rxdata2 rxdata1 rxdata0 rxdataRD0
1 2 3 4 5 6 7 8 9 10
1 vss vss vccio vccio vss
2 vss vccio vccio vss vss
3 vss vss vccio vccio vss
4 rxcksbRD rxcksb vccio rxdatasb rxdatasbRD
5 txdatasbRD txdatasb vss txcksb txcksbRD
6 txdata5 vccio vss rxdata14 vccio
7 txdata4 txdata21 txckp rxdata30 rxdata13
8 txdata6 txdata22 vccio vccio rxdataRD0
9 vss vss txckn rxdata29 rxdata12
10 txdata7 txdata23 rxdata31 rxdata15 vccio
11 vccio txdata20 txckRD rxdata28 vss
12 vccio vccio vccio rxdata16 rxdata0
13 txdata3 txdata19 txtrk rxdata27 rxdata11
14 txdata8 txdata24 rxdataRD1 vccio rxdata1
15 vss vss vss vss rxdata10
16 txdata9 txdata25 rxvldRD rxdata17 vss
17 txdata2 txdata18 txvld rxdata26 vss
18 vccio txdata26 rxvld rxdata18 rxdata2
19 vccio txdata17 txvldRD rxdata25 rxdata9
20 txdata10 vccio vccio vccio vccio
21 txdata1 vss txdataRD1 rxdata24 rxdata8
22 txdata11 txdata27 rxtrk rxdata19 rxdata3
23 txdata0 txdata16 vss vccfwdio vccfwdio
24 vccio txdata28 rxckRD rxdata20 vss
25 vss txdata15 txdata31 rxdata23 rxdata7
26 txdata12 txdata29 rxckn vss vss
27 txdataRD0 vss vss rxdata22 rxdata6
28 txdata13 txdata30 rxckp rxdata21 rxdata4
29 vss txdata14 vss vss rxdata5
30 vccio vccio vccfwdio vccfwdio vccfwdio
31 vccio vccio vccio vccfwdio vccfwdio
Die Edge
Note: In Figure 5-31, at 45-um pitch, the module depth of the 10-column bump map as
shown is approximately 725 um. Rows 1, 2, and 31 are required for packaging
solutions using floating bridges without through-silicon vias (TSVs). They can be
optional for packaging solutions with TSVs. The vccfwdio bumps are required for the
tightly coupled mode up to 16 GT/s. For higher speeds, the vccfwdio bumps may be
connected to the vccio bumps in package.
Note: In Figure 5-32, at 25-um pitch, the module depth of the 16-column bump map as
shown is approximately 250 um. Rows 1 and 19 are required for packaging solutions
using floating bridges without TSVs. They can be optional for packaging solutions with
TSVs. The vccfwdio bumps are required for the tightly coupled mode up to 16 GT/s. For
higher speeds, the vccfwdio bumps may be connected to the vccio bumps in package.
1 2 3 4 5 6 7 8
1 vss vccio vccio vss
2 vss vccio vccio vss
3 vss vccio vccio vss
4 txdatasbRD txdatasb txcksb txcksbRD
5 rxcksbRD rxcksb rxdatasb rxdatasbRD
6 txdata22 txckp rxdata14 rxdataRD0
7 vss txdata23 vss rxdata13
8 txdata21 txckn vccio vccio
9 txdata5 vss rxdata30 rxdata12
10 vccio vccio rxdata15 rxdata0
11 txdata6 txdata24 rxdata31 rxdata11
12 txdata20 txckRD rxdata16 rxdata1
13 vss vss vss vss
14 txdata19 txtrk rxdata29 rxdata2
15 txdata7 txdata25 rxdataRD1 rxdata10
16 txdata18 vccio rxdata28 rxdata3
17 vccio txdata26 vss rxdata9
18 txdata17 txvld rxdata27 vccio
19 txdata4 vss rxvldRD rxdata8
20 txdata8 txvldRD vccio rxdata4
21 vss txdata27 rxvld rxdata17
22 txdata9 vccio rxdata26 vss
23 txdata3 txdata28 vss rxdata18
24 txdata10 txdataRD1 rxdata25 rxdata7
25 txdata2 txdata29 rxtrk rxdata19
26 vccio vccio vccio vccfwdio
27 txdata1 txdata16 rxckRD rxdata20
28 txdata11 txdata31 rxdata24 rxdata6
29 txdata0 txdata15 vss vss
30 txdata12 txdata30 vccfwdio rxdata5
31 vss vss rxckn rxdata21
32 txdata13 vss rxdata23 vss
33 txdataRD0 txdata14 rxckp rxdata22
34 vccio vccio vccfwdio vccfwdio
35 vccio vccio vccfwdio vccfwdio
Die Edge
Note: In Figure 5-33, at 55-um pitch, the module depth of the 8-column bump map as shown
is approximately 990 um. Rows 1, 2, and 35 are required for packaging solutions using
floating bridges without TSVs. They can be optional for packaging solutions with TSVs.
The vccfwdio bumps are required for the tightly coupled mode up to 16 GT/s. For
higher speeds, the vccfwdio bumps may be connected to the vccio bumps in package.
These bump maps have been optimized to minimize the lane to-lane routing mismatch, which is not
avoidable when two different bumps at different bump pitches interoperate. Table 5-14 summarizes
the max skew due to bump locations for the representative cases. As a rule of thumb, each 150-um
mismatch causes about 1-ps timing skew. This skew can be reduced or eliminated by the length
matching effort in package channel layout design.
Rx
16-column 16-column 10-column 10-column 8-column x64 8-column x32
x64 at 25 um x32 at 25 um x64 at 45 um x32 at 45 um at 55 um at 55 um
Tx
16-column x64
0 125 351 399 560 605
at 25 um
16-column x32
0 351 393 563 618
at 25 um
10-column x64
0 159 351 463
at 45 um
10-column x32
0 428 398
at 45 um
8-column x64
0 468
at 55 um
8-column x32
0
at 55 um
However, if x64 to x32 modules or x32 to x32 modules have normal and mirrored orientation as
shown in Figure 5-34 and Figure 5-35, respectively, signal traces between the TX half and RX half will
crisscross and require swizzling technique which refers to rearranging the physical connections
between signal bumps of two chiplets to optimize the layout and routing on the interposer or
substrate. It involves changing the order of the connections or route on different layers without
altering the netlist or the electrical functionality of the design. Moreover, connections between 8-
column, 16-column, and 10-column modules may need to be routed to adjacent columns (swizzle and
go across). In all cases, the electrical spec must be met for all these connections.
It is optional for a x64 Module to support interoperability with a x32 Module. The following
requirements apply when a x64 module supports x32 interoperability:
• When a x64 module connects to x32 module, the connection shall always be contained to the
lower half of the x64 module. This must be followed even with x32 lane reversal described below.
• Electrical specifications must be met for combinations that require signal-routing swizzling.
• Lane reversal will not be permitted on CKP-, CKN-, CKRD-, VLD-, VLDRD-, TRK-, and sideband-
related pins. These pins need to be connected appropriately. Swizzling for these connections is
acceptable.
• x64 module must support a lane-reversal mode in a x32 manner (i.e., TD_P[31:0] =
TD_L[0:31]. When a x64 module is connected to a x32 module, in either Normal or Mirrored
orientation, the upper 32 bits are not used and should be disabled.
• It is not permitted for a single module of larger width to simultaneously interop with two or more
modules of a lower width. For example, a x64 Advanced Package module physically connected to
two x32 Advanced Package modules is prohibited.
Table 5-15 summarizes the connections between combinations of x64 and x32 modules in both
Normal-to-Normal and Normal-to-Mirrored module orientations. The table applies to all combinations
of 10-column, 16-column, or 8-column modules on either side of the Link.
– Rx
a b c d
x64 TX[63:0] – RX[63:0] TX[31:0] – RX[31:0] rTX[63:0] – RX[0:63] rTX[31:0] – RX[0:31]c e
Normal
Tx
Module
x32 TX[31:0] – RX[31:0]b TX[31:0] – RX[31:0]b rTX[31:0] – RX[0:31]c e rTX[31:0] – RX[0:31]c e
a. Entry “TX[63:0] – RX[63:0]” is for Normal Module connections between two x64 modules without lane reversal.
This applies to x64–to–x64 combination.
b. Entry “TX[31:0] – RX[31:0]” is for Normal Module connections between lower 32-bit half without lane reversal.
This applies to x64-to-x32, x32-to-x64, and x32-to-x32 combinations.
c. The prefix “r” means lane reversal is enabled on the Transmitter lanes, and:
• “rTX[63:0]” means TD_P[63:0] = TD_L[0:63], to be connected with RD_P[0:63]
• “rTX[31:0]” means TD_P[31:0] = TD_L[0:31], to be connected with RD_P[0:31].
d. Entry “rTX[63:0] – RX[0:63]” = Normal-to-Mirrored Module connections between two x64 modules with TX lane reversal.
This applies to x64-to-x64 Normal-to-Mirrored combinations.
e. Entry “rTX[31:0] – RX[0:31]” = Normal-to-Mirrored Module connections between lower 32-bit half with TX lane reversal. This
applies to x64-to-x32, x32-to-x64, and x32-to-x32 Normal-to-Mirrored combinations.
The defined bump matrices can achieve optimal skew between bump matrices of differing depths,
and the worst-case trace-reach skews are expected to be within the maximum lane-to-lane skew limit
for the corresponding data rates as defined in Section 5.3 and Section 5.4.
Figure 5-34 and Figure 5-35 show examples of normal and mirrored x64-to-x32 and x32-to-x32
Advanced Package Module connections, respectively.
vss vss vccio vccio vss vss vss vccio vccio vss
vss vccio vccio vss vss vss vccio vccio vss vss
vss vss vccio vccio vss vss vss vccio vccio vss
rxcksbRD rxcksb vccio rxdatasb rxdatasbRD rxcksbRD rxcksb vccio rxdatasb rxdatasbRD
txdatasbRD txdatasb vss txcksb txcksbRD txdatasbRD txdatasb vss txcksb txcksbRD
vss txdata22 rxdata31 vccio vccio vss txdata22 rxdata31 vccio vccio
vss txdata21 txckp rxdata30 rxdata13 vss txdata21 txckp rxdata30 rxdata13
txdata5 txdata23 vss rxdata14 vccio txdata5 txdata23 vss rxdata14 vccio
vccio txdata20 txckn rxdata29 rxdata12 vccio txdata20 txckn rxdata29 rxdata12
txdata6 vss rxdataRD1 rxdata15 rxdataRD0 txdata6 vss rxdataRD1 rxdata15 rxdataRD0
txdata4 vss txckRD rxdata28 rxdata11 txdata4 vss txckRD rxdata28 rxdata11
txdata7 txdata24 rxvldRD vss vss txdata7 txdata24 rxvldRD vss vss
vss txdata19 txtrk rxdata27 rxdata10 vss txdata19 txtrk rxdata27 rxdata10
txdata8 txdata25 rxvld rxdata16 rxdata0 txdata8 txdata25 rxvld rxdata16 rxdata0
txdata3 txdata18 vss rxdata26 vss txdata3 txdata18 vss rxdata26 vss
vss txdata26 vss rxdata17 rxdata1 vss txdata26 vss rxdata17 rxdata1
txdata2 txdata17 txvld rxdata25 rxdata9 txdata2 txdata17 txvld rxdata25 rxdata9
txdata9 vss rxtrk rxdata18 vss txdata9 vss rxtrk rxdata18 vss
vccio vccio vccio vccfwdio vccfwdio vccio vccio vccio vccfwdio vccfwdio
txdata10 txdata27 rxckRD rxdata19 rxdata2 txdata10 txdata27 rxckRD rxdata19 rxdata2
txdata1 txdata16 txvldRD rxdata24 rxdata8 txdata1 txdata16 txvldRD rxdata24 rxdata8
txdata11 txdata28 rxckn rxdata20 rxdata3 txdata11 txdata28 rxckn rxdata20 rxdata3
txdata0 vss vss vss rxdata7 txdata0 vss vss vss rxdata7
txdata12 txdata29 rxckp vss vss txdata12 txdata29 rxckp vss vss
vss txdata15 txdataRD1 rxdata23 rxdata6 vss txdata15 txdataRD1 rxdata23 rxdata6
txdata13 txdata30 vss rxdata21 rxdata4 txdata13 txdata30 vss rxdata21 rxdata4
txdataRD0 txdata14 txdata31 rxdata22 rxdata5 txdataRD0 txdata14 txdata31 rxdata22 rxdata5
vccio vccio vccfwdio vccfwdio vccfwdio vccio vccio vccfwdio vccfwdio vccfwdio
vccio vccio vccio vccfwdio vccfwdio vccio vccio vccio vccfwdio vccfwdio
Die Edge Die Edge
The Module naming is defined to help with connecting the Modules deterministically which, in turn,
will help minimize the multiplexing requirements in the Multi-module PHY Logic (MMPL).
The naming of M0, M1, M2, and M3 will apply to 1, 2, or 4 Advanced Package modules that are
aggregated through the MMPL.
Figure 5-36 shows the naming convention for 1, 2, or 4 Advanced Package Modules when they are
connected to their “Standard Die Rotate” Module counterparts that have same number of Advanced
Package Modules.
Note: The double-ended arrows in Figure 5-36 through Figure 5-39 indicate Module-to-
Module connections.
Figure 5-36. Naming Convention for One-, Two-, and Four-module Advanced Package
Paired with “Standard Die Rotate” Configurations
Figure 5-37 shows the naming convention for 1, 2, or 4 Advanced Package modules when they are
connected to their “Mirrored Die Rotate” counterparts with the same number of Advanced Package
modules.
Figure 5-37. Naming Convention for One-, Two-, and Four-module Advanced Package
Paired with “Mirrored Die Rotate” Configurations
Table 5-16 summarizes the connections between the combinations shown in Figure 5-36 and
Figure 5-37.
Advanced Package Module Connections Standard Die Rotate Mirrored Die Rotate
(Same # of Modules on Both Sides) Counterpart Counterpart
x1 – x1 • M0 – M0 • M0 – M0
• M0 – M1 • M0 – M0
x2 – x2
• M1 – M0 • M1 – M1
• M0 – M2 • M0 – M0
• M1 – M3 • M1 – M1
x4 – x4
• M3 – M1 • M2 – M2
• M2 – M0 • M3 – M3
Figure 5-38 shows the naming convention for 1, 2, or 4 Advanced Package modules when they are
connected to their “Standard Die Rotate” counterparts that have a different number of Advanced
Package modules.
Figure 5-39 shows the naming convention for 1, 2, or 4 Advanced Package modules when they are
connected to their “Mirrored Die Rotate” counterparts that have a different number of Advanced
Package modules.
Rx
(x64)
M0
Tx
(x64)
(x64)
Tx
M0
(x64)
Rx
Table 5-17 summarizes the connections between the combinations shown in Figure 5-38 and
Figure 5-39.
Advanced Package Module Connections Standard Die Rotate Mirrored Die Rotate
(Different # of Modules on Both Sides) Counterparta Counterparta
• M0 – M0 • M0 – M0
x2 – x1
• M1 – NC • M1 – NC
• M0 – M0 • M0 – M1
• M1 – M1 • M1 – M0
x4 – x2
• M3 – NC • M2 – NC
• M2 – NC • M3 – NC
• M0 – M0 • M0 – M0
• M1 – NC • M1 – NC
x4 – x1
• M3 – NC • M2 – NC
• M2 – NC • M3 – NC
a. NC indicates no connection.
Table 5-18. IL and Crosstalk for Standard Package: With Receiver Termination Enabled
XT(fN) < 3 * L(fN) - 11.5 XT(fN) < 3 * L(fN) - 11.5 XT(fN) < 2.5 * L(fN) - 10
VTF Crosstalk (dB)
and XT(fN) < -25 and XT(fN) < -25 and XT(fN) < -26
a. Voltage Transfer Function for 4 GT/s and 8 GT/s (Tx: 30 ohm / 0.3pF; Rx: 50 ohm / 0.3pF).
b. Voltage Transfer Function for 12 GT/s and 16 GT/s (Tx: 30 ohm / 0.2pF; Rx: 50 ohm / 0.2pF).
c. Voltage Transfer Function for 24 GT/s and 32 GT/s (Tx: 30 ohm / 0.125pF; Rx: 50 ohm / 0.125pF).
IL and crosstalk for requirement at Nyquist frequency without Receiver termination is defined by
Table 5-19. Loss and crosstalk specifications between DC and Nyquist fN follow the same methodology
defined in Section 5.7.2.1.
a. Voltage Transfer Function for 4 GT/s and 8 GT/s (Tx: 30 ohm / 0.3pF; Rx: 0.2 pF).
b. Voltage Transfer Function for 12 GT/s and 16 GT/s (Tx: 30 ohm / 0.2pF; Rx: 0.2 pF).
Data
Sideband
It is strongly recommended to follow the bump matrices provided in Figure 5-40 for one module and
Figure 5-42 for two module Standard Packages. The lower left corner of the bump map will be
considered “origin” of a bump matrix.
Signal exit order for x16 and x32 Standard Package bump matrices are shown in Figure 5-41 and
Figure 5-43, respectively.
The following rules must be followed for Standard Package bump matrices:
• The signals within a column must be preserved. For example, for a x16 (one module Standard
Package) shown in Figure 5-40, Column 1 must contain the signals: txdata0, txdata1,
txdata4, txdata5, and txdatasb.
• The signals must exit the bump field in the order shown in Figure 5-41. Layer 1 and Layer 2 are
two different signal routing layers in a Standard Package.
It is strongly recommended to follow the supply and ground pattern shown in the bump matrices. It
must be ensured that sufficient supply and ground bumps are provided to meet channel
characteristics (FEXT and NEXT) and power-delivery requirements.
The following rules must be followed for instantiating multiple modules of Standard Package bump
matrix:
• When looking at a die such that the UCIe Modules are on the south side, Tx should always
precede Rx within a module along the die’s edge when going from left to right.
• When instantiating multiple modules, the modules must be stepped in the same orientation and
abutted. Horizontal or vertical mirroring is not permitted.
If more Die Edge Bandwidth density is required, it is permitted to stack two modules before abutting.
If two modules are stacked, the package may need to support at least four routing layers for UCIe
signal routing. An example of stacked Standard Package Module instantiations is shown in
Figure 5-42.
• If only one stacked module is instantiated, when looking at a die such that the UCIe Modules are
on the south side, Tx should always precede Rx within a module along the die’s edge when going
from left to right.
• When instantiating multiple stacked modules, the modules must be stepped in the same
orientation and abutted. Horizontal or vertical mirroring is not permitted.
Note: An example of signal routing for stacked module is shown in Figure 5-44.
IMPLEMENTATION NOTE
Figure 5-45 shows a breakout design reference with the Standard Package channel based on the
bump pitch and on routing design rules.
Figure 5-45. Standard Package reference configuration
D L
S
Py
Px
• 4-row deep breakout per routing layer
• Example 1: Py=190.5 um, Px § 111.5 um, P § 110 um
• Example 2: Py=190.5 um, Px § 177 um, P § 130 um
UCIe-S x8 support is limited to a single module configuration. When a UCIe-S x8 port is connected to
a multi-module x16 port, it is always connected to Module 0 UCIe-S x16.
Figure 5-46 shows the reference bump matrix for a x8 Standard Package.
It is strongly recommended to follow the bump matrix provided in Figure 5-46. The lower left corner
of the bump map will be considered “origin” of a bump matrix.
The same rules as mentioned for x16 and x32 Standard Package bump matrices in Section 5.7.3.1
must be followed for the x8 bump matrix.
The naming of M0, M1, M2, and M3 will apply to 1, 2, or 4 Standard Package modules that are
aggregated through MMPL, in stacked and unstacked configuration combinations.
Figure 5-47 shows the naming convention for 1, 2, or 4 Standard Package modules when they are
connected to their “Standard Die Rotate” module counterparts with the same number of Standard
Package modules, with either same stack or same unstacked configuration.
Note: The double-ended arrows in Figure 5-47 through Figure 5-51 indicate Module-to-
Module connections.
Figure 5-47. Naming Convention for One-, Two-, and Four-module Standard Package
Paired with “Standard Die Rotate” Configurations
Figure 5-48 shows the naming convention for 1, 2, or 4 Standard Package modules when they are
connected to their “Mirrored Die Rotate” counterparts that have same number of Standard Package
modules, with either same stack or same unstacked configuration.
Figure 5-48. Naming Convention for One-, Two-, and Four-module Standard Package
Paired with “Mirrored Die Rotate” Configurations
Table 5-21 summarizes the connections between the combinations shown in Figure 5-47 and
Figure 5-48.
x1 – x1 • M0 – M0 • M0 – M0
• M0 – M1 • M0 – M0
x2 Unstacked – x2 Unstacked
• M1 – M0 • M1 – M1
• M0 – M0 • M0 – M0 • M0 – M1
x2 Stacked – x2 Stacked
• M1 – M1 • M1 – M1 • M1 – M0
• M0 – M2 • M0 – M0
• M1 – M3 • M1 – M1
x4 Unstacked – x4 Unstacked
• M3 – M1 • M2 – M2
• M2 – M0 • M3 – M3
• M0 – M2 • M0 – M0 • M0 – M1
• M1 – M3 • M1 – M1 • M1 – M0
x4 Stacked – x4 Stacked
• M3 – M1 • M2 – M2 • M2 – M3
• M2 – M0 • M3 – M3 • M3 – M2
c. For some mirrored cases, there are possible alternative connections to allow design choices between more routing layers vs. max
data rates, shown as Option 1 and Option 2 in Table 5-21. For x2 – x2 Stacked and x4 – x4 Stacked cases, Option 1 typically
requires 2x the routing layers and enables nominal data rates, while Option 2 enables same the layer count but at reduced max
data rates due to potential crosstalk. See Figure 5-50 for Option 2 connection illustrations.
Figure 5-49 shows the naming convention for 1, 2, or 4 Standard Package modules when they are
connected to their “Standard Die Rotate” counterparts that have a different number of Standard
Package modules.
Figure 5-50 shows the naming convention for 1, 2, or 4 Standard Package Modules when they are
connected to their “Mirrored Die Rotate” counterparts that have a different number of Standard
Package Modules.
Figure 5-51 illustrates the possible alternative connections for some mirrored cases to allow design
choices between more routing layers vs. reduced max data rates due to potential crosstalk, shown as
Option 2 in Table 5-21 and Table 5-22.
Figure 5-51. Additional Examples for Standard Package Configurations Paired with
“Mirrored Die Rotate” Counterparts, with a Different Number of Modules
x4 Stacked – x2 Unstacked
x2 Stacked – x2 Stacked x4 Stacked – x4 Stacked x4 Stacked – x2 Stacked
Table 5-22 summarizes the connections between the combinations shown in Figure 5-49, Figure 5-50,
and Figure 5-51.
• M0 – M2 • M0 – M0
• M1 – M3 • M1 – M1
x4 Stacked – x4 Unstacked
• M3 – M1 • M2 – M2
• M2 – M0 • M3 – M3
• M0 – M0 • M0 – M0 • M0 - M1
• M1 – M1 • M1 – M1 • M1 - M0
x4 Stacked – x2 Stacked
• M3 – NC • M2 – NC • M2 – NC
• M2 – NC • M3 – NC • M3 – NC
• M0 – M0 • M0 – M1 • M0 – NC
• M1 – NC • M1 – NC • M1 – M1
x4 Stacked – x2 Unstacked
• M3 – NC • M2 – M0 • M2 – NC
• M2 – M1 • M3 - NC • M3 – M0
• M0 – M0 • M0 – M0
• M1 – NC • M1 – NC
x4 Stacked – x1
• M3 – NC • M3 – NC
• M2 – NC • M2 – NC
• M0 – M1 • M0 – M0
• M1 – M0 • M1 – M1
x4 Unstacked – x2 Unstacked
• M3 – NC • M2 – NC
• M2 – NC • M3 – NC
• M0 – M1 • M0 – M1
• M1 – M0 • M1 – M0
x4 Unstacked – x2 Stacked
• M3 – NC • M2 – NC
• M2 – NC • M3 – NC
• M0 – M0 • M0 – M0
• M1 – NC • M1 – NC
x4 Unstacked – x1
• M3 – NC • M3 – NC
• M2 – NC • M2 – NC
• M0 – M1 • M0 – M0
x2 Stacked – x2 Unstacked
• M1 – M0 • M1 – M1
• M0 – M0 • M0 – M0
x2 Stacked – x1
• M1 – NC • M1 – NC
• M0 – M0 • M0 – M0
x2 Unstacked – x1
• M1 – NC • M1 – NC
a. NC indicates no connection.
On a 2-module or 4-module link, if one or more module-pairs have failed, the link will be degraded
and shall comply with the following rules:
1. The degraded link shall be either one or two modules, and shall not be three modules.
a. For a 4-module link:
i. If any one module-pair failed, it shall be degraded to a 2-module link.
Figure 5-52 illustrates an example with a x4 Unstacked connected to a x4 Unstacked “Standard Die
Rotate” counterpart with one M0 – M2 pair failed. The M1 – M3 pair on its left shall be disabled
accordingly to comply with the rules defined above, which will be denoted as “x (d)” in Table 5-23.
X (degraded) X
Table 5-23 summarizes the resulting degraded link if there are one, two, or three failed module-pairs
for the x4 Unstacked to x4 Unstacked configuration.
Table 5-23. Summary of Degraded Links when Standard Package Module-pairs Fail
M0 – M2 x x (d) x x x x x x
M1 – M3 x (d) x x x x x x x
M3 – M1 x x (d) x x x x x x
M2 – M0 x (d) x x x x x x x
All other module configurations shall follow the same Module Degrade rules as defined above.
Figure 5-53 shows the bump map for a UCIe-S sideband-only port. Figure 5-54 shows the supported
configurations for a UCIe-S sideband-only port.
vss
m1rxdatasb
m1rxcksb
vccaon
vss
m1txcksb
m1txdatasb
Config 1 Config 2
UCIe-S UCIe-S
SB-only SB-only
SB MB
UCIe-S
UCIe-S x16/x8
SB-only
In this mode, there is no Receiver termination and the Transmitter must provide full swing output. In
this mode, further optimization of PHY circuit and power reduction is possible. For example, a tuned
inverter can potentially be used instead of a front-end amplifier. Training complexity such as voltage
reference can be simplified.
Eye Width (rectangular eye mask with specified eye height) 0.7 UI
Loss and crosstalk requirement follow the same VTF method, adjusting to the eye mask defined in
Table 5-24. Table 5-25 shows the specification at Nyquist frequency.
a. Based on Voltage Transfer Function (Tx: 25 ohm / 0.25 pF; Rx: 0.2 pF).
Loss and crosstalk specifications between DC and Nyquist fN follow the same methodology defined in
Section 5.7.2.1.
Although the use of this mode is primarily for Advanced Package, it may also be used for Standard
Package when two Dies are near one another and Receiver must be unterminated.
For x64 Advanced Package modules, the four redundant bumps for data repair are divided into two
groups of two. Figure 5-55 shows an illustration of x64 Advanced package module redundant bump
assignment for data signals. TRD_P0 and TRD_P1 are allocated to the lower 32 data Lanes and
TRD_P2 and TRD_P3 are allocated to the upper 32 data Lanes. Each group is permitted to remap up
to two Lanes. For example, TD15 is a broken Lane in the lower half and TD_P32 and TD_P40 are
broken Lanes in the upper 32 Lanes. Figure 5-56 illustrates Lane remapping for the broken Lanes.
For x32 Advanced Package modules, only the lower 32 data lanes and TRD_P0 and TRD_P1 apply in
Figure 5-55 and Figure 5-56.
Details and implementation of Lane remapping for Data, Clock, Track, and Valid are provided in
Section 4.3.
TD_P2 TD_P9 TD_P16 TD_P23 TD_P30 TD_P34 TD_P41 TD_P48 TD_P55 TD_P62
TD_P3 TD_P10 TD_P17 TD_P24 TD_P31 TD_P35 TD_P42 TD_P49 TD_P56 TD_P63
TD_P1 TD_P8 TD_P15 TD_P22 TD_P29 TD_P33 TD_P40 TD_P47 TD_P54 TD_P61
TD_P0 TD_P7 TD_P14 TD_P21 TD_P28 TD_P32 TD_P39 TD_P46 TD_P53 TD_P60
TRD_P0 TD_P6 TD_P13 TD_P20 TD_P27 TRD_P2 TD_P38 TD_P45 TD_P52 TD_P59
TD_P2 TD_P9 TD_P16 TD_P23 TD_P30 TD_P34 TD_P41 TD_P48 TD_P55 TD_P62
TD_P3 TD_P10 TD_P17 TD_P24 TD_P31 TD_P35 TD_P42 TD_P49 TD_P56 TD_P63
TD_P1 TD_P8 TD_P15 TD_P22 TD_P29 TD_P33 TD_P40 TD_P47 TD_P54 TD_P61
TD_P0 TD_P7 TD_P14 TD_P21 TD_P28 TD_P32 TD_P39 TD_P46 TD_P53 TD_P60
TRD_P0 TD_P6 TD_P13 TD_P20 TD_P27 TRD_P2 TD_P38 TD_P45 TD_P52 TD_P59
Data Rate
(GT/s)
Package Type
4 8 12 16 24 32
Clock
Valid
Data
8 UI 8 UI
As described in Section 4.1.3, clock must be gated only after Valid signal remains low for 16 UI
(8 cycles) of postamble clock for half-rate clocking and 32 UI (8 cycles) of postamble clock for
quarter-rate clocking, unless free running clock mode is negotiated.
Idle state is when there is no data transmission on the mainband. During Idle state, Data, Clock, and
Valid Lanes must hold values as follows:
• If the Link is unterminated (all Advanced Package and unterminated Standard Package Links),
some Data Lane Transmitters are permitted to remain toggling up to the same transition density
as the scrambled data without advancing the scrambler state. The remaining Data Lane
Transmitters must hold the data of the last transmitted bit. Valid Lane must be held low until the
next normal transmission.
— In Strobe mode, the clock level in a clock-gated state for half-rate clocking (after meeting
postamble requirement) must alternate between differential high and differential low during
consecutive clock-gating events. For quarter-rate clocking, the clock level in a clock-gated
state must alternate between high and low for both phases (Phase-1 and Phase-2)
simultaneously. Clock must drive a differential (simultaneous) low for half- (quarter-) rate
clocking for at least 1 UI or a maximum of 8 UI before normal operation. The total clock-gated
period must be an integer multiple of 8 UI. Example shown in Figure 5-58 and Figure 5-59.
— In Continuous mode, the clock remains free running (examples shown in Figure 5-60). Total
idle period must be an integer multiple of 8 UI.
Valid
1 UI to 8 UI
16 UI
Valid
32 UI 1 UI to 8 UI
Clock
Valid
• If the Link is terminated (Standard Package terminated Links), some Data Lane Transmitters are
permitted to remain toggling up to the same transition density as the scrambled data without
advancing the scrambler state. The remaining Data Lanes Transmitters hold the data of the last-
transmitted bit. Valid Lane must be held low until the next normal transmission. Note that keeping
the transmitter toggling will incur extra power penalty and should be applied with discretion.
— In Strobe mode, the clock level in a clock-gated state for half-rate clocking (after meeting
postamble requirement) must alternate between differential high and differential low during
consecutive clock-gating events. For quarter-rate clocking, the clock level in a clock-gated
state must alternate between high and low for both phases (Phase-1 and Phase-2)
simultaneously. Transmitters must precondition the Data Lanes to a 0 or 1 (V) and clock must
drive a differential low for at least 1 UI or up to a maximum of 8 UIs for half- (quarter-) rate
clocking before the normal transmission. The total clock-gated period must be an integer
multiple of 8 UI. Example shown in Figure 5-61 and Figure 5-63.
— In Continuous mode, the clock remains free running (examples shown in Figure 5-64).
Transmitters must precondition the Data Lanes to a 0 or 1 (V) for at least 1 UI or up to a
maximum of 8 UI. Total idle period must be an integer multiple of 8 UI.
Note: Entry into and Exit from Hi-Z state are analog transitions. Hi-Z represents Transmitter
state and the actual voltage during this period will be pulled Low due to termination to
ground at the Receiver.
Figure 5-61. Data, Clock, Valid Gated Levels for Half-rate Clocking:
Terminated Link
Clock Parked
ParkedClock
ClockLevel
Level
Valid
Data: Hi-Z
Data[n] D0 D1 D2 D3 D4 D5 D6 D7 D7 V
16 UI 1 UI to 8 UI
Figure 5-62. Data, Clock, Valid Gated Levels for Quarter-rate Clocking:
Terminated Link
Valid
Data: Hi-Z
Data[n] D0 D1 D2 D3 D4 D5 D6 D7 D7
1 UI to 8 UI
32 UI
Figure 5-63. Data, Clock, Valid Gated Levels for Half-rate Clocking:
Continuous Clock Terminated Link
Clock
Valid
Data: Hi-Z
Data[n] D0 D1 D2 D3 D4 D5 D6 D7 D7 V
16 UI 1 UI to 8 UI
Sideband data is sent edge aligned with the positive edge of the strobe. The Receiver must sample
the incoming data with the strobe. The negative edge of the strobe is used to sample the data as the
data uses single data rate signaling as shown in Figure 5-64. Sideband transmission is described in
Section 4.1.5.
For Advanced Package modules, redundancy is supported for the sideband interface. Sideband
initialization and repair are described in Section 4.5.3.2. There is no redundancy and no Lane repair
support on Standard Package modules.
SB Clock
SB Message
It is strongly recommended that the two sides of the sideband I/O Link share the same power supply
rail.
s
TX Swing 0.8*VCCAON - - V
a. Always On power supply. The guidelines for maximum Voltage presented in Section 1.5 apply to sideband
signaling.
b. 20 to 80% of VCCAON level with Advanced Package reference channel load.
c. 20 to 80% of VCCAON level with Standard Package reference channel load.
Limits
Symbol Description Unit Notes
Min Rec Max
§§
6.1 Introduction
Three-dimensional heterogeneously integrated technologies present an opportunity for the
development of new electronic systems with advantages of higher bandwidth and lower power as
compared to 2D and 2.5D architectures. 3D will enable applications where the scale of data
movement is impractical for monolithic, 2D, or 2.5D approaches.
• All debug/testability hooks are located within a common block (across all UCIe-3D Links) that is
connected to the SoC Logic network inside the chiplet.
• Lane repair becomes a bundle-wide repair that is orchestrated by the SoC Logic.
(a) (b)
Chiplets connected across each SoC with UCIe-3D. Each chiplet has its own system controller logic,
A failure in UCIe-3D in either die results in the I/Os, etc. Each SoC Logic connects to one or more
remaining SoCs being routing around the failure. UCIe-3D PHY. The common test, debug, pattern,
and infrastructure (TDPI) block orchestrates
training, testing, debug, etc. across chiplets.
Table 6-1 summarizes the key performance indicators of the proposed UCIe-3D.
Characteristics
Width
16 64 80 • Options of reduced width to 70, 60, ….
(each cluster)
BW Die Edge
28 to 224 165 to 1,317 N/A (vertical)
(GB/s/mm)
• 4 TB/s/mm2 at 9 um
BW Density • Approximately 12 TB/s/mm2 at 5 um
22 to 125 188 to 1,350 4,000 at 9 um
(GB/s/mm2) • Approximately 35 TB/s/mm2 at 3 um
• Approximately 300 TB/s/mm2 at 1 um
0.5 ns at ≤ 16 GT/s
Low-power Entry/Exit 0 ns • No preamble or postamble.
0.5 ns to 1 ns at ≥ 24 GT/s
Latency
< 2 ns (PHY + Adapter) 0.125 ns at 4 GT/s • 0.5 UI, half of flop to flop.
(Tx + Rx)
Reliability
0 < FIT << 1 • BER < 1E-27.
(FIT)
80x
Data Tx Data Rx
Channel
ClkTree ClkTree
D1 D2
CLK
Clock Tx Clock Rx
It is important to highlight that UCIe-3D uses a rise-to-fall timing approach, differing from the typical
on-die logic design that uses a rise-to-rise timing approach. The primary distinction between these
two scenarios is that on-die logic must factor in the delay caused by combinational logic, whereas
UCIe-3D features matched data and clock buffer delays, resulting in a near-zero differential. As
depicted in Figure 6-4, rise-to-fall timing yields the most-optimal timing margin for zero-delay
differential.
1 UI
Rx Data Eye
0.5 UI
Clock delay
Data delay
Optimal sampling when
0.5 UI Data delay = Clock delay
UI = 250 ps
Specification Name Min Typ Max Unit Note
at 4 GT/s
a
Eye Closure due to Channel Ch 0.1 UI 25 ps
Pulse-width Deviation
Jpw 0.08 UI pk-to-pk 20 ps
from 50% Clock Period
UI = 250 ps
Specification Name Min Typ Max Unit Note
at 4 GT/s
a. Eye closure due to channel includes inter-symbol interference (ISI) and crosstalk.
b. Defined as clock to mean data, min/typ/max values are shown below.
c. Alpha factor is defined as follows for Tx and Rx, respectively:
dD tx dV cc dD rx dV cc
αTx = αRx =
Dtx Vcc Drx Vcc
d. This is equivalent to a variation of ±5% in Vcc. Careful mitigation is particularly needed when disturbances external to UCIe
occur, such as electromagnetic coupling from through-silicon vias (TSVs).
Parameters Dtx and Drx are Vcc-dependent functions. Equation 6-1 defines their typical values.
Equation 6-1.
Vcc
Dtx_typ = Drx_typ =
0.0153 Vcc2 + 0.0188 Vcc – 0.0084
Equation 6-2 and Equation 6-3 define the minimum spec curve of Dtx and Drx, respectively.
Equation 6-2.
Equation 6-3.
Equation 6-4 and Equation 6-5 define the maximum spec curve of Dtx and Drx, respectively.
Equation 6-4.
Equation 6-5.
120
Spec Typ
80
Dtx and Drx (ps)
60
40
20
0
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85
Vcc (V)
The equation for delay time, derived from the general theory of buffer chain, incorporates a term
proportional to Vcc and a quadratic Vcc dependence in the denominator. This equation is fitted to a
specific process and design. A typical design is expected to have the same trend, and remain within
the boundaries of the upper and lower curves. It is not required to align with the central curve.
Equation 6-6 is essential in closing the timing budget, subsequently leading to the defined
specification limit.
Equation 6-6.
When there is a change in Vcc, as in the case of a dynamic voltage frequency scaling (DVFS) scenario,
the specification range for Dtx and Drx adjusts correspondingly. This offers a degree of design
flexibility because the delay does not need to conform to a fixed band across the entire Vcc range.
Given that the range from maximum to minimum remains constant, the timing margin remains
unaffected.
Parameter Minimum
The feasibility of 0-V ESD should be explored for the special case of wafer-to-wafer hybrid bonding.
For more details, see the Industry Council on ESD Targets white papers.
For > 10 um to < 25 um bump pitches, higher ESD can be permitted. The exact target will be
published in a future revision of the specification.
9 0.05
3 0.02
1 0.01
21 R00 R01 R02 R03 R04 VSS R05 R06 R07 R08 R09
20 R10 R11 R12 R13 R14 VSS R15 R16 R17 R18 R19
19 VDD VDD VDD VDD VDD VDD VDD VDD VDD VDD VDD
18 R20 R21 R22 R23 R24 VSS R25 R26 R27 R28 R29
17 R30 R31 R32 R33 R34 VSS R35 R36 R37 R38 R39
16 R40 R41 R42 R43 R44 RXCK R45 R46 R47 R48 R49
15 VSS VSS VSS VSS VSS VSS VSS VSS VSS VSS VSS
14 R50 R51 R52 R53 R54 VSS R55 R56 R57 R58 R59
13 R60 R61 R62 R63 R64 VSS R65 R66 R67 R68 R69
12 R70 R71 R72 R73 R74 VSS R75 R76 R77 R78 R79
11 VDD VDD VDD VDD VDD VDD VDD VDD VDD VDD VDD
10 T70 T71 T72 T73 T74 VSS T75 T76 T77 T78 T79
9 T60 T61 T62 T63 T64 VSS T65 T66 T67 T68 T69
8 T50 T51 T52 T53 T54 VSS T55 T56 T57 T58 T59
7 VSS VSS VSS VSS VSS VSS VSS VSS VSS VSS VSS
6 T40 T41 T42 T43 T44 TXCK T45 T46 T47 T48 T49
5 T30 T31 T32 T33 T34 VSS T35 T36 T37 T38 T39
4 T20 T21 T22 T23 T24 VSS T25 T26 T27 T28 T29
3 VDD VDD VDD VDD VDD VDD VDD VDD VDD VDD VDD
2 T10 T11 T12 T13 T14 VSS T15 T16 T17 T18 T19
1 T00 T01 T02 T03 T04 VSS T05 T06 T07 T08 T09
1 2 3 4 5 6 7 8 9 10 11
The UCIe-3D standard does not prescribe a mandatory bump pitch; however, a 9-um pitch is
recommended at introduction. As the technology advances, additional specific recommended pitch
values will be established.
Although UCIe-3D does not inherently predefine an adapter, users have the flexibility to allocate some
data lanes within the module for adapter functions as required, such as Valid, Data Mask, Parity, and
ECC. UCIe-3D does not necessitate a sideband for initialization. If a low-bandwidth data link similar to
sideband is required, it is up to the implementation to determine how to assign a group of lanes for
the purpose. Bit replication or other forms of redundancy can be used to guarantee link reliability.
If modules are physically adjacent, extra VDDs can be added between them to provide physical
separation, shielding, and additional power delivery.
Along with x80, the bump map of x70 Module is depicted in Figure 6-7. Bump maps of additional
Module widths may be incorporated in a future update to this specification if needed, using similar
layout.
Given these considerations, a bundle repair strategy is proposed for UCIe-3D. This involves reserving
bundles within the SoC for repair purposes, which can be rerouted to serve as backup in the event of
a failure, as illustrated in Figure 6-8. The figure shows the cases of no repair, 1-bundle repair, and 4-
bundle repairs. For a densely packed 2D UCIe Module array, it is recommended to reserve two full
Modules (equivalent to four bundles) to repair a single failure. This assumes an alternating
arrangement of Tx and Rx bundles in at least one direction. Each Module is equipped with one Tx
bundle (comprising a x80 Tx + Clock) and one Rx bundle (comprising a x80 Rx + Clock).
To scale the general case of a large number of UCIe links, the following mathematical model can be
used to compute the repair requirements:
Parameters:
• D0 represents the defect density of the interconnect, expressed in terms of the number of failures
per unit area
• A signifies the total UCIe-3D area of the chip
• δ denotes the acceptable yield loss
The model suggests reserving 2k full Modules, where k is determined by the subsequent equation.
Equation 6-7.
k
1– ∑ P (AD ) < δ
i 0
i=0
Equation 6-8.
x i -x
Pi (x) = e
i!
The calculations in Equation 6-7 and Equation 6-8 assume that large interconnect defects that are
comparable to bundle size are relatively rare. More spare bundles may be needed if density of large
defects exceeds a limit such that Equation 6-9 does not hold.
Equation 6-9.
1 – e-AD 1 < δ
where, D1 is the density of defects with diameter greater than the bundle dimension. The exact
amount can be determined by simulation.
When UCIe-3D links are not densely packed, strategic placement of spacing between bundles can
effectively reduce the number of repair bundles required. For example, with sufficient spacing
between rows, the occurrence of a single defect eliminating four bundles can be prevented. However,
the precise determination of this spacing is highly dependent on the specific technology in use, and
thus, falls beyond the scope of this specification. The specification merely highlights this as a potential
option.
The initiation of repair is anticipated to originate from the SoC Logic, which is external to the UCIe-3D
PHY, and therefore is not elaborated on in this context. The implementation can be specific to the
system.
§§
The same protocol is also used for local die sideband accesses over FDI and RDI. When relevant, FDI
specific rules are pointed out using “FDI sideband:”. When relevant, RDI specific rules are pointed out
using “RDI sideband:”. When relevant, UCIe Link specific rules are pointed out using “UCIe Link
sideband:”. If no prefix is mentioned, it is a common rule across FDI, RDI and UCIe Link.
The Physical Layer is responsible for framing and transporting sideband packets over the UCIe Link.
Direct sideband access to remote die can originate from the Adapter or the Physical Layer. The
Adapter forwards a remote die sideband access over RDI to the Physical Layer for framing and
transport. These include register access requests, completions or messages.
The Protocol Layer has indirect access to remote die registers using the sideband mailbox mechanism.
The mailbox registers reside in the Adapter, and it is the responsibility of the Adapter to initiate
remote die register access requests when it receives the corresponding access trigger for the mailbox
register over FDI.
FDI sideband: In the case of multi-protocol stacks, the Adapter must track which protocol stack sent
the original request and route the completion back to the appropriate protocol stack.
FDI sideband: Because the Protocol Layer is only permitted indirect access to remote die registers,
and direct access to local die registers, currently only Register Access requests and completions are
permitted on the FDI sideband.
All sideband requests that expect a response have an 8ms timeout. A “Stall” encoding is provided for
the relevant packets for Retimers, to prevent timeouts if the Retimer needs extra time to respond to
the request. When stalling to prevent timeouts, it is the responsibility of the Retimer to send the
corresponding Stall response once every 4ms. The Retimer must also ensure that it does not Stall
indefinitely, and escalates a Link down event after a reasonable attempt to complete resolution that
required stalling the requester. If a requester receives a response with a “Stall” encoding, it resets the
timeout counter.
In certain cases, it is necessary for registers to be fragmented between the different layers; i.e.,
certain bits of a given register physically reside in the Protocol Layer, other bits reside in the Adapter,
and other bits reside in the Physical Layer. UCIe takes a hierarchical decoding for these registers. For
fragmented registers, if a bit does not physically reside in a given Layer, it implements that bit as
Read Only and tied to 0. Hence reads would return 0 for those bits from that Layer, and writes would
have no effect on those bits. As an example, for reads, Protocol Layer would forward these requests
to the Adapter on FDI and the Protocol Layer will OR the data responded by the Adapter with its local
register before responding to software. The Adapter must do the same if any bits of that register
reside in the Physical Layer before responding to the Protocol Layer.
Every packet carries a 5-bit opcode, a 3-bit source identifier (srcid), and a 3-bit destination identifier
(dstid). The 5-bit opcode indicates the packet type, as well as whether the packet carries no data, 32b
of data or 64b of data.
Table 7-2, Table 7-3, and Table 7-4 give the encodings of source and destination identifiers. It is not
permitted for Protocol Layer from one side of the Link to directly access Protocol Layer of the remote
Link partner over sideband (this should be done via mainband).
Fielda Description
a. srcid and dstid are Reserved for completion messages transferred over FDI. The Protocol Layer must correlate
the completions to original requests using the Tag field. Currently, no requests are permitted from Adapter to
Protocol Layer over FDI sideband.
Fielda Description
a. srcid and dstid are Reserved for completion messages transferred over RDI for local Register Access
completions. For Register Access completions, the Adapter must correlate the completions to original requests
using the Tag field regardless of dstid field. Both local and remote Register Access requests are mastered by
the Adapter with unique Tag encodings.
Table 7-4. UCIe Link sideband: srcid and dstid encodings for UCIe Link
Field Description
Note that the sideband packet format figures provided in this chapter show the packet format over
multiple 32-bit Phases. This is for representation purposes only. For transport over the UCIe sideband
bumps (serial interface), the transfer occurs as a 64-bit serial packet at a time. For headers, the
transmission order is bit 0 of Phase 0 as bit 0 of the serial packet (D0 in Figure 4-8), bit 1 of Phase 0
as bit 1 of the serial packet, etc., followed by bit 0 of Phase 1 as bit 32 of the serial packet, bit 1 of
Phase 1 as bit 33 of the serial packet, etc., until bit 31 of Phase 1 as bit 63 of the serial packet.
Data (if present) is sent as a subsequent serial packet, with bit 0 of Phase 2 as bit 0 of the serial
packet (D0 in Figure 4-8), bit 1 of Phase 2 as bit 1 of the serial packet, etc., followed by bit 0 of Phase
3 as bit 32 of the serial packet, bit 1 of Phase 3 as bit 33 of the serial packet, etc., until bit 31 of Phase
3 as bit 63 of the serial packet.
Field Description
CP Control Parity (CP) is the even parity of all the header bits excluding DP.
Data Parity is the even parity of all bits in the data payload. If there is no
DP data payload, this bit is set to 0b.
If 1b, indicates one credit return for credited sideband messages. This
Cr field is only used by the Adapter for remote Link partner’s credit returns
for E2E credits. It is not used for local FDI or RDI credit loops.
Address of the request. Different opcodes use this field differently. See
Table 7-6 for details.
Addr[23:0] The following rules apply for the address field:
For 64-bit request, Addr[2:0] is reserved.
For 32-bit request, Addr[1:0] is reserved.
Tag is a 5-bit field generated by the requester, and it must be unique for
all outstanding requests that require a completion. The original
Tag[4:0] requester uses the Tag to associate returning completions with the
original request.
Opcode Description
{RL[3:0], Offset[19:0]}
Offset is the Byte Offset.
RL[3:0] encodings are as follows:
0h: Register Locator 0
1h: Register Locator 1
2h: Register Locator 2
Memory Reads/Writes 3h: Register Locator 3
Fh: Accesses for Protocol specific MMIO registers that are shadowed in the
Adapter (e.g., ARB/MUX registers defined in the CXL Specification). The offsets
for these registers are implementation specific, and the protocol layer must
translate accesses to match the offsets implemented in the Adapter.
Other encodings are reserved.
For accesses to Reserved RL encodings, the completer must respond with a UR.
These allow for accessing the DMS registers implemented in UCIe Spoke Type
0, 1, or 2.
Addr[21:0] provides the register offset in DMS register space, relative to the
start of the Spoke’s register space, that corresponds to the DevID. A maximum
of 4 MB of address space is possible for UCIe D2D/PHY Spokes. These opcodes
are always targeted at the local D2D or PHY registers (i.e., these opcodes never
target the remote link partner).
DMS Register Reads/Writes Addr[23:22] encodings are as follows:
00b: Spoke registers.
01b: Reserved.
10b: Reserved.
11b: Used for other chiplet UMAP registers that are shadowed in the D2D or
PHY, if any. The definitions of these registers and offsets are implementation-
specific.
Field Description
Completion Tag associated with the corresponding Request. The requester uses this to
Tag[4:0] associate the completion with the original request.
Control Parity. All fields other than “DP” and “CP” in the Header are protected by Control
CP Parity, and the parity scheme is even (including reserved bits).
DP Data Parity. All fields in data are protected by data parity, and the parity scheme is even.
If 1b, indicates one credit return for credited sideband messages. This field is only used by
Cr the Adapter for remote Link partner’s credit returns for E2E credits. It is not used for local
FDI or RDI credit loops.
Data Poison. If poison forwarding is enabled, the completer can poison the data on
internal errors.
Setting the EP bit is optional, the conditions for setting it to 1 are implementation-specific.
EP Typical usages involve giving additional FIT protection against data integrity errors on
internal data buffers. A Receiver must not modify the contents of the target location for
requests with data payload that have the EP bit set. It must return UR for the completion
status of requests with an EP bit set.
Byte Enables for the Request. Completer returns the same value that the original request
BE[7:0] had (this avoids the requester from having to save off the BE value). BE[7:4] are reserved
if the opcode is for a 32-bit request.
Completion Status
000b - Successful Completion (SC). This can be a completion with or without data,
depending on the original request (it must set the appropriate Opcode). If the original
request was a write, it is a completion without data. If the original request was a read, it is
a completion with data.
001b - Unsupported Request (UR). On UCIe, this is a completion with 64b Data when a
request is aborted by the completer, and the Data carries the original request header that
resulted in UR. This enables easier header logging at the requester. Register Access
requests that timeout must also return UR status, but for those the completion is without
Data.
Status[2:0] 100b - Completer Abort (CA). On UCIe, this is a completion with 64b Data, and the Data
carries original request header that resulted in UR. This enables easier header logging at
the requester.
111b - Stall. Receiving a completion with Stall encoding must reset the timeout at the
requester. Completer must send a Stall once every 4ms if it is not ready to respond to the
original request.
Other encodings are reserved.
An error is logged in the Sideband Mailbox Status if a CA was received or if the number of
timeouts exceed the programmed threshold. For timeouts below the programmed
threshold, a UR is returned to the requester.
The definitions of opcode, srcid, dstid, dp, and cp fields are the same as Register Access packets.
Table 7-8 and Table 7-9 give the encodings of the different messages without data that are send on
UCIe. Some Notes on the different message categories are listed below:
• {NOP.Crd} — These are used for E2E Credit returns. The destination must be D2D Adapter.
• {LinkMgmt.RDI.*} — These are used to coordinate RDI state transitions, the source and
destination is Physical Layer.
• {LinkMgmt.Adapter0.*} — These are used to coordinate Adapter LSM state transitions for the
Adapter LSM corresponding to Stack 0 Protocol Layer. The source and destination is D2D Adapter.
• {LinkMgmt.Adapter1.*} — These are used to coordinate Adapter LSM state transitions for the
Adapter LSM corresponding to Stack 1 Protocol Layer. The source and destination is D2D Adapter.
• {ParityFeature.*} — This is used to coordinate enabling of the Parity insertion feature. The source
and destination for this must be the D2D Adapter.
• {ErrMsg} — This is used for error reporting and escalation from the remote Link Partner. This is
sent from the Retimer or Device die to the Host, and the destination must be the D2D Adapter.
0000h:Reserved
0001h:1 Credit return
Explicit Credit return from Remote
{Nop.Crd} 00h 00h 0002h: 2 Credit returns
Link partner for credited messages.
0003h: 3 Credit returns
0004h: 4 Credit returns
{LinkMgmt.RDI.Req
01h Active Request for RDI SM.
.Active}
{LinkMgmt.RDI.Req
04h L1 Request for RDI SM.
.L1}
{LinkMgmt.RDI.Req
08h L2 Request for RDI SM.
.L2}
{LinkMgmt.RDI.Req
01h 09h Reserved LinkReset Request for RDI SM.
.LinkReset}
{LinkMgmt.RDI.Req
0Ah LinkError Request for RDI SM.
.LinkError}
{LinkMgmt.RDI.Req
0Bh Retrain Request for RDI SM.
.Retrain}
{LinkMgmt.RDI.Req
0Ch Disable Request for RDI SM.
.Disable}
{LinkMgmt.RDI.Rsp
01h Active Response for RDI SM.
.Active}
{LinkMgmt.RDI.Rsp
02h PMNAK Response for RDI SM
.PMNAK}
{LinkMgmt.RDI.Rsp
04h L1 Response for RDI SM.
.L1}
{LinkMgmt.RDI.Rsp
08h L2 Response for RDI SM.
.L2} 0000h: Regular Response
02h
{LinkMgmt.RDI.Rsp FFFFh: Stall Response
09h LinkReset Response for RDI SM.
.LinkReset}
{LinkMgmt.RDI.Rsp
0Ah LinkError Response for RDI SM.
.LinkError}
{LinkMgmt.RDI.Rsp
0Bh Retrain Response for RDI SM.
.Retrain}
{LinkMgmt.RDI.Rsp
0Ch Disable Response for RDI SM.
.Disable}
{LinkMgmt.Adapter
04h L1 Request for Stack 0 Adapter LSM.
0.Req.L1}
03h
{LinkMgmt.Adapter
08h L2 Request for Stack 0 Adapter LSM.
0.Req.L2}
Reserved
{LinkMgmt.Adapter LinkReset Request for Stack 0 Adapter
09h
0.Req.LinkReset} LSM.
{LinkMgmt.Adapter
04h L1 Response for Stack 0 Adapter LSM.
0.Rsp.L1} 0000h: Regular Response
04h
{LinkMgmt.Adapter FFFFh: Stall Response
08h L2 Response for Stack 0 Adapter LSM.
0.Rsp.L2}
{LinkMgmt.Adapter
04h L1 Request for Stack 1 Adapter LSM.
1.Req.L1}
05h
{LinkMgmt.Adapter
08h L2 Request for Stack 1 Adapter LSM.
1.Req.L2}
Reserved
{LinkMgmt.Adapter LinkReset Request for Stack 1 Adapter
09h
1.Req.LinkReset} LSM.
{LinkMgmt.Adapter
04h L1 Response for Stack 1 Adapter LSM.
1.Rsp.L1} 0000h: Regular Response
06h
{LinkMgmt.Adapter FFFFh: Stall Response
08h L2 Response for Stack 1 Adapter LSM.
1.Rsp.L2}
[15:4] : Reserved
{SBINIT out of Reset} 91h 00h
[3:0] : Resulta
[15:4]: Reserved
[3]: Compare Results from
RRDCK_L
[2]: Compare Results from
{MBINIT.REPAIRCLK result resp} RTRK_L AAh 04h
[1]: Compare Results from
RCKN_L
[0]: Compare Results from
RCKP_L
[15:4]: Reserved
[3:0]: Repair Encoding
Fh: Reserved
{MBINIT.REPAIRCLK apply repair req} 0h: Repair RCLKP_L A5h 05h
1h: Repair RCLKN_L
2h: Repair RTRK_L
7h: Reserved
[15:4]: Reserved
[3]: Compare Results from
RRDCK_L
[2]: Compare Results from
{MBINIT.REPAIRCLK check results resp} RTRK_L AAh 07h
[1]: Compare Results from
RCKN_L
[0]: Compare Results from
RCKP_L
{MBINIT.REPAIRCLK done req} 0000h A5h 08h
[15:2]: Reserved
[1]: Compare Results from
{MBINIT.REPAIRVAL result resp} RRDVLD_L AAh 0Ah
[0]: Compare Results from
RVLD_L
[15:2]: Reserved
[1:0]: Repair Encoding
{MBINIT.REPAIRVAL apply repair req} 3h: Reserved A5h 0Bh
0h: Repair RVLD_L
1h: Reserved
[15:3]: Reserved
{MBINIT.REPAIRMB apply degrade req} [2:0]: Standard package logical A5h 14h
Lane map
[15:3]: Reserved
{MBTRAIN.REPAIR Apply degrade req} [2:0]: Standard Package logical B5h 1Eh
Lane mapb
[15:3]: Reserved
{PHYRETRAIN.retrain start req} C5h 01h
[2:0]: Retrain Encoding
[15:3]: Reserved
{PHYRETRAIN.retrain start resp} CAh 01h
[2:0]: Retrain Encoding
Msg
Name Msgsubcode MsgInfo Data Bit Encodings Description
code
Msg
Name Msgsubcode MsgInfo Data Bit Encodings Description
code
0000h: Post
negotiation, if [23:0] : Flexbus Mode negotiation usage
Enhanced_Multi_ bits as defined for Symbols 12-14 of
Protocol_Enable Modified TS1/TS2 Ordered Set in CXL
is 0b, or it is 1b Specification, with the following
and the message additional rules:
is for Stack 0. • [0]: PCIe capable/enable - this must
Advertised
0001h: Post be 1b for PCIe Non-Flit Mode.
{AdvCap.CXL} 01h 01h Capabilities for
negotiation, if • [1]: CXL.io capable/enable - this CXL protocol.
Enhanced_Multi_ must be 0b for PCIe Non-Flit Mode.
Protocol_Enable
• [2]: CXL.mem capable/enable - this
is 1b and the
must be 0b for PCIe Non-Flit Mode.
message is for
Stack 1. • [3]: CXL.cache capable/enable - this
must be 0b for PCIe Non-Flit Mode.
FFFFh: Stall
Message • [4]: CXL 68B Flit and VH capable;
must be set for ports that support
0000h: Post CXL protocols, as specified in the
negotiation, if Protocol Layer interoperability
Enhanced_Multi_ requirements.
Protocol_Enable • [8]: Multi-Logical Device - must be
is 0b, or it is 1b set to 0b for PCIe Non-Flit Mode.
and the message
• [9]: Reserved.
is for Stack 0.
• [12:10]: these bits do not apply for Finalized
0001h: Post
{FinCap.CXL} 02h 01h UCIe, must be 0b. Capabilities for
negotiation, if
• [14]: Retimer 2 - does not apply for CXL protocol.
Enhanced_Multi_
Protocol_Enable UCIe, must be 0b.
is 1b and the • [15]: CXL.io Throttle - must be 0b
message is for for PCIe Non-Flit Mode.
Stack 1. • [17:16]: NOP Hint Info - does not
FFFFh: Stall apply for UCIe, and must be 0.
Message
Finalized
[0]: “68B Flit Mode” Capability for
[1]: “CXL 256B Flit Mode” Protocol
0000h: Reserved negotiation when
{MultiProtFinCap. [2]: “PCIe Flit Mode”
02h 02h FFFFh: Stall Enhanced
Adapter} [3]: Reserved
Message Multi_Protocol_En
[4]: “Management Transport Protocol” able is negotiated
[63:5]: Reserved and Stack 1 is
PCIe or CXL
Msg
Name Msgsubcode MsgInfo Data Bit Encodings Description
code
Vendor Defined
Messages.
These can be
exchanged at any
time after
sideband is
functional post
SBINIT.
Interoperability is
vendor defined.
Unsupported
{Vendor Defined vendor defined
FFh -- Vendor ID
Message} messages must
be discarded by
the receiver.
Note that this is
NOT the UCIe
Vendor ID, but
rather the unique
identifier of the
chiplet vendor
that is defining
and using these
messages.
MsgCode MsgSubcode
Message MsgInfo[15:0] Data Field[63:0]
[7:0] [7:0]
[63:60]: Reserved
[59]: Comparison Mode (0: Per Lane; 1:
Aggregate)
[58:43]: Iteration Count Settings
[42:27]: Idle Count settings
[26:11]: Burst Count settings
{Start Tx Init D to C [15:0]: Maximum [10]: Pattern Mode (0: continuous mode, 1:
85h 01h
point test req} comparison error threshold Burst Mode)
[9:6] : Clock Phase control at Tx Device (0h:
Clock PI Center, 1h: Left Edge, 2h: Right
Edge)
[5:3] : Valid Pattern (0h: Functional pattern)
[2:0]: Data pattern (0h: LFSR, 1h: Per Lane
ID)
MsgCode MsgSubcode
Message MsgInfo[15:0] Data Field[63:0]
[7:0] [7:0]
[15:6]: Reserved
[5]: Valid Lane comparison
results
[4]: Cumulative Results of
all Lanes (0: Fail (Errors >
Max Error Threshold), [63:0]: Compare Results of individual Data
1: Pass (Errors <= Max Lanes (0h: Fail (Errors > Max Error
Error Threshold)). Threshold), 1h: Pass (Errors <= Max Error
[3:0]: Threshold))
UCIe-A: Compare results UCIe-A {RD_L[63], RD_L[62], ….,
{Tx Init D to C from Redundant Lanes (0h: RD_L[1], RD_L[0]}
8Ah 03h
results resp} Fail (Errors > Max Error UCIe-S {48'h0, RD_L[15], RD_L[14], …,
Threshold), 1h: Pass (Errors RD_L[1], RD_L[0]}
<= Max Error Threshold)) UCIe-A x32 {32’h0, RD_L[31], RD_L[30],
(RRD_L[3], RRD_L[2], …, RD_L[0]}
RRD_L[1], RRD_L[0]) UCIe-S x8 {56'h0, RD_L[7], RD_L[6], …,
UCIe-S: Reserved RD_L[1], RD_L[0]}
[63:60]: Reserved
[59]: Comparison Mode (0: Per Lane; 1:
Aggregate)
[58:43]: Iteration Count Settings
[42:27]: Idle Count settings
[26:11]: Burst Count settings
{Start Tx Init D to C [15:0]: Maximum [10]: Pattern Mode (0: continuous mode, 1:
85h 05h
eye sweep req} comparison error threshold Burst Mode)
[9:6]: Clock Phase control at Tx Device (0h:
Clock PI Center, 1h: Left Edge, 2h: Right
Edge)
[5:3]: Valid Pattern (0h: Functional pattern)
[2:0]: Data pattern (0h: LFSR, 1h: Per Lane
ID)
[63:60]: Reserved
[59]: Comparison Mode (0: Per Lane; 1:
Aggregate)
[58:43]: Iteration Count Settings
[42:27]: Idle Count settings
[26:11]: Burst Count settings
{Start Rx Init D to C [15:0]: Maximum [10]: Pattern Mode (0: continuous mode, 1:
85h 07h
point test req} comparison error threshold Burst Mode)
[9:6]: Clock Phase control at Transmitter
(0h: Clock PI Center, 1h: Left Edge, 2h:
Right Edge)
[5:3]: Valid Pattern (0h: Functional pattern)
[2:0]: Data pattern (0h: LFSR, 1h: Per Lane
ID)
MsgCode MsgSubcode
Message MsgInfo[15:0] Data Field[63:0]
[7:0] [7:0]
[63:60]: Reserved
[59]: Comparison Mode (0: Per Lane; 1:
Aggregate)
[58:43]: Iteration Count Settings
[42:27]: Idle Count settings
[26:11]: Burst Count settings
{Start Rx Init D to C [15:0]: Maximum [10]: Pattern Mode (0: continuous mode, 1:
85h 0Ah
eye sweep req} comparison error threshold Burst Mode)
[9:6]: Clock Phase control at Transmitter
(0h: Clock PI Center, 1h: Left Edge, 2h:
Right Edge)
[5:3]: Valid Pattern (0h: Functional pattern)
[2:0]: Data pattern (0h: LFSR, 1h: Per Lane
ID)
[15:6]: Reserved
[5]: Valid Lane comparison
result
[4]: Cumulative Results of
all Lanes (0: Fail (Errors >
Max Error Threshold), 1: [63:0]: Compare Results of individual Data
Pass (Errors <= Max Error Lanes (0h: Fail (Errors > Max Error
Threshold)). Threshold), 1h: Pass (Errors <= Max Error
[3:0]: Threshold))
UCIe-A: Compare results UCIe-A {RD_L[63], RD_L[62], ….,
{Rx Init D to C from Redundant Lanes (0h: RD_L[1], RD_L[0]}
8Ah 0Bh
results resp} Fail (Errors > Max Error UCIe-S {48'h0, RD_L[15], RD_L[14], …,
Threshold), 1h: Pass (Errors RD_L[1], RD_L[0]}
<= Max Error Threshold)) UCIe-A x32 {32’h0, RD_L[31], RD_L[30],
(RRD_L[3], RRD_L[2], …, RD_L[0]}
RRD_L[1], RRD_L[0]) UCIe-S x8 {56'h0, RD_L[7], RD_L[6], …,
UCIe-S: Reserved RD_L[1], RD_L[0]}
[63:15]: Reserved
[14]: Sideband feature extensions is
supported (1) or not supported (0)
[13]: UCIe-A x32
[12:11]: Module ID: 0h: 0, 1h: 1, 2h: 2,
3h:3
[10]: Clock Phase: 0b: Differential clock, 1b:
{MBINIT.PARAM Quadrature phase
0000h A5h 00h [9]: Clock Mode - 0b: Strobe mode; 1b:
configuration req}
Continuous mode
[8:4]: Voltage Swing - The encodings are the
same as the “Supported Tx Vswing
encodings” field of the PHY Capability
register
[3:0]: Max IO Link Speed - The encodings
are the same as “Max Link Speeds” field of
the UCIe Link Capability register
MsgCode MsgSubcode
Message MsgInfo[15:0] Data Field[63:0]
[7:0] [7:0]
[63:11]: Reserved
[10]: Clock Phase: 0b: Differential clock, 1b:
Quadrature phase
[9]: Clock Mode - 0b: Strobe mode; 1b:
{MBINIT.PARAM
0000h AAh 00h Continuous mode
configuration resp}
[8:4]: Reserved
[3:0]: Max IO Link Speed - The encodings
are the same as “Max Link Speeds” field of
the UCIe Link Capability register
[63:3]: Reserved
[2]: Sideband-only (SO) port (1), full UCIe
port (0)
{MBINIT.PARAM
0000h: Regular Message A5h 01h [1]: Sideband Performant Mode Operation
SBFE req}
(PMO) is supported (1) or not supported (0)
[0]: Management Transport protocol is
supported (1) or not supported (0)
[63:3]: Reserved
[2]: Sideband-only (SO) port (1), full UCIe
port (0)
{MBINIT.PARAM 0000h: Regular Message
AAh 01h [1]: Sideband Performant Mode Operation
SBFE resp} FFFFh: Stall Message
(PMO) is negotiated (1) or not supported (0)
[0]: Management Transport protocol is
supported (1) or not supported (0)
MsgCode MsgSubcode
Message MsgInfo[15:0] Data Field[63:0]
[7:0] [7:0]
MsgCode MsgSubcode
Message MsgInfo[15:0] Data Field[63:0]
[7:0] [7:0]
Bits [21:14] in the first DW of the MPM Hdr of a MPM with Data message, forms an 8b msgcode that
denotes a specific MPM with Data message. Table 7-12 summarizes the supported MPM with Data
messages over sideband.
Support for these messages is optional and negotiated as described in Section 8.2.3.1.
msgcode Message
Others Reserved
7.1.2.4.1 Common Fields in MPM Header of MPM with Data Messages on Sideband
Figure 7-5 shows and Table 7-13 describes the common fields in the MPM header of MPM with data
messages on the sideband.
Figure 7-5. Common Fields in MPM Header of all MPM with Data Messages on Sideband
3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
re rs
srcid=011b rsvd sp
vc msgcode length vd
opcode = 11000b
rs msgcode-
vd
cp rsvd dstid=111b msgcode-specific rsvd specific
rsvd rxqid
Table 7-13. Common Fields in MPM Header of all MPM with Data Messages
on Sideband
Field Description
length MPM Payload length (i.e., 0h for 1 QWORD, 1h for 2 QWORDs, 2h for 3 QWORDs, etc.).
0: Request MPM.
1: Response MPM.
resp
For a Vendor-defined Management Port Gateway Message with Data, this bit is always 0
(see Section 7.1.2.4.3).
RxQ-ID to which this packet is destined, and RxQ-ID associated with any credits
rxqid
returned in the packet (see Section 8.2.3.1.2 for RxQ details).
Encapsulated MTP on sideband is an MPM with Data message with a msgcode of 01h.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
srcid= re rs
rsvd vc msgcode = 01h length opcode = 11000b
011b sp vd
a
cr_
rs ret
vd
cp rsvd dstid=111b _re cr_ret_vc cr_ret (in QWORDs) rsvd s p rsvd rxqid
sp
c
b d
…
e
1 DWORD padding of all 0s (if required)
a. MPM Header.
b. MPM Payload.
c. Management Transport Packet (MTP).
d. Length in MPM Header.
e. DWORD padding.
Segmented MTP (see Section 8.2.4.2). The first and middle segments in a
segmented MTP have this bit set to 1. The last segment in a segmented
s
MTP will have this bit cleared to 0. An unsegmented MTP also has this bit
cleared to 0.
Value of RxQ credits being returned to the MPG receiving this message,
indicated by the rxqid value and its VC:Resp channel indicated via
cr_ret_vc/cr_ret_resp fields.
MPM Headera 000h indicates 0 credits returned.
cr_ret 001h indicates 1 credit returned.
...
3FEh indicates 1022 credits returned.
3FFh is reserved.
If there is no credit being returned, cr_ret fields must be set to 0h.
Resp value associated with the credit returned. 0=Request channel credit.
cr_ret_resp
1=Response channel credit.
See Section 8.2 for details. Note that DWORDx:Bytey in Figure 7-6 refers
MPM Payload — to the corresponding DWORD, Byte defined in the Management Transport
Packet in Figure 8-5.
a. See Section 7.1.2.4.1 for details of header fields common to all MPMs with data on the sideband.
The Vendor-defined Management Port Gateway message with data is defined for custom
communication between MPGs on the two ends of a UCIe sideband link. These messages are not part
of the Management transport protocol, and these messages start at an MPG and terminate at the MPG
on the other end of the UCIe sideband link. These messages share the same RxQ-ID request buffers
and credits as encapsulated MTP messages. If an MPG does not support these messages or does not
support vendor-defined messages from a given vendor (identified by the UCIe Vendor ID in the
header), the MPG silently drops those messages. Length of these Vendor defined messages is subject
to the same rules stated in Section 8.2.5.1.2. Ordering of these messages sent over multiple
sideband links is subject to the same rules presented in Section 8.2.4.3 for encapsulated MTPs.
Figure 7-7. Vendor-defined Management Port Gateway Message with Data on Sideband
3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
re
rs
srcid=011b rsvd sp vc msgcode = FFh length opcode = 11000b
=0
vd
a
rs
vd
cp rsvd dstid=111b UCIe Vendor ID rsvd rxqid
b c
Vendor-defined payload
a. MPM Header.
b. MPM Payload.
c. Length in MPM Header.
MPM Headera UCIe Vendor ID UCIe Consortium-assigned unique ID for each vendor.
a. See Section 7.1.2.4.1 for details of header fields common to all MPMs with data on the sideband.
msgcode Message
04h PM Message
Others Reserved
Figure 7-8 shows and Table 7-17 describes the common fields in the MPM header of MPM without data
messages on the sideband.
Figure 7-8. Common Fields in MPM Header of all MPM without Data Messages on Sideband
3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
rs
srcid=011b rsvd msgcode msgcode-specific vd
opcode = 10111b
rs msgcode-
vd
cp rsvd dstid=111b msgcode-specific rsvd specific
Table 7-17. Common Fields in MPM Header of all MPM without Data Messages on Sideband
Field Description
See Section 8.2.3.1.2 for usage of this message during sideband management transport path
initialization.
Figure 7-9 shows and Table 7-18 describes the Management Port Gateway Capabilities message
format on the sideband.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
a. MPM Header.
Table 7-18. Management Port Gateway Capabilities MPM Header Fields on Sidebanda
Field Description
Number of VCs supported by the Management Port Gateway that is transmitting the
NumVC
message.
Port ID number value of the Management port associated with the Management Port
Port ID
Gateway that is issuing the message (see Section 8.1.3.6.2.1).
a. See Table 7-17 for details of header fields common to all MPMs without data on the sideband.
See Section 8.2.3.1.2 for usage of this message during sideband management transport path
initialization.
Figure 7-10 shows and Table 7-19 describes the Credit Return message format on the sideband.
If credit returns a and b carry the same vc:resp fields, then the total credit returned for that
rxqid:vc:resp credit type is the sum of cr_ret_a and cr_ret_b.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
cr_ cr_
ret ret
rs
srcid=011b rsvd msgcode = 02h _re cr_ret_vc_a _re cr_ret_vc_b
vd
opcode = 10111b
a sp_ sp_
a b
rs
vd
cp rsvd dstid=111b cr_ret_a cr_ret_b rsvd rxqid
a. MPM Header.
Field Description
Value of credits returned for the RxQ (in the Management Port Gateway transmitting this
message) indicated by the rxqid field and the associated VC:Resp channel indicated via
cr_ret_vc_a(b)/cr_ret_resp_a(b) fields.
000h indicates 0 credits returned.
001h indicates 1 credit returned.
...
cr_ret_a(b) 3FEh indicates 1022 credits returned.
3FFh indicates infinite credits.
3FFh value is legal only on credit returns that happen during VC initialization (i.e., before
Init Done message is sent) and cannot be used after initialization until the transport path
is renegotiated/initialized again.
If a receiver detects infinite credit returns after VC initialization and during runtime, it
silently ignores it.
RxQ-ID of the receiver queue for which the credits are being returned (see
rxqid
Section 8.2.3.1.2 for RxQ details).
a. See Table 7-17 for details of header fields common to all MPMs without data on the sideband.
See Section 8.2.5.1.4 for usage of this message during sideband management transport path
initialization.
Figure 7-11 shows and Table 7-20 describes the Init Done message format on the sideband.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
a. MPM Header.
Field Description
RxQ-ID of the receiver queue that has completed initializing credits (see
rxqid
Section 8.2.3.1.2 for RxQ details).
a. See Table 7-17 for details of header fields common to all MPMs without data on the sideband.
7.1.2.5.5 PM Message
See Section 8.2.5.1.4 for usage of this message during sideband management transport PM flows.
Figure 7-12 shows and Table 7-21 describes the PM message format on the sideband.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
a. MPM Header.
Field Description
RxQ-ID of the receiver queue to which the message applies (see Section 8.2.3.1.2 for
rxqid
RxQ details).
a. See Table 7-17 for details of header fields common to all MPMs without data on the sideband.
The Vendor-defined Management Port Gateway message without data is defined for custom
communication between the MPGs on both ends of a UCIe sideband link. These messages are not part
of the management transport protocol, and these messages start at an MPG and terminate at the
MPG on the other end of the UCIe sideband link. These messages share the same RxQ-ID request
buffers as encapsulated MTP messages. If an MPG does not support these messages or does not
support these messages from a given vendor (identified by the UCIe Vendor ID in the header), the
MPG silently drops those messages.
The Vendor-defined Management Port Gateway message without data on the sideband has the format
shown in Figure 7-13.
Figure 7-13. Vendor-defined Management Port Gateway MPM without Data on Sideband
3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
re
rs
srcid=011b rsvd sp vc msgcode = FFh Vendor-defined opcode = 10111b
=0
vd
a
rs
vd
cp rsvd dstid=111b UCIe Vendor ID rsvd rxqid
a. MPM Header.
Field Descriptions
Vendor-defined Management Port Gateway message without data always uses the
resp
Request channel. The value must be 0.
RxQ-ID of the receiver queue to which the message belongs (see Section 8.2.3.1.2 for
rxqid
RxQ details).
a. See Table 7-17 for details of header fields common to all MPMs without data on the sideband.
7.1.3.1 Flow Control and Data Integrity over FDI and RDI
For each Transmitter associated with FDI or RDI, a design time parameter of the interface is used to
determine the number of credits advertised by the Receiver, with a maximum of 32 credits. Each
credit corresponds to 64 bits of header and 64 bits of potentially associated data. Thus, there is only
one type of credit for all sideband packets, regardless of how much data they carry. Every
Transmitter/Receiver pair has an independent credit loop. For example, on RDI, credits are advertised
from Physical Layer to Adapter for sideband packets transmitted from the Adapter to the Physical
Layer; and credits are also advertised from Adapter to the Physical Layer for sideband packets
transmitted from the Physical Layer to the Adapter.
The Transmitter must check for available credits before sending Register Access requests and
Messages. The Transmitter must not check for credits before sending Register Access Completions,
and the Receiver must guarantee unconditional sinking for any Register Access Completion packets.
Messages carrying requests or responses consume a credit on FDI and RDI, but they must be
guaranteed to make forward progress by the Receiver and not get blocked behind Register Access
requests. Both RDI and FDI give a dedicated signal for sideband credit returns across those
interfaces.
All Receivers associated with RDI and FDI must check received messages for data or control parity
errors, and these errors must be mapped to Uncorrectable Internal Errors (UIE) and transition RDI to
LinkError state.
When supporting Management Port Messages over sideband, the Physical Layer maintains separate
credited buffers (which is a design time parameter) per RxQ-ID it supports to which it can receive
Management Port Messages from Management Port Gateway over the RDI configuration bus. Whether
received over FDI or RDI, Management Port Messages are always sunk unconditionally in the
Management Port Gateway.
Sideband access for Remote Link partner’s Adapter or Physical Layer registers is only accessible via
the indirect mailbox mechanism, and the number of outstanding transactions is limited to four at a
time. Although four credits are provisioned, there is only a single mailbox register, and this limits the
number of outstanding requests that can use this mechanism to one at a time. The extra credits allow
additional debug-related register access requests in case of register access timeouts. These credits
are separate from local FDI or RDI accesses, and thus the Physical Layer must provision for sinking at
least one register access request and completion each from remote die and local Adapter in addition
to other sideband request credits (see Implementation Note below). The Adapter provisions for at
least four remote register access requests from remote die Adapter. Each credit corresponds to 64b of
header and 64b of data. Even requests that send no data or only send 32b of data consume one
credit. Register Access completions do not consume a credit and must always sink.
If Management Transport Protocol is not supported, the Adapter credit counters for register access
request are initialized to 4 on Domain Reset exit OR whenever RDI transitions from Reset to Active.
If Management Transport Protocol is supported, the Adapter credit counters for register access
request are initialized to 4 on [Domain Reset exit] OR whenever [RDI transitions from Reset to Active
AND SB_MGMT_UP=0].
It is permitted to send an extra (N-4) credit returns to remote Link partner if a UCIe implementation
is capable of sinking a total of N requests once RDI has transitioned to Active state. The Adapter must
implement a saturating credit counter capable of accumulating at least 4 credits, and hence prevent
excess credit returns from overflowing the counter.
All other messages except Vendor Defined messages must always sink and make forward progress,
and not block any messages on the sideband interface behind them. All Link Management message
requests have an associated response, and the source of these messages must only have one
outstanding request at a time (i.e., one outstanding message per “Link Management Request”
MsgCode encoding).
For vendor defined messages, there must be a vendor defined cap on the number of outstanding
messages, and the Receiver must guarantee sufficient space so as to not block any messages behind
the vendor defined messages on any of the interfaces.
IMPLEMENTATION NOTE
Figure 7-14 shows an example of an end-to-end register access request to remote die
and the corresponding completion returning back.
9. Response from remote die over RDI cfg. 3. Remote die request over RDI cfg.
8. Response for remote die over UCIe sideband. 4. Remote die request over UCIe sideband.
Adapter Die 1
In Step 1 shown in Figure 7-14, the Protocol Layer checks for FDI credits before
sending the request to Adapter Die 0. Adapter Die 0 completes the mailbox request
as soon as the mailbox register is updated (shown in Step 1a). FDI credits are
returned once its internal buffer space is free. In Step 2, Adapter Die 0 checks credits
for remote Adapter as well as credits for local RDI before sending the remote die
request to Physical Layer Die 0 in Step 3. Physical Layer schedules the request over
UCIe sideband and returns the RDI credit to Adapter Die 0 once it has freed up its
internal buffer space.
IMPLEMENTATION NOTE
Continued
In Step 5, Physical Layer Die 1 checks for Adapter Die 1 credits on RDI before sending
the request over RDI. Adapter Die 1 decodes the request to see that it must access a
register on Physical Layer Die 1; Adapter Die 1 checks for RDI credits of Physical
Layer Die 1 before sending the request over RDI in Step 6. Adapter Die 1 must remap
the tag for this request, if required, and save off the original tag of the remote die
request as well as pre-allocate a space for the completion. Physical Layer Die 1
completes the register access request and responds with the corresponding
completion. Because a completion is sent over RDI, no RDI credits need to be
checked or consumed. Adapter Die 1 generates the completion for the remote die
request and sends it over RDI (no credits are checked or consumed for completion
over RDI) in Step 7. The completion is transferred across the different hops as shown
in Figure 7-14 and finally sunk in Adapter Die 0 to update the mailbox information. No
RDI credits need to be checked for completions at the different hops.
For forward progress to occur, the Adapters and Physical Layers on both die must
ensure that they can sink sufficient requests, completions, and messages to
guarantee that there is no Link Layer level dependency between the different types of
sideband packets (i.e., remote register access requests, remote register access
completions, Link state transition messages for Adapter LSM(s), Link state transition
messages for RDI, and Link Training related messages). In all cases, because at most
one or two outstanding messages are permitted for each operation, it is relatively
easy to provide greater than or the same number of buffers to sink from RDI. For
example, in the scenario shown in Figure 7-14, Physical Layer Die 1 must ensure that
it has dedicated space to sink the request in Step 6 independent of any ongoing
remote register access request or completion from Die 1 to Die 0, or any other
sideband message for state transition, etc. Similarly, Physical Layer Die 1 must have
dedicated space for remote die register access completion in Step 7.
Implementations of the Physical Layer and Adapter must ensure that there is no
receiver buffer overflow for messages being sent over the UCIe sideband Link. This
can be done by either ensuring that the time to exit clock gating is upper bounded
and less than the time to transmit a sideband packet over the UCIe sideband Link, OR
that the Physical Layer has sufficient storage to account for the worst-case backup of
each sideband message function (i.e., remote register access requests, remote
register access completions, Link state transition messages for Adapter LSM(s), Link
state transition messages for RDI, and Link Training related messages). The latter
offers more-general interoperability at the cost of buffers.
§§
8.1.1 Overview
UCIe Manageability is optional and defines mechanisms to manage a UCIe-based SiP that is
independent of UCIe mainband protocols. This accelerates the construction of a UCIe-based SiP by
allowing a common manageability architecture and hardware/software infrastructure to be leveraged
across implementations.
Examples of functions that may be performed using UCIe Manageability include the following:
• Discovery of chiplets that make up an SiP and their configuration,
• Initialization of chiplet structures, and parameters (i.e., serial EEPROM replacement),
• Firmware download,
• Power and thermal management,
• Error reporting,
• Performance monitoring and telemetry,
• Retrieval of log and crash dump information,
• Initiation and reporting of self-test status,
• Test and debug, and
• Various aspects of chiplet security.
• Manageability capabilities are discoverable and configurable, allowing a common firmware base to
be rapidly used across SiPs.
• UCIe manageability builds on top of applicable industry standards.
A UCIe Chiplet that supports manageability contains a Management Fabric and one or more
Management Entities. A Management Entity is a Management Element, a Management Port, or a
Management Bridge. An example UCIe chiplet that supports manageability is shown in Figure 8-1 and
an example SiP that supports manageability consisting of four UCIe chiplets is shown in Figure 8-2.
Chiplet
Management
Fabric
Management
Element Management
Element
Management Management
Bridge Port
Management
Element
Chiplet
Management Chiplet
Fabric Management
Fabric
Management Management
Director Element
Management Management Management Management
Bridge Port Port Element
Management
Element
Management
Port
Management Management Management
Port Port Port
Management
Element
Management
Element
Management
Management Element
Management
Fabric
Fabric
Management
UCIe Protocol(s)
Protocol
UCIe Management
UCIe Management Transport
Transport
A Management Port is a Management Entity that acts as the interface between the Management
Fabric within a chiplet and a point-to-point management link that interconnects two chiplets. As
shown in Figure 8-3. UCIe Manageability Protocol Hierarchy, below the UCIe Management Transport is
a Management Link Encapsulation Mechanism that defines how UCIe Management Transport packets
are transferred across a point-to-point management link. Two Management Link Encapsulation
Mechanisms are defined, one for the UCIe sideband and one for the UCIe mainband. See Section 8.2
for additional details of Management Link Encapsulation Mechanisms. Whether a specific UCIe
sideband or mainband link in a chiplet may function as a point-to-point management link is
implementation specific.
A chiplet that supports manageability should support at least one UCIe sideband Management Port. To
enable broad interoperability, it is strongly recommended that a chiplet support enough UCIe
sideband Management Ports to enable construction of an SiP with a single management domain using
only UCIe sideband. If a chiplet supports management applications that require high bandwidth, such
as test, debug, and telemetry, then it is strongly recommended that the chiplet support UCIe
mainband Management Ports.
A Management Fabric within a UCIe chiplet facilitates communication between Management Entities
inside the chiplet. A Management Entity is a Management Element, a Management Port, or a
Management Bridge. The Management Fabric may be realized using one or more on-die fabrics and
the implementation of the Management Fabric is beyond the scope of this specification.
One of the Management Elements within an SiP is designated as the Management Director. An SiP
may contain multiple Management Elements that may act as a Management Director; however, there
can only be one active Management Director at a time. How the Management Director is selected in
such SiPs is beyond the scope of this specification. The roles of the Management Director include the
following:
• Discovering chiplets and configuring Chiplet IDs,
• Discovering and configuring Management Elements,
One or more of the Management Elements within an SiP may function as a Security Director. A
Security Director is responsible for configuring security parameters.
The relationship between the various type of Management Entities is shown in Figure 8-4.
Management Entity
Other Type of
Management Director Security Director DFx Management Hub
Management Element
Unless otherwise specified, UCIe manageability, Management Entities, and all associated
manageability structures in a chiplet (e.g., those in a Management Capability Structure) are reset on
a Management Reset. A Management Reset occurs on initial application of power to a chiplet. Other
conditions that cause a Management Reset are implementation specific.
Reserved fields in a UCIe Management Transport packet must be filled with 0s when the packet is
formed. Reserved fields must be forwarded unmodified on the Management Network and ignored by
receivers. An implementation that relies on the value of a reserved field in a packet is non-compliant.
+0 +1 +2 +3
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
Mgmt.
Resp
DWORD 0 Destination ID Protocol TC PIPP Reserved Ver
Packet ID
Header
Source ID Security Clearance Length
DWORD 1 Group
DWORD 2
DWORD 3
Management
Protocol
Specific
...
DWORD M
Table 8-1 defines the fields of a UCIe Management Transport packet. All fields in the table have little
endian bit ordering (e.g., Destination ID bits 0 through 7 are in Byte 1 with bit 0 of the Destination ID
in Byte 1 bit 0, and Destination ID bits 8 through 15 are in Byte 0 with Destination ID bit 8 in Byte 0
bit 0).
Destination ID
Destination ID 16 bits This field specifies the Management Network ID of the entity on the Management
Network that is to receive the packet.
Management Protocol ID
Mgmt. Protocol ID 3 bits This field contains an ID that corresponds to a Management Protocol and specifies
the type of payload contained in the Management Protocol Specific field. See
Section 8.1.3.1.3 for ID values.
Traffic Class
TC 3 bits This field is a packet label used to enable different packet servicing policies. Each
Traffic Class is a unique ordering domain with no ordering guarantees between
packets in different Traffic Classes. See Section 8.1.3.1.1.
Request or Response
This field indicates whether the packet is a request or a response.
Resp 1 bit
0: Request packet
1: Response packet
Source ID
Source ID 16 bits This field indicates the Management Network ID of the entity on the Management
Network that originated the packet.
Length
This field indicates the length of the entire packet in DWORDs. This includes the
UCIe Management Network Header, the Management Protocol Specific field, and the
Length 9 bits Packet Integrity Protection field if present.
The length of the packet in DWORDs is equal to the value of this field plus 1 (e.g., a
value of 000h in this field indicates a packet length of one DWORD, a value of 001h
in this field indicates a packet length of two DWORDs, and so on).
A request packet is a packet with the Resp field in the UCIe Management Transport packet header
assigned to 0. A requester is a Management Entity that first introduces a request packet into a
Management Fabric. A responder is the Management Entity that performs the actions associated with
a request that consists of one or more request packets and is the ultimate destination of these
request packet(s). A response packet is a packet with the Resp field in the UCIe Management
Transport packet header assigned to 1. As a result of performing the actions associated with a
request, the responder may introduce one or more response packets into the Management Fabric
destined to the requester.
A Management entity that receives a malformed packet must discard the packet.
• A packet with an incorrect length in the Length field is malformed.
— An example of a malformed packet with an incorrect length in the Length field is one where
the Length field in the UCIe Management Transport packet header indicates a length of 65
DWORDs but the actual length of the packet is 64 DWORDs.
• A packet whose size exceeds the Management Transport Packet size supported by a chiplet is
malformed.
• A packet that violates a requirement outlined in this specification is malformed.
— An example of a malformed packet due to a requirement violation is a response packet with a
nonzero value in the Security Clearance Group field.
Traffic classes (TCs) are used to enable different packet servicing policies. The UCIe Management
Transport supports eight traffic classes, and the characteristics of each traffic class are described in
Table 8-2. The Management Director configures management fabric routes and may configure
different routes for different traffic classes (e.g., TC0 traffic may be routed to UCIe sideband
Management Port A and TC2 traffic may be routed to UCIe mainband Management Port B).
5 to 6 Reserved
A request packet and an associated response packet need not have the same traffic class. Which
traffic classes are allowed and how they are mapped between a request and response are
Management Protocol specific.
An implementation shall ensure forward progress on all traffic classes. Quality of service between
traffic classes is implementation specific and beyond the scope of this specification.
Each traffic class is a unique ordering domain and there are no ordering guarantees for packets in
different traffic classes and there are no ordering guarantees between request and response packets
in the same traffic class. Regardless of the traffic class, a response packet associated with any traffic
class must be able to bypass a blocked request packet associated with any traffic class.
Within an ordered traffic class, request packets are delivered in-order from a requester to a responder
and response packets are delivered in-order from a responder to the requester. There are no ordering
guarantees between requests to different responders and there are no ordering guarantees between
responses from different responders to a requester.
Within an unordered traffic class there are no ordering guarantees between packets of any type and
the chiplet’s Management Fabric is free to arbitrarily reorder packets. While packets may be reordered
on an unordered traffic class, there is no requirement that they be reordered (i.e., an implementation
is free to maintain ordering as in an ordered traffic class).
Packets associated with a lossy traffic class may be dropped during normal operation. The policy used
to determine when a lossy traffic class packet is dropped is vendor defined and beyond the scope of
this specification (e.g., due to exceeding a vendor defined congestion threshold). Lossless traffic
classes are reliable, and packets associated with a lossless traffic class are not dropped during normal
operation; however, packets associated with a lossless traffic class may be dropped due to an error
condition. The detection of lost packets and recovery from lost packets is the responsibility of a
Management Protocol.
To maintain forward compatibility, a UCIe Management Transport packet with a reserved traffic class
value is treated in the same manner as a packet with a traffic class value of zero (i.e., Default Ordered
Lossless Traffic Class).
To maintain interoperability, all implementations are required to support all traffic classes; however,
an implementation is only required to maintain the ordered and lossless characteristics of a traffic
class. All other traffic class characteristics may be ignored. For example, an implementation may treat
all traffic classes in the same manner as the Default Ordered Lossless Traffic Class. Under no
circumstances may packets in an ordered traffic class be reordered between a requester and a
responder. Similarly, under no circumstances may a packet in a lossless traffic class be dropped in a
properly functioning system without errors.
The length of a Packet is indicated by the Length field, is an integral number of DWORDs, and consists
of the entire length of the packet (i.e., the UCIe Management Network Header, the Management
Protocol Specific field, and Packet Integrity Protection field if present.).
The maximum packet length architecturally supported by the UCIe Management Transport is 512
DWORDs (i.e., 2048B). A chiplet may support a maximum packet length that is less than the
architectural limit. The Maximum Packet Size (MPS) field in the Chiplet Capability Structure indicates
the maximum packet size supported by the chiplet. If a packet larger than that advertised by MPS is
received on a Management Port or Management Bridge, then it is silently discarded and not emitted
on the chiplet’s Management Fabric.
The Configured Maximum Packet Size (CMPS) field in the Chiplet Capability Structure controls the
maximum UCIe Management Transport packet size generated by a Management Entity within the
chiplet. The initial value of this field corresponds to 8 DWORDs (i.e., 64B). The CMPS field does not
affect UCIe Management Transport packets emitted by Management Ports and Management Bridges
when forwarding packets into a chiplet’s Management Fabric. This allows packets to be routed through
the chiplet (e.g., between Management Ports) that are larger than the size of packets generated by
Management Entities within the chiplet.
The Management Protocol field in a packet contains a Management Protocol ID that indicates the
format of the Management Protocol Specific field. Table 8-3 lists the Management Protocols supported
by the UCIe Management Transport.
0 Reserved
3 to 6 Reserved
7 Vendor defined
As shown in Figure 8-6, a Management Network ID is a 16-bit field that consists of the concatenation
of a Chiplet ID and an Entity ID. The Chiplet ID uniquely identifies a chiplet within an SiP and the
Entity ID is a fixed value that uniquely identifies a Management Entity within a chiplet. Together the
Chiplet ID and Entity ID uniquely identify each Management Entity in an SiP.
15 N N-1 0
Chiplet ID Entity ID
The size of the Chiplet ID in bits is chiplet implementation specific and may be 2-bits to 15-bits in
size. The size of the Entity ID in bits is also chiplet implementation specific and may be 1-bit to 14-
bits in size. For each chiplet, the sum of the size of the Chiplet ID and the Entity ID must be 16-bits.
The size of the Chiplet ID and Entity ID fields may be different in different chiplets in an SiP (i.e., it is
not a requirement that all chiplets of an SiP have the same Chiplet ID field size). To facilitate
interoperability an implementation should make the Chiplet ID field as large as possible (i.e., make
the Entity ID field only as large as needed to address Management Entities in the chiplet).
The Management Director initializes the Chiplet ID value associated with each chiplet during SiP
initialization. This is done by writing the Chiplet ID value to the Chiplet field in the Chiplet Capability
Structure and setting the Chiplet ID Valid bit in that structure. The Management Director may
determine the size of the Chiplet ID associated with a chiplet by examining the initial value of the
Chiplet ID field in the Chiplet Capability Structure or by determining which bits are read-write in this
field.
Each Management Entity within a chiplet has a fixed Entity ID. Entity ID zero within each chiplet must
be a Management Element that is used to initialize and configure the chiplet. Entity IDs within a
chiplet that map to Management Entities may be sparse. For example, a chiplet may contain
Management Entities at Management Entity IDs zero, one, and three. The other Entity IDs are
unused. A UCIe Management Packet that targets an unused Entity ID within a chiplet is silently
discarded by the chiplet’s Management Fabric.
A System-in-Package (SiP) may be composed of chiplets with varying sizes for the
Chiplet ID portion of the Management Network ID. To establish routing within such an
SiP, one approach is to identify the smallest Chiplet ID size among all chiplets in the
SiP. Subsequently, Route Entries (as described in Section 8.1.3.6.2.2) are configured
as if each chiplet possessed a Chiplet ID size equivalent to this minimum. For chiplets
with a Chiplet ID size exceeding the minimum, any unused Chiplet ID bits are
assigned 0 in the Base ID field of a Route Entry and set to 1 in the Limit ID field of a
Route Entry.
8.1.3.3 Routing
UCIe Management Transport packets are routed based on the Destination ID field in the UCIe
Management Network Header.
The routing of UCIe Management Transport packets differs depending on whether the Chiplet ID value
is initialized or uninitialized. The Chiplet ID value is initialized when the Chiplet ID Valid bit is set to 1
in the Chiplet Capability Structure. The Chiplet ID value is uninitialized when the Chiplet ID Valid bit is
cleared to 0 in the Chiplet Capability Structure.
A UCIe Management Transport request packet is one where the Resp field is cleared to 0. A UCIe
Management Transport response packet is one where the Resp field is set to 1.
While the Chiplet ID and Entity ID size of chiplets may vary in an SiP, all packet routing associated
with a chiplet is performed using the Chiplet ID size of the chiplet performing the routing.
The method used to configure UCIe Management Transport routing within an SiP is beyond the scope
of this specification. The routing may be pre-determined during SiP design and this static
configuration may be used by the Management Director to configure routing. Alternatively, the
Management Director may discover the SiP configuration (i.e., chiplets, Management Network
topology, and Chiplet ID size of each chiplet) and use this information to dynamically configure
routing.
Because the Management Network may contain redundant management links between chiplets as
well as links that form cycles, care must be exercised to ensure that the UCIe Management Transport
routing is acyclic and deadlock free. In the absence of faults, request packets and response packets
are delivered in order; however, it is possible to configure UCIe Management Transport routing such
that the path used by a request packet from a requester to a responder is different from path used by
a response packet from the responder back to the requester.
This section describes routing of a UCIe Management Transport packet generated by a Management
Entity within the chiplet.
• If the chiplet’s Chiplet ID value is initialized, then the packet is routed as follows.
— If the Chiplet ID portion of the packet’s Destination ID matches the Chiplet ID value of the
chiplet performing the routing, the packet is routed within the chiplet based on the Entity ID
portion of the packet’s Destination ID.
o If the Entity ID portion of the packet’s Destination ID matches that of a Management
Entity within the chiplet, then the packet is routed to that Management Entity.
o If the Entity ID portion of the packet’s Destination ID does not match that of any
Management Entity within the chiplet, then the packet is discarded.
— If the Chiplet ID portion of the packet’s Destination ID does not match the Chiplet ID value of
the chiplet, then the packet is routed based on Management Port Route Entries.
o If the packet’s Destination ID matches a Route Entry associated with a Management Port,
then the packet is routed out that Management Port. See Section 8.1.3.6.2.2 for Route
Entry matching rules.
o If the packet’s Destination ID matches multiple Route Entries within the chiplet and the
packet is associated with an ordered traffic class, then the packet is discarded.
o If the packet’s Destination ID matches multiple Route Entries within the chiplet and the
packet is associated with an unordered traffic class, then the packet is routed out one of
the Management Ports with a matching Route Entry. Which matching Route Entry is
selected in the unordered traffic class case is vendor defined.
o If the packet’s Destination ID does not match any Route Entries within the chiplet, then
the packet is discarded.
• If the chiplet’s Chiplet ID value is uninitialized, then packet is routed as follows.
— If the packet is a UCIe Management Transport Response packet and the corresponding UCIe
Management Transport Request packet was received from a Management Port, then the
packet is routed as follows.
o If the link is up that is associated with Management Port on which the corresponding UCIe
Management Transport Request packet was received, then the response packet is routed
out that same Management Port on virtual channel zero (VC0). This allows a Management
Entity outside the chiplet to configure the chiplet before the chiplet’s Chiplet ID value is
initialized.
o If the link associated with Management Port on which the corresponding UCIe
Management Transport Request packet was received is not up, then the response packet
is discarded.
— Otherwise, the packet is routed within the chiplet based on the Entity ID portion of the
packet’s Destination ID.
o If the Entity ID portion of the packet’s Destination ID matches that of a Management
Entity within the chiplet, then the packet is routed to that Management Entity.
o If the Entity ID portion of the packet’s Destination ID does not match that of any
Management Entity within the chiplet, then the packet is discarded.
When a response packet is routed out the Management Port on which the
corresponding request was received, it is routed out the Management Port on virtual
channel zero. While there is no requirement that requests to uninitialized chiplets use
virtual channel zero (VC0), it is strongly encouraged that virtual channel zero be
used.
This Section describes routing of a UCIe Management Transport packet received on a chiplet’s
Management Port.
• If the chiplet’s Chiplet ID value is initialized, then the packet is routed in the same manner is a
packet generated by a Management Entity within the chiplet. See Section 8.1.3.3.1 for how such
a packet is routed.
• If the chiplet’s Chiplet ID value is uninitialized, then the packet is routed within the chiplet based
on the Entity ID portion of the packet’s Destination ID.
— If the Entity ID portion of the packet’s Destination ID matches that of a Management Entity
within the chiplet, then the packet is routed to that Management Entity.
— If the Entity ID portion of the packet’s Destination ID does not match that of any Management
Entity within the chiplet, then the packet is discarded.
CRC protection, defined in Section 8.1.3.4.1, is the only packet integrity protection currently defined
and is one DWORD in size.
When the PIPP field in a packet is set to 11b, then the Packet Integrity Protection field in the packet is
one DWORD in size and contains a 32-bit CRC computed over the previous contents of the packet
(i.e., the UCIe Management Network Header and Management Protocol Specific field). Each bit of the
Packet Integrity Protection field is set to the corresponding bit of the computed CRC (e.g., bit 31 of
the computed CRC corresponds to bit 31 of the Packet Integrity Protection field).
The 32-bit CRC required by this specification is CRC-32C (Castagnoli) which uses the generator
polynomial 1EDC6F41h. The CRC is calculated using the following Rocksoft™ Model CRC Algorithm
parameters:
Name : "CRC-32C"
Width : 32
Poly : 1EDC6F41h
Init : FFFFFFFFh
RefIn : True
RefOut : True
XorOut : FFFFFFFFh
Check : E3069283h
When the PIPP field in a packet is set to 11b and the CRC contained in the Packet Integrity Protection
field is incorrect, then the packet is discarded.
Some Management Protocols have their own security architecture, so the use of the access control
mechanism is Management Protocol specific. Table 8-4 shows which Management Protocols use the
access control mechanism.
a. The UCIe Test and Debug Protocol uses the UCIe Memory Access Protocol for configuration and status field
accesses and as a result uses the Access Control Mechanism.
The access control mechanism is based on a 7-bit security clearance group value contained in request
packets. When a Management Entity emits a request packet on the Management Network, it
populates the Security Clearance Group field in the packet with a value that corresponds to the
security clearance group associated with the requester. When a Management Entity receives a request
packet, it determines whether the asset(s) accessed by the request are allowed or prohibited by that
security clearance group.
A Management Entity may emit packets with different security clearance group values. How the
security clearance group values that a Management Entity emits are configured or selected is beyond
the scope of this specification (see Section 8.1.3.6.4.1).
Although the security clearance group value is seven bits in size, an implementation may choose to
restrict the number of security clearance groups. When an implementation restricts the number of
security clearance groups to a value N, then security clearance group values from 0 to (N-1) are
allowed, and security clearance group values from N to 127 are not allowed. Security Clearance Group
0 is dedicated for use by Security Directors (see Section 8.1.3.5.2).
The method a Management Entity uses to determine whether a request packet is allowed access to an
asset is shown in Figure 8-7 and consists of the following steps.
1. The standard asset class ID or the vendor defined asset class ID of the asset being accessed by
the request packet is determined.
— UCIe defines standard asset classes (see Section 8.1.3.5.1) and supports vendor defined
asset classes. Each asset must map to a standard asset class, a vendor defined asset class, or
both.
— Associated with each asset class is an asset class ID. The mapping associated with this step
produces a standard asset class ID, a vendor defined asset class ID, or both.
— The manner and granularity in which an implementation maps assets to asset class IDs are
beyond the scope of this specification and may be done as part of address decoding, tagging
of assets, or some other mechanism.
2. Each asset class ID determined in the previous step is mapped to a 256-bit access control
structure.
— Associated with each asset class ID is a 128-bit Read Access Control (RAC) structure and a
128-bit Write Access Control (WAC) structure. If an asset’s state is being read, then the RAC
structure is selected as the access control structure. If an asset’s state is being written/
modified, then the WAC structure is selected as the access control structure.
o RAC and WAC structures are contained in the Access Control Capability Structure (see
Section 8.1.3.6.3).
o If a Management Entity does not have any assets that correspond to an asset class ID,
then the RAC and WAC structures associated with that asset class ID must be hardwired
to 0 (i.e., all the bits on both the RAC and WAC structures are read-only with a value of 0)
so no accesses would be allowed.
o If an implementation restricts the number of security clearance groups, then RAC and
WAC structure bits associated with security clearance groups that are not supported must
be hardwired to 0. For example, if an implementation only supports Security Clearance
Groups 0 through 3, then bits 4 through 127 must be read-only with a value of 0 in all
RAC and WAC structures.
3. The bit corresponding to the security clearance group value in the request packet is examined in
each access control structure determined in the previous step to determine whether access to the
asset is allowed.
— The 7-bit security clearance group value in the request packet is a value from 0 to 127 and
this value maps to a corresponding bit in a 128-bit access structure (e.g., security clearance
group value 27 maps to access control structure bit 27).
— If a bit corresponding to the security clearance group value in an access control structure is
set to one, then access to that asset is allowed by that security clearance group. If a bit
corresponding to the security clearance group value in an access control structure is set to 0,
then access to that asset is prohibited by that security clearance group.
4. If access to the asset is allowed by either the standard asset class or the vendor defined asset,
then access to the asset is granted and the request is processed. If access to the asset class is
prohibited by both the standard asset class and a vendor defined asset class, then access to the
asset is prohibited and the request is not processed.
— How a request that attempts to access a prohibited asset is handled is Management Protocol
specific.
If a request packet requires access to multiple assets for the request to be serviced, then Step 1
through Step 3 are performed and the request is processed only if access is granted to all assets. If
access is prohibited to any asset associated with the request, then no asset is accessed by the
request.
RACy
Management Entity
WACy
Responder
The objective of the standard asset classes is to provide a consistent classification of assets across
chiplet implementations and applications for access control. To achieve this, it must be possible for a
chiplet implementer to map chiplet assets that are accessible over the Management Network into
asset classes without understanding the underlying architecture, roles, or applications associated with
an SiP that uses the chiplet.
Standard asset class 0 is for SiP security configuration (e.g., reading and writing the RAC and WAC
structures). The remaining standard asset classes are constructed by taking the Cartesian product of
a set of asset types and asset contexts and removing elements that are not applicable in practice.
Asset types used to construct the standard asset class are shown in Table 8-5 and asset contexts used
to construct the standard asset class are shown in Table 8-6. The standard asset classes are shown in
Table 8-7.
In some cases, an asset may logically map to two or more standard asset classes. For example, a
memory region may contain both SiP data and chiplet data. When this occurs, the asset should be
mapped to the standard asset class with the lowest ID value.
In some cases, further access control granularity may be desired beyond what is offered by the
standard asset classes. This granularity may be accomplished by separating the assets into different
Management Elements within the chiplet. For example, a chiplet may contain two global secrets and
the chiplet implementor may wish to allow one set of requesters access to the first global secret and a
separate set of requesters access to the second global secret. By putting the two global assets into
two different Management Elements, different security clearance groups may be given access to each
global secret. In another example, a chiplet may contain an I/O controller and a memory controller
and the chiplet implementor may wish to allow one security clearance group to configure the I/O
controller and a different security clearance group to configure the memory controller. This granularity
may be achieved by putting the I/O controller and memory controller in different Management
Elements.
A secret, data, or status that may compromise a secret, or configuration that may control
Secret the exposure of a secret.
Example: security key
Volatile or persistent data that should have limited exposure, status that could expose
information that should have limited exposure, or configuration that may control
Sensitive exposure or modification of sensitive information.
Examples: error injection capabilities, sensitive state machines, private memory space
General status that cannot be used to expose user, sensitive, or secret information.
Status
Example: boot status
Asset associated with or which affects a partition. The definition of a partition is vendor
Partition defined but is broadly defined as a collection of hardware resources that act as an
independent unit.
Standard
Security Asset Asset Context Asset Type
Class ID
1 Global Secret
3 SiP Secret
5 SiP Sensitive
6 SiP Permanent
7 SiP Data
8 SiP Configuration
Standard
Security Asset Asset Context Asset Type
Class ID
9 SiP Status
11 Chiplet Secret
13 Chiplet Sensitive
14 Chiplet Permanent
15 Chiplet Data
16 Chiplet Configuration
17 Chiplet Status
19 Partition Secret
21 Partition Sensitive
22 Partition Permanent
23 Partition Data
24 Partition Configuration
25 Partition Status
A Management Element within an SiP that may configure security parameters is designated as a
Security Director. An SiP may contain multiple Security Directors. When an SiP contains multiple
Security Directors, coordination between security directors is beyond the scope of this specification.
The Security Clearance Group value of 0 is reserved for Security Directors and must not be used for
any other purpose.
The Management Director may also be a Security Director. While it is not recommended that the
Management Director operate using the Security Clearance Group value reserved for Security
Directors (i.e., 0) during normal operation, it is required to operate with this value during initial
configuration. When and how a Management Director changes the Security Clearance Group used for
transactions is beyond the scope of this specification.
Unless otherwise specified, Management Capability Structures and any sub-structures must be read
or written as single DWORD quantities (i.e., the Length field in the UCIe Memory Access Request must
be 0h which represents a data length of one DWORD). All fields in a Management Capability Structure
and any sub-structures have little endian bit ordering.
A Management Entity may support the UCIe Memory Access protocol (see Section 8.1.4) which
exposes a 64-bit address space associated with the Management Entity containing fields that allow
configuration by another Management Entity, such as the Management Director.
Each chiplet must support Management Element 0. Chiplet initialization and configuration are
performed through Management Element 0 using the Chiplet Capability Structure and as a result
Management Element 0 must support the UCIe Memory Access protocol. A chiplet may contain other
Management Entities and the number of such Management Entities is implementation specific. These
additional Management Entities may support the UCIe Memory Access protocol.
Figure 8-8 shows the UCIe Memory Access protocol memory map associated with a Management
Entity that support the UCIe Memory Access Protocol. The contents and organization of the memory
map are implementation specific except for a 64-bit Capability Directory Pointer value located at
address 0. If the Management Entity implements any Management Capability Structures, then the
Management Capability Directory Pointer contains the address of a Management Capability Directory.
If the Management Entity does not implement any Management Capability Structures, then the
Management Capability Directory Pointer contains a value of 0.
The Management Capability Directory, described in Section 8.1.3.6.1, contains a list of pointers (i.e.,
64-bit UCIe Memory Access protocol addresses) to Management Capability Structures supported by
the Management Entity and contains a pointer (i.e., the Element ID) of the next Management Entity in
the chiplet if one exists.
FFFF_FFFF_FFFF_FFFFh
Capability Structure
Capability Structure
Capability
Directory
Capability Structure
Capability Directory
0000_0000_0000_0000h Pointer
The organization that all Management Capability Structures follow is shown in Figure 8-9.
A Management Capability Structure is at least two DWORDs in size and may be larger. The size
of a Management Capability Structure is Management Capability Structure specific. Associated with
each Management Capability Structure is a Management Capability ID that identifies the capability.
The list of Management Capability IDs defined by UCIe are listed in Table 8-8.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Management
Management Capability Structure Name Description
Capability ID
5
to Reserved
12,287
12,288
to Vendor defined See Figure 8-10
16,383
The top 4096 Management Capability IDs are available for vendor-defined use. The organization of a
Vendor Defined Management Capability Structure is shown in Figure 8-10. Bits 0 through 15 of
DWORD 1 contain the UCIe-assigned identifier of the vendor that specified the Management
Capability Structure.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
The Management Capability Directory provides a method for discovery of Management Capability
Structures associated with a Management Entity. The structure of the Management Capability
Directory is shown in Figure 8-11 and described in Table 8-9.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
DWORD 2
Capability Pointer 0
DWORD 3
Capability Pointer 1
...
...
Capability Pointer N
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
a. See Table 8-7 for a description of Standard Security Asset Class IDs.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
a. See Table 8-7 for a description of Standard Security Asset Class IDs.
The Chiplet Capability Structure must be implemented in Management Element 0 of each chiplet and
must not be implemented in any other Management Entity in a chiplet.
Figure 8-13 shows the organization of the Chiplet Capability Structure. The Chiplet Capability
Structure describes the basic characteristics of the chiplet. It points to a list of Management Port
Structures that describe the characteristics of chiplet Management Ports. Each Management Port
Structure contains one or more Route Entries that control the routing of UCIe Management Transport
packets from the Management Fabric of the chiplet out the corresponding Management Port to an
adjacent chiplet in the SiP.
Port 1
Management
Port Structure
Route Entry 0
Route Entry 1
Port 0
Management
Port Structure
Route Entry 0
Route Entry 1
Route Entry 2
Route Entry 3
Chiplet Capability
Structure
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
Management Capability ID
This field specifies the Capability ID of this Management
Management
0 [29:16] 17 RO Capability Structure.
Capability ID
The Chiplet Capability Structure has a Management
Capability ID of 000h.
Chiplet ID
This field is used to configure the Chiplet ID portion of the
16-bit Management Network ID associated with
Management Element zero in the chiplet.
A Management Network ID is partitioned into a Chiplet ID
field in the upper bits and an Entity ID field in the lower bits
(see Section 8.1.3.2).
The lower bits of this field associated with the Entity ID
portion of the Management Network ID are hardwired to 0
Chiplet ID 1 [15:0] 8 RW/RO (i.e., RO). Since bits 0 and 1 are only associated with an
Entity ID, they are always hardwired to zero.
The upper bits of this field associated with the Chiplet ID
portion of the Management Network ID may be read and
written (i.e., RW). These upper bits must be initialized with
the Chiplet ID value associated with the chiplet. The initial
value of these upper bits is all ones (i.e., 1).
The value of the Chiplet ID portion of the Management
Network ID is used for UCIe Management Transport packet
routing only when the Chiplet ID Valid (CIV) field is set to 1.
Chiplet ID Valid
CIV 1 [16] 8 RW When this bit is set to 1, the Chiplet ID value in the Chiplet
ID field is used for UCIe Management Transport packet
routing.
Vendor ID
Vendor ID 2 [15:0] 17 RO UCIe assigned identifier of the vendor that produced the
chiplet.
Device ID
Vendor assigned identifier that identifies the type of chiplet
Device ID 2 [31:16] 17 RO produced by that vendor.
The tuple (Vendor ID, Device ID) uniquely identifies a type
of chiplet.
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
a. See Table 8-7 for a description of Standard Security Asset Class IDs.
The Management Port Structure provides a mechanism to discover and configure the characteristics
of a chiplet Management Port. The structure contains Route Entries associated with the port and
points to the next Management Port if one exists.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
DWORD 2 Reserved NumVCs Reserved RMT Reserved HT IDT Reserved RLD LNU LU PS
DWORD 8
Route Entries
...
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
Retrain Link
Writing a 1 to this bit initiates retraining of the link
associated with the Management Port.
Because the UCIe Management Transport may be
multiplexed with other protocols on the link, retraining a
port link may affect SiP operation.
RL 1 [0] 8 RW Retraining a UCIe Sideband link will also retrain the
corresponding UCIe Mainband link if one exists.
Retraining a link may take time to complete after this bit
is written. Status in this structure does not reflect the
result of a link retraining until the operation completes.
The Retrain Link Done (RLD) field may be used to
determine when the operation has completed.
Port Status
This field indicates the current Management Port Status.
PS 2 [0] 17 RO
0: Link Not Up
1: Link Up
Link Up
This bit is set to 1 when the link transitions from a link not
LU 2 [1] 17 RW1C
up to a link up state.
Writing to this field has no effect on the link.
Link Not Up
This bit is set to 1 when the link transitions from a link up
LNU 2 [2] 17 RW1C
to a link not up state.
Writing to this field has no effect on the link.
Heartbeat Timeout
This bit is set to 1 when a Heartbeat Timeout is detected
HT 2[9] 17 RW1C (see Section 8.2.5.1.3 for details).
Heartbeat Timeout is implemented only on the UCIe
sideband.
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
Port Identifier
This field indicates the chiplet’s unique 16-bit identifier
associated with the corresponding Management Port.
Port identifiers are statically assigned by the chiplet
manufacturer, never change, and need not be assigned
sequentially (i.e., their assignment may be sparse) except
as outlined below.
Port ID 3 [15:0] 17 RO UCIe mainband and sideband ports associated with the
same physical connection share all port ID bits in common
except bit 0. Bit 0 has a value of 0 in the mainband port
identifier and a value of 1 in the corresponding sideband
port identifier. For example, if a UCIe mainband port has a
port identifier of N, then N is even and the UCIe sideband
port associated with that mainband port is odd and has a
port identifier if (N+1).
The port identifier FFFFh is reserved.
Port Entity ID
Port Entity ID 4 [13:0] 17 RO This field indicates the Entity ID associated with the
Management Port (see Section 8.1.3.2).
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
a. See Table 8-7 for a description of Standard Security Asset Class IDs.
A Route Entry is used to specify a route from the Management Fabric within a chiplet out the
Management Port associated with the Route Entry.
The TC Select field selects traffic classes that are filtered out from matching a Route Entry.
A Route Entry may specify a normal route or a default route. The type of route is determined by the
RT field.
While the Chiplet ID and Entity ID size of chiplets may vary in an SiP, all Route Entry matching
associated with a chiplet is performed using the Chiplet ID size of that chiplet.
— The Chiplet ID portion (using the Chiplet ID width for this chiplet) of the packet’s Destination
ID field is less than or equal to the value in the Limit ID field.
• If a Route Entry has the Route Type field set to Default Route, then a packet matches the route
when all the following are true:
— The link is up,
— The packet is associated with a traffic class that has the corresponding bit of the TC Select
field in the Route Entry set to 1, and
— The packet does not match any other Route Entry within the chiplet.
If a packet on the Management Fabric of a chiplet matches the route specified by the Route Entry,
then the packet is transmitted out the Management Port associated with the Route Entry. The virtual
channel used by the packet is specified by the VC ID field in the matching Route Entry.
Route Entries associated with the Management Ports on either side of a point-to-point link that
interconnects two chiplets may be configured differently. This means that the TC-to-VC mapping in
each direction on the link may be different.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
Virtual Channel ID
This field selects the Virtual Channel (VC) used by packets
VC ID 0 [10:8] 8 RW that match this Route Entry.
The default value of this field is 0 which maps all selected
traffic classes onto VC0.
Route Type
This field selects the routing type of this Route Entry.
RT 0 [15] 8 RW 0: Normal Route
1: Default Route
The default value of this field is 0.
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
Base ID
This field contains the Base ID value of the Chiplet ID
associated with this Route Entry.
This field contains a 16-bit Management Network ID. The
Management Network ID is partitioned into a Chiplet ID
field in the upper bits and an Entity ID field in the lower
bits (see Section 8.1.3.2).
Base ID 1 [15:0] 8 RW/RO The lower bits of this field associated with the Entity ID
portion of the Management Network ID are hardwired to 0
(i.e., RO). Since bits 0 and 1 are only associated with an
Entity ID, they are always hardwired to zero.
The upper bits of this field associated with the Chiplet ID
portion of the Management Network ID may be read and
written (i.e., RW). These upper bits must be initialized with
the Chiplet ID value associated with the Base ID. The initial
value of these upper bits is all ones (i.e., 1).
Limit ID
This field contains the Limit ID value of the Chiplet ID
associated with this Route Entry.
This field contains a 16-bit Management Network ID. The
Management Network ID is partitioned into a Chiplet ID
field in the upper bits and an Entity ID field in the lower
bits (see Section 8.1.3.2).
The lower bits of this field associated with the Entity ID
Limit ID 1 [31:16] 8 RW portion of the Management Network ID are hardwired to 0
(i.e., RO).
The upper bits of this field associated with the Chiplet ID
portion of the Management Network ID may be read and
written (i.e., RW). These upper bits must be initialized with
the Chiplet ID value associated with the Base ID. The initial
value of these upper bits is all 0s.
If the Base ID is greater than the Limit ID, then the Route
Entry is disabled.
a. See Table 8-7 for a description of Standard Security Asset Class IDs.
A Management Entity must support the access control mechanism outlined in Section 8.1.3.5 and
must implement the Access Control Capability Structure described in this section. The Access Control
Capability Structure provides access to the Read Access Control (RAC) and Write Access Control
(WAC) structures associated with asset classes contained in the Management Entity.
The organization of the Access Control Capability Structure is shown in Figure 8-17. It consists of a
10-DWORD header that contains a pointer to the standard asset class access table and the vendor
defined asset class access table.
The standard and vendor defined asset access class tables consists of a sequence of 128-bit (4-
DWORD) RAC and 128-bit (4-DWORD) WAC structure pairs. The number of RAC/WAC structure pairs
in the standard asset class access table is equal to 26 (i.e., the number of standard asset class IDs)
and the number of RAC/WAC structure pairs in the vendor defined asset class access table
corresponds to the number of vendor defined asset classes. The fields in the 4-DWORD RAC and WAC
structures are described in Table 8-15 and Table 8-16, respectively.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Max Security
DWORD 0 Rsvd Management Capability ID Rsvd
Clearance Group Supported Ver
DWORD 2 Reserved
DWORD 3 Reserved
DWORD 5 Reserved
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
Management Capability ID
This field specifies the Capability ID of this Management
Management
0 [29:16] 17 RO Capability Structure.
Capability ID
The Access Control Capability Structure has a Management
Capability ID of 001h.
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
a. See Table 8-7 for a description of Standard Security Asset Class IDs.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
DWORD 0
DWORD 1
RAC0
DWORD 2
DWORD 3
DWORD 4
DWORD 5
WAC0
DWORD 6
DWORD 7
...
...
DWORD 200
DWORD 201
RAC25
DWORD 202
DWORD 203
DWORD 204
DWORD 205
WAC25
DWORD 206
DWORD 207
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
DWORD 0
DWORD 1
RAC0
DWORD 2
DWORD 3
DWORD 4
DWORD 5
WAC0
DWORD 6
DWORD 7
...
...
DWORD 8x
DWORD 8x+1
*
RACx
DWORD 8x+2
DWORD 8x+3
DWORD 8x+4
DWORD 8x+5
*
WACx
DWORD 8x+6
DWORD 8x+7
* The x in RACx/WACx corresponds to the number of vendor specific asset classes minus one (i.e., the
first vendor specific asset class corresponds to RAC0/WAC0, the second vendor specific asset class
corresponds to RAC1/WAC1, and so on).
Standard
DWORDa & Security
Field Name Attribute Description
Bit Location Asset Class
IDb
a. DWORD in this table refers to the DWORD offset into asset class access table for RACx. For example, the 128-bit RAC2 structure
is at DWORD offsets 16, 17, 18, and 19.
b. See Table 8-7 for a description of Standard Security Asset Class IDs.
Standard
DWORDa & Security
Field Name Attribute Description
Bit Location Asset Class
IDb
a. DWORD in this table refers to the DWORD offset into asset class access table for WACx. For example, the 128-bit WAC2 structure
is at DWORD offsets 20, 21, 22, and 23.
b. See Table 8-7 for a description of Standard Security Asset Class IDs.
The Security Clearance Group Capability Structure allows a Security Director to configure the Security
Clearance Group value used by a Management Entity when issuing Management Transport requests.
This capability structure must be implemented by a Management Entity that is the ultimate source of
UCIe Management Transport request packets (i.e., emits request packets) and is not required to be
implemented by any other Management Entity.
In some cases, a Management Entity may need to issue requests with different Security Clearance
Group values when operating in different contexts. The Security Clearance Group Capability Structure
supports multiple Security Clearance Group Contexts to allow a Security Director to configure a
Security Clearance Group value for each context. How a Security Director determines the
manageability functions provided by these contexts and what Security Clearance Group value should
be used is beyond the scope of this specification.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
DWORD 3 Reserved
DWORD 4
Security Clearance Group Contexts
...
Standard
DWORD & Security
Field Name Attribute Description
Bit Location Asset Class
IDa
Management Capability ID
This field specifies the Capability ID of this Management
Management
0 [29:16] 17 RO Capability Structure.
Capability ID
The Security Clearance Group Capability Structure has a
Management Capability ID of 004h.
a. See Table 8-7 for a description of Standard Security Asset Class IDs.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
DWORD 1 Reserved
Standard
DWORD& Security
Field Name Attribute Description
Bit Location Asset Class
IDa
Request Enable
When this field is set to 1 the Management Entity may
issue requests associated with this security clearance
group context.
When this field is set to 0 the Management Entity must
En 0 [31] 0 RW not issue requests associated with this security clearance
group context.
The initial value of this field is 0 if this context is not
associated with a Security Director. The initial value of this
field is 1 if this context is associated with a Security
Director.
a. See Table 8-7 for a description of Standard Security Asset Class IDs.
The address space associated with a Management Entity may be local to that Management Entity or
shared across one or more Management Entities in a chiplet. For example, the same address in two
Management Entities may reference the same memory location or different memory locations (e.g., a
memory location associated with each Management Entity). A Management Entity may have some
addresses that are local and some that are shared. For shared addresses, how concurrent accesses,
security, and mutual exclusion are handled is beyond the scope of this specification.
The UCIe Memory Access Protocol utilizes the UCIe Management Transport access control mechanism
(see Section 8.1.3.5).
Memory request packets are issued by a Management Entity to read or write memory mapped
structures or memory in another Management entity. The Opcode field indicates the type of operation.
When a UCIe Management Transport packet carries a UCIe Memory Request, the Resp field is set to 0
corresponding to a request packet.
UCIe Memory Request packet operations are non-posted. If a UCIe Management Transport packet
that carries a UCIe Memory Request packet is not discarded, then a UCIe Memory Response packet is
sent in response.
A Management Entity may issue requests on an ordered or unordered traffic class when the
Unordered Traffic Class Enable (UE) bit is set to 1 in the UCIe Memory Access Protocol capability
structure. When the UE bit is cleared to 0, then the Management Entity may only issue requests on an
ordered traffic class and must not issue requests on an unordered traffic class. Whether a
Management Entity utilizes an unordered traffic class is implementation specific.
The Tag field in a UCIe Memory Request packet is an 8-bit field populated by the requester, carried in
a request packet, and returned by the responder in the corresponding response packet if one is
generated. A requester may have multiple outstanding requests with the same Tag field value to the
same or different responders. The responder must not assume that Tag field values are unique and
must not in any way interpret the Tag field value. The use of the Tag field is requester implementation
specific and may be used for applications such as mapping responses to previously issued requests;
determining the responder associated with a response packet; and detecting lost, dropped, or
discarded packets.
The maximum number of requests that a requester may have outstanding is requester
implementation specific.
Figure 8-22 shows the fields of a UCIe Memory Access Request packet. Reserved fields (i.e., ones
labeled as Rsvd) must be filled with all 0s when the packet is formed. Reserved fields must be
forwarded unmodified on the Management Network and ignored by receivers. An implementation that
relies on the value of a reserved field in a packet is non-compliant.
+0 +1 +2 +3
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
DWORD 3 Address[63:32]
Rsvd
IPA
DWORD 4 Address[31:2]
Protocol
Specific
DWORD 5
DWORD M
Table 8-19 defines the fields of a UCIe Memory Access Request packet. The packet starts at DWORD 2
because DWORDs 0 and 1 contain the UCIe Management Transport packet header. All fields in the
table have little endian bit ordering, similar to Figure 8-5 (e.g., Address bits 32 through 39 are in Byte
3 bits 7 through 0 of DWORD 3, and Address bits 40 through 47 are in Byte 2 bits 7 through 0 of
DWORD 2 and so on).
Tag
Tag 8 bits This field contains the value of the tag field of the corresponding memory access
request.
Data Length
This field indicates the length of data referenced in DWORDs.
Length 8 bits The length of the packet in DWORDs is equal to the value of this field plus 1 (e.g., a
value of 00h in this field indicates a packet length of one DWORD, a value of 01h in
this field indicates a packet length of two DWORDs, and so on).
Opcode
This field indicates the memory access request operation.
0000b: Reserved (used for responses)
Opcode 4 bits
0001b: Memory Read (MemRd)
0010b: Memory Write (MemWr)
Others: Reserved
Address
Address 62 bits
This field contains the DWORD address being referenced in the Management Entity.
Data
Data Varies This field is present in Memory Write requests and contains the data being written.
This field is not present in Memory Read requests.
A UCIe Memory Access Response packet is generated by a Management Entity when the processing
associated with a UCIe Memory Access Request packet completes. When a UCIe Management
Transport packet carries a UCIe Memory Response, the Resp field is set to 1 corresponding to a
response packet.
A UCIe Memory Access Protocol responder must always support UCIe Memory Request packets on all
traffic classes (TC). The traffic class of a UCIe Memory Access Response packet is the same as the
traffic class used in the corresponding UCIe Memory Access Request packet.
As described in Section 8.1.3.1.1, each traffic class is a unique ordering domain. There are no
ordering guarantees for UCIe Memory Request packets in different traffic classes.
Within an ordered traffic class, UCIe Memory Request packets are delivered in-order between a
requester and a responder and UCIe Memory Response packets are delivered in-order between a
responder and the requester. There are no ordering guarantees between requests to different
responders and there are no ordering guarantees between responses from different responders to a
requester. Within an unordered traffic class there are no packet ordering guarantees and the packets
may be delivered in any order.
A Management Entity may process received UCIe Memory Request packets sequentially (i.e., one at a
time) or concurrently (i.e., two or more at a time). There are no ordering requirements between
requests in different traffic classes; however, the result of processing these requests must be
equivalent to some sequential processing of requests performed in an atomic manner.
Regardless of whether UCIe Memory Request packets are associated with an ordered or an unordered
traffic class, a responder may send UCIe Memory Access Response Packets out-of-order (i.e., a
responder is not required to send response packets in the same order that the corresponding request
packets were received by the responder). This means that responses may be received by a requester
in an order different from the order in which the requests were sent by the requester.
The Status field in a UCIe Memory Access Response packet indicates the status associated with
processing the corresponding UCIe Memory Access Request packet. If a UCIe Memory Access Request
packet is processed successfully, then a UCIe Memory Access Response packet is generated with
status Success. If the request requires response data, then all the data associated with the successful
response is contained in a single response packet.
If a Management Entity receives a well formed UCIe Management Transport packet, but the UCIe
Memory Access Request packet is malformed, then no processing of the request occurs and a
response with no data and status Packet Error is returned.
• Examples of a malformed UCIe Memory Access Request packet:
— Receipt of a UCIe Memory Access Request packet with a reserved value in the Opcode field.
— Receipt of a UCIe Memory Access Request packet with the Length field set to zero and the
Last DW BE field set to a nonzero value.
If a request violates the programming model of a Management Entity, then the request is not
performed and a response with no data and status Programming Model Violation is returned.
• Examples of programming model violations:
— Unless otherwise specified all UCIe defined structures must be accessed as DWORDs.
If a Management Entity receives a request, is not capable of processing the request, but will be able
to process the request at some point in the future, then a response with no data and status Retry
Request is returned. The Retry Request status should not be used during normal operation and
implementations are strongly encouraged to only use the Retry Status when absolutely necessary.
How long a requester waits after receiving a response with status Retry Request before reissuing the
request is implementation specific. The Max Retry Time Units and Max Retry Time Value fields in the
UCIe Memory Access Protocol Capability Structure report the maximum duration of time during which
a Management Entity may return a response with status Retry Request. A requester may use this
time duration to determine how long to poll a responder before declaring that the responder has
malfunctioned.
If the Management Entity can process a request, the request does not contain an error, and the
request attempts to access an asset that is prohibited, then the asset is not accessed, and no
processing associated with the request occurs.
• If the Ignore Prohibited Access (IPA) bit in the request is cleared to 0, then a response with no
data and a status of Access Denied is returned.
• If the Ignore Prohibited Access (IPA) bit in the request is set to 1, then the required response data
with all values set to zero and status Success is returned. The purpose of this is to allow an
address range to be probed without returning errors.
The read of a byte whose corresponding byte enable is 0 in the First DW BE or Last DW BE field should
return a value of FFh.
If a Management Entity receives a UCIe Memory Access Request packet with a byte
enable value of 0 in the First DW BE or Last DW BE field and does not return a value
of FFh for the byte in the corresponding response, then care must be exercised to
ensure that the data returned in unused bytes does not create a security issue.
Implementations are strongly encouraged to align secure information on DWORD or
larger boundaries.
Figure 8-23 shows the fields of a UCIe Memory Access Response packet. Reserved fields (i.e., ones
labeled as Rsvd) must be filled with 0s when the packet is formed. Reserved fields must be forwarded
unmodified on the Management Network and ignored by receivers. An implementation that relies on
the value of a reserved field in a packet is non-compliant.
+0 +1 +2 +3
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
DWORD 3
Protocol
Specific
... Data (Optional)
DWORD M
Table 8-20 defines the fields of a UCIe Memory Access Response packet. The packet starts at DWORD
2 because DWORDs 0 and 1 contain the UCIe Management Transport packet header. All fields in the
table have little endian bit ordering (e.g., Tag bit 0 is in DWORD 3 bit 0, and Tag bit 7 is in DWORD 3
bit 7).
Opcode
Opcode 4 bits
This field must be set to 0000b.
Response Status
This field indicates the memory access response status.
000b: Success (SUCCESS)
001b: Programming Model Violation (PMV)
Status 3 bits
010b: Retry Request (RR)
011b: Access Denied (AD)
100b: Packet Error (PERR)
Others: Reserved
Tag
Tag 8 bits This field contains the value of the tag field of the corresponding memory access
request.
Data
Data Varies If the memory access request was a Memory Read that was processed successfully
(i.e., the Response Status field contains Success), then this field contains the data
read. This field is not present in Memory Write completions.
The Max Buffered Requests field reports the maximum number of requests that the Management
Entity is guaranteed to buffer. Issuing more outstanding requests to the Management Entity than this
maximum may result in head-of-line blocking in the chiplet Management Fabric and/or a VC
associated with a Management Port between chiplets.
The Request Buffer Size field reports the sum of the size of the requests that the Management Entity
is guaranteed to buffer. Issuing more outstanding requests to the Management Entity than will fit in
this buffer may result in head-of-line blocking in the chiplet Management Fabric and/or a VC
associated with a Management Port between chiplets.
The Max Response Time Units and Max Response Time Value fields report the expected maximum
time that the Management Entity requires to process a request. This is the expected maximum time
with no other outstanding requests from receipt of a UCIe Memory Request packet at the
Management Entity to the Management Entity emitting a corresponding UCIe Memory Response
packet.
The UCIe Management Access Protocol does not define an architected completion timeout mechanism
to detect lost packets or hardware failures; however, a requester may use the time reported in Max
Response Time Units and Max Response Time Value fields in this capability structure to implement a
vendor defined completion timeout mechanism. When a completion timeout mechanism is
implemented, the requester must not declare a completion timeout sooner than the expected
maximum response time reported by Response Time Units and Response Time Value fields.
+3 +2 +1 +0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Table 8-21. UCIe Memory Access Protocol Capability Structure Fields (Sheet 1 of 2)
Standard
DWORD & Bit
Field Name Security Asset Attribute Description
Location
Class IDa
Management Capability ID
This field specifies the Capability ID of this
Management Management Capability Structure.
0 [29:16] 17 RO
Capability ID The UCIe Memory Access Protocol Capability
Structure has a Management Capability ID
of 002h.
Table 8-21. UCIe Memory Access Protocol Capability Structure Fields (Sheet 2 of 2)
Standard
DWORD & Bit
Field Name Security Asset Attribute Description
Location
Class IDa
a. See Table 8-7 for a description of Standard Security Asset Class IDs.
See Figure 8-25 and Figure 8-26 for a high-level view of the MTP transport path over UCIe sideband
and mainband respectively. In this architecture, MTPs are transported between Management Port
Gateways (MPGs) on each end of the UCIe link using either the sideband or the mainband path. In
this context, an MPG is an entity that provides the bridging functionality when transporting an MTP
from/to a local SoC management fabric (which is an SoC-specific implementation) to/from a UCIe
link. The MPG has credited buffers for receiving MTPs (called RxQ) from the link and their sizes are
exchanged during initial link training. These credited buffers are separately maintained for Sideband
and Mainband paths when management transport is supported on both. Up to eight VCs can be
supported on a Management Port. Dedicated buffering is required for each VC negotiated. Support for
VC0 is mandatory when management transport is negotiated, and all other VCs are optional.
Die A
D2D Adapter
Separate credits per
Sideband/VC/Req/Resp
Cfg1
MTP
Rx/Tx
Queues MPG Mux
Module 0
Module 1
Module 2
Module 3
Any, some, or all the
Sideband links in multi-
module config can be used
for Management Transport
Die B
PHY
Cfg2
RDI
Rx/Tx
Queues
Management Port Gateway
Cfg1
D2D Adapter
a. Configuration interface (i.e., pl_cfg_* and lp_cfg_* signals) described in Table 10-1.
b. Configuration interface (i.e., pl_cfg_* and lp_cfg_* signals) described in Table 10-1 plus extensions
described in Table 10-3.
Protocol Die A
(CXL.io, PCIe, Streaming)
FDI
MTP
Rx/Tx
Queues MPG Mux
RDI
PHY
Module 0
Module 1
Module 2
Module 3
Die B
PHY
RDI
D2D Adapter
Separate credits per
Stack/VC/Req/Rsp
FDI
Protocol
(CXL.io, PCIe, Streaming)
In Figure 8-25 and Figure 8-26, the location of the Management Port Gateway mux is shown for
reference purposes only. Implementations can choose to locate the mux elsewhere (e.g., above FDI
for sideband path) and details of such implementations are beyond the scope of this specification.
Interface definitions for this architecture, seen in Chapter 10.0, and other details discussed around
Management Port Gateway integration are with respect to this reference Management Port Gateway
mux placement.
The Management Port Gateway interfaces to the D2D Adapter by way of the FDI for mainband
transport as shown in Figure 8-26, and FDI is described in Chapter 10.0. The Management Port
Gateway can also connect directly to D2D by way of the FDI. Supported configurations of
Management Port Gateway connectivity to D2D are shown in Figure 8-27.
The terminology used throughout this chapter will be in reference to the concepts shown in
Figure 8-25 and Figure 8-26. In case of CXL protocol, the Management Port Gateway mux is on the
CXL.io FDI.
FDI
FDI
FDI
MTP MTP
Rx/Tx Rx/Tx
Queues MPG MuxStack X
Queues MPG Mux MPG Mux
Stack X
Stack Y
FDI FDI
FDI
FDI
FDI
MTP MTP
Rx/Tx Rx/Tx
Stack 0
Stack Y
Stack X
Stack X
Stack Y
FDI FDI
FDI
FDI
Stack X
Queues MPG Mux Queues MPG Mux
Stack Y
Stack Y
Stack X
Stack X
FDI FDI
ARB/MUX
D2D Adapter D2D Adapter
Config e Config f
Stack X Stack X
FDI FDI
8.2.2.1 Sideband
There are currently two MPM opcodes defined as shown in Table 7-1, “Opcode Encodings Mapped to
Packet Types”. See Section 7.1.2.4 for more information.
8.2.2.2 Mainband
All MPMs on mainband carry a 2-DWORD header referred to as “MPM Header” (see Figure 8-28 and
Figure 8-31). In that Header, bits [4:0] in the first DWORD carry the MPM opcode and are defined in
Table 8-22. The remainder of this section discusses the format of these opcode messages. See
Section 8.2.5.2.3 for details of how these messages are packed in the Management Flit when
transmitting over the mainband.
Opcode Message
Others Reserved
Bits [21:14] in the first DWORD of the MPM header (see Figure 8-28) of an MPM with Data message
form an 8b msgcode that denotes a specific MPM with Data message. Supported MPM with data
messages on the mainband are shown in Table 8-23.
msgcode Message
Others Reserved
The term “MPM payload” is used in the remainder of this section to refer to the payload in the MPM
with Data messages.
8.2.2.2.1.1 Common Fields in MPM Header of MPM with Data Messages on Mainband
Figure 8-28 shows and Table 8-24 describes the common fields in the MPM header of MPM with data
messages on the mainband.
Figure 8-28. Common Fields in MPM Header of all MPM with Data Messages on Mainband
3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
re rs
rsvd sp
vc msgcode length vd
opcode = 11000b
msgcode- rxq
rsvd msgcode-specific rsvd specific
rsvd id
Table 8-24. Common Fields in MPM Header of all MPM with Data Messages
on Mainband (Sheet 1 of 2)
Field Description
length MPM Payload length (i.e., 0h for 1 QWORD, 1h for 2 QWORDs, 2h for 3 QWORDs, etc.).
Table 8-24. Common Fields in MPM Header of all MPM with Data Messages
on Mainband (Sheet 2 of 2)
Field Description
0: Request MPM.
1: Response MPM.
resp
For a Vendor-defined Management Port Gateway Message with Data, this bit is always 0
(see Section 8.2.2.2.1.3).
RxQ-ID to which this packet is destined. See Section 8.2.3.2.2 for RxQ details.
rxqid
rxqid=0 corresponds to Stack 0. rxqid=1 corresponds to Stack 1.
Encapsulated MTP on the mainband is an MPM with Data with a msgcode of 01h.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
re rs
rsvd vc msgcode = 01h length opcode = 11000b
sp vd
a
rxq
rsvd rsvd s p rsvd
id
c
b d
…
e
1 DWORD padding of all 0s (if required)
a. MPM Header.
b. MPM Payload.
c. Management Transport Packet (MTP).
d. Length in MPM Header.
e. DWORD padding.
Segmented MTP (see Section 8.2.4.2). The first and middle segments in a
segmented MTP have this bit set to 1. The last segment in a segmented
s
MTP will have this bit cleared to 0. An unsegmented MTP also has this bit
MPM Headera cleared to 0.
See Section 8.1.3.1 for details. Note that DWORDx:Bytey in Figure 8-29
MPM Payload — refers to the corresponding DWORD, Byte defined in the Management
Transport Packet in Figure 8-5.
a. See Section 8.2.2.2.1.1 for details of header fields common to all MPMs with data on the mainband.
The Vendor-defined Management Port Gateway message with data is defined for custom
communication between MPGs on the two ends of a UCIe mainband link. These messages are not part
of the Management transport protocol, and these messages start at an MPG and terminate at the MPG
on the other end of the UCIe mainband link. These messages share the same rxqid buffers as
encapsulated MTP messages. If an MPG does not support these messages or does not support
vendor-defined messages from a given vendor (identified by the UCIe Vendor ID in the header), the
MPG silently drops those messages. Ordering of these messages sent over multiple mainband stacks
is subject to the same rules presented in Section 8.2.4.3 for encapsulated MTPs.
Figure 8-30. Vendor-defined Management Port Gateway Message with Data on Mainband
3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
re
rs
rsvd sp vc msgcode = FFh length opcode = 11000b
=0
vd
a
rxq
rsvd UCIe Vendor ID rsvd
id
b c
Vendor-defined payload
a. MPM Header.
b. MPM Payload.
c. Length in MPM Header.
MPM Headera UCIe Vendor ID UCIe Consortium-assigned unique ID for each vendor.
a. See Section 8.2.2.2.1.1 for details of header fields common to all MPMs with data on the mainband.
Bits [21:14] in the first DWORD of the MPM header of an MPM without Data message form an
8b msgcode that denotes a specific MPM without Data message. Table 8-27 lists the supported
msgcodes.
msgcode Message
Others Reserved
Figure 8-31 shows and Table 8-28 describes the common fields in the MPM header of MPM without
data messages on the mainband.
Figure 8-31. Common Fields in MPM Header of all MPM without Data Messages on Mainband
3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
msgcode- rs
rsvd msgcode msgcode-specific opcode = 10111b
specific vd
msgc
ode-
rsvd msgcode-specific rsvd specif
ic
Table 8-28. Common Fields in MPM Header of all MPM without Data Messages on Mainband
Field Description
See Section 8.2.3.2.2 for details of how this message is used during mainband management path
initialization.
Figure 8-32 shows and Table 8-29 describes the Management Port Gateway Capabilities message
format on the mainband.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
a. MPM Header.
Table 8-29. Management Port Gateway Capabilities MPM Header Fields on Mainbanda
Field Description
Number of VCs supported by the Management Port Gateway that is transmitting the
NumVC
message.
Port ID number value of the Management port associated with the Management Port
Port ID
Gateway that is issuing the message (see Section 8.1.3.6.2.1).
a. See Section 8.2.2.2.2.1 for details of header fields common to all MPMs without data on the mainband.
See Section 8.2.3.2.2 for details of how this message is used during mainband management path
initialization.
Figure 8-33 shows and Table 8-30 describes the Init Done message format on the mainband.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
a. MPM Header.
Field Description
rxqid RxQ-ID associated with the message. See Section 8.2.3.2.2 for RxQ details.
a. See Section 8.2.2.2.2.1 for details of header fields common to all MPMs without data on the mainband.
The Vendor-defined Management Port Gateway message without data is defined for custom
communication between the MPGs on both ends of a UCIe mainband link. These messages are not
part of the management transport protocol, and these messages start at an MPG and terminate at the
MPG on the other end of the UCIe mainband link. These messages share the same rxqid buffers as
encapsulated MTP messages. If an MPG does not support these messages or does not support these
messages from a given vendor (identified by the UCIe Vendor ID in the header), the MPG silently
drops those messages.
The Vendor-defined Management Port Gateway message without data on the mainband has the
format shown in Figure 8-33.
Figure 8-34. Vendor-defined Management Port Gateway Message without Data on Mainband
3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
re
rs
rsvd sp vc msgcode = FFh Vendor-defined opcode = 10111b
=0
vd
a
rxq
rsvd UCIe Vendor ID rsvd
id
a. MPM Header.
Field Description
Vendor-defined Management Port Gateway message without data always uses the
resp
Request channel.
rxqid RxQ-ID to which this packet is destined. See Section 8.2.3.2.2 for RxQ details.
a. See Section 8.2.2.2.2.1 for details of header fields common to all MPMs without data on the mainband.
Section 8.2.3.1 describes the setup process for the sideband. Section 8.2.3.2 describes the setup
process for the mainband.
8.2.3.1 Sideband
Sideband Management Transport path setup occurs after a Management Reset or when Software
writes 1 to the ‘Retrain Link’ bit in the Sideband Management Port Structure register (see
Section 8.1.3.6.2.1). After setup is complete, management transport path over sideband remains
active until the next Management Reset or until a ‘Heartbeat timeout’ is detected (as described in
Section 8.2.5.1.3).
Negotiation occurs in the MBINIT.PARAM state. See Section 4.5.3.3.1.1 for details.
If the Negotiation phase indicates support for Management transport and the SB_MGMT_UP flag (see
Section 4.5) is cleared, Initialization phase steps are performed as indicated in this section.
A few general rules for RxQs that are initialized in this phase:
• Management Port Gateway maintains separate Rx queues for each sideband link over which it can
receive MPMs. The Management Port Gateway can limit the number of Rx queues to be the same
or smaller than the number of modules in the design. For example, in a design with four modules,
a Management Port Gateway can choose to limit Rx queues to three or two or one.
• Each Rx queue in the Management Port Gateway is assigned a separate RxQ-ID and it is relevant
for maintaining ordering when interleaving MTPs across multiple sideband links. See
Section 8.2.4.3.
• See Section 8.2.4.1 for details of credit buffers that are required in each Rx queue.
• Number of RxQs finalized for transmitting and receiving MPMs is 0 to MIN{RxQ-Local, RxQ-
Remote}, where RxQ-Local and RxQ-Remote are defined in Section 4.5.3.3.1.1.
• Transmission of MPMs with a given RxQ-ID is always associated with a specific local module that is
design-specific. For example, an MPM with an RxQ-ID of 0 can be sent on any Module’s sideband
and that choice is design-specific. However, the choice is static and cannot change after the first
MPM with that RxQ-ID is sent.
• Credits associated with each RxQ-ID are exchanged with a remote link partner by way of Credit
Return messages as discussed below.
After pm_param_done is asserted and there is >0 module count negotiated for management
transport for the local and remote sides, the Management Port Gateway begins the initialization
process to the remote MPG for each RxQ-ID that the MPG needs to enable.
1. The initialization phase starts (shown in Figure 8-35, Figure 8-36, and Figure 8-37) with each
Management Port Gateway sending the Management Port Gateway Capabilities message (see
Figure 7-9 for message format).
— Message can be sent on any RxQ-ID path, but sent only once per initialization phase from a
chiplet to the partner chiplet.
— Port ID value in the transmitted message is the value in Port ID field (see Table 8-12).
— Port ID value in the received message is recorded in the “Remote Port ID” field (see
Table 8-12).
— NumVC field is the number of VCs supported by the transmitting Management Port Gateway.
The number of VCs supported is the value in the NumVC field + 1. For example, if only one VC
(VC0) is supported, NumVC is 0h. If two VCs are supported (VC0, VC1), then NumVC is 1h,
etc.
— MIN{Transmitted NumVC, Received NumVC}+1 number of VCs is enabled by each
Management Port Gateway in the subsequent steps. The value of the enabled VCs starts from
0 and increments by 1 for each enabled VC up to MIN{Transmitted NumVC, Received
NumVC}.
2. Management Port Gateway then sends credit Return messages for each enabled VC for each type
(requests and responses), across all enabled RxQ-IDs. The Management Port Gateway is
permitted to send this message. Figure 8-35 shows the flow for the case of only a single RxQ
(RxQ-ID=0) and single VC (VC0) negotiated during the negotiation phase. Figure 8-36 shows the
flow for the case of two RxQs (RxQ-ID=0, 1) and single VC (VC0) negotiated during the
negotiation phase. Figure 8-37 shows the flow for the case of only a single RxQ (RxQ-ID=0) and
two VCs (VC0, VC1) negotiated during the negotiation phase.
— Credit Return message (see Figure 7-10) contains an “RxQ-ID” field. The field must be
assigned starting from 0 to MIN{RxQ-Local, RxQ-Remote}-1.
— Infinite credits are permitted to be advertised. This is performed by sending a value of 3FFh in
the “Rx Credit return in QWORDs” field for that VC and Type before the “Init Done”.
Chiplet 1
Chiplet 1
Management Transport
Packets to Chiplet 1 Legend
Link A
Link B
Chiplet 1
Management Transport
Packets to Chiplet 1
3. After the last Credit Return message for a given RxQ-ID, the Management Port Gateway must
send an “Init Done” message (see Figure 7-11) for the corresponding RxQ-ID. This informs the
remote Link partner that a receiver has finished advertising credits for enabled VCs for the given
RxQ-ID.
— After “Init Done” has been transmitted and received by a Management Port Gateway for all
available RxQ-ID paths, the MPG is ready for sending Management Transport packets.
o Sideband should be able to send/receive management transport packets at this point
without any dependency on the mainband link status.
o Management Port Gateway asserts the mp_mgmt_up and mp_mgmt_init_done signals
to PHY to indicate that the Management Transport Path was successfully initialized. PHY
sets the SB_MGMT_UP flag when both mp_mgmt_up and mp_mgmt_init_done are
asserted. The flag remains set until the management path goes down. In case of any fatal
error (e.g., credit return messages were received for an RxQ-ID that is not expected, a
timeout occurred while waiting for the Init Done message, etc.) during RxQ credit
exchange, the mp_mgmt_up signal will remain de-asserted with the
mp_mgmt_init_done signal asserted.
o Note that the Management Port Gateway is unaware of PHY states and thus, after the
mp_mgmt_up signal is asserted, the Management Port Gateway assumes that the
management path through the sideband is available unless there is a Management Reset
or the MPG detects an error through the mechanism described in Section 8.2.5.1.3.
o After the SB_MGMT_UP flag is set, sideband link is available for sideband packet (MPMs or
any other sideband packets) transmission/reception in all state machine states including
RESET/SBINIT.
— After the Management Port Gateway receives the “Init Done” message for a given RxQ-ID, the
MPG must be ready to receive MTPs with that RxQ-ID.
The PHY Layer routes a message with a given RxQ-ID (specified by the mp_rxqid signal) to a specific
module’s sideband link and that association is design-specific. Note that because RxQ-ID association
to a module sideband is design-specific, on the same sideband link, messages with different RxQ-IDs
in each direction are possible.
8.2.3.2 Mainband
Mainband Management Transport path setup occurs when a link trains up. After the setup is complete,
the management transport path remains active until a Domain Reset or until the link or the associated
stack(s) goes down.
Mainband Management Transport path negotiation occurs on every mainband link training, thereby
leveraging the existing D2D adapter protocol negotiation messages/flows. Support for Management
Transport protocol within a stack is explicitly indicated with a new bit in the negotiation flow (see
Table 3-1).
Section 3.1 and Section 3.2 provide Management Transport protocol negotiation details. At the end of
protocol negotiation, the D2D adapter indicates the number of D2D stacks that negotiated
Management Transport protocol by signals discussed in Table 10-3.
A few general rules for the RxQs that are initialized in this phase:
• Management Port Gateway maintains separate Rx queues for receiving MTPs over each
negotiated stack.
• Each Rx queue within the Management Port Gateway is assigned a separate RxQ-ID, which is
necessary for maintaining ordering when interleaving packets across multiple stacks. See
Section 8.2.4.3.
• See Section 8.2.4.1 for details of credit buffers that are required in each Rx queue.
• RxQ-ID values are either 0 or 1. A value of 0 is used if only one stack is negotiated for
management transport (regardless of the stack-id value negotiated) and values of 0 and 1 are
used if two stacks are negotiated for management transport. In the latter case, an RxQ-ID value
of 0 is used for Stack 0, and an RxQ-ID of 1 is used for Stack 1.
The initialization flow follows the similar sequence as sideband and some example flows are illustrated
below. The credit exchange is not by way of an explicit message as in sideband, but rather by way of
a dedicated DWORD, ‘CRD’, in management flits whose format is shown in Figure 8-45 and further
explained in Chapter 3.0. Management Port Gateway Capabilities and Init Done Message formats for
the mainband can be seen in Section 8.2.2.2.2. Note that during initialization, the transmitter can
return valid credits in the same Management Flit that carries the Init Done message. All protocol layer
bytes in the management flit (minus the CRD and Rsvd bytes) carrying the ‘Init Done’ MPM are driven
with NOPs after the ‘Init Done’ MPM.
refers to a Management Flit that carries the specified MPM along with credit returns for the indicated
credit types. For example:
indicates a Management Flit that carries the Init Done message along with credit returns for the VC0
request and response credit types for RxQ-ID=0.
Chiplet 1
Packets to Chiplet 1
Chiplet 1
Chiplet 0 can transmit
Chiplet 0
Management Transport
Packets to Chiplet 1
The following rules relate to Management Port Gateways and mainband Management Transport:
• During runtime, if the FDI status on any stack that has management traffic negotiated moves to a
Link Status=down state, the Management Port Gateway behaves the same as in the ‘Init Done’
timeout scenario (see Section 8.2.4.4) across both stacks, if more than one stack had
management transport negotiated.
• Arbitration between Management Flits and regular Protocol Layer Flits is implementation-specific.
• When management Software writes 1 to the ‘Retrain link’ bit in the Management Port Structure
register that corresponds to the mainband link, the mainband is retrained, similar to when SW
writes 1 to the ‘Start UCIe Link Training’ bit in the UCIe Link Control register in the UCIe link
DVSEC. Note that this retraining of the mainband does not affect the management path on the
sideband (if that path had been negotiated), if the path was already set up and active.
• Although the number of VCs supported in both directions is the same, TC-to-VC mapping can be
different in each direction. See the Route Entry description in Section 8.1.3.6.2.2 for how SW
controls mapping of TC to VC.
• For each RxQ-ID in the Management Port Gateway:
— Independent credit management is required for each resp type (Requests vs. Responses), and
each supported VC.
— Credits are in QWORD (64-bit) granularity (i.e., one credit corresponds to one QWORD of
storage space at the receiver buffer).
— Minimum three credits are required for each credit type when nonzero credits are advertised.
— Header and Data portions of an Encapsulated Management Transport packet and Vendor-
defined Management Port Gateway Messages use the same type of credit.
— Receiver implements separate buffers for Requests and Responses per supported VC and
advertises the corresponding credits to the remote Management Port Gateway during
initialization. Credits are returned when space is freed up in the receiver buffers.
— Up to eight VCs are permitted — different VC counts are permitted on sideband vs. mainband.
o Support for VC0 is mandatory for all implementations.
o For every VC supported, it is mandatory to initialize credits for Request types and
Response types.
o Credits advertised for a VC:Resp credit type during the initialization phase can be either 0
across all RxQ-IDs or nonzero across all RxQ-IDs.
o If a VC is initialized, credits for that VC must be advertised on all enabled RxQ-IDs and
Resp types. For example, it is NOT permitted to have a configuration where VC1 is
supported on RxQ-ID 0 but not on RxQ-ID 1. However, it is not required to advertise the
same number of credits on all enabled Paths. This rule is important to simplify
Transmitter/Receiver implementations at Management Port Gateways for interleaving
MTPs across multiple Links while maintaining ordering across them (see 1.4.3 for concept
of interleaving).
— During management transport initialization and before the Init Done message is received, if
multiple credit returns are received for the same VC:Resp credit type, the value from the
latest credit return overwrites the previous value.
• Number of RxQs (in the partner chiplet’s Management Port Gateway) to which a Management Port
Gateway can transmit management messages is always same as the number of RxQs to which
the MPG can receive these messages (from the partner chiplet’s Management Port Gateway). For
example, if two RxQs were negotiated, both send and receive of management traffic must be on
two RxQs.
• If the initial credit advertised was infinite for a credit type, there cannot be any credits returned
for that type at run time (i.e., after the Init Done message has been sent), with one exception for
the “VC0 request infinite credit” scenario for which a runtime credit return of 0 is permitted.
• Credits advertised during the initialization phase are the maximum number of credits that can be
outstanding at the transmitter at any point during runtime.
• See Section 8.2.4.3 for the rules for maintaining ordering when interleaving MTPs/MTP Segments
across different RxQ-IDs.
• Chiplets can optionally check for the following error conditions during management path
initialization flow, and abort the flow when these conditions are detected:
— Receiving credit returns for more RxQ-IDs than what was negotiated in the Negotiation Phase.
— Receiving credit returns for more VCs than what was implicitly negotiated by the Management
Port Gateway Capabilities message.
— Not receiving credit returns or receiving incomplete credit returns for any of the negotiated
RxQ-IDs prior to receiving the ‘Init Done’ message for the RxQ-ID.
8.2.4.2 Segmentation
The Management Port Gateway is permitted to break up (i.e., segment) one large MTP and send the
individual segments across multiple RxQ-IDs (i.e., interleave; see Figure 8-41 for an example). This is
useful for cases in which the MTP message sizes are asymmetric. When segmenting:
• Management Port Gateway sets the s bit in the Encapsulated MTP message header within each
individual segment except the last segment that completes the MTP transfer. If an MTP is not
segmented, the s bit is 0. Segments with the s bit set to 1 must not also have the p bit set to 1.
• Transmitter must ensure that no other Encapsulated MTP OR no other credited MPM packet (e.g.,
Vendor-defined Management Port Gateway messages), from the same VC:Resp credit type is
interleaved until the segmented management packet completes.
Note that segmentation is visible only from Management Port Gateway-to-Management Port Gateway
and is not end-to-end on the UCIe Management Fabric.
See Section 8.2.4.3 for the rules for reassembling the segments and maintaining ordering when
interleaving Segments across different RxQ-IDs.
Figure 8-41. Example Illustration of a Large MTP Transmitted over Multiple RxQ-IDs
on Sideband with Segmentationa
— When the MTP is not segmented, the MTP is fully transmitted to the associated credit buffers
and this could take multiple Encapsulated MTPs. In that scenario, each Encapsulated MTP
carries the same MPM header but with the length field adjusted for the data length in that
message. cr_ret_* fields are also refreshed in every Encapsulated MTP (on the sideband) and
indicate 0 if there is no new credit to return. On the mainband path, credits can be refreshed
every management flit.
— RxQ-ID is incremented by 1 for transmitting the next MTP of the same VCy:Respz credit type
(i.e., the next MTP of VCy:Respz credit type is sent to RxQ-ID1:VCy:Respz credit buffers).
— When the MTP is segmented, a single Encapsulated MTP belonging to the MTP is transmitted
to the associated buffers with the “s” bit set to 1. RxQ-ID is incremented by 1 (with
wraparound as indicated later in this section) for transmitting each subsequent segment of
the same MTP until the MTP is fully sent. After the MTP is fully sent, the RxQ-ID is
incremented by 1 again (with wraparound as indicated later in this section) for transmitting
the next MTP of the same VCy:Respz credit type.
• The above scheme is repeated independently for traffic within each VCy:Respz credit type.
Transmission of packets on different VCy:Respz queues have no dependencies between them.
• RxQ-ID value wraps around after the maximum-negotiated RxQ-ID.
• Transmission to multiple RxQ-ID buffers can occur in parallel on sideband links or mainband
stacks.
Figure 8-42 illustrates the ordering mechanism for an example scenario with three RXQ-IDs, and y
VCs (where y=0-7) on the sideband. For the purposes of this illustration — TC0 management port
traffic is mapped to VC0 on the sideband management path. TCy management port traffic is mapped
to VCy on the sideband management path. Note that in the figure, TC0 Req Pkt 1 and TC0 Resp Pkt 2
are segmented to two segments, to show the impact of segmentation on interleaving and ordering.
Other MTPs are not segmented. Similar ordering applies for packets that are interleaved over multiple
stacks on the mainband.
Vendor-defined Management Port Gateway messages also use the same credited buffers as MTPs.
Transmitter and receiver interleaving rules for these messages are the same as discussed earlier for
Encapsulated MTPs.
Figure 8-42. Conceptual Illustration of Sideband Multi-module Ordering with Three RxQs
TCy
Resp
TCy
Req
TC0
Represents the Resp
traffic ordering after TC0
TCy MTP Resp Pkt 1
TCy Req
TCy MTP MTP Pkt
Resp2 Pkt 3
TCy MTP
TCy Req
MTP Pkt
Resp3 Pkt 4
TC0 Req
TC0 MTP MTP Pkt
Resp1 Pkt 2
TC0 Req
TC0 MTP MTP Pkt
Resp2 Pkt 3
TC0 Req
TC0 MTP MTP Pkt
Resp3 Pkt 4
VC0 Response buffers TC0 eMTP Resp Pkt 1 TC0 eMTP Resp Pkt 2 TC0 eMTP Resp Pkt 2
(1 st Segment) ** (2 nd Segment) **
TC0 TCy
eMTP Req Resp
eMTP Pkt 1Pkt 3 TC0 eMTP Req Pkt 1 eMTP = Encapsulated MTP
TC0 eMTP Resp Pkt 4 TCy eMTP Req Pkt 2
VC0 Request buffers (1 st Segment) * (2 nd Segment) *
SB Link 2
SB Link 0
TCy Req
TCy MTP MTP Pkt
Resp2 Pkt 3
TCy MTP
TCy Req
MTP Pkt
Resp3 Pkt 4
TC0 Req
TC0 MTP MTP Pkt
Resp2 Pkt 3
TC0 Req
TC0 MTP MTP Pkt
Resp3 Pkt 4
8.2.5.1 Sideband
When supporting MPMs with Data (see Section 7.1.2.4) over the sideband, to prevent these messages
from occupying the sideband interface for extended periods of time (and thus blocking its usage for
mainband link management packets), the following rules must be observed:
• An MPM with Data (e.g., Encapsulated MTP) can have a maximum length field value of seven
QWORDs
• Receivers must not check for violation of this transmit rule.
• If the original MTP was larger than seven QWORDs, multiple Encapsulated MTPs are sent until the
full MTP is transmitted. It is also permitted to send Encapsulated MTPs smaller than seven
QWORDs even when the original MTP is larger than seven QWORDs. This can occur because of
credit availability for transmitting the Encapsulated MTP.
• The above rules allow for the link to be arbitrated for any pending Link management packet OR
any pending higher priority MEM packet of a different VC:Resp credit type (waiting behind an MPM
with Data that is in transmission) with an upper bound on the delay to transmit them. An example
of a higher-priority MPM packet that needs to be serviced in a time-bound fashion is a TC1 MTP
(see Section 8.1.3.1.1).
• Segmentation, when performed, must follow the rules described above for each individual
Segment of the MTP. See Section 8.2.4.2 for description of segmentation.
Figure 8-43 provides a pictorial representation of splitting a large MTP into multiple smaller
Encapsulated MTPs (based on the length rules stated above) and how the Encapsulated MTPs are sent
on the sideband link. If the MTP is also segmented, then each Encapsulated MTP is sent on a different
RxQ-ID. See Section 8.2.4.2 and Section 8.2.4.3.
See Section 4.8 for how the PHY arbitrates between MPMs and Link Management packets.
Figure 8-43. Example Illustration of a Large MTP Split into Multiple Smaller
Encapsulated-MTPs for Transport over Sideband, without Segmentationa
After the management transport path Initialization Phase completes, receiver starts an 8-ms
‘Heartbeat’ timer that restarts whenever an MPM (i.e., opcode 10111b or 11000b) is received.
Implementations are permitted to implement this timer as a global timer across all RxQ-IDs or as a
timer per RxQ-ID. If the timer times out, the Management Port Gateway de-asserts the mp_mgmt_up
signal which in turn clears the SB_MGMT_UP flag in the PHY and de-asserts the
mp_mgmt_port_gateway_ready signal. After a Heartbeat timeout, Management Port Gateway
functions similar to what occurs during a ‘Init Done timeout’ (see Section 8.2.4.4 for details). The
Heartbeat timer stops after L1/L2 entry negotiation on the sideband path successfully completes, and
restarts when L1/L2 exit negotiation starts. See Section 8.2.5.1.4 for details of Management path PM
entry/exit flows.
After the ‘Init Done’ message is been transmitted on an RxQ-ID path during the initialization Phase,
the Management Port Gateway (MPG) must guarantee an MPM transmission of no more than 4 ms
apart on the RxQ-ID path. If there are no scheduled messages to send on an RxQ-ID path, the MPG
must send a credit return message (unless there was a Heartbeat timeout on the receiver side as
stated in the previous paragraph) with VC value set to VC0, Resp value set to 0 and cr_ret value set
to 0h. Note that the latter applies even if the MPG takes longer than 8 ms to exit L1/L2 before the
MPG sends the associated PM exit message.
If a control parity error is detected on any received MPM, the Management Port Gateway invokes the
‘Heartbeat timeout’ flow.
On the sideband interface, it is expected that there is higher-level firmware/software managing the
deeper power states of Management Port Gateways on both sides. The sleep and wake req/ack/nak
messages (see Figure 7-12) are provided to negotiate shutdown/wake of the management transport
path for deep power states in which the Management Port Gateway logic can be clock gated or
powered down (as coordinated by the higher-level firmware). It is especially useful for a low-power
chiplet and/or SiP states flows to take advantage of these handshakes and coordinate entry and wake
up of the Management Transport Path. These messages and negotiation must occur independently for
each RxQ-ID path, and each direction. While not in a PM state, the Management Port Gateway must
keep the mp_wake_req signal asserted and this informs the Physical Layer adapter to keep the logic
up and running.
• After the partner chiplet’s Management Port Gateway receives the “Wake Req” message, that
Management Port Gateway must respond with a “Wake ack” message when the MPG is ready to
receive credited packets into its Rx buffer. Moreover, the Management Port Gateway must initiate
its own “Wake req” message to the remote Link partner if the MPG has not already done so.
• After a “Wake ack” message is sent and received on all negotiated RxQ-IDs, the PM exit flow is
complete and regular packet transfer can begin as soon as the last “Wake ack” message is
transmitted.
There is no prescribed arbitration mechanism for the Management Port Gateway mux on the
Sideband. Additionally, the size of Management Port Gateway Flow control buffers over RDI (see
Section 8.2.5.1.1) is not specified for Management Port Gateway-initiated traffic. Implementations
should take care to ensure that the PHY arbitration rules specified in Section 4.8 are not violated.
8.2.5.2 Mainband
The Management Port Gateway inserts NOP messages whose format is shown in Figure 8-44, in all
QWORD locations in a Management flit in which there is no MPM to send. NOP messages can start
only at MPM boundaries within a flit.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0000_0000h
0000_0000h
Figure 8-45. Management Transport Credit Return DWORD (CRD) Format on Mainband
3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
cr_ret_resp_b
cr_ret_resp_a
cr_ret_ cr_ret_
rxqid
rsvd
cr_ret_a rsvd cr_ret_b
vc_a vc_b
See Section 3.3.3 and Section 3.3.4 for details on where this DWORD is sent in a Management Flit for
various Flit formats.
On the mainband, MPMs are supported only over Flit Format 3 through Format 6.
See Section 3.3.3 and Section 3.3.4 for a D2D view of the Management Protocol mapping over Flit
Format 3 through Format 6. If Flit Format 1 and Format 2 are negotiated, the Management Protocol
on that stack is disabled (if supported). Management flits have bits [7:6] of Byte 1 set to 10b. See
Section 8.2.2.2 for packet format of MPMs over the mainband. Mapping of these MPMs over Flit
Format 3 through Format 6 is as follows:
• MPM header and each QWORD of MPM payload (when applicable) can be placed only at specified
byte locations in the Management flit, and can start at the 1st byte in the Management flit in
which “all bits are populated by protocol layer” (see Figure 2-1 for reference), and at subsequent
8B increments within the flit. While incrementing, only bytes in which “all bits are populated by
the Protocol Layer” are considered, excluding CRD byte locations and bytes marked as rsvd for
Protocol Layer (e.g., Flit Format 3, Bytes 40 through 43). This is pictorially shown in Figure 8-46.
August 6, 2024
Figure 8-46.
FH B1 1 FH B1 1 FH B1 1 1
2 2 2 2
For the other colors, see Figure 2-1 for color mapping.
17 17 17 17
18 18 18 18
… … … … … … … … … … … … … … … …
40 40 40 Rsvd 40
41 41 41 Rsvd 41
42 42 Rsvd 42 Rsvd 42
43 43 Rsvd 43 Rsvd 43
44 44 Rsvd 44 FH B0 44
45 45 Rsvd 45 FH B1 45
46 46 CRD B0 46 CRD B0 46
47 47 CRD B1 47 CRD B1 47
48 CRD B0 48 CRD B2 48 CRD B2 48
49 CRD B1 49 CRD B3 49 CRD B3 49
Valid MPM Header Start Locations for Various Flit Formatsa
50 CRD B2 50 50 50
b
51 CRD B3 51 51 51
Rsvd 52 52 52 52
Rsvd 53 53 53 53
Rsvd 54 54 54 54
a. Yellow cells indicate a Valid Management Port Message (MPM) header or Payload QWORD start location.
Rsvd 55 55 55 55
Reserved
Reserved
Rsvd 56 56 56 56
10B
Rsvd 57 57 57 57
Reserved
CRD B0 58 58 58 58
CRD B1 59 59 59 59
4B
CRD B2 60 60 C0 B0 60 C0 B0 60
Reserved
b. B = Byte, C = CRC, CRD = Credit Return DWORD, FH = Flit Header, Rsvd = Reserved. All are numbered, as appropriate,
CRD B3 61 61 C0 B1 61 C0 B1 61
C1 B0 C0 B0 62 C1 B0 C0 B0 62 C1 B0 62 C1 B0 62
C1 B1 C0 B1 63 C1 B1 C0 B1 63 C1 B1 63 C1 B1 63
360
System Architecture
System Architecture
Starting at a valid MPM header byte location (as discussed above), Byte 0 of the first DWORD of the
MPM header is sent at that byte, followed by Byte 1 of the first DWORD of the header at starting byte
location+1 until Byte 3 of the 2nd DWORD of the header. This is followed by Byte 0 of the 1st DWORD
of the MPM payload (if one exists), followed by Byte 1, Byte 2, Byte 3, etc., placed at incrementing
byte locations. Non-CRD bytes, bytes that are not marked as reserved and those that are driven by
the protocol layer are contiguously packed with MPM bytes after an MPM transmission starts and until
the transmission ends. If an MPM cannot be fully transmitted within a Management Flit, the MPM
continues in the subsequent Management Flit of the same stack. NOP message(s) (see
Section 8.2.5.2.1) can be inserted between MPMs within a Management Flit. It is also valid to send a
Management Flit with all NOP messages in the protocol layer-driven non-CRD bytes and non-reserved
byte locations. CRD bytes in a Management Flit always carry the credit return information per the
rules stated in Section 8.2.5.2.2.
Figure 8-47 and Figure 8-48 show example mappings of three MPMs inside Flits of Format 3 and
Format 5, respectively. The 1st MPM is of an MPM with Data type with a payload size of 15 QWORDs.
The 2nd MPM is also of an MPM with Data type with a payload size of 6 QWORDs. The 3rd MPM is an
MPM with a payload of size of 1 QWORD. NOPs are inserted after the end of the 3rd MPM until the end
of the flit.
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
…
0
1
2
3
4
5
6
7
8
9
MP D30B1 MP D13B1
MP D30B2 MP D13B2
MP D30B3 MP D13B3
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
MP D0B0
MP D0B1
MP D0B2
MP D0B3
MP D1B0
MP D1B1
MP D1B2
MP D1B3
…
MP D14B0
MP D14B1
MP D11B3
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
MP D0B0
MP D0B1
MP D0B2
MP D0B3
MP D1B0
MP D1B1
MP D1B2
MP D1B3
…
MP D0B0
MP D0B1
MP D0B2
MP D0B3
MP D1B0
MP D1B1
MP D1B2
MP D1B3
CRD B0
CRD B1
CRD B2
CRD B3
C0 B0
C0 B1
C1 B0
C1 B1
FH B0
FH B1
10B
Rsvd
Rsvd
Rsvd
Rsvd
NOP
NOP
NOP
NOP
NOP
NOP
NOP
NOP
NOP
…
Reserved
a b
Figure 8-48. Example Mapping of MPMs and NOPs in Flit of Format 5
10
11
12
13
14
15
16
17
18
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
…
0
1
2
3
4
5
6
7
8
9
MP D12B3
MP D13B0
MP D13B1
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
MP D0B0
MP D0B1
MP D0B2
MP D0B3
MP D1B0
MP D1B1
MP D1B2
MP D1B3
MP D2B0
MP D2B1
FH B0
FH B1
…
MP D28B0 MP D13B2
MP D28B1 MP D13B3
MP D27B3
C0 B0
C0 B1
4B
…
Reserved
MP D28B2
MP D28B3
MP D29B0
MP D29B1
MP D29B2
MP D29B3
MP D11B3
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
MP D0B0
MP D0B1
MP D0B2
MP D0B3
…
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
MP D0B0
MP D0B1
MP D0B2
MP D0B3
MP D1B0
MP D1B1
MP D1B2
MP D1B3
CRD B0
CRD B1
CRD B2
CRD B3
C1 B0
C1 B1
10B
NOP
NOP
NOP
NOP
NOP
…
Reserved
Figure 8-49 shows an example mapping of four MPMs inside a Format 3 flit. The 3rd MPM rolls over
into the 2nd flit. The 1st MPM in this example is of an MPM with Data type with a payload size of 15
QWORDs. The 2nd MPM is also of an MPM with Data type with a payload size of 6 QWORDs. The 3rd
MPM is an MPM with payload size of 6 QWORDs, where the 6th QWORD is sent in the 2nd Flit. The 4th
MPM in this example is a 1-QWORD Vendor-defined Management Port Gateway message without data.
The remainder of the 2nd flit is all NOPs.
Flit 0
10
11
12
13
14
15
16
17
18
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
…
0
1
2
3
4
5
6
7
8
9
MP D30B1 MP D13B1
MP D30B2 MP D13B2
MP D30B3 MP D13B3
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
MP D0B0
MP D0B1
MP D0B2
MP D0B3
MP D1B0
MP D1B1
MP D1B2
MP D1B3
…
MP D14B0
MP D14B1
MP D11B3
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
MP D0B0
MP D0B1
MP D0B2
MP D0B3
MP D1B0
MP D1B1
MP D1B2
MP D1B3
…
MP D0B0
MP D0B1
MP D0B2
MP D0B3
MP D1B0
MP D1B1
MP D1B2
MP D1B3
MP D2B0
MP D2B1
MP D2B2
MP D2B3
MP D3B0
MP D3B1
MP D3B2
MP D3B3
MP D9B3
CRD B0
CRD B1
CRD B2
CRD B3
FH B0
FH B1
C0 B0
C0 B1
C1 B0
C1 B1
10B
Rsvd
Rsvd
Rsvd
Rsvd
…
Reserved
Flit 1
10
11
12
13
14
15
16
17
18
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
…
0
1
2
3
4
5
6
7
8
9
MP D10B0
MP D10B1
MP D10B2
MP D10B3
MP D11B0
MP D11B1
MP D11B2
MP D11B3
MH D0B0
MH D0B1
MH D0B2
MH D0B3
MH D1B0
MH D1B1
MH D1B2
MH D1B3
NOP
NOP
…
NOP
NOP
…
NOP
NOP
…
CRD B0
CRD B1
CRD B2
CRD B3
FH B0
FH B1
C0 B0
C0 B1
C1 B0
C1 B1
10B
Rsvd
Rsvd
Rsvd
Rsvd
NOP
NOP
…
Reserved
For Management Transport on the mainband that has a Management Port Gateway mux, it should be
noted that if the associated protocol stack resets or disables the link, the Management Transport path
is also reset/disabled. If this is not desired, it is recommended that protocol stacks in such
configurations have a way to disable sending link reset and link disable requests on the FDI so that
the Management Transport path is not affected.
UDA is architected on top of UCIe Manageability Infrastructure and uses the architectural elements of
that infrastructure for Chiplet-level and SiP-level testing and debug (see Section 8.1 for details of
UCIe Manageability Architecture). UDA requires functional UCIe links (Sideband and/or Mainband)
and a functional management network for test and debug purposes. Debug and bring-up of UCIe links
and elements that comprise the UDA (see Section 8.3.1.1 through Section 8.3.1.4) can be performed
by any sideband interface of choice (e.g., JTAG, GPIO, Sideband-only UCIe, etc.) of the chiplet
vendor, and are beyond the scope of this specification.
Within each chiplet, UDA is architected in a Hub-Spoke model. In this model, DFx Management Hub
(DMH) is the Management Element that implements the Debug and Test Protocol(s). UDA allows for
SW/FW to discover debug capabilities present in a chiplet, and provides for global security control/
status for test/debug functionality present in the chiplet. Chiplet test/debug functionality is
implemented in DFx Management Spoke (DMS). Some examples of test/debug functionality are Scan
controller, Memory BIST, SoC fabric debug, Core debug, trace protocol engine, etc.
In Figure 8-50, there is one DMH with a Management Network ID of 1040h and 4 DMSs connected to
it with DMS-IDs (also referred to as Spoke-IDs) from 1 to 4. Management Network ID is used to route
DFx and other relevant manageability packets to DMH in the manageability fabric. See
Section 8.1.3.2 for how to interpret this ID. DMS-ID is used to route ID-routed Test and Debug
packets to the correct Spoke within DMH.
Chiplet
UCIe MEMBIST
DMS DMS
DMS-ID = 2 DMS-ID = 3
DMH
Management
Network ID = 1040h
Debug
Scan
Protocol
Chain
Engine
DMS
DMS
DMS-ID = 1 DMS-ID = 4
• Spokes can optionally support Vendor-defined Test and Debug messages, and these messages are
routed to the destination Spoke within a DMH using a DMS-ID.
• Valid DMS-IDs for Spokes are from 1 to 254. A value of 0h is assigned for DMH. A value of FFh is
reserved.
• DMS-ID is unique within a given DMH.
• DMH provides a pointer to the first DMS in a linked list of DMSs present within the DMH.
• Each Spoke identifies itself as one of these types (see Section 8.3.5.3.2.8 for more details):
— UCIe.Physical_Layer
— UCIe.Adapter
— UCIe.Adapter_Physical_Layer
— Vendor-defined
• Each Spoke implements a simple standard register set that helps to uniquely enumerate each
Spoke and to allow custom SW to be loaded to interact with the Spoke.
— All Spokes minimally support DWORD Register Rd/Wr accesses. Support for sizes beyond that
are optional.
• Vendor-defined sections of the register space can be used for a vendor to implement any Spoke
functionality such as triggering BIST, reading internal debug registers, array dump, etc.
Used to access registers in DMH/DMS. See Section 8.1.4 for details of this protocol.
Used for test as discussed in Section 8.3.2 and Section 8.3.3 and for any other vendor-defined
functionality. Format of DWORDS 2 to M of Vendor-defined Test and Debug UCIe DFx Messages (or
Vendor-defined UDM, for short) is shown below. DWORDs 0 to 1 of these messages follow the
standard format of Management Transport packet described in Section 8.1.3, with the Management
Protocol field set to ‘Test and Debug Protocol’. Packet Integrity Protection DWORDs (that appear after
DWORD M) are as defined in Section 8.1.
• These messages are routed to the correct Spoke within a DMH, using the Destination DMS-ID field
in Byte 8 of the message.
• UCIe Vendor-ID field is the UCIe Consortium-assigned Vendor ID for the Spoke’s IP Vendor
In Figure 8-51, UCIe Vendor ID[15:8] is sent on Byte 0[7:0] of DWORD 3, and UCIe Vendor ID[7:0]
is sent on Byte 1[7:0] of DWORD 3.
•
•
• Vendor-defined Payload
(Protocol-specific)
DWORD M
IMPLEMENTATION NOTE
A Spoke’s support for Vendor-defined UDM is negotiated/discovered using vendor-
defined mechanisms. The Spoke Vendor ID and Spoke Device ID can be used to
determine a specific Spoke implementation from a specific Vendor. Vendor-defined
registers in the Spoke can be used to negotiate/discover the Vendor-defined Payload
format of Vendor-defined UDM.
Testing with low-cost ATE typically requires cycle-accurate determinism. When using UCIe as a test
port, how the determinism is achieved end-to-end is implementation-specific and beyond the scope of
this specification.
UCIe
Chiplet
UDM over
SoC MEMBIS T
UCIe Sideband/Mainband Spoke DMS
UCIe
ATE UCIe-SB/MB
Management DMH
Network
UCIe Ctlr
• UCIe sideband and/or mainband can be used for this testing if they have a bump pitch of 100 um
to 130 um.
• For sending/receiving scan test patterns, Vendor-defined UDMs (see Figure 8-51) are used over
UCIe Sideband or mainband. These messages can target the appropriate Spoke (using the DMS-
ID field) in the design that implements the scan functionality.
• For general-purpose testing/debugging using register reads and writes, UCIe UMAP messages can
be used (e.g., for triggering in-build self-test mechanisms in a chiplet, a UCIe register read/write
mechanism can be used to trigger a test and then read the test results).
In Figure 8-52, a UCIe Management port embedded in the UCIe controller provides access from the
tester to the chiplet’s manageability/test/debug fabric. The access control mechanism for ATE to
acquire access to the UCIe Management network is implementation-specific.
While the ATE interfaces covered above are UCIe sideband and mainband, other interfaces such as
JTAG, GPIO, and PCIe are also possible. Vendors can implement a bridge from these interfaces, with
appropriate security control, to the UCIe Management network.
ATE
JTAG/GPIO/PCIe/..
Package
• There is at least one test/debug port pinned out in the package for SiP-level testing/debugging.
— The port could be any of JTAG, GPIO, PCIe, USB, SMBus, and/or I2(3)C.
• More than one package port can be used for speeding up package-level test/debug.
• Vendors can implement bridges, with appropriate security control, from these interfaces to the
UCIe Management network.
— On the UCIe Management network, bridged packets follow the UCIe Management Transport
Packet format.
• Accesses from package ports are forwarded over UCIe sideband or mainband if they target other
chiplets. See Section 8.1.3.2 for details of how the target chiplet of a Manageability packet is
determined.
• See Section 8.2.1 for details of how UDMs are encapsulated on the UCIe sideband and mainband.
• Similar to sort testing,
— For sending/receiving scan test patterns, Vendor-defined UCIe DFx Messages (UDM) are used
over UCIe. These messages can target the appropriate Spoke in the design that implements
the scan control functionality.
— For general-purpose testing/debugging using register reads and writes, UMAP messages can
be used, as defined in Section 8.1.4.
Logic
Debugger
Analyzer
PCIe
I3C
Package
USB
Remote
Debug
Console
All spec-defined registers in DMH and DMS are accessed in DWORD size only.
The DMH base address in Figure 8-55 is from the Capability Directory of Management Element that
hosts a DMH (see Section 8.1.3.6.1 for details).
FFFFFFFF_FFFFFFFFh DMS
DMS
DMS_Next = 0h
DMS
DMS_Next
DMH
DMS_Start
DBG_CAP
DBG_CTL
DBG_STS
DMH_Length_Low
DMH_Length_High
DMH_Ext_Cap_Low
DMH_Ext_Cap_High
DMS_Start_Low
DMS_Start_High
40B
Reserved •
•
•
128B
•
Vendor-defined •
•
Version
7:0 RO
Set to 00h.
Capability ID
13:0 RO
Set to 2h to indicate DMH.
Version
3:0 RO
Set to 0h.
DMS Accessed
0 RO At least one DMS was accessed from the management network since the last
Management Reset. This bit is cleared on each Management Reset.
Lower 20 bits of the length of the DMH register space in multiples of 4K. Bits
[11:0] in this register are reserved to ensure 4k multiples of length.
31:12 RO
A value of 1000h for {DMH_Length_High :: DMH_Length_Low} indicates a
length of 4K. A value 2000h indicates a length of 8K, etc.
Upper 32 bits of the length of the DMH register space in multiples of 4K.
31:0 RO A value of 1000h for {DMH_Length_High :: DMH_Length_Low} indicates a
length of 4K. A value 2000h indicates a length of 8K, etc.
Lower 30 bits of the DWORD-aligned offset from the DMH starting address,
31:2 RO where any extended capabilities start, when present in the DMH. Set to all 0s
for this revision of the spec.
Upper 32 bits of the DWORD-aligned offset from the DMH starting address,
31:0 RO where any extended capabilities start, when present in the DMH. Set to all 0s
for this revision of the spec.
Lower 20 bits of the 4K-aligned starting address (in the UMAP address space
31:12 RO of the management element that hosts the DMH) of the first DMS connected
to the DMH.
Upper 32 bits of the 4K-aligned starting address (in the UMAP address space
31:0 RO of the management element that hosts the DMH) of the first DMS connected
to the DMH.
Figure 8-58 shows registers that are common for all Spoke types. For security, DMS registers are
classified as follows. See Section 8.1.3.5.1 for the details of each class.
• Spoke STS register falls within the ‘Chiplet Status’ asset class.
• When applicable, UCIe Link Status in UCIe Link DVSEC, UCIe link-related status/log registers in
the Adapter_Physical_Layer register block (e.g., Correctable/Uncorrectable Error Status),
Compliance and Test-related status registers in the Compliance_Test register block (e.g., Physical
Layer Compliance 1 and 2 Status registers), fall within the ‘SiP Status’ asset class.
— All standard UCIe link registers other than the ones noted above fall within the ‘SiP
Configuration’ asset class.
• All other spec-defined registers in DMS fall within the ‘Chiplet Configuration’ asset class.
Figure 8-57 shows the register map for “empty” Spokes. Designs can use this Spoke register
structure to indicate that the Spoke does not have any Spoke functionality.
DMS_Next_Low 4B
DMS_Next_High 8B
Lower 20 bits of the 4K-aligned starting address (in the UMAP address space
of the management element that hosts the DMH) of the next DMS connected
31:12 RO
to the DMH. If this is the last Spoke in the Spoke chain, this field needs to be
set to all 0s.
Upper 32 bits of the 4K-aligned starting address (in the UMAP address space
of the management element that hosts the DMH) of the next DMS connected
31:0 RO
to the DMH. If this is the last Spoke in the Spoke chain, this field needs to be
set to all 0s.
Figure 8-58 shows the registers that are present in all non-empty Spokes. Locations marked as
“Type-Specific” in Figure 8-58 carry registers that are specific to the ‘Spoke Type’ and are discussed in
Section 8.3.5.3.3 and Section 8.3.5.3.4.
Figure 8-58. Common DMS Registers for All Non-empty Spokes Register Map (Sheet 1 of 2)
31 24 23 16 15 8 7 0
DMS_Next_Low
DMS_Next_High
Figure 8-58. Common DMS Registers for All Non-empty Spokes Register Map (Sheet 2 of 2)
31 24 23 16 15 8 7 0
Spoke CAP
Spoke CTL
Spoke STS
Spoke CTL
DMS_Length_Low
DMS_Length_High
DMS_Ext_Cap_Low
DMS_Ext_Cap_High
48B
Type-Specific •
•
•
80B
Reserved •
•
•
128B
•
Type-Specific •
•
Spoke Vendor ID
15:0 RO Uniquely identifies a Spoke Vendor to Software. This ID is assigned by UCIe
Consortium.
Spoke Device ID
15:0 RO Uniquely identifies a device from the Vendor identified by the Vendor ID. This
ID is assigned by the vendor.
Lower 20 bits of the 4K-aligned starting address (in the UMAP address space
of the management element that hosts the DMH) of the next DMS connected
31:12 RO
to the DMH. If this is the last Spoke in the Spoke chain, this field needs to be
set to all 0s.
Upper 32 bits of the 4K-aligned starting address (in the UMAP address space
of the management element that hosts the DMH) of the next DMS connected
31:0 RO
to the DMH. If this is the last Spoke in the Spoke chain, this field needs to be
set to all 0s.
Spoke Revision ID
7:0 RO Uniquely identifies a Spoke Vendor to Software. This ID is assigned by UCIe
Consortium.
Associated DMS-ID
Spoke-ID associated with other Spokes that constitute the same UCIe link. For
example, if there are separate Spokes for some or all of the IPs that constitute
a full UCIe stack – Adapter, Physical Layer, Protocol Stack0, Protocol Stack1 –
these registers within each Spoke provide the DMS-IDs of the related partner
7:0 RO
Spokes. If there are no related Spokes, this register reads as FFh. If there are
multiple protocol stacks, the lower value DMS-ID belongs to Stack 0 and
higher value belongs to Stack 1.
These registers are used by SW to identify all the Spokes that constitute a
single UCIe link.
Version
3:0 RO
Set to 0h for this version of the capability.
Spoke Type
0: UCIe.Adapter. Indicates a Spoke associated with UCIe Adapter.
1: UCIe.Physical_Layer. Indicates a Spoke associated with UCIe Physical
Layer.
15:8 RO
2: UCIe.Adapter_Physical_Layer. Indicates a common Spoke across both UCIe
Adapter and Physical Layer.
3 to 127: Reserved.
128 to 255: Vendor-defined.
Spoke Used
Indicates that the Spoke has been accessed at least once since the last
0 RO Management Reset. Access implies sending or receiving UMAP packets or
UDMs.
Bit is cleared on the next Management Reset.
Lower 20 bits of length of the DMS register space from Offset 0h of DMS, in
multiples of 4K. Value of 1000h for {DMS_Length_High :: DMS_Length_Low}
indicates 4K length, 2000h indicates 8K length, etc. Bits [11:0] in this register
31:12 RO are reserved to ensure 4k multiples of length.
UCIe Spoke Types 0, 1, and 2 implemented to this revision of the spec must
have a value in this register such that the DMS register space is not larger
than 4 MB.
Upper 32 bits of length of the DMS register space from Offset 0h of DMS, in
multiples of 4K. Value of 1000h for {DMS_Length_High :: DMS_Length_Low}
31:0 RO indicates 4K length, 2000h indicates 8K length, etc.
UCIe Spoke Types 0, 1, and 2 implemented to this revision of the spec must
set this value to all 0s.
Lower 30 bits of the DWORD-aligned offset from the DMS starting address,
31:2 RO where any extended capabilities start, when present in the DMS. Value of all
0s indicates that there are no extended capabilities (default).
Upper 32 bits of the DWORD-aligned offset from the DMS starting address,
31:0 RO where any extended capabilities start, when present in the DMS. Value of all
0s indicates that there are no extended capabilities (default).
Figure 8-59 shows the DMS register map for the UCIe Spoke types. Figure 8-58 and Section 8.3.5.3.2
detail registers that are common to all Spoke types. This section details the remaining registers,
which are unique to the UCIe Spoke types.
DMS_Next_Low
DMS_Next_High
Spoke CAP
Spoke CTL
Spoke STS
Spoke CTL
DMS_Length_Low
DMS_Length_High
DMS_Ext_Cap_Low
DMS_Ext_Cap_High
Adapter_Physical_Layer_Ptr_Low 48B
Adapter_Physical_Layer_Ptr_High
Compliance_Test_Ptr_Low
Compliance_Test_Ptr_High
Impl_Spec_Adapter_Ptr_Low
Impl_Spec_Adapter_Ptr_High
Impl_Spec_Physical_Layer_Ptr_Low
Impl_Spec_Physical_Layer_Ptr_High
80B
Reserved •
•
•
128B
UCIe Link DVSEC
Vendor-defined •
•
•
UCIe Link Register Blocks
Vendor-defined
N
Port ID
For Spoke Types 0, 1, and 2, this register indicates the Port ID of the UCIe link
15:0 RO that is associated with the Spoke, if a Port ID exists for the link. A UCIe link
has a Port ID assigned to it if the link is a Management Port. If the link does
not have an assigned Port ID, this register reads as FFFFh.
Lower 20 bits of the 4K-aligned offset (from the starting address of the Spoke)
of the UCIe Adapter/Physical Layer register block that is associated with the
UCIe link.
Accesses to registers that are referenced by
Adapter_Physical_Layer_Ptr_Low/High pointers in a UCIe.Adapter Spoke are
limited to the 4k block(s) that contain the Adapter registers and the register
31:12 RO
block header itself, and the 4k block(s) that contain Physical Layer registers
are treated as reserved.
Accesses to registers that are referenced by
Adapter_Physical_Layer_Ptr_Low/High pointers in a UCIe.Physical_Layer
Spoke are limited to the 4k block(s) that contain the PHY registers and the
register block header itself, and Adapter registers are treated as reserved.
Upper 32 bits of the 4K-aligned offset (from the starting address of the Spoke)
of the UCIe Adapter/PHY register block that is associated with the UCIe link.
Accesses to registers that are referenced by
Adapter_Physical_Layer_Ptr_Low/High pointers in a UCIe.Adapter Spoke are
limited to the 4k block(s) that contain the Adapter registers and the register
31:0 RO block header itself, and the 4k block(s) that contain PHY registers are treated
as reserved.
Accesses to registers that are referenced by
Adapter_Physical_Layer_Ptr_Low/High pointers in a UCIe.Physical_Layer
Spoke are limited to the 4k block(s) that contain the PHY registers and the
register block header itself, and Adapter registers are treated as reserved.
Lower 20 bits of the 4K-aligned offset (from the starting address of the Spoke)
of the UCIe Test/Compliance register block that is associated with the UCIe
link.
Accesses to registers that are referenced by Compliance_Test_Ptr_Low/High
pointers in a UCIe.Adapter Spoke are limited to the 4k block(s) that contain
the Adapter registers and the register block header itself, and the 4k block(s)
that contain PHY registers are treated as reserved.
31:12 RO
Accesses to registers that are referenced by Compliance_Test_Ptr_Low/High
pointers in a UCIe.Physical_Layer Spoke are limited to the 4k block(s) that
contain the PHY registers and the register block header itself, and the Adapter
registers are treated as reserved.
Accesses to registers that are referenced by Compliance_Test_Ptr_Low/High
pointers in a UCIe.Adapter_Physical_Layer Spoke have no access restrictions.
Set to all 0s if this register block is not implemented.
Upper 32 bits of the 4K-aligned offset (from the starting address of the Spoke)
of the UCIe Test/Compliance register block that is associated with the UCIe
link.
Accesses to registers that are referenced by Compliance_Test_Ptr_Low/High
pointers in a UCIe.Adapter Spoke are limited to the 4k block(s) that contain
the Adapter registers and the register block header itself, and the 4k block(s)
that contain PHY registers are treated as reserved.
31:0 RO
Accesses to registers that are referenced by Compliance_Test_Ptr_Low/High
pointers in a UCIe.Physical_Layer Spoke are limited to the 4k block(s) that
contain the PHY registers and the register block header itself, and the Adapter
registers are treated as reserved.
Accesses to registers that are referenced by Compliance_Test_Ptr_Low/High
pointers in a UCIe.Adapter_Physical_Layer Spoke have no access restrictions.
Set to all 0s if this register block is not implemented.
Lower 20 bits of the 4K-aligned offset (from the starting address of the Spoke)
of the Adapter Implementation-specific register block.
31:12 RO
In a UCIe.Physical_Layer Spoke type, this pointer must be set to all 0s. Also
set to all 0s if the register block is not implemented in the design.
Upper 32 bits of the 4K-aligned offset (from the starting address of the Spoke)
of the Adapter Implementation-specific register block.
31:0 RO
In a UCIe.Physical_Layer Spoke type, this pointer must be set to all 0s. Also
set to all 0s if the register block is not implemented in the design.
Lower 20 bits of the 4K-aligned offset (from the starting address of the Spoke)
of the Physical Layer Implementation-specific register block.
31:12 RO
In a UCIe.Adapter Spoke type, this pointer must be set to all 0s. Also set to all
0s if the register block is not implemented in the design.
Upper 32 bits of the 4K-aligned offset (from the starting address of the Spoke)
of the Physical Layer Implementation-specific register block.
31:0 RO
In a UCIe.Adapter Spoke type, this pointer must be set to all 0s. Also set to all
0s if the register block is not implemented in the design.
UCIe Link DVSEC (see Section 9.5.1) is mirrored starting at this location. Accesses to the DVSEC by
the UCIe.Physical_Layer Spoke type are treated as reserved.
IMPLEMENTATION NOTE
Spokes can restrict access to UCIe link registers based on access control
considerations (see Section 8.1.3.5 for details).
Figure 8-60 shows the DMS register map for the Vendor-defined Spoke types. Figure 8-58 and
Section 8.3.5.3.2 detail registers that are common to all Spoke types. Section 8.3.5.3.3.1 details the
Port ID register. Vendor-defined Spokes do not have any additional architected registers.
DMS_Next_Low
DMS_Next_High
Spoke CAP
Spoke CTL
Spoke STS
Spoke CTL
DMS_Length_Low
DMS_Length_High
DMS_Ext_Cap_Low
DMS_Ext_Cap_High
48B
Reserved •
•
•
128B
•
Vendor-defined •
•
8.3.5.3.5 DMS Register Implementation in UCIe Adapter and in UCIe Physical Layer
IMPLEMENTATION NOTE
For Spoke Type 0, the DMS registers are implemented in the Adapter. For Spoke Type
1, the DMS registers are implemented in the Physical Layer. For Spoke Type 2, all but
the register blocks associated with the Physical Layer are implemented in the Adapter.
These registers are accessed over the FDI config bus (lp_cfg*/pl_cfg*) using
DMS Register read/write opcodes (see Table 7-1, “Opcode Encodings Mapped to
Packet Types”). SoC logic that interfaces with on-die management fabric (which is
implementation-specific) is required to perform the conversion from Management
Transport protocol UMAP packets to FDI config bus packets. The FDI config bus is
defined in Section 10.2.
§§
UCIe specification allows for a single UCIe Link to be shared by multiple protocol stacks. In this
version of the spec, this sharing is limited to at most 2 protocol stacks. Shared Link layer is a new
concept from Software perspective and requires new discovery/control mechanisms. The mechanism
by which UCIe-aware SW discovers UCIe capability is described in the next section.
Table 9-1 shows the legal/illegal combinations of Upstream and Downstream devices/ports at a given
UCIe interface, from a SW viewpoint.
Table 9-1. Software view of Upstream and Downstream Device at UCIe interface
All the CXL/PCIe legacy/advanced capabilities/registers defined in the respective specifications apply
to UCIe host and devices as well. Some Link and PHY layer specific registers in PCIe Base
Specification do not apply in UCIe context and these are listed in the appendix. In addition, two new
DVSEC capabilities and four other MMIO mapped register blocks are defined to deal with UCIe-specific
Adapter and Physical Layer capabilities.
Dev0/Fn0 of the USP carrying • In multi-stack implementations, Dev0/Fn0 of the USP in only one of
In Switch USP
a UCIe Link DVSEC Capability the stacks carries the UCIe Link DVSEC Capability.
•UCIe Links below the switch are described in UiSRB whose base
address is provided in the UiSRB DVSEC Capability
• A UCIe Link DVSEC capability per downstream UCIe Link is present
Dev0/Fn0 of the Switch USP in the UiSRB
In Switch DSP carrying one ore more UiSRB • Association of a UCIe Link to 1 or more Switch DSPs is described as
DVSEC Capability part of the UCIe Link DVSEC Capability, allowing for UCIe-aware
SW to understand the potential shared nature of the UCIe interface
Note: It is legal for a Switch USP to carry the UiSRB DVSEC capability
but not a UCIe Link DVSEC Capability
Config in Table 9-3). The Mailbox mechanism is available via RP/DSP UCIe Link DVSEC Capability
to access the UCIe Retimer registers on the Retimer closest to the host. For accessing UCIe
Retimer registers on the far end Retimer, the same Mailbox mechanism is also available in the
UCIe Link DVSEC capability of EP/USP. See Section 9.5.1.11 and Section 9.5.1.12 for details of
the Mailbox mechanism.
• For debug and runtime Link health monitoring reasons, host SW can also access the UCIe related
registers in any partner die on the sideband interface, using the same Mailbox mechanism. For
brevity purposes, that is not shown in Table 9-3. Note that register accesses over sideband are
limited to only the UCIe-related Capability registers (the two DVSECs currently defined in the
spec) and the four defined UCIe Register Blocks. Nothing else on the remote die are accessible via
the sideband mechanism.
Table 9-3 summarizes the location of various register blocks in each native UCIe port/device.
Henceforth a “UCIe port/device/EP/Switch” is used to refer to a standard PCIe or CXL port/device/EP/
Switch with UCIe Link DVSEC Capability.
UCIe Test/
Switch USP-BAR EP-BAR SB-MMIO Registers for Test/Compliance
Compliance UiRB UiSRB
Region Region Space of UCIe interface
Register Block
UCIe Implementation
Switch USP-BAR EP-BAR SB-MMIO Registers for vendor specific
Specific UiRB UiSRB
Region Region Space implementation
Register Block
Figure 9-1. Software view Example with Root Ports and Endpoints
Example in Figure 9-2 has a Switch with 2 UCIe Links on its downstream side and each UCIe Link
carries traffic from 2 Switch DSPs.
UCIe1 Comp/Test
Header
UCIe1 PHY/D2D
Header
UCIe0 PHY
Implementation
Specific
Header
UCIe0 D2D
Implementation
Specific
Header
UCIe0 Comp/Test
Header
UCIe0 PHY/D2D
Header
UCIe0 Switch DSP Switch DSP UCIe1 Switch DSP Switch DSP
UCIe0 UCIe1
UCIe UCIe
EP EP EP EP
Example in Figure 9-3 shows details UCIe registers in an implementation where two EPs are sharing a
common UCIe Link.
Attribute Description
RO Read Only
RW Read-Write
Read-Write-One-To-Lock
RWO
Field becomes RO after writing 1 to it. Cleared by management reset.
RW1C Read-Write-One-To-Clear
RW1CS Read-Write-One-To-Clear-Stickya
a. Definition of ‘sticky’ follows the underlying protocol definition if any of the Protocol stacks are PCIe or CXL. For
Streaming, the sticky registers are recommended to preserve their value even if the Link is down. In all
scenarios, Domain Reset must initialize these to their default values.
b. Typically, this register attribute is used for functionality/capability that can vary with package integration. For
example, a chiplet that is capable of 32 GT/s maximum speed might be routed to achieve a maximum speed
of 16 GT/s in a given package implementation. To account for such scenarios, the Max link speed field in the
UCIe Link Capability register has the HWInit attribute and its value could be configured by a package-level
strap or device/system firmware to reflect the maximum speed of that implementation.
All numeric values in various data structures, individual registers and register fields defined in this
chapter are always encoded in little endian format, unless stated otherwise.
c
…
d
…
Reserved
Requester ID/Reserved
Reserved
Table 9-5. UCIe Link DVSEC - PCI Express Extended Capability Header
For this revision of UCIe, only values 0h, 1h, 2h and 7h are valid.
‘Port Number’ is bits 31:24 of the PCIe Link capabilities register of the
downstream port.
Raw Format
0 RO
If set, indicates the Link can support Raw Format.
0h: x16
3:1 HWInit 4h: x256
1h: x32
7h: x8
2h: x64
Others - Reserved
3h: x128
RO (Retimer),
8 Retimer - Set by retimer to indicate it to SW
RsvdP (others)
Multi-protocol capablea
RsvdP (Retimer), 0 - single stack capable
9
RO (others) 1 - multi-protocol capable
Only 2 stacks max is possible
Advanced Packaging
10 RO 0 = Standard package mode for UCIe Link
1 = Advanced package mode for UCIe Link
a. This bit was named and referred to as “Multi-stack” in r1.1 and prior revisions of the spec.
Raw Format Enable: If set, enables the Link to negotiate Raw Format
during Link training.
Default value of this is 0b for RP and firmware/SW sets this bit based on
RW (RP/DSP), system usage scenario.
0
HWInit (Others) Switch DSP can set the default via implementation-specific mechanisms
such as straps/FW/etc., to account of system usage scenario (like UCIe
retimer). This allows for the DSP Link to train up without Software
intervention and be UCIe-unaware-OS compatible.
Start UCIe Link training - When set to 1, Link training starts with Link
Control bits programmed in this register and with the protocol layer
capabilities. Bit is automatically cleared when Link training completes
with either success or error. The status register captures the final status
of the Link training. Note that if the Link is up when this bit is set to 1
from 0, the Link will go through full training through Link Down state
thus resetting everything beneath the Link. If Link Status (in UCIe Link
Status register) is 0b and the link is already in training (i.e., the link
RW, with auto training state machine is in between RESET and ACTIVE states), when
10 clear (RP/DSP), this bit transitions from 0 to 1, link does not restart the training and this
RsvdP (Others) bit’s transition from 0 to 1 is ignored.
Primary usage intended for this bit is for initial Link training out of reset
on the host side.
Note: For downstream ports of a switch with UCIe, local HW/FW has to
autonomously initiate Link training after a conventional reset,
without waiting for higher level SW to start the training via this
bit, to ensure backward compatibility.
Default is 0.
a. This bit was named and referred to as “Multi-stack” in r1.1 and prior revisions of the spec.
Link Status
0 - Link is down.
1 - Link is up
This bit indicates the status of the mainband.
Transitioning a Link from down to up requires a full Link training, which
can be achieved using one of these methods:
• Start Link training via the bits in the UCIe Link Control register of
the upstream device
• Using the protocol layer reset bit associated with the Link, like the
SBR bit in the BCTL register of the RP P2P space
15 RO • Using the protocol layer Link Disable bit associated with the Link,
like the Link Disable bit in the Link CTL register of the PCIe capability
register in the RP P2P space, and then releasing the disable.
Notes: If the Link is actively retraining, this bit reflects a value of 1.
This bit is a consolidated status of the RDI and FDI (i.e., if both
the RDI and FDI are up, then this bit is set to 1; otherwise, this
bit is cleared to 0).
Link Training/Retraining
16 RO 1b - Currently Link is training or retraining
0b - Link is not training or retraining
HW autonomous BW changed
RW1C (RP/DSP),
18 UCIe autonomously changed the Link width or speed to correct Link
RsvdZ (Others)
reliability related issues.
a. This bit was named and referred to as “Multi-stack” in r1.1 and prior revisions of the spec.
9.5.1.7 UCIe Link DVSEC - Link Event Notification Control (Offset 18h)
Link event notification related controls are in this register.
This field indicates which MSI vector (for host UCIe Links), or MSI/MSI-X
vector (for switch DSP UCIe Links) is used for the interrupt message
generated in association with the events that are controlled via this
register.
For MSI, the value in this field indicates the offset between the base
Message Data and the interrupt message that is generated. Hardware is
required to update this field so that it is correct if the number of MSI
Messages assigned to the Function changes when software writes to the
RO(RP/DSP), Multiple Message Enable field in the Message Control Register for MSI.
15:11
RsvdP(Others) For first generation of UCIe, maximum 2 interrupt vectors could be
requested for UCIe related functionality and the ‘Link event’ is one of
them.
For MSI-X (applicable only for interrupts from Switch DSPs with UCIe
Links), the value in this field indicates which MSI-X Table entry is used to
generate the interrupt message. The entry must be one of the first 32
entries even if the Function implements more than 32 entries. For a
given MSI-X implementation, the entry must remain constant.
Note: This register only controls the propagation of the error condition and it has no impact
on the setting of the appropriate status bits in the Link Status register, when the
relevant error happens.
RW(RP/DSP),
0 Default is 0
RsvdP (Others)
EP/USP
0: Reporting of this error via sideband error message is not enabled
1: Reporting of this error via sideband error message is enabled
Default is 0
RP/DSP
0: Reporting of this error via UCIe Link Error interrupt is not enabled
1: Reporting of this error via UCIe Link Error interrupt is enabled
EP/USP
0: Reporting of this error via sideband error message is not enabled
1: Reporting of this error via sideband error message is enabled
3 RW
Retimer connected to RP/DSP
Default is 0
EP/USP
0: Reporting of this error via sideband error message is not enabled
1: Reporting of this error via sideband error message is enabled
Default is 0
This field indicates which MSI vector (for host UCIe Links), or MSI/MSI-X
vector (for switch DSP UCIe Links) is used for the interrupt message
generated in association with the events that are controlled via this
register.
For MSI, the value in this field indicates the offset between the base
Message Data and the interrupt message that is generated. Hardware is
required to update this field so that it is correct if the number of MSI
Messages assigned to the Function changes when software writes to the
Multiple Message Enable field in the Message Control Register for MSI.
For first generation of UCIe, maximum 2 interrupt vectors could be
15:11 RW/RO requested for UCIe related functionality and the ‘Error’ is one of them.
For MSI-X (applicable only for interrupts from Switch DSPs with UCIe
Links), the value in this field indicates which MSI-X Table entry is used to
generate the interrupt message. The entry must be one of the first 32
entries even if the Function implements more than 32 entries. For a
given MSI-X implementation, the entry must remain constant.
Note: All register blocks start with a header section that indicates the size of the block in
multiples of 4 KB.
Register BIR
For UCIe DVSEC capability in host UiRB, Switch UiSRB and in UCIe
Retimer, this field is reserved.
For others, its defined as follows:
Indicates which one of a Dev0/Fn0 Base Address Registers, located
beginning at 10h in Configuration Space, or entry in the Enhanced
Allocation capability with a matching BAR Equivalent Indicator (BEI), is
used to map the UCIe Register blocks into Memory Space.
Defined encodings are:
2:0 RO • 0 Base Address Register 10h
• 1 Base Address Register 14h
• 2 Base Address Register 18h
• 3 Base Address Register 1Ch
• 4 Base Address Register 20h
• 5 Base Address Register 24h
All other Reserved.
The Registers block must be wholly contained within the specified BAR.
For a 64-bit Base Address Register, the Register BIR indicates the lower
DWORD.
Note: All register blocks start with a header section that indicates the size of the block in
multiples of 4 KB.
Opcode
00000b 32b Memory Read
00001b 32b Memory Write
00100b 32b Configuration Read
00101b 32b Configuration Write
4:0 RW 01000b 64b Memory Read
01001b 64b Memory Write
01100b 64b Configuration Read
01101b 64b Configuration Write
OthersReserved
Default 00100
BE[7:0]
12:5 RW
Default Fh
For sideband write opcodes, this carries the write data [31:0] to the
destination.
31:0 RW For sideband read opcodes, this carries the data read from the
destination when the Write/Read Trigger bit in the Mailbox Control
register is cleared, after it was initially set. This field’s value is undefined
until the Write/Read trigger bit is cleared on reads.
For sideband write opcodes, this carries the write data [63:32] to the
destination.
For sideband read opcodes, this carries the data read from the
31:0 RW destination when the Write/Read Trigger bit in the Mailbox Control
register is cleared, after it was initially set. This field’s value is undefined
until the Write/Read trigger bit is cleared on reads.
For 32b Writes/Reads, this register does not carry valid data.
Write/Read status
00b: CA received
RW1C(RP/DSP), 01b: UR received
RW1C(EP/USP),
1:0 10b: Reserved
when
implemented 11b: Success
This bit has valid value only when the Write/Read Trigger bit is cleared
from being a 1 prior to it.
Port Number 1 - ‘Port number’ of the 1st switch DSP associated with this
7:0 RO
UCIe. This value is from the Link Capabilities register of that switch DSP.
Port Number 2 - ‘Port number’ of the 2nd switch DSP associated with this
UCIe, if any. If there is no 2nd switch DSP associated with this UCIe Link, this
field is treated as reserved and should not be included as part of the “length”
15:8 RO field of the ‘Designated Vendor specific Header 1’ register and SW should not
consider this as part of the DVSEC capability.
Note: Only a maximum of two Port numbers can be associated with a UCIe
Link in the current revision of the specification.
Example#2: Host UiRB supporting 3 register locators would set the length to indicate 84B.
Example#3: Switch UiSRB supporting 3 register locators and associated with just 1 DSP port to a
UCIe Link, would set the length to indicate 85B.
Design
Next Capability Offset 31:20
Dependent
9.5.2.3 UCIe Switch Register Block (UiSRB) Base Address (Offset Ch)
All bits in this register are RO.
Register BIR
Indicates which one of a Switch USP Function’s Base Address
Registers, located beginning at 10h in Configuration Space, or entry in
the Enhanced Allocation capability with a matching BAR Equivalent
Indicator (BEI), is used to locate the UCIe Switch Register Block.
Defined encodings are:
0 RO • 0 Base Address Register 10h
• 1 Base Address Register 14h
• All other Reserved.
The Registers block must be wholly contained within the specified
BAR. For a 64-bit Base Address Register, the Register BIR indicates
the lower DWORD.
Vendor ID
15:0 RO
Default is set to Vendor ID assigned for UCIe Consortium - D2DEh
Default is 2000h.
Header Log 1: This logs the header for the sideband mailbox
register access that received a completion with Completer Abort
status or received a completion with Unsupported Request status.
Note that register accesses that time out are not required to be
logged at the requester.
If the Write/Read Status field in the ‘Sideband Mailbox Status’
63:0 ROS register indicates ‘Success’ or the Write/Read trigger bit in the
Sideband Mailbox Control register is set to 1, this field’s value is
undefined.
This register is rearmed for logging new errors every time the
Write/Read Trigger bit in the Mailbox Control register sees a 0-to-1
transition.
Default Value is 0.
Adapter LSM id
10 ROS 0b : Adapter LSM 0 timed out
1b : Adapter LSM 1 timed out
Flit Format: This field logs the negotiated Flit Format, it is the
current snapshot of the format the Adapter is informing to the
Protocol Layer. See Chapter 3.0 for the definitions of these formats.
The encodings are:
0001b - Format 1
17:14 ROS 0010b - Format 2
0011b - Format 3
0100b - Format 4
0101b - Format 5
0110b - Format 6
Other encodings are Reserved
First Fatal Error Indicator: 5-bit encoding that indicates which bit
of Uncorrectable Error Status errors was logged first. The value of
this field has no meaning if the corresponding status bit is cleared.
The encoding of this field is as follows:
00h if the error corresponding to Uncorrectable Error Status
register[0] is the first fatal error.
01h if the error corresponding to Uncorrectable Error Status
register[1] is the first fatal error.
…
Because reserved bits may be repurposed in future versions of the
specification, software might observe that this field points to a
reserved bit (from its perspective) in the Uncorrectable Error Status
register. This can happen when an older version of Software is run
on newer hardware. Software must be aware that it still needs to
clear the Status register bit if it desires to allow for continued error
22:18 ROS logging. How SW handles error status bits it does not understand is
beyond the scope of the specification.
Once set, the value of this field does not change until SW clears the
corresponding Uncorrectable Error Status register bit. When SW
clears the corresponding status bit, HW is rearmed to capture
subsequent first fatal errors.
Note that because of an inherent race condition between HW setting
a new status bit and SW clearing an older status bit, SW must be
aware that this field might not always indicate the first error
amongst all the errors logged in the Uncorrectable Error Status
register. For example, if the Uncorrectable Error Status bit 0 was set
first by HW and in the time SW reads the status and cleared it, bit 1
in the Status register was set. So, after SW clears bit 0 if error
corresponding to bit 0 recurs, it will be captured as the next first
error even though the error corresponding to bit 1 occurred earlier.
If multiple errors are encountered simultaneously, which error is
logged as the First Fatal Error is implementation-dependent.
9.5.3.20 Advertised CXL Capability Log Register for Stack 1 (Offset 88h)
This register is reserved for designs that do not implement the Enhanced Multi-protocol capability.
9.5.3.21 Finalized CXL Capability Log Register for Stack 1 (Offset 90h)
This register is reserved for designs not implementing the Enhanced multi-protocol capability.
2:0 RO Reserved
TX Equalization support
4 RO 0: TXEQ not supported
1: TXEQ supported
10 RsvdP Reserved
Package type
15 RO 0b: Advanced Package
1b: Standard Package
Rx Terminated Control
0b: Rx Termination disabled
1b: Rx Termination enabled
3 RW
Default is same as ‘Terminated Link’ bit in PHY capability register.
Note that this bit is always cleared to 0 for Advanced Packages. This
control is provided for debug purposes only.
Tx Eq Enable
0b: Eq Disabled
4 RW
1b: Eq Enabled
Default is 0
2:0 RO Reserved
Rx Termination Status
0: Rx Termination disabled
1: Rx Termination enabled
Default is same as ‘Terminated Link’ bit in PHY capability register.
This is the current status of the local UCIe Module. Note that this is
always 0 for Advanced Packages. For Standard packages, whether
the Rx decides to terminate the Link could depend on several
3 RO
factors (including channel length in the Package, etc.), and that
decision is implementation-specific. For Transmitter of a remote Link
partner, it needs this information in order to know whether to Hi-Z
the Data and Track Lanes during clock gating and when not
performing Runtime Recalibration, respectively. It is expected that
this information is known a priori at Package integration time, and
the Transmitter is informed of this in an implementation-specific
manner.
Tx Eq Status
0: Eq Disabled
4 RO
1: Eq Enabled
Default is 0
Initialization control
000b: Initialize to Active. This is the regular Link bring up.
001b: Initialize to MBINIT (Debug mode) (i.e., pause training after
completing step-2 of MBINIT.PARAM).
010b: Initialize to MBTRAIN (Debug/compliance mode) (i.e., pause
training after entering MBTRAIN after completing step-1 of
MBTRAIN.VALVREF).
011b = Pause after completing step-1 of MBTRAIN.RXDESKEW;
regardless of entering for initial bring up or from Retrain.
100b = Pause after completing step-1 of
MBTRAIN.DATATRAINCENTER2; regardless of entering for initial
bring up or from Retrain.
All other encodings are reserved.
2:0 RW
Resume Training
A 0b-to-1b transition on this bit triggers hardware to resume
training from the last link training state, achieved via ‘Initialization
5 RW Control’ field in this register until ACTIVE.
A device that does not support the UCIe Test and Compliance
register block is permitted to hardwire this bit to 0b.
Default is 0b.
Training mode
0b: Continuous mode
10 RW
1b: Burst Mode
Default = 0
Idle count: Indicates the duration of low following the burst (UI
15:0 RW count)
Default = 4h
State (N-1): Captures the state before State N was entered for
Link training state machine. State encodings are the same as State
23:16 ROS N field.
Default is 0
State (N-2): Captures the state before State (N-1) was entered for
Link training state machine. State encodings are the same as State
31:24 ROS N field.
Default is 0
State (N-3): Captures the state status before State (N-2) was
7:0 ROS entered. State encodings are the same as State N field.
Default is 0
Start: Software writes to this bit before setting Link Retrain bit to
inform hardware that the contents of this register are valid. HW
6 RW
clears this bit to 0 after the Busy bit in the Runtime Link Test Status
register is set to 1.
Note that HW may also measure Eye Margins during HW-autonomous retraining and/or initial training
and if measured, is permitted to report it in the Eye Margin registers whenever the EMV bit is cleared.
For x32 Advanced Packaging implementations, EML* and EMR* registers for Lanes 63:32 are RsvdP.
Table 9-65. UHM DVSEC - Designated Vendor Specific Header 1, 2 (Offsets 04h and 08h)
Step Count
Step count used in the reporting of margin information. A value of 0
7:0 RO indicates 256. For example, a value of 32 indicates that the UI is equally
divided into 32 steps and Eye Margin registers provide the left and right
margins in multiples of UI/32.
Vendor ID
15:0 RO
Default is set to Vendor ID assigned for UCIe Consortium - D2DEh.
Compliance Mode
Any write to this register takes effect after the next entry of RDI state
status to Retrain.
• 00b = Normal mode of operation
• 01b = PHY only Link Training or Retraining
— Adapter performs the necessary RDI handshakes to bring
RDI to Active but does not perform Parameter exchanges or
Adapter vLSM handshakes and keeps FDI in Reset to
prevent mainband traffic.
— Adapter must still trigger RDI to Retrain if software
programmed the Retrain bit in Link Control.
— Sideband Register Access requests and completions are
operational in this mode.
• 10b = Adapter Compliance
1:0 RW — Adapter performs the necessary RDI handshakes to bring
RDI to Active but does not perform Parameter exchanges or
Adapter vLSM handshakes (unless triggered by software)
and keeps FDI in Reset.
— Adapter only performs actions based on the triggers and
setup according to the registers defined in Section 9.5.4.4.2
to Section 9.5.4.4.6.
— Adapter must still trigger RDI to Retrain if software
programmed the Retrain bit in Link Control.
— Sideband Register Access requests and completions are
operational in this mode.
• 11b = Reserved
Any RDI transition to LINKERROR when this field is either 01b or 10b
does not reset any registers.
Default is 00b.
Table 9-73. Flit Tx Injection Control (Offset 28h from D2DOFF) (Sheet 1 of 2)
Flit Type
Type of Flit injected.
• 000b = Adapter NOP Flits. These bypass TX retry buffer.
3:1 RW • 001b = Test Flits.
• 010b = Alternate between NOP Flits and Test Flits.
• All other encodings are reserved.
Default is 000b.
Injection mode
• 00b = Continuous injection of Flits as specified by Flit Type field.
• 01b = Inject ‘Flit Inject Number’ of Flits contiguously without any
intervening Protocol Flits.
5:4 RW • 10b = Inject ‘Flit Inject Number’ of Flits while interleaving with
Protocol Flits. If Protocol Flits are available, alternate between
Protocol Flits and Injected Flits. If no Protocol Flits are available
then, inject consecutively.
• 11b = Reserved.
Default is 00b.
Payload Type
This field determines the payload type used if Test Flits are injected.
Payload includes all bits in the Flit with the exception of Flit Header,
CRC, and Reserved bits.
• 0h = Fixed 4B pattern picked up from ‘Payload Fixed Pattern’ field
of this register, inserted so as to cover all the Payload bytes (with
the same pattern replicated in incrementing 4B chunks)
• 1h = Random 4B pattern picked up from a 32b LFSR (linear
feedback shift register used for pseudo random pattern
generation), inserted so as to cover all the Payload bytes (with
the same pattern replicated in incrementing 4B chunks)
• 2h = Fixed 4 byte pattern picked up from ‘Payload Fixed Pattern’
field of this register, inserted once at the ‘Flit Byte Offset’ location
within the Flit
17:14 RW • 3h = Random 4B pattern picked up from a 32b LFSR, inserted
once at the ‘Flit Byte Offset’ location within the Flit and the rest of
the payload is assigned 0b
• 4h = Same as 2h, except the 4B pattern is injected every ‘Pattern
Repetition’ bytes starting with ‘Flit Byte Offset’
• 5h = Same as 3h, except the 4B pattern is injected every ‘Pattern
Repetition’ bytes starting with ‘Flit Byte Offset’ and the rest of the
payload is assigned 0b
• All other encodings are reserved
Default is 0h.
LFSR seed and primitive polynomial choice is implementation specific.
Note: While in mission mode, because scrambling is always enabled,
changing the Payload Type may have no benefit. This may,
however, be useful during compliance testing with scrambling
disabled.
Table 9-73. Flit Tx Injection Control (Offset 28h from D2DOFF) (Sheet 2 of 2)
Pattern Repetition
See ‘Payload Type’. A value of 00h or 01h must be interpreted as a
31:26 RW
single pattern occurrence.
Default is 00h.
Compliance Status
0 RO If Adapter is in ‘PHY only Link Training or Retraining’ or ‘Adapter
Compliance’ mode, it is set to 1b; otherwise, it is 0b.
Flit Rx Status
• 00b = No Test Flits received
4:3 RW1C • 01b = Received at least one Test Flit without CRC error
• All other encodings are reserved
Default is 00b.
9.5.4.4.4 Link State Injection Control Stack 0 (Offset 34h from D2DOFF)
As mentioned in Section 11.2, this register only takes effect when the Adapter is in Adapter
Compliance Mode.
Injection Type
• 0b = Inject a request packet with the request matching “Link
Request” field
1 RW
• 1b = Inject a response packet with the response matching “Link
Response” field when a request matching “Link Request” field is
received
Link Request
5:2 RW
The encodings match the State request encodings of FDI.
Link Response
9:6 RW
The encodings match the State response encodings of FDI.
9.5.4.4.5 Link State Injection Control Stack 1 (Offset 38h from D2DOFF)
As mentioned in Section 11.2, this register only takes effect when the Adapter is in Adapter
Compliance Mode.
Injection Type
• 0b = Inject a request packet with the request matching “Link
Request” field
1 RW
• 1b = Inject a response packet with the response matching “Link
Response” field when a request matching “Link Request” field is
received
Link Request
5:2 RW
The encodings match the State request encodings of FDI.
Link Response
9:6 RW
The encodings match the State response encodings of FDI.
Byte Offset
See ‘Error Injection Type on Transmitted Flits’.
11:4 RW 00h means error is injected on Byte 0, 01h means error is injected in
Byte 1, and so on.
Default is 00h.
31 RsvdP Reserved
SW is required to place the Adapter in one of the Compliance modes (defined in the Adapter
Compliance Control register) before enabling @PHY-Compliance.
All modules of a Link must be in @PHY-Compliance at the same time. The Link behavior is undefined
if a subset of modules of a Link are in @PHY-Compliance and others are not. All registers in this
section are replicated, one per module, as follows:
• Module 0 registers start at Offset 000h from PHYOFF
• Module 1 registers start at Offset 400h from PHYOFF
• Module 2 registers start at Offset 800h from PHYOFF
• Module 3 registers start at Offset C00h from PHYOFF
If certain modules are not implemented, those registers become reserved (as shown with gray boxes
in Figure 9-6).
9.5.4.5.1 Physical Layer Compliance Control 1 (Offsets 000h, 400h, 800h, and C00h
from PHYOFF)
Scrambling Disabled
@PHY-Compliance, when set to 1b, Physical Layer disables
1 RW
scrambling.
Default is 0b.
Rx Vref Offset
@PHY-Compliance, when ‘Rx Vref Offset Enable’ is set to 01b or 10b,
this is the value that needs to be added or subtracted as defined in ‘Rx
Vref Offset Enable’.
The Rx Vref value, after applying the Rx Vref offset, is expected to be
monotonically increasing/decreasing with increasing/decreasing
values of Rx Vref offset relative to the trained value and must have
sufficient range to cover the input eye mask range defined in
Chapter 5.0.
17:10 RW Rx Vref Offset will be applied during Tx or Rx Data to Point Training
and the Physical Layer must compare the per Lane errors with ‘Max
error Threshold in per-Lane comparison’, and aggregate Lane errors
with ‘Max Error Threshold in Aggregate Comparison’ in the ‘Training
Setup 4’ register. If the errors measured are greater than the
corresponding threshold, then the device must set the Rx Vref offset
status register to “failed”.
Software must increase or decrease the Rx Vref Offset by one from
the previous value.
Default is 00h.
9.5.4.5.2 Physical Layer Compliance Control 2 (Offsets 008h, 408h, 808h, and C08h
from PHYOFF)
Track Enable
If @PHY-Compliance {
If this bit is set, Track Transmission is enabled during one of the operations
set by ‘PHY compliance operation type’. Track transmission complies with
descriptions in Section 5.5.1.
2 RW
}
Else {
The appropriate sideband handshakes as described in Section 4.6 needs to
be followed irrespective of the value of this bit
}
Compare Setup
• 0b = Aggregate comparison
3 RW
• 1b = Per Lane comparison
Default is 0b. See Section 4.4 for more details.
9.5.4.5.3 Physical Layer Compliance Status 1 (Offsets 010h, 410h, 810h, and C10h from
PHYOFF)
9.5.4.5.4 Physical Layer Compliance Status 2 (Offsets 018h, 418h, 818h, and C18h from
PHYOFF)
9.5.4.5.5 Physical Layer Compliance Status 3 (Offsets 020h, 420h, 820h, and C20h from
PHYOFF)
IMPLEMENTATION NOTE
While the SW view of Protocol Layer for streaming protocols is implementation-
specific, it is strongly recommended that UCIe link-related registers defined in this
chapter be implemented as-is for streaming mode solutions as well. If a streaming
mode solution chooses to support the industry-standard PCIe hierarchical tree model
for enumeration/control, it must be compliant with the enumeration model and
registers defined in this chapter. A UCIe port in such an implementation would expose
UCIe link registers consistent with the RP/DSP or EP/USP functionality it represents.
Length
Field Byte Offset Description
in Bytes
Creator Revision 20h 4 Revision of the utility that created this table.
Length
Field Byte Offset Description
in Bytes
Type 00h 1 Signature for the UCIe Early Discovery Table (UEDT).
• 1h = One RP
UCIe Stack Size 08h 4
• 2h = Two RPs
Length
Field Byte Offset Description
in Bytes
§§
This chapter will cover the details of interface operation and signal definitions for the Raw Die-to-Die
Interface (RDI), as well as the Flit-Aware Die-to-Die Interface (FDI). Common rules across RDI and
FDI are covered as a separate section. The convention used in this chapter is that “assertion” of a
signal is for 0b to 1b transition, and “de-assertion” of a signal is for 1b to 0b transition. A “pulse” of
“n” cycles for a signal is defined as an event where the signal transitions from 0b to 1b, stays 1b for
“n” clock cycles, and subsequently returns to 0b. A receiver sampling this signal on the same clock as
the transmitter will see it being asserted for “n” clock cycles. If a value of “n” is not specified, it is
interpreted as a value of one. In the context of error signals defined as pulses, the receiving logic for
error logging must treat the rising edge as a new event indication and not rely on the length of the
pulse.
In this chapter, interface reset/domain reset also applies to all forms of Conventional Reset defined in
PCIe Base Specification, if the Protocol is PCIe or CXL. In the sections that follow, “UCIe Flit mode”
refers to scenarios in which the Link is not operating in Raw Format, and “UCIe Raw Format” or “Raw
Format” refers to scenarios in which the Link is operating in Raw Format.
Die-to-Die Die-to-Die
Adapter Adapter
RDI RDI
Physical Layer Multi-module PHY Logic
(single module) (two modules)
AFE Module 0 AFE/PHY Logic Module 1 AFE/PHY Logic
(a) (b)
Die-to-Die
Adapter
RDI
Multi-module PHY Logic
(four modules)
Module 0 AFE/PHY Logic Module 1 AFE/PHY Logic Module 2 AFE/PHY Logic Module 3 AFE/PHY Logic
(c)
Table 10-1 lists the RDI signals and their descriptions. All signals are synchronous with lclk.
In Table 10-1:
• pl_* indicates that the signal is driven away from the Physical Layer to the Die-to-Die Adapter.
• lp_* indicates that the signal is driven away from the Die-to-Die Adapter to the Physical Layer.
Adapter to Physical Layer signal indication that the Adapter has data to send. This must
be asserted if lp_valid is asserted and the Adapter wants the Physical Layer to
sample the data.
lp_irdy
lp_irdy must not be presented by the Adapter when pl_state_sts is Reset except
when the status transitions from LinkError to Reset. On a LinkError to Reset transition,
it is permitted for lp_irdy to be asserted for a few clocks but it must be de-asserted
eventually. Physical Layer must ignore lp_irdy when status is Reset.
Adapter to Physical Layer indication that data is valid on the corresponding lp_data
lp_valid bytes.
Adapter to Physical Layer data, where ‘NBYTES’ equals number of bytes determined by
lp_data[NBYTES-1:0][7:0] the data width for the RDI instance.
When asserted at a rising clock edge, it indicates a single credit return from the
Adapter to the Physical Layer for the Retimer Receiver buffers. Each credit corresponds
lp_retimer_crd to 256B of mainband data. This signal must NOT assert for dies that are not UCIe
Retimers.
The Physical Layer is ready to accept data. Data is accepted by the Physical Layer when
pl_trdy, lp_valid, and lp_irdy are asserted at the rising edge of lclk.
pl_trdy This signal must only be asserted if pl_state_sts is Active or when performing the
pl_stallreq/lp_stallack handshake when the pl_state_sts is LinkError (see
Section 10.3.3.7).
Physical Layer to Adapter data, where NBYTES equals the number of bytes determined
pl_data[NBYTES-1:0][7:0] by the data width for the RDI instance.
When asserted at a rising clock edge, it indicates a single credit return from the
pl_retimer_crd Retimer to the Adapter. Each credit corresponds to 256B of mainband data. This signal
must NOT assert if the remote Link partner is not a Retimer.
Adapter to Physical Layer indication that an error has occurred which requires the Link
to go down. Physical Layer must move to LinkError state and stay there as long as
lp_linkerror=1. The reason for having this be an indication decoupled from regular
state transitions is to allow immediate action on part of the Adapter and Physical Layer
lp_linkerror in order to provide the quickest path for error containment when applicable (for
example, a viral error escalation must map to the LinkError state).
The Adapter must OR internal error conditions with lp_linkerror received from
Protocol Layer on FDI.
Physical Layer to the Adapter indication that the Die-to-Die Link has finished training
and is ready for RDI transition to Active and Stage 3 of bring up.
pl_inband_pres
Once it transitions to 1b, this must stay 1b until Physical Layer determines the Link is
down (i.e., the Link Training State Machine transitions to TrainError or Reset).
Physical Layer to the Adapter indication that it has detected a framing related error
which is recoverable through Link Retrain. An example is where the Physical Layer
received an invalid encoding on the Valid Lane. It is a pulse of one or more cycles that
must occur only when RDI is in Active state. It is permitted to de-assert at the same
clock edge where the state transitions away from Active state.
It is pipelined with the receive data path such that the error indication reaches the
Adapter before or at the same time as the corrupted data. Physical Layer is expected to
go through Retrain flow after this signal has been asserted and it must not send valid
data to Adapter until the Link has retrained.
It is permitted for the Physical Layer to squash the pl_valid internally for the
corrupted data. Once pl_error is asserted, pl_valid should not be asserted
(without pl_error assertion in the same cycle) until the state status has transitioned
to Active after completing a successful Retrain entry and exit.
If pl_error=1 and pl_valid=1 in the same clock cycle, the Adapter must discard
the corresponding Flit (even if it is only partially received when pl_error asserted).
In UCIe Flit mode, when retry is enabled, it is the responsibility of the Adapter to
ensure data integrity for Flits forwarded to FDI, and that they are canceled following
pl_error the rules of pl_flit_cancel if they are suspected of corruption (see Section 10.2).
A couple of examples are given below:
• For 68B Flit Format, the Adapter could discard partially received Flits, but in 256B
Latency optimized modes, it could have processed one half correctly, and the error
may have happened on the other half, and so it has to track that and process
future flits accordingly.
• Another example is if it is not doing store/forward and only received 64B of a 128B
half, and pl_error happened before receiving the remaining 64B of the 128B
half, it needs to send dummy data for the second 64B and do a pl_flit_cancel
for that half of the Flit.
In UCIe Flit mode with Retry enabled for the Adapter, Retrain exit would naturally result
in a Replay of any partially received Flits eventually (see Section 3.8).
In UCIe Flit mode with Retry disabled, the Adapter must map pl_error assertion to
an Uncorrectable Internal Error and escalate it accordingly.
If the Link is operating in Raw Format, the Adapter forwards pl_error to the Protocol
Layer such that it is pipeline matched to the data bus, and Protocol Layer handles it in
an implementation-specific manner.
Physical Layer to the Adapter indication that a correctable error was detected that does
not affect the data path and will not cause Retrain on the Link. In UCIe Flit mode with
Retry enabled, the Adapter must OR the pl_error and pl_cerror signals for
Correctable Internal Error Logging.
In UCIe Flit mode with Retry disabled or when the Link is operating in Raw Format, the
pl_cerror Adapter must only use pl_cerror for Correctable Internal Error Logging.
It is a pulse of one or more cycles which can occur in any RDI state. If it is a state in
which clock gating is permitted, it is the responsibility of the Physical Layer to perform
the clock gating exit handshake with the Adapter before asserting this signal. Clock
gating can resume once pl_cerror de-asserts and all other conditions permitting
clock gating are satisfied.
Physical Layer to the Adapter indication that a non-fatal error was detected. There is no
architecturally defined error condition for the Physical Layer currently asserting this
signal; however, the signal is provided on the interface for any implementation-specific
non-fatal errors. The Adapter treats this in the same manner as when it received a
Sideband Non-Fatal Error Message from the remote Link partner.
pl_nferror
It is a pulse of one or more cycles that can occur in any RDI state. If it is a state where
clock gating is permitted, it is the responsibility of the Physical Layer to perform the
clock gating exit handshake with the Adapter before asserting this signal. Clock gating
can resume after pl_nferror is de-asserted and all other conditions permitting clock
gating have been met.
Indicates a fatal error from the Physical Layer. Physical Layer must transition
pl_state_sts to LinkError if not already in LinkError state.
This must be escalated to upper Protocol Layers based on the mask and severity
programming of Uncorrectable Internal Error in the Adapter. Implementations are
pl_trainerror permitted to map any fatal error to this signal that require upper layer escalation (or
interrupt generation) depending on system-level requirements.
It is a level signal that can assert in any RDI state but remains asserted until RDI exits
the LinkError state to Reset state.
Physical Layer indication to Adapter that the Physical Layer is training or retraining. If
this is asserted during a state where clock gating is permitted, the pl_clk_req/
pl_phyinrecenter lp_clk_ack handshake must be performed with the upper layer. The upper layers are
permitted to use this to update the “Link Training/Retraining” bit in the UCIe Link
Status register.
Physical Layer request to Adapter to align Transmitter at Flit boundary and not send
pl_stallreq any new Flits to prepare for state transition. See Section 10.3.2.
Adapter to Physical Layer indication that the Flits are aligned and stalled (if
pl_stallreq was asserted). It is strongly recommended that this response logic be
lp_stallack on a global free running clock, so the Adapter can respond to pl_stallreq with
lp_stallack even if other significant portions of the Adapter are clock gated. See
Section 10.3.2.
Request from the Physical Layer to remove clock gating from the internal logic of the
Adapter. This is an asynchronous signal relative to lclk from the Adapter’s perspective
since it is not tied to lclk being available in the Adapter. Together with lp_clk_ack,
it forms a four-way handshake to enable dynamic clock gating in the Adapter.
pl_clk_req When dynamic clock gating is supported, the Adapter must use this signal to exit clock
gating before responding with lp_clk_ack.
If dynamic clock gating is not supported, it is permitted for the Physical Layer to tie this
signal to 1b.
Response from the Adapter to the Physical Layer acknowledging that its clocks have
been ungated in response to pl_clk_req. This signal is only asserted when
pl_clk_req is asserted, and de-asserted after pl_clk_req has de-asserted.
lp_clk_ack When dynamic clock gating is not supported by the Adapter, it must stage
pl_clk_req internally for one or more clock cycles and turn it around as
lp_clk_ack. This way it will still participate in the handshake even though it does not
support dynamic clock gating.
Request from the Adapter to remove clock gating from the internal logic of the Physical
Layer. This is an asynchronous signal from the Physical Layer’s perspective since it is
not tied to lclk being available in the Physical Layer. Together with pl_wake_ack, it
forms a four-way handshake to enable dynamic clock gating in the Physical Layer.
lp_wake_req When dynamic clock gating is supported, the Physical Layer must use this signal to exit
clock gating before responding with pl_wake_ack.
If dynamic clock gating is not supported, it is permitted for the Adapter to tie this signal
to 1b.
Response from the Physical Layer to the Adapter acknowledging that its clocks have
been ungated in response to lp_wake_req. This signal is only asserted after
lp_wake_req has asserted, and is de-asserted after lp_wake_req has de-asserted.
pl_wake_ack When dynamic clock gating is not supported by the Physical Layer, it must stage
lp_wake_req internally for one or more clock cycles and turn it around as
pl_wake_ack. This way it will still participate in the handshake even though it does
not support dynamic clock gating.
This is the sideband interface from the Physical Layer to the Adapter. See Chapter 7.0
for packet format details. NC is the width of the interface. Supported values are 8, 16,
and 32.
pl_cfg[NC-1:0] Register accesses must be implemented by hardware to be atomic regardless of the
width of the interface (i.e., all 32 bits of a register must be updated in the same cycle
for a 32-bit register write, and similarly all 64 bits of a register must be updated in the
same cycle for a 64-bit register write).
When asserted, indicates that pl_cfg has valid information that should be consumed
pl_cfg_vld by the Adapter.
Credit return for sideband packets from the Physical Layer to the Adapter for sideband
packets. Each credit corresponds to 64 bits of header and 64 bits of data. Even
transactions that do not carry data or carry 32 bits of data consume the same credit
and the Physical Layer returns the credit once the corresponding transaction has been
processed or deallocated from its internal buffers. See Section 7.1.3.1 for additional
flow control rules. A value of 1 sampled at a rising clock edge indicates a single credit
pl_cfg_crd return.
Because the advertised credits are design parameters, the Adapter transmitter updates
the credit counters with initial credits on domain reset exit, and no initialization credits
are returned over the interface.
Credit returns must follow the same rules of clock gating exit handshakes as the
sideband packets to ensure that no credit returns are dropped by the receiver of the
credit returns.
This is the sideband interface from Adapter to the Physical Layer. See Chapter 7.0 for
details. NC is the width of the interface. Supported values are 8, 16, and 32.
lp_cfg[NC-1:0] Register accesses must be implemented by hardware to be atomic regardless of the
width of the interface (i.e., all 32 bits of a register must be updated in the same cycle
for a 32-bit register write, and similarly all 64 bits of a register must be updated in the
same cycle for a 64-bit register write).
When asserted, indicates that lp_cfg has valid information that should be consumed
lp_cfg_vld by the Physical Layer.
Credit return for sideband packets from the Adapter to the Physical Layer for sideband
packets. Each credit corresponds to 64 bits of header and 64 bits of data. Even
transactions that do not carry data or carry 32 bits of data consume the same credit
and the Adapter returns the credit once the corresponding transaction has been
processed or deallocated from its internal buffers. See Section 7.1.3.1 for additional
flow control rules. A value of 1 sampled at a rising clock edge indicates a single credit
lp_cfg_crd return.
Because the advertised credits are design parameters, the Physical Layer transmitter
updates the credit counters with initial credits on domain reset exit, and no initialization
credits are returned over the interface.
Credit returns must follow the same rules of clock gating exit handshakes as the
sideband packets to ensure that no credit returns are dropped by the receiver of the
credit returns.
Signals in Table 10-2 apply only when supporting MPM over sideband. The choice for whether these
signals run on the lclk or the Mgmt_Clk is implementation-specific.
Table 10-2. RDI Config interface extensions for Management Transport (Sheet 1 of 3)
A two-clock trigger pulse from Management Port Gateway to PHY to start negotiation
on the sideband links. Management Port Gateway must ensure that the
mp_mgmt_up signal is de-asserted when this signal is pulsed. This signal forces the
mp_mgmt_init_start link state machine to RESET state (if it is not already there) and hence can bring the
mainband link down from link up state. The standard TRAINERROR flow applies here
as well for transitioning the state machine to RESET if the state machine is not
already in that state when this signal is pulsed.
Table 10-2. RDI Config interface extensions for Management Transport (Sheet 2 of 3)
This is credit return for the Flow control buffers over RDI (see Section 8.2.5.1.1)
used by the Management Port Gateway to transmit management packets to the
remote Management Port Gateway.
Each credit corresponds to 64 bits of buffer space. Physical Layer returns the credit
once the corresponding transaction has been deallocated from its internal buffers.
See Section 8.2.5.1.1 for additional flow control rules. Because the advertised
pm_cfg_credit[N-1:0] credits are design parameters, the Management Port Gateway transmitter updates
the credit counters with initial credits on Management reset exit or on ‘Heartbeat
timeout’, and no initialization credits are returned over the interface for these
conditions. Credit returns must follow the same rules of clock gating exit
handshakes as the sideband packets to ensure that no credit returns are dropped by
the receiver of the credit returns.
There is a signal per RxQ-ID in the design and hence N can be 1, 2, 3, or 4.
RxQ-ID associated with the message. Has meaning when mp_mgmt_pkt signal is
asserted on a RDI transfer. Used by PHY to steer the packet to the correct SB link.
On encapsulated MTPs and PM Req messages, this carries the far-end Rx queue’s
RxQ-ID. On Credit return, Init Done and PM Ack messages this carries the RxQ-ID of
the local Rx queue associated with the message.
mp_rxqid[N-1:0] N is either 2 (for 4 modules links scenarios) or 1 (1 or 2 modules links scenarios).
There is a fixed mapping in the PHY between this value and a physical SB link and
the mapping is determined post successful completion of management transport
negotiation on the transmit side. The chosen SB link for a given RxQ-ID must be one
of the SB links that successfully trained for management transport on the transmit
side.
RxQ-ID associated with the message. Has meaning when pm_mgmt_pkt signal is
asserted on a RDI transfer. Used by Management Port Gateway to internally steer
the packet to the correct RxQ.
N is either 2 (for 4 modules/sideband-only links scenarios) or 1 (1 or 2 modules/
pm_rxqid[N-1:0] sideband-only links scenarios). Valid for all MPM config bus transmissions. PHY uses
the RxQ-ID from the first credit return message received from a given sideband link
to drive these signals on config interface. These signals are undefined for SoC
Capabilities message. The captured RxQ-ID value is reset only when the
management path is reinitialized.
Request from the Management Port Gateway to remove clock gating from the
internal logic of the Physical Layer that handles management transport traffic. This
is an asynchronous signal from the Physical Layer’s perspective since it is not tied to
lclk being available in the Physical Layer. Together with pm_wake_ack, it forms a
four-way handshake to enable dynamic clock gating in the Physical Layer for logic
mp_wake_req that handles management transport traffic.
When dynamic clock gating is supported, the Physical Layer must use this signal to
exit clock gating before responding with pm_wake_ack.
If dynamic clock gating is not supported, Management Port Gateway must tie this
signal to 1.
Response from the Physical Layer to the Management Port Gateway acknowledging
that its clocks have been ungated in response to mp_wake_req. This signal is only
asserted after mp_wake_req has asserted, and is de-asserted after mp_wake_req
has de-asserted.
pm_wake_ack
When dynamic clock gating is not supported by the Physical Layer, it must stage
mp_wake_req internally for one or more clock cycles and turn it around as
pm_wake_ack. This way it will still participate in the handshake even though it does
not support dynamic clock gating.
Request from the Physical Layer to remove clock gating from the internal logic of the
Management Port Gateway. This is an asynchronous signal relative to lclk/
Mgmt_clk from the Management Port Gateway perspective because it is not tied to
lclk/Mgmt_clk being available in the Management Port Gateway.
pm_clk_req Together with mp_clk_ack, it forms a four-way handshake to enable dynamic clock
gating in the Management Port Gateway. When dynamic clock gating is supported,
the Management Port Gateway must use this signal to exit clock gating before
responding with mp_clk_ack. If dynamic clock gating is not supported, Physical
Layer must tie this signal to 1.
Table 10-2. RDI Config interface extensions for Management Transport (Sheet 3 of 3)
Response from the Management Port Gateway to the PHY acknowledging that its
clocks have been ungated in response to pm_clk_req. This signal is asserted only
when pm_clk_req is asserted, and de-asserted after pm_clk_req has de-
asserted. When dynamic clock gating is not supported by the Management Port
Gateway, it must stage pm_clk_req internally for one or more clock cycles and
mp_clk_ack turn it around as mp_clk_ack. This way it will still participate in the handshake
even though it does not support dynamic clock gating.
When supporting dynamic clock gating of the Management Port Gateway, PHY must
ensure that pulsed signals (e.g., pm_param_done), are delivered only after the
mp_clk_ack is set to ensure that the Management Port Gateway saw those pulses.
During a valid RDI data transfer to PHY, this signal indicates whether the transfer is
for an MPM.
mp_mgmt_pkt 0: Link management packet.
1: MPM. Used by PHY to steer the packet to the correct RDI credit buffer.
During a valid RDI data transfer from PHY, this signal indicates whether the transfer
is for an MPM.
pm_mgmt_pkt 0: Link management packet.
1: MPM. Used by the Management Port Gateway to steer the packet to RxQ buffers
or to D2D Adapter.
Optional clock used for the Configuration interface on the RDI for implementations in
Mgmt_clk which the main RDI clock is not available for Management Transport path
initialization.
Set by any sideband link fatal error indication, such as parity error on a sideband
pm_fatal_error packet. Cleared by a Management Reset.
Each side is permitted to internally instantiate clock-crossing FIFOs if needed, as long as it does not
violate the requirements at the interface itself.
It is important to note that back pressure is not possible from the Adapter to the Physical Layer on the
main data path. So any clock-crossing-related logic internal to the Adapter must take this into
consideration.
For example, for a 64-Lane module with a maximum speed of 16 GT/s, the RDI could be 64B wide
running at 2 GHz to be exactly bandwidth matched.
Adapter can request removal of clock gating of the Physical Layer by asserting lp_wake_req
(asynchronous to lclk availability in the Physical Layer). All Physical Layer implementations must
respond with a pl_wake_ack (synchronous to lclk). The extent of internal clock ungating when
pl_wake_ack is asserted is implementation-specific, but lclk must be available by this time to
enable RDI signal transitions from the Adapters. The Wake Req/Ack is a full handshake and it must be
used for state transition requests (on lp_state_req or lp_linkerror) when moving away from a
state in which clock gating is permitted. It must also be used for sending packets on the sideband
interface.
Physical Layer is permitted to initiate pl_clk_req/lp_clk_ack handshake at any time and the
Adapter must respond.
3. pl_clk_req must de-assert before lp_clk_ack. It is the responsibility of the Physical Layer to
control the specific scenario of de-assertion, after the required actions for this handshake are
completed.
4. pl_clk_req should not be the only consideration for the Adapter to perform clock gating, it must
take into account pl_state_sts and other protocol-specific requirements before performing
trunk and/or local clock gating.
5. The Physical Layer must use this handshake to ensure transitions of pl_inband_pres have been
observed by the Adapter. Since pl_inband_pres is a level oriented signal (once asserted it
stays asserted during the lifetime of Link operation), the Physical Layer is permitted to let the
signal transition without waiting for lp_clk_ack. When this is done during initial Link bring up, it
is strongly recommended for the Physical Layer to keep pl_clk_req asserted until the state
status transitions away from Reset to a state where clock gating is not permitted.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
lclk
pl_clk_req
lp_clk_ack
pl_inband_pres
6. The Physical Layer must also perform this handshake before transition to LinkError state from
Reset or PM state (when the LinkError transition occurs by the Physical Layer without being
directed by the Adapter). It is permitted to assert pl_clk_req before the state change, in which
case it must stay asserted until the state status transitions. It is also permitted to assert
pl_clk_req after the state status transition, but in this case Physical Layer must wait for
lp_clk_ack before performing another state transition.
7. The Physical Layer must also perform this handshake when the status is PM and remote Link
partner is requesting PM exit. For exit from Reset or PM states to a state that is not LinkError, it is
required to assert pl_clk_req before the status change, and in this case it must stay asserted until
the state status transitions away from Reset or PM.
8. When clock-gated in RESET states, Adapters that rely on dynamic clock gating to save power
must wait in clock gated state for pl_inband_pres=1. The Physical Layer will request clock
gating exit when it transitions pl_inband_pres, and the Adapter must wait for
pl_inband_pres assertion before requesting lp_state_req = ACTIVE. If pl_inband_pres
de-asserts while pl_state_sts = RESET, then the Adapter is permitted to return to clock-gated
state after moving lp_state_req to NOP.
9. Physical Layer must also perform this handshake for sideband traffic to Adapter. When performing
the handshake for pl_cfg transitions, Physical Layer must wait for lp_clk_ack before changing
pl_cfg or pl_cfg_vld. Because pl_cfg can have multiple transitions for a single packet
transfer, it is necessary to make sure that the Adapter clocks are up before transfer begins.
0 1 2 3 4 5 6
clk
lp_irdy
As indicated in the signal list descriptions, when the Physical Layer is sending data to the Adapter,
there is no backpressure mechanism, and data is transferred whenever pl_valid is asserted. The
Physical Layer is permitted to insert bubbles in the middle of a Flit transfer and the Adapter must be
able to handle that.
IMPLEMENTATION NOTE
For the transmit side of the Physical Layer for data sent over the UCIe Link, it must
ensure that if the Adapter has a continuous stream of packets to transmit (lp_irdy
and lp_valid do not de-assert), it does not insert bubbles in valid frames on the
Physical Link.
For the Runtime Link Testing feature with parity insertion, the Adapter as a receiver of
parity bytes is permitted to issue a {ParityFeature.Nak} if software sets up a number
of parity byte insertions (“Number of 64 Byte Inserts” field in the “Error and Link
Testing Control” register) that does not amount to 256B or a multiple of the RDI width
(to save the implementation cost of barrel shifting the parity bytes). For example, if
the RDI width is 64B then either 64B, 128B, or 256B of inserted parity bytes are okay,
but if the RDI width is 256B or larger, then it is better to always have 256B of inserted
parity bytes so that it matches the data transfer granularity of Flits.
IMPLEMENTATION NOTE
It is permitted to use lp_irdy as an early indication that the valid data will be
resuming imminently, and the Physical Layer needs to ungate clocks and assert
pl_trdy when it is ready to receive data. A couple of examples are shown in
Figure 10-4 and Figure 10-5. Note that pl_trdy could have asserted as early as
Clock Cycle 1 in Figure 10-4.
Reset
Disabled
From Any State
(except LinkError)
LinkReset
L2 From Any State
Active
(except Disabled and
LinkError)
PMNAK
L1
Retrain
Steps 3 to 5 are referred to as the “Active Entry handshake” and must be performed for every entry
to Active state. Active.PMNAK to Active transition is not considered here because Active.PMNAK is
only a sub-state of Active.
Sideband initialization
Stage 1
Adapter exits clock gating, gives Adapter exits clock gating, gives
Active request once it is ready to Active request once it is ready to
receive and send protocol receive and send protocol
parameter exchanges parameter exchanges
Stage 2 Complete
— If pl_state_sts = Active and lp_state_req = Active and it remains this way for 1us after
receiving the “PM Request” sideband message, it must respond with
{LinkMgmt.RDI.Rsp.PMNAK} sideband message.
• If a Physical Layer receives a “PM Response” sideband message in response to a “PM Request”
sideband message, it must transition pl_state_sts on its local RDI to PM (if it is currently in
Active state). If the current state is not Active, no action needs to be taken.
• If a Physical Layer receives a {LinkMgmt.RDI.Rsp.PMNAK} sideband message in response to a
“PM Request” sideband message, it must transition pl_state_sts on its local RDI to
Active.PMNAK state if it is currently in Active state. If it is not in Active state, no action needs to
be taken. The Physical Layer is permitted to retry PM entry handshake (if all conditions of PM
entry are satisfied) at least 2 us after receiving the {LinkMgmt.RDI.Rsp.PMNAK} sideband
message OR if it received a corresponding “PM Request” sideband message from the remote Link
partner.
• PM exit is initiated by the Adapter requesting Active on RDI. This triggers the Physical Layer to
initiate PM exit by sending a {LinkMgmt.RDI.Req.Active} sideband message. Physical Layer must
make sure it has finished any Link retraining steps before it responds with the
{LinkMgmt.RDI.Rsp.Active} sideband message. Figure 10-10 shows an example flow of PM exit
on RDI.
— PM exit handshake completion requires both Physical Layers to send as well as receive a
{LinkMgmt.RDI.Rsp.Active} sideband message. Once this has completed, the Physical Layer
is permitted to transition pl_state_sts to Active on RDI.
— If pl_state_sts = PM and a {LinkMgmt.RDI.Req.Active} sideband message is received, the
Physical Layer must initiate pl_clk_req handshake with the Adapter, and transition
pl_state_sts to Retrain. This must trigger the Adapter to request Active on
lp_state_req (if not already doing so), and this in turn triggers the Physical Layer to send
{LinkMgmt.RDI.Req.Active} sideband message to the remote Link partner. Figure 10-11
shows an example of the L1 exit flow on RDI and its interaction with the LTSM in the Physical
Layer. It is permitted for the LTSM to begin the Link PM exit and retraining flow when a
{LinkMgmt.RDI.Req.Active} sideband message is received or when the Adapter requests
Active on RDI. The timeout counters for the Active Request sideband message handshake
must begin only after LTSM is in the LINKINIT state. L2 exit follows a similar flow for cases in
which graceful exit is required without domain reset; however, the L2 exit is via Reset state
on RDI, and not Retrain. Exit conditions from Reset state apply for L2 exit (i.e., a NOP ->
Active transition is required on lp_state_req for the Physical Layer to exit Reset state on
RDI).
Note that the following figures are examples for L1, and do not show the lp_wake_req,
pl_clk_req handshakes. Implementations must follow the rules outlined for these handshakes in
previous sections.
LTSM is in LINKINIT
LTSM is in LINKINIT
FDI FDI
Arb/Mux
Die-to-Die Adapter Die-to-Die Adapter
RDI RDI
FDI FDI
Stack Mux Arb/Mux Arb/Mux
Die-to-Die Adapter Stack Mux
Die-to-Die Adapter
RDI
RDI
(c) Two Protocol Stacks (d) Two CXL stacks multiplexed inside the adapter
Table 10-3 lists the FDI signals and their descriptions. All signals are synchronous with lclk.
In Table 10-3:
• pl_* indicates that the signal is driven away from the Die-to-Die Adapter to the Protocol Layer.
• lp_* indicates that the signal is driven away from the Protocol Layer to the Die-to-Die Adapter.
Note: The same signal-naming convention as RDI is used to highlight that RDI signal list is a
proper subset of FDI signal list.
Signal encodings pertaining to ‘Management Transport protocol’ are applicable only when
Management Transport protocol was successfully negotiated on the mainband. Otherwise, those
encodings are reserved. Also, dm_* signals in Table 10-3 are applicable only when supporting
Management Transport path over the mainband (“dm” is an abbreviation for “d2d_adapter-to-
management_port_gateway”).
Signal indicating that the Protocol Layer potentially has data to send. This must be
asserted if lp_valid is asserted and the Protocol Layer wants the Adapter to sample the
data.
lp_irdy lp_irdy must not be presented by the Protocol Layer when pl_state_sts is Reset
except when the status transitions from LinkError to Reset. On a LinkError to Reset
transition, it is permitted for lp_irdy to be asserted for a few clocks but it must be de-
asserted eventually. Physical Layer must ignore lp_irdy when status is Reset.
Protocol Layer to Adapter indication that data is valid on the corresponding lp_data
lp_valid bytes.
Protocol Layer to Adapter data, where ‘NBYTES’ equals number of bytes determined by
lp_data[NBYTES-1:0][7:0] the data width for the FDI instance.
When asserted at a rising clock edge, it indicates a single credit return for the Retimer
Receiver buffer. Each credit corresponds to 256B of mainband data (including Flit header
and CRC, etc.). This signal must NOT assert if a Retimer is not present.
On FDI, this is an optional signal. It is permitted to have the Receiver buffers in the
Protocol Layer for Raw Format only. If this is not exposed to Protocol Layer, Adapter must
lp_retimer_crd track credit at 256B granularity even for Raw Format and return credits to Physical Layer
on RDI.
When this is exposed on FDI, the Adapter must have the initial credits knowledge
through other implementation specific means in order to advertise this to the remote
Link partner during parameter exchanges.
This signal is only applicable for CXL.cachemem in UCIe Flit Mode (i.e., the Adapter doing
Retry) for CXL 256B Flit Mode. It is meant as a latency optimization that enables
detection and containment for viral or poison using the Adapter to corrupt CRC of
outgoing Flit. It is recommended to corrupt CRC by performing a bitwise XOR of the
computed CRC with the syndrome 138Eh. The syndrome was computed such that no 1-
bit or 2-bit errors alias to this syndrome, and it has the least probability of aliasing with
3-bit errors.
For Standard 256B Flits, Protocol Layer asserts this along with lp_valid for the last
chunk of the Flit that needs containment. Adapter corrupts CRC for both of the 128B
halves of the Flit which had this set. It also must make sure to overwrite this flit (with
lp_corrupt_crc the next flit sent by the Protocol Layer) in the Tx Retry buffer.
For Latency-Optimized 256B Flits, Protocol Layer asserts this along with lp_valid for
the last chunk of the 128B Flit half that needs containment. If lp_corrupt_crc is
asserted on the first 128B half of the Flit, Protocol Layer must assert it on the second
128B half of the Flit as well. The very next Flit from the Protocol Layer after this signal
has been asserted must carry the information relevant for viral, as defined in the CXL
specification. If this was asserted on the second 128B half of the Flit only, it is the
responsibility of the Protocol Layer to send the first 128B half exactly as before, and
insert the viral information in the second half of the Flit. Adapter corrupts CRC for the
128B half of the Flit which had this set. It also must make sure to overwrite this flit (with
the next flit sent by the Protocol Layer) in the Tx Retry buffer.
Protocol Layer to Adapter transfer of DLLP bytes. This is not used for 68B Flit Mode,
CXL.cachemem or Streaming protocols. For a 64B data path on lp_data, it is
recommended to assign NDLLP >= 8, so that 1 DLLP per Flit can be transferred from the
lp_dllp[NDLLP-1:0] Protocol Layer to the Adapter on average. The Adapter is responsible for inserting DLLP
into DLP bytes 2:5 if the Flit packing rules permit it. See Section 10.2.4.1 for additional
rules.
Indicates valid DLLP transfer on lp_dllp. DLLP transfers are not subject to
backpressure by pl_trdy (the Adapter must have storage for different types of DLLP
and this can be overwritten so that the latest DLLPs are sent to remote Link partner).
lp_dllp_valid DLLP transfers are subject to backpressure by pl_stallreq - Protocol Layer must stop
DLLP transfers at DLLP Flit aligned boundary before giving lp_stallack or requesting
PM.
Protocol Layer to Adapter signal that indicates the stream ID to use with data. Each
stream ID maps to a unique protocol and stack. It is relevant only when lp_valid is 1.
00h: Reserved
01h: Stack 0: PCIe
02h: Stack 0: CXL.io
03h: Stack 0: CXL.cachemem
04h: Stack 0: Streaming protocol
lp_stream[7:0] 05h: Stack 0: Management Transport protocol
11h: Stack 1: PCIe
12h: Stack 1: CXL.io
13h: Stack 1: CXL.cachemem
14h: Stack 1: Streaming protocol
15h: Stack 1: Management Transport protocol
Other encodings are Reserved.
The Adapter is ready to accept data. Data is accepted by the Adapter when pl_trdy,
lp_valid, and lp_irdy are asserted at the rising edge of lclk.
pl_trdy This signal must be asserted only if pl_state_sts is Active or when performing the
pl_stallreq/lp_stallack handshake when the pl_state_sts is LinkError (see
Section 10.3.3.7).
Adapter to Protocol Layer data, where NBYTES equals the number of bytes determined
pl_data[NBYTES-1:0][7:0] by the data width for the FDI instance.
When asserted at a rising clock edge, it indicates a single credit return from the Retimer.
Each credit corresponds to 256B of mainband data (including Flit header and CRC, etc.).
This signal must NOT assert if a Retimer is not present.
On FDI, this is an optional signal. It is permitted to expose these credits to Protocol
Layer for Raw Format only. If this is not exposed to Protocol Layer, Adapter must track
pl_retimer_crd credit at 256B granularity even for Raw Format and back-pressure the Protocol Layer
using pl_trdy.
When this is exposed on FDI, the Adapter converts the initial credits received from the
Retimer over sideband to credit returns to the Protocol Layer on this bit after Adapter
LSM has moved to Active state.
Adapter to Protocol Layer transfer of DLLP bytes. This is not used for 68B Flit mode,
CXL.cachemem or Streaming protocols. For a 64B data path on pl_data, it is
recommended to assign NDLLP >= 8, so that 1 DLLP per Flit can be transferred from the
pl_dllp[NDLLP-1:0] Adapter to the Protocol Layer, on average. The Adapter is responsible for extracting DLLP
from DLP Bytes 2:5 if a Flit Marker is not present. The Adapter is also responsible for
indicating Optimized_Update_FC format by setting pl_dllp_ofc = 1 for the
corresponding transfer on FDI.
Indicates valid DLLP transfer on pl_dllp. DLLPs can be transferred to the Protocol
pl_dllp_valid Layer whenever valid Flits can be transferred on pl_data. There is no backpressure and
the Protocol Layer must always sink DLLPs.
Adapter to Protocol Layer signal that indicates the stream ID to use with data. Each
stream ID maps to a unique protocol. It is relevant only when pl_valid is 1.
00h: Reserved
01h: Stack 0: PCIe
02h: Stack 0: CXL.io
03h: Stack 0: CXL.cachemem
04h: Stack 0: Streaming protocol
pl_stream[7:0] 05h: Stack 0: Management Transport protocol
11h: Stack 1: PCIe
12h: Stack 1: CXL.io
13h: Stack 1: CXL.cachemem
14h: Stack 1: Streaming protocol
15h: Stack 1: Management Transport protocol
Other encodings are Reserved.
Adapter to Protocol Layer indication to dump a Flit. This enables latency optimizations on
the Receiver data path when CRC checking is enabled in the Adapter. It is not applicable
for Raw Format or 68B Flit Format.
For Standard 256B Flit, it is required to have a fixed number of clock cycle delay between
the last chunk of a Flit transfer and the assertion of pl_flit_cancel. This delay is
fixed to be 1 cycle (i.e., the cycle after the last chunk transfer of a Flit). When this signal
is asserted, Protocol Layer must not consume the associated Flit.
For Latency-Optimized 256B Flits, it is required to have a fixed number of clock cycle
delay between the last chunk of a 128B half Flit transfer and the assertion of
pl_flit_cancel. This delay is fixed to be 1 cycle (i.e., the cycle after the last transfer
of the corresponding 128B chunk).
pl_flit_cancel
When this signal is asserted, Protocol Layer must not consume the associated Flit half.
When this mode is supported, Protocol Layer must support it for all applicable Flit
Formats associated with the corresponding protocol. Adapter must guarantee this to be a
single cycle pulse when dumping a Flit or Flit half. It is the responsibility of the Adapter
to ensure that the canceled Flits or Flit halves are eventually replayed on the interface
without cancellation in the correct order once they pass CRC after Retry etc. See
Section 10.2.5 for examples.
When operating in UCIe Flit mode, it is permitted to use this signal to also cancel valid
NOP Flits for the Protocol Layer to prevent forwarding these to the Protocol Layer.
However for interoperability, if a Protocol Layer receives a NOP Flit without a
corresponding pl_flit_cancel, it must discard these Flits.
Protocol Layer to Adapter indication that an error has occurred which requires the Link to
go down. Adapter must propagate this request to RDI, and move the Adapter LSMs (and
CXL vLSMs if applicable) to LinkError state once RDI is in LinkError state. It must stay
lp_linkerror there as long as lp_linkerror=1. The reason for having this be an indication
decoupled from regular state transitions is to allow immediate action on part of the
Protocol Layer and Adapter in order to provide the quickest path for error containment
when applicable (for example, a viral error escalation could map to the LinkError state)
Adapter to the Protocol Layer indication that the Die-to-Die Link has finished negotiation
of parameters with remote Link partner and is ready for transitioning the FDI Link State
Machine (LSM) to Active.
pl_inband_pres
Once it transitions to 1b, this must stay 1b until FDI moves to Active or LinkError. It
stays asserted while FDI is in Retrain, Active, Active.PMNAK, L1, or L2. It must de-assert
during LinkReset, Disabled or LinkError states.
Adapter to the Protocol Layer indication that it has detected a framing related error. It is
pipeline matched with the receive data path. It must also assert if pl_error was
asserted on RDI by the Physical Layer for a Flit which the Adapter is forwarding to the
Protocol Layer.
In UCIe Flit Mode, it is permitted for Protocol Layer to use pl_error indication to log
correctable errors when Retry is enabled from the Adapter. The Adapter must finish any
partial Flits sent to the Protocol Layer and assert pl_flit_cancel in order to prevent
consumption of that Flit by the Protocol Layer. Adapter must initiate Link Retrain on RDI
following this, if it was a framing error detected by the Adapter.
pl_error In UCIe Flit Mode, if Retry is disabled, the Adapter is responsible for mapping internally
detected framing errors or Physical Layer received pl_error to an Uncorrectable
Internal Error and escalate it as pl_trainerror if the mask and severity registers
permit the escalation.
If the Link is operating in Raw Format, the Adapter has no internal detection of framing
errors, it just forwards any pl_error indication received from the Physical Layer on FDI
such that it is pipeline matched to the data path.
It is a pulse indication that can occur only when FDI receiver is Active (i.e.
pl_rx_active_req = lp_rx_active_sts = 1).
Adapter to the Protocol Layer indication that a correctable error was detected that does
not affect the data path. The Protocol Layer must OR the pl_error and pl_cerror
signals for Correctable Error Logging.
Errors logged in the Correctable Error Status register are mapped to this signal if the
corresponding mask bit in the Correctable Error Mask register is cleared to 0.
pl_cerror
It is a pulse of one or more cycles that can occur in any FDI state. If it is a state in which
clock gating is permitted, it is the responsibility of the Adapter to perform the clock
gating exit handshake with the Protocol Layer before asserting this signal. Clock gating
can resume after pl_cerror is de-asserted and all other conditions permitting clock
gating have been met.
Adapter to the Protocol Layer indication that a non-fatal error was detected. This is used
by Protocol Layer for error logging and corresponding escalation to software. The
Adapter must OR any internally detected errors with pl_nferror on RDI and forward
the result on FDI. Errors logged in Uncorrectable Error Status Register are mapped to
this signal if the corresponding Severity and Mask bits are cleared to 0.
pl_nferror
It is a pulse of one or more cycles that can occur in any FDI state. If it is a state in which
clock gating is permitted, it is the responsibility of the Adapter to perform the clock
gating exit handshake with the Protocol Layer before asserting this signal. Clock gating
can resume after pl_nferror is de-asserted and all other conditions permitting clock
gating have been met.
Indicates a fatal error from the Adapter. Adapter must transition pl_state_sts to
LinkError if not already in LinkError state. (Note that the Adapter first takes RDI to
LinkError, and that LinkError is eventually propagated to all the FDI states).
Implementations are permitted to map any fatal error to this signal that require upper
pl_trainerror layer escalation (or interrupt generation) depending on system level requirements.
Errors logged in Uncorrectable Error Status Register are mapped to this signal if the
corresponding Severity is set to 1 and the corresponding Mask bit is cleared to 0.
It is a level signal that can assert in any FDI state but stays asserted until FDI exits the
LinkError state to Reset state.
Adapter asserts this signal to request the Protocol Layer to open its Receiver’s data path
and get ready for receiving protocol data or Flits. The rising edge of this signal must be
pl_rx_active_req when pl_state_sts is Reset, Retrain or Active.
Together with lp_rx_active_sts, it forms a four way handshake.
See Section 10.2.7 for rules related to this handshake.
Adapter indication to Protocol Layer of the protocol that was negotiated during training.
0000b: PCIe without Management Transport
0011b: CXL.1 [Single protocol, i.e., CXL.io] without Management Transport
0100b: CXL.2 [Multi-protocol, Type 1 device] without Management Transport
0101b: CXL.3 [Multi-protocol, Type 2 device] without Management Transport
0110b: CXL.4 [Multi-protocol, Type 3 device] without Management Transport
0111b: Streaming protocol without Management Transport
pl_protocol[3:0] 1000b: PCIe with Management Transport
1001b: Management Transport
1011b: CXL.1 [Single protocol, i.e., CXL.io] with Management Transport
1100b: CXL.2 [Multi-protocol, Type 1 device] with Management Transport
1101b: CXL.3 [Multi-protocol, Type 2 device] with Management Transport
1110b: CXL.4 [Multi-protocol, Type 3 device] with Management Transport
1111b: Streaming protocol with Management Transport
Other encodings are Reserved
This indicates the negotiated Format. See Chapter 3.0 for the definitions of these
formats.
0001b: Format 1: Raw Format
0010b: Format 2: 68B Flit Format
pl_protocol_flitfmt[3:0] 0011b: Format 3: Standard 256B End Header Flit Format
0100b: Format 4: Standard 256B Start Header Flit Format
0101b: Format 5: Latency-Optimized 256B without Optional Bytes Flit Format
0110b: Format 6: Latency-Optimized 256B with Optional Bytes Flit Format
Other encodings are Reserved
Adapter request to Protocol Layer to flush all Flits for state transition and not prepare
pl_stallreq any new Flits.
See Section 10.2.6 for details.
Protocol Layer to Adapter indication that the Flits are aligned and stalled (if
pl_stallreq was asserted). It is strongly recommended that this response logic be on
lp_stallack a global free running clock, so the Protocol Layer can respond to pl_stallreq with
lp_stallack even if other significant portions of the Protocol Layer are clock gated.
Adapter indication to Protocol Layer that the Link is doing training or retraining (i.e., RDI
has pl_phyinrecenter asserted or the Adapter LSM has not moved to Active yet). If
this is asserted during a state where clock gating is permitted, the pl_clk_req/
pl_phyinrecenter lp_clk_ack handshake must be performed with the upper layer. The upper layers are
permitted to use this to update the “Link Training/Retraining” bit in the UCIe Link Status
register.
Adapter indication to Protocol Layer that the Physical Layer is in L1 power management
pl_phyinl1 state (i.e., RDI is in L1 state).
Adapter indication to Protocol Layer that the Physical Layer is in L2 power management
pl_phyinl2 state (i.e., RDI is in L2 state).
The Protocol Layer must only consider this signal to be relevant when the FDI state is
Active or Retrain. This is the total width across all Active modules for the corresponding
FDI instance.
Request from the Adapter to remove clock gating from the internal logic of the Protocol
Layer. This is an asynchronous signal from the Protocol Layer’s perspective since it is not
tied to lclk being available in the Protocol Layer. Together with lp_clk_ack, it forms
a four-way handshake to enable dynamic clock gating in the Protocol Layer.
pl_clk_req When dynamic clock gating is supported, the Protocol Layer must use this signal to exit
clock gating before responding with lp_clk_ack.
If dynamic clock gating is not supported, it is permitted for the Adapter to tie this signal
to 1b.
Response from the Protocol Layer to the Adapter acknowledging that its clocks have
been ungated in response to pl_clk_req. This signal is only asserted when
pl_clk_req is asserted, and de-asserted after pl_clk_req has de-asserted.
lp_clk_ack When dynamic clock gating is not supported by the Protocol Layer, it must stage
pl_clk_req internally for one or more clock cycles and turn it around as lp_clk_ack.
This way it will still participate in the handshake even though it does not support
dynamic clock gating.
Request from the Protocol Layer to remove clock gating from the internal logic of the
Adapter. This is an asynchronous signal relative to lclk from the Adapter’s perspective
since it is not tied to lclk being available in the Adapter. Together with pl_wake_ack,
it forms a four-way handshake to enable dynamic clock gating in the Adapter.
lp_wake_req When dynamic clock gating is supported, the Adapter must use this signal to exit clock
gating before responding with pl_wake_ack.
If dynamic clock gating is not supported, it is permitted for the Protocol Layer to tie this
signal to 1b.
Response from the Adapter to the Protocol Layer acknowledging that its clocks have
been ungated in response to lp_wake_req. This signal is only asserted after
lp_wake_req has asserted, and is de-asserted after lp_wake_req has de-asserted.
pl_wake_ack When dynamic clock gating is not supported by the Adapter, it must stage
lp_wake_req internally for one or more clock cycles and turn it around as
pl_wake_ack. This way it will still participate in the handshake even though it does not
support dynamic clock gating.
This is the sideband interface from the Adapter to the Protocol Layer. See Chapter 7.0 for
details. NC is the width of the interface. Supported values are 8, 16, and 32.
pl_cfg[NC-1:0] Register accesses must be implemented by hardware to be atomic regardless of the
width of the interface (i.e., all 32 bits of a register must be updated in the same cycle for
a 32-bit register write, and similarly all 64 bits of a register must be updated in the same
cycle for a 64-bit register write).
When asserted, indicates that pl_cfg has valid information that should be consumed by
pl_cfg_vld the Protocol Layer.
Credit return for sideband packets from the Adapter to the Protocol Layer for sideband
packets. Each credit corresponds to 64 bits of header and 64 bits of data. Even
transactions that do not carry data or carry 32 bits of data consume the same credit and
the Receiver returns the credit once the corresponding transaction has been processed
or de-allocated from its internal buffers. See Section 7.1.3.1 for additional flow control
rules. A value of 1 sampled at a rising clock edge indicates a single credit return.
pl_cfg_crd Because the advertised credits are design parameters, the Protocol Layer transmitter
updates the credit counters with initial credits on domain reset exit, and no initialization
credits are returned over the interface.
Credit returns must follow the same rules of clock gating exit handshakes as the
sideband packets to ensure that no credit returns are dropped by the receiver of the
credit returns.
This is the sideband interface from Protocol Layer to the Adapter. See Chapter 7.0 for
details. NC is the width of the interface. Supported values are 8, 16, and 32.
lp_cfg[NC-1:0] Register accesses must be implemented by hardware to be atomic regardless of the
width of the interface (i.e., all 32 bits of a register must be updated in the same cycle for
a 32-bit register write, and similarly all 64 bits of a register must be updated in the same
cycle for a 64-bit register write).
When asserted, indicates that lp_cfg has valid information that should be consumed by
lp_cfg_vld the Adapter.
Credit return for sideband packets from the Protocol Layer to the Adapter for sideband
packets. Each credit corresponds to 64 bits of header and 64 bits of data. Even
transactions that do not carry data or carry 32 bits of data consume the same credit and
the Receiver returns the credit once the corresponding transaction has been processed
or de-allocated from its internal buffers. See Section 7.1.3.1 for additional flow control
rules. A value of 1 sampled at a rising clock edge indicates a single credit return.
lp_cfg_crd Because the advertised credits are design parameters, the Adapter transmitter updates
the credit counters with initial credits on domain reset exit, and no initialization credits
are returned over the interface.
Credit returns must follow the same rules of clock gating exit handshakes as the
sideband packets to ensure that no credit returns are dropped by the receiver of the
credit returns.
Number of stacks that successfully negotiated Management Transport protocol. This field
is sampled only when dm_param_exchange_done signal is asserted. If 68B Flit format
was finalized, this field must be cleared to 00b.
00b: 0 stack
dm_param_stack_count[N-1:0] 01b: 1 stack
10b: 2 stacks
Others: reserved
N=1 for single stack and 2 for 2 stacks.
Each side is permitted to instantiate clock crossing FIFOs internally if needed, as long as it does not
violate the requirements at the interface itself.
It is important to note that there is no back pressure possible from the Protocol Layer to the Adapter
on the main data path. So any clock crossing related logic internal to the Protocol Layer must take
this into consideration.
will enable error handlers to make sure the Link is not stuck in a LinkError state, if the intent is to
save power when a Link is in an error state.
Protocol Layer can request removal of clock gating of the Adapter by asserting lp_wake_req
(asynchronous to lclk availability in the Adapter). All Adapter implementations must respond with a
pl_wake_ack (synchronous to lclk). The extent of internal clock ungating when pl_wake_ack is
asserted is implementation-specific, but lclk must be available by this time to enable FDI transitions
from the Protocol Layers. The Wake Req/Ack is a full handshake and it must be used for state
transition requests (on lp_state_req or lp_linkerror) when moving away from a state in which
clock gating is permitted. It must also be used for sending packets on the sideband interface.
Adapter is allowed to initiate pl_clk_req/lp_clk_ack handshake at any time and the Protocol
Layer must respond.
4. pl_clk_req should not be the only consideration for the Protocol Layer to perform clock gating,
it must take into account pl_state_sts and other protocol-specific requirements before
performing trunk and/or local clock gating.
5. The Adapter must use this handshake to ensure transitions of pl_inband_pres, pl_phyinl1,
pl_phyinl2, pl_phyinrecenter, and pl_rx_active_req have been observed by the
Protocol Layer. Since these are level oriented signals, the Adapter is permitted to let the signal
transition without waiting for lp_clk_ack. When this is done during initial Link bring up, it is
strongly recommended for the Adapter to keep pl_clk_req asserted until the state status
transitions away from Reset to a state where clock gating is not permitted or until the state status
is Reset and pl_inband_pres de-asserts.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
lclk
pl_clk_req
lp_clk_ack
pl_inband_pres
6. The Adapter must also perform this handshake before transition to LinkError state from Reset,
LinkReset, Disabled or PM state (especially when the LinkError transition occurs by the Adapter
without being directed by the Protocol Layer). It is permitted to assert pl_clk_req before the
state change, in which case it must stay asserted until the state status transitions. It is also
permitted to assert pl_clk_req after the state status transition, but in this case Adapter must
wait for lp_clk_ack before performing another state transition.
7. The Adapter must also perform this handshake when the status is PM and remote Link partner is
requesting PM exit. For exit from Reset, LinkReset, Disabled or PM states to a state that is not
LinkError, it is required to assert pl_clk_req before the status change, and in this case it must
stay asserted until the state status transitions away from Reset or PM.
8. The Adapter must also perform this handshake for sideband transfers from the Adapter to the
Protocol Layer. When performing the handshake for pl_cfg transitions, Adapter must wait for
lp_clk_ack before changing pl_cfg or pl_cfg_vld. Because pl_cfg can have multiple
transitions for a single packet transfer, it is necessary to make sure that the Protocol Layer clocks
are up before transfer begins.
When clock-gated in Reset states, Protocol Layers that rely on dynamic clock gating to save power
must wait in clock gated state for pl_inband_pres=1. The Adapter will request clock gating exit
when it transitions pl_inband_pres, and the Protocol Layer must wait for pl_inband_pres
assertion before requesting lp_state_req = ACTIVE. If pl_inband_pres de-asserts while
pl_state_sts = Reset, then the Protocol Layer is permitted to return to clock-gated state after
moving lp_state_req to NOP.
bubbles in the middle of a Flit transfer (i.e., lp_valid and lp_irdy must be asserted continuously
until the Flit transfer is complete. Of course, data transfer can stall because of pl_trdy de-
assertion).
0 1 2 3 4 5 6
clk
lp_irdy
lp_data Dat0 Dat1 Dat2
lp_valid
pl_trdy
As indicated in the signal list descriptions, when Adapter is sending data to the Protocol layer, there is
no back-pressure mechanism, and data is transferred whenever pl_valid is asserted. The Adapter
is permitted to insert bubbles in the middle of a Flit transfer and the Protocol Layer must be able to
handle that.
• The Protocol Identifier corresponding to D2D Adapter in the Flit Header overlaps with the Flit
usage of NOP Flits defined in PCIe and CXL specifications. The Adapter must check for available
DLLPs in these Flits as well. All 0 bits in the DLLP byte positions indicate a NOP DLLP, and must
not be forwarded to the Protocol Layer.
Figure 10-15 shows an example timing relationship for pl_flit_cancel and pl_data for Latency-
Optimized Flits when the first Flit half fails CRC check. Both Flit halves are canceled by the Adapter in
this example by asserting pl_flit_cancel one clock after the last chunk transfer of the
corresponding Flit half. It is permitted for the Adapter to de-assert pl_valid on clock cycles 5 and 6
instead of canceling that Flit half; however, this might have implications to meeting physical design
timing margins in the Adapter. The use of pl_flit_cancel allows the Adapter to perform the CRC
check on the side without putting the CRC logic in the critical timing path of the data flow and thus
permitting higher frequency operation for implementations. In the example shown, after replay flow
the entire Flit is transferred to the Protocol Layer without canceling as CRC checks pass.
Figure 10-16 and Figure 10-17 show examples of two possible implementations of timing relationship
for pl_flit_cancel and pl_data for Latency-Optimized Flits when the second Flit half fails CRC
check. In both cases, the first half of the Flit is consumed by the Protocol Layer because it is not
canceled by the Adapter (the data transferred on clock cycles 3 and 4).
In the first case (shown in Figure 10-16), after the replay flow, CRC passes, and the Adapter ensures
that the Protocol Layer does not re-consume the first half again by asserting pl_flit_cancel for it.
In this case, pl_valid asserts for the entire Flit, but only the second half is consumed because the
first half was canceled on clock cycle (n+2).
In the second case (shown in Figure 10-17), after the replay flow, CRC passes, and the Adapter
ensures that the Protocol Layer does not re-consume the first half again by not asserting pl_valid
for it.
Figure 10-18 shows an example for a Standard 256B Flit. In this case, the CRC bytes are packed
toward the end of the Flit and thus a CRC error on either of the two halves cancels the entire Flit.
After replay flow, CRC passes, and the entire Flit is sent to the Protocol Layer without canceling it.
Reset
Disabled
From Any State
(except LinkError)
L2
Active LinkReset
From Any State
(except Disabled and
PMNAK
LinkError)
L1
Retrain
When CXL is sent over UCIe, ARB/MUX functionality is performed by the Adapter and CXL vLSMs are
exposed on FDI. Although ALMPs are transmitted over mainband, the interface to the Protocol Layer is
FDI and it follows the rules of Rx_active_req/Sts Handshake as well.
Step 3 through Step 6 constitute the “Active Entry Handshake” on FDI and must be performed for
every entry to Active state. Active.PMNAK to Active transition is not considered here because
Active.PMNAK is only a sub-state of Active.
Protocol Layer Adapter Physical layer CHANNEL Physical layer Adapter Protocol Layer
Die 0 Die 0 Die 0 Die 1 Die 1 Die 1
Stage 2 Complete
request could have been from either side depending on the configuration. Protocol Layer must
continue receiving protocol data or Flits while the status is Active or Active.PMNAK.
— DP Protocol Layer for PCIe or CXL is permitted to change request from PM to Active without
waiting for PM or Active.PMNAK (the DP FDI will never have pl_state_sts=Active.PMNAK
since it does not send “PM Request” sideband messages); however, it is still possible for the
Adapter to initiate a stallreq/ack and complete PM entry if it was in the process of committing
to PM entry when the Protocol Layer changed its request. In this scenario, the Protocol Layer
will see pl_state_sts transition to PM and it is permitted to continue asking for the new
state request.
— If the resolution is LinkError, then the Link is down and it resets any outstanding PM
handshakes.
• Adapter (UP port only if CXL or PCIe protocol), initiates a “PM request” sideband message once it
samples a PM request on lp_state_req and has completed the StallReq/Ack handshake with
the corresponding Protocol Layer and its Retry buffer is empty of Flits from the Protocol Layer that
is requesting PM (all pending Acks have been received).
• If the Adapter LSM moves to Retrain while waiting for a “PM Response” sideband message, it must
wait for the response. Once the response is received, it must transition back to Active before
requesting a new PM entry. Note that the transition to Active requires Active Entry handshake
with the remote Link partner, and that will cause the remote partner to exit PM. If the Adapter
LSM receives a “PM Request” sideband message after it has transitioned to Retrain, it must
immediately respond with {LinkMgmt.Adapter0.Rsp.PMNAK}.
Note: The precise timing of the remote Link partner that is observing Link Retrain is
unknown; thus, the safer thing to do is to go to Active and redo the PM handshake
when necessary for this scenario. There is a small probability that there might be an
exit from PM and re-entry back in PM under certain scenarios.
• Once the Adapter receives a “PM request” sideband message, it must respond to it within 2 us
(the time is only counted during the Adapter LSM being in Active state):
— if its local Protocol Layer is requesting PM, it must respond with the corresponding “PM
Response” sideband message after finishing the StallReq/Ack handshake with its Protocol
Layer and its Retry buffer being empty. If the current status is not PM, it must transition
pl_state_sts to PM after responding to the sideband message.
— If the current pl_state_sts = PM, it must respond with “PM Response” sideband message.
— If pl_state_sts = Active and lp_state_req = Active and it remains this way for 1us after
receiving the “PM Request” sideband message, it must respond with
{LinkMgmt.Adapter0.Rsp.PMNAK} sideband message. The time is only counted during all the
relevant state machines being in Active state.
• If the Adapter receives a “PM Response” sideband message in response to a “PM Request”
sideband message, it must transition pl_state_sts on its local FDI to PM (if it is currently in
Active state).
• If the Adapter receives a {LinkMgmt.Adapter0.Rsp.PMNAK} sideband message in response to a
“PM Request” sideband message, it must transition pl_state_sts on its local FDI to
Active.PMNAK state if it is currently in Active state. If it is not in Active state, no action needs to
be taken. It is permitted to retry PM entry handshake (if all conditions of PM entry are satisfied) at
least 2us after receiving the {LinkMgmt.Adapter0.Rsp.PMNAK} sideband message OR if it
received a corresponding “PM Request” sideband message from the remote Link partner.
• PM exit is initiated by the Protocol Layer requesting Active on FDI. After RDI is in Active, triggers
the Adapter to initiate PM exit by performing the Active Entry handshakes on sideband.
Figure 10-24 shows an example flow of PM exit on FDI when Adapter LSM is exposed.
Note that the following figures are examples and do not show the lp_wake_req, pl_clk_req, and/
or pl_rx_active_req handshakes. Implementations must follow the rules outlined for these
handshakes in previous sections.
Protocol Layer Adapter Physical layer CHANNEL Physical layer Adapter Protocol Layer
Die 0 Die 0 Die 0 Die 1 Die 1 Die 1
Protocol Layer Adapter Physical layer CHANNEL Physical layer Adapter Protocol Layer
Die 0 Die 0 Die 0 Die 1 Die 1 Die 1
Because Active.PMNAK is a sub-state of Active, all rules that apply for Active are also applicable for
Active.PMNAK; however the state status cannot move from Active.PMNAK directly to L1 or L2 due to
the rules requiring the Upper Layer to request a transition to Active before requesting PM again.
For every data transfer, the Least Significant Byte from the corresponding Flit Chunk is mapped to
Byte 0 on FDI (or RDI), the next Byte from the Flit is mapped to Byte 1 on FDI (or RDI), and so on.
Within each Byte, bit 0 of the Byte from the Flit maps to bit 0 of the corresponding Byte on FDI (or
RDI), and so on. The same mapping applies for both transmit and receive directions.
For example, in Transfer 0, Byte 0 of the Flit is mapped to Byte 0 of FDI (or RDI), Byte 1 of the Flit is
mapped to Byte 1, and so on. In transfer 1, Byte 64 of the Flit is mapped to Byte 0 of FDI (or RDI),
Byte 65 of the Flit is mapped to Byte 1 of FDI (or RDI) and so on. This example is illustrated in
Figure 10-26. Data transfers follow the rules outlined in Section 10.1.4 for RDI and Section 10.2.4 for
FDI and hence do not necessarily correspond to consecutive clock cycles.
Figure 10-25. CXL.io Standard 256B Start Header Flit Format Examplea
+45
+49
+59
+63
+46
+50
+60
FH B0b +0
FH B1b +1
+2
DLP B3c
DLP B4c
DLP B5c
C0 B0d
C0 B1d
C1 B0d
C1 B1d
DLP B2
10B
Byte 192 46B of Flit Chunk 3 (from Protocol Layer)
Reserved
Figure 10-26. FDI (or RDI) Byte Mapping for 64B Datapath to 256B Flits
0 Flit Byte 0 Flit Byte 1 Flit Byte 2 … Flit Byte 60 Flit Byte 61 Flit Byte 62 Flit Byte 63
1 Flit Byte 64 Flit Byte 65 Flit Byte 66 … Flit Byte 124 Flit Byte 125 Flit Byte 126 Flit Byte 127
2 Flit Byte 128 Flit Byte 129 Flit Byte 130 … Flit Byte 188 Flit Byte 189 Flit Byte 190 Flit Byte 191
3 Flit Byte 192 Flit Byte 193 Flit Byte 194 … Flit Byte 252 Flit Byte 253 Flit Byte 254 Flit Byte 255
If the FDI or RDI datapath width is increased (or decreased), the Byte mapping follows the same
convention of increasing order of Flit bytes mapped to increasing order of FDI (or RDI) bytes.
Figure 10-27 shows an illustration of a 128B data path.
Figure 10-27. FDI (or RDI) Byte Mapping for 128B Datapath to 256B Flits
Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte
0 … …
0 1 2 62 63 64 65 125 126 127
Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte Flit Byte
1 … …
128 129 130 190 191 192 193 253 254 255
For 68B Flit Formats, the Protocol Layer transfers only 64B of payload information from the Flit over
FDI (the Flit Header and CRC are inserted by the Adapter). Thus, if the datapath is 128B wide, two
such transfers will happen at a given clock cycle as shown in Figure 10-28. The numbering in the
figure still uses the Byte positions relative to the overall Flit, hence Byte 0 corresponds to Flit 0 Byte
2, etc. On the Transmit path, the Protocol Layer inserts empty slots (i.e., bytes with a value of 00h) to
populate the entire width of the bus if the interface width is greater than 64B and there is insufficient
payload information to transmit. The Adapter does the same on the Receive path.
Figure 10-28. FDI Byte Mapping for 128B Datapath for 68B Flit Format
FDI Bytes
Transfer (Columns)
(Rows)
0 1 2 … 62 63 64 65 … 125 126 127
Flit 0 Flit 0 Flit 0 Flit 0 Flit 0 Flit 1 Flit 1 Flit 1 Flit 1 Flit 1
0 … …
Byte 2 Byte 3 Byte 4 Byte 64 Byte 65 Byte 2 Byte 3 Byte 63 Byte 64 Byte 65
For 68B Flit Formats, Adapter inserts the Flit Header and CRC bytes, and performs the necessary
shifting before transferring the bytes over RDI. Thus, if the data path is 128B wide, the byte mapping
will follow as shown in Figure 10-29. The remainder of Flit 1 continues on the next transfer, etc. Given
that the Adapter must insert PDS bytes before pausing the data stream, which makes the transfer a
multiple of 256B, the transfer naturally aligns when the width of RDI is 64B, 128B, or 256B on both
the Transmit and Receive directions. For wider than 256B interfaces, see the Implementation Note
below.
Figure 10-29. RDI Byte Mapping for 128B Datapath for 68B Flit Format
RDI Bytes
Transfer (Columns)
(Rows)
0 1 2 … 65 66 67 68 … 125 126 127
Flit 0 Flit 0 Flit 0 Flit 0 Flit 0 Flit 0 Flit 1 Flit 1 Flit 1 Flit 1
0 … …
Byte 0 Byte 1 Byte 2 Byte 65 Byte 66 Byte 67 Byte 0 Byte 57 Byte 58 Byte 59
The frequency of operation of the interfaces along with the data width determines the maximum
bandwidth that can be sustained across the FDI (or RDI) interface. For example, a 64B datapath at 2
GHz of clock frequency is required to sustain a 16 GT/s Link for an Advanced Package configuration
with a single module. Similarly, to scale to 32 GT/s of Link speed operation for Advanced Package
configuration with a single module, a 128B datapath running at 2 GHz would be required to support
the maximum Link bandwidth.
The FDI (or RDI) byte mapping for the transmit or receive direction does not change for multi-module
configurations. The MMPL logic within the Physical Layer is responsible for ensuring that the bytes are
transmitted in the correct order to the correct module. Any byte swizzling or rearrangement to resolve
module naming conventions, etc., is thus the responsibility of the MMPL logic.
It is strongly recommended that when operating in UCIe Flit mode, NBYTES is chosen to be one of
64, 128, 256, or 512 and is selected to get the best KPI (e.g., latency, area, etc.) for the desired
bandwidth from the UCIe Link. If NBYTES is chosen to be larger than or equal to 512, it is strongly
recommended that it is a multiple of 256 and is only done for the case of a four module Advanced
Package Link designed for 16 GT/s or higher. Data transfer over the Link for all Flit formats defined
in UCIe Flit mode are in a granularity of 256B, so aligning to a multiple of that avoids unnecessary
shifting and corresponding tracking.
For situations in which the RDI or FDI data path is wider than 256B, the following considerations
apply for interoperability:
• On the Transmit side, it is required to send valid data corresponding to the full width of the
interface. For FDI, this would mean the Protocol Layer might need to pack a Protocol Flit with
empty slots. For RDI, this would mean the Adapter might need to insert NOP Flits (for 68B Flit
Format, PDS bytes are also included as valid data for this purpose).
• On the Receive side, for RDI:
It is possible that the Physical Layer has to wait to accumulate sufficient bytes before
transmitting over RDI. The Physical Layer must accumulate data in multiples of 256B and if the
accumulated data is less than the RDI width, it must wait for a sufficient gap in valid data
transfer on the Physical Link (at least 16 UI for differential clock and 32 UI for quadrature clock)
before transmitting this data on RDI. In this scenario, the accumulated data is sent on the lower
significant bytes of the RDI, and any remaining bytes on the interface are assigned to all 0s.
For 256B Flit Formats, a Flit Header which is 0000h with a CRC of 0000h is silently discarded by
the Adapter. It is also not included for the purposes of Runtime Link Testing.
For 68B Flit Formats, the Adapter is expected to keep track of the PDS bytes (because these are
included in Runtime Link Testing). Any extra padding beyond that is silently discarded and not
included for the purposes of Runtime Link Testing.
• On the Receive side, for FDI:
The Adapter must accumulate data in multiples of 256B before forwarding to the Protocol Layer.
If the accumulated data is less than the FDI width, it gets sent on the lower significant bytes of
the FDI, and any remaining bytes on the interface are assigned to 0b.
For 256B Flit Formats, a Flit Header of 0000h is a NOP for the Protocol Layer and is discarded.
For 68B Flit Formats, 00h are IDLE symbols for PCIe/CXL.io or Empty slots for CXL.cachemem,
both of which get discarded by the Protocol Layer. For Streaming protocols that use 68B Flit
Formats, it is recommended to use the same approach.
• lp_corrupt_crc, pl_flit_cancel, and pl_error apply to all the Flits that are transferred
at the corresponding clock cycle. If applicable, it is recommended to set NDLLP to 32 for these
applications and limit the DLLP throughput to be 1 per clock cycle on FDI.
The Stallreq/Ack mechanism is mandatory for all FDI and RDI implementations. lp_stallack
assertion implies that Upper Layer has stalled its pipeline at a Flit aligned boundary.
The pl_stallreq/lp_stallack handshake is a four-phase sequence that follows the rules below:
1. The pl_stallreq and lp_stallack must be de-asserted before domain reset exit.
2. A rising edge on pl_stallreq must only occur when lp_stallack is de-asserted.
3. A falling edge on pl_stallreq must only occur when lp_stallack is asserted or when the
domain is in reset.
4. A rising edge on lp_stallack must only occur when pl_stallreq is asserted.
5. A falling edge on lp_stallack must only occur when pl_stallreq is de-asserted or when
domain is in reset.
6. When lp_stallack is asserted lp_valid and lp_irdy must both be de-asserted.
7. While pl_stallreq is asserted, any data presented on the interface must be accepted by the
physical layer until the rising edge of lp_stallack. pl_trdy is not required to be asserted
consecutively.
8. The logic path between pl_stallreq and lp_stallack must contain at least one flip-flop to
prevent a combinatorial loop.
9. A complete stallreq/stallack handshake is defined as the completion of all four phases: Rising
edge on pl_stallreq, rising edge on lp_stallack, falling edge on pl_stallreq, falling
edge on lp_stallack.
10. It is strongly recommended that Upper Layer implements providing lp_stallack on a global
free running clock so that it can finish the handshake even if the rest of its logic is clock gated.
IMPLEMENTATION NOTE
In multiple places within this specification, for state transitions, it is referring to
completing the Stallreq/Ack handshake before the state transition. In the context of
state transitions, there are two acceptable ways to implement this from the lower
layer:
• One implementation from the lower layer would follow the sequence:
i. Assert pl_stallreq.
ii. After lp_stallack is asserted, perform the necessary actions for state
transition (including deassertion of pl_trdy).
iii. De-assert pl_stallreq. Once lp_stallack de-asserts, the state
transition is considered complete.
• The alternate implementation from the lower layer would follow the sequence:
i. Assert pl_stallreq.
ii. After lp_stallack is asserted, de-assert pl_trdy.
iii. De-assert pl_stallreq and perform the necessary actions for state
transition.
The requests are listed on the Row and the state status is listed in the Column.
Request (Row)
Versus
Reset Active L1 LinkReset Retrain Disable L2 LinkError
Status
(Column)
LinkError
Yes Yes Yes Yes Yes Yes Yes Yes
(sideband wire)
The pl_state_sts is not permitted to exit Reset state until requested by the upper layer. The exit
from Reset state is requested by the upper layer by changing the lp_state_req signal from NOP
encoding value to the permitted next state encoding value.
Section 10.3.3.8 describes the transition from Active or Active.PMNAK to LinkReset, Disable, or
LinkError states.
Protocol Layer must not request Retrain on FDI, unless UCIe is operating in UCIe Raw Format.
A Retrain transition on RDI must always be propagated to Adapter LSMs that are in Active. Retrain
transitions of the UCIe Link are not propagated to CXL vLSMs. Upon Retrain entry, the credit counter
for UCIe Retimer (if present) must be reset to the value advertised during initial Link bring up (the
value is given by the “Retimer_Credits” Parameter in the {AdvCap.Adapter} sideband message during
initial Link bring up). The Retimer must drain or dump any Flits in flight or its internal transport
buffers upon entry to Retrain. Additionally, the Retimer must trigger Retrain of the remote UCIe Link
(across the Off-Package Interconnect).
Entry into Retrain state resets power management state for the corresponding state machine, and
power management entry if required must be re-initiated after the interface enters Active state. If
there was an outstanding PM request that returns PM Response, the corresponding state machine
must perform Active Entry handshakes to bring that state machine back to Active.
Note: The requirement to wait for NOP->Active transition ensures that the Upper Layer has a
way to delay Active transition in case it is waiting for any relevant sideband handshakes
to complete (for example the Parity Feature handshake).
Adapter triggers LinkReset transition upon observing a LinkReset request from the Protocol Layer, OR
on receiving a sideband message requesting LinkReset entry from the remote Link partner OR an
implementation specific internal condition (if applicable). Implementations must make best efforts to
gracefully drain the Retry buffers when transitioning to LinkReset, however, entry to LinkReset must
not timeout on waiting for the Retry buffer to drain. The Protocol Layer and Adapter must drain/flush
their pipelines and retry buffer of the Flits for the corresponding Protocol Stack once the FDI state
machines have entered LinkReset.
If all the FDI state machines and Adapter LSMs are in LinkReset, the Adapter triggers RDI to enter
LinkReset as well.
Implementations must make best efforts to gracefully drain the Retry buffers when transitioning to
Disabled, however, entry to Disabled must not timeout on waiting for the Retry buffer to drain. The
Protocol Layer and Adapter must drain/flush their pipelines and retry buffer of the Flits for the
corresponding Protocol Stack once the FDI state machines have entered Disabled.
If all the FDI state machines and Adapter LSMs are in Disabled, the Adapter triggers RDI to enter
Disabled as well.
The lower layer enters LinkError state when directed by an lp_linkerror signal or due to Internal
LinkError conditions. For RDI, the entry is also triggered if the remote Link partner requested
LinkError entry through the relevant sideband message. It is not required to complete the stallreq/ack
handshake before entering this state. However, for implementations where LinkError state is not a
terminal state (terminal implies SoC needs to go through reset flow after reaching LinkError state), it
is expected that software can come and retrain the Link after clearing error status registers, etc., and
the following rules should be followed:
• If the lower layer decides to perform a pl_stallreq/lp_stallack handshake, it must provide
pl_trdy to the upper layer to drain the packets. In cases where there is an uncorrectable
internal error in the lower layer, these packets could be dropped and not transmitted on the Link.
• It is required for the upper layer to internally clean up the data path, even if pl_trdy is not
asserted and it has sampled LinkError on pl_state_sts for at least one clock cycle.
The lower layer may enter LinkError state due to Internal LinkError requests such as when:
• Encountering uncorrectable errors due to hardware failure or directed by Upper Layer
• Remote Link partner requests entry into LinkError (RDI only)
LinkError due to timeouts (to cover for cases where the LinkError transition happened and
sideband was not functional).
From a state machine hierarchy perspective, it is required for Adapter LSM to move to LinkReset,
Disabled or LinkError before propagating this to CXL vLSMs. This ensures CXL rules are followed
where these states are “non-virtual” from the perspective of CXL vLSMs.
Adapter LSM can transition to LinkReset or Disabled without RDI transitioning to these states. In the
case of multi-protocol stacks over the same Physical Link/Adapter, each Protocol can independently
enter these states without affecting the other protocol stack on the RDI.
If all the Adapter LSMs have moved to a common state of LinkReset/Disabled or LinkError, then RDI is
taken to the corresponding state. If however, the Adapter LSMs are in different state combinations of
LinkError, Disabled or LinkReset, the RDI is moved to the highest priority state. The priority order
from highest to lowest is LinkError, Disabled, LinkReset. For a LinkError/LinkReset/Disabled transition
on RDI, Physical Layer must initiate the corresponding sideband handshake to transition remote Link
partner to the required state. If no response is received from remote Link partner for this message
after 8ms, RDI transitions to LinkError.
If RDI moves to a state that is of a higher priority order than the current Adapter LSM, it is required
for the Adapter to propagate that to the Adapter LSM using sideband handshakes to ensure the
transition with the remote Link partner.
After transition from LinkError/LinkReset/Disable to Reset on RDI, the Physical Layer must not begin
training unless the Physical Layer observes a NOP->Active transition on lp_state_req from the
Adapter or observes one of the Link Training triggers defined in Chapter 4.0. The Adapter should not
trigger NOP->Active unless it receives this transition from the Protocol Layer or has internally decided
to bring the Link Up. The Adapter must trigger this on RDI if the Protocol Layer has triggered this
even if pl_inband_pres = 0. Thus, if the Protocol Layer is waiting for software intervention and
wants to hold back the Link from training, it can delay the NOP->Active trigger on FDI. Upper Layers
are permitted to transition lp_state_req back to NOP after giving the NOP->Active trigger in order
to clock gate while waiting for pl_inband_pres to assert.
If RDI transitions to L2, the exit is through Reset, and complete Link Initialization and Training flow
will occur (including a fresh Parameter Exchange for the Adapter). After transition from L2 to Reset on
RDI, the LTSM will begin the Link PM exit and retraining flow when a {LinkMgmt.RDI.Req.Active}
sideband message is received or when the Adapter requests Active on RDI or it observes one of the
Link Training triggers defined in Chapter 4.0.
If the Adapter LSM transitions to L2, but RDI does not go to a Link down state (i.e. Reset, LinkReset,
Disabled, LinkError), then this is a “virtual” L2 state. The exit from L2 for the Adapter LSM in this case
will go through Reset for the Adapter LSM, but it does not result in a fresh Parameter Exchange for
the Adapter, and the protocol parameters and the Flit Formats remain the same as prior to L2 entry.
An example of this is if there are multiple stacks on the same Adapter, and only one of the FDIs
transitions to L2.
If the RDI state is already in a Link down state (i.e., Reset, LinkReset, Disabled, LinkError) and the
Link is not currently training (Adapter can infer this from pl_phyinrecenter), then there is no
need to notify the remote Link partner. Adapter or Physical Layer can complete the state transitions
locally for this case. If RDI is in RESET and the Link is training, it is recommended to wait for
training to complete before triggering a state transition with the remote Link partner to LinkReset or
Disabled.
The following is written for Disabled state, but applies to both Disabled and LinkReset states.
• For PCIe or CXL protocols, the Downstream Port initiates the transition to Disabled. Because the
Upstream Port goes through a Conventional Reset after transitioning to Disabled, the Upstream
Port waits for Downstream Port to re-initiate Link Training once the corresponding SoC reset
flow has finished.
• For Streaming protocols,
— The initiating Protocol Layer transitions lp_state_req to Disabled. If the necessary
conditions are met from the Adapter perspective (for example, attempting to drain the Retry
buffer etc.), it forwards the request using the corresponding sideband message to the
remote Link partner’s Adapter.
— On the remote Link partner, the Adapter transitions pl_state_sts to the requested state
once the necessary conditions are met from the Adapter perspective (for example,
attempting to drain the Retry buffer etc.). It also sends the corresponding sideband
message response.
If the Adapter needs to take the RDI to Disabled state, it is recommended to keep FDI
pl_state_sts in Disabled state until that flow has completed. Otherwise, if the exit
conditions for Disabled are met, it is permitted to transition to Reset state on FDI.
Following this, the Protocol Layer on the remote Link partner in turn is permitted bring the
FDI state back to Disabled if required by the underlying protocol. The Adapter must not
trigger another sideband handshake for this scenario.
— The initiating Adapter transitions pl_state_sts to Disabled upon receiving the sideband
message response.
— The Protocol Layers on either side of the Link can initiate an exit flow by requesting Active
when pl_state_sts is Disabled, followed by a NOP->Active transition after the
pl_state_sts is Reset.
• For configurations in which the Adapter is servicing multiple Protocol Layers, the Disabled or
LinkReset handshakes are independent per Protocol Layer. In case the Adapter LSM has
transitioned to Reset from Disabled or LinkReset for a given Protocol Layer, the Adapter must
keep track of the most-recent previous state to determine the correct resolution for RDI state
request.
Figure 10-30 also shows the link reset flow for a PCIe/CXL.io protocol. If Management Transport
protocol is supported and negotiated on the same stack as PCIe/CXL.io protocol, the Management
Port Gateway must still follow the LinkReset flow and reset requirements that correspond to PCIe/
CXL.io.
DP Protocol Layer DP Adapter DP Physical layer CHANNEL UP Physical layer UP Adapter UP Protocol Layer
Die 0 Die 0 Die 0 Die 1 Die 1 Die 1
10.3.4.2 LinkError
Figure 10-31 shows an example of LinkError entry and exit when the Protocol Layer detected an
uncorrectable internal error.
DP Protocol Layer DP Adapter DP Physical layer CHANNEL UP Physical layer UP Adapter UP Protocol Layer
Die 0 Die 0 Die 0 Die 1 Die 1 Die 1
Protocol Layer Adapter Physical layer CHANNEL Physical layer Adapter Protocol Layer
Die 0 Die 0 Die 0 Die 1 Die 1 Die 1
Adapter LSM is L2
§§
The goal of Compliance testing is to validate the mainband supported features of a Device Under Test
(DUT) against a known good reference UCIe implementation. Device support for Compliance Testing
is optional, however a device that does not support capabilities listed in this chapter may not be able
to participate in the Compliance program. Different layers of UCIe (Physical, Adapter, Protocol) will be
checked independently with a suite of tests for compliance testing.
UCIe implementations that support compliance testing must implement the Compliance/Test Register
Block as outlined in Chapter 9.0 and adhere to the requirements outlined in this chapter.
The above components are integrated together in a test package (see Figure 11-1), which is then
used for running Compliance and Interoperability tests.
UCIe sideband plays a critical role for enabling compliance testing by allowing compliance software to
access registers from different UCIe components (e.g., Physical Layer, D2D Adapter, etc.) for setting
up tests as well as monitoring status. It is expected that UCIe sideband comes up without requiring
any FW initialization.
This specification defines the required hardware capabilities of the UCIe stack in the DUT. A separate
document will be published later to describe the following:
• Compliance test setup, including the channel model and package level details
• Test details
• Golden Die details including form factor and system-level behavior.
This chapter uses the terms ‘software’ and ‘compliance software’ interchangeably. Any use of the term
‘software’ in this chapter means compliance software that is either running on the Golden Die, or on
an external controller that is connected to the Golden Die via test/JTAG port.
Software, prior to testing compliance for any optional UCIe capability, must read the corresponding
Capability register (e.g., PHY Capability register described in Section 9.5.3.22) to ensure that the DUT
implements the capability.
Figure 11-1. Examples of Standard and Advanced Package setups for DUT and Golden Die
Compliance Testing
For PCIe and CXL Protocol Layers, UCIe leverages the protocol compliance defined in those
specifications for the respective transaction layers. Implementations must follow the requirements
and capabilities outlined in PCIe Base Specification and CXL Specification, respectively.
For Streaming protocols, because Protocol Layer interoperability is specific to the protocol being
streamed, compliance testing of the Protocol Layer is beyond the scope of this specification.
The capabilities listed in this section must be supported by the Adapter in the DUT if the Adapter
supports any of the Flit Formats defined in Chapter 3.0. These capabilities are applicable to Adapters
of all UCIe device types (including Retimers). Each of the capabilities also have their respective
Control and Status registers, which are used to enable software to test various combinations of flows
and test criteria.
• Ability to Inject Test or NOP Flits: On the Transmitter, the injection behavior is defined by the Flit
Tx Injection Control register (see Table 9-73). For all injected Flits, CRC is computed, and if CRC
error injection is enabled, CRC errors are injected accordingly. It is allowed for the Adapter to be
set up to inject NOP Flits or Test Flits. NOP Flit follows the identical layout as defined in
Chapter 3.0. Test Flits carry a special encoding of 01b in bits [7:6] of Byte 1 of the Flit Header
that is applicable for all Flit Formats that the Adapter supports. Unlike NOP Flits, Test Flits go
through the Tx Retry buffer if Retry is enabled. One of the purposes of defining the Test Flits is to
test the Retry Flows independently, regardless of whether the Protocol Layer is enabled. The
Payload in these Flits carry specific patterns that are determined by the fields in the Flit Tx
Injection Control register. Software is permitted to enable flit injection in mission mode as well
while interleaving with regular Protocol Flits using the appropriate programming (see the register
fields in Table 9-73). At the Receiver, these Flits are not forwarded to the Protocol Layer. The
Receiver cancels these using the pl_flit_cancel signal on FDI or any other mechanism;
however, CRC must be checked the same as with regular Flits, and any errors must trigger the
Retry Flows as applicable.
• Injection of Link State Request or Response sideband messages. This is controlled using the Link
State Injection registers defined in the Link State Injection Control Stack 0 and Link State
Injection Control Stack 1 registers (see Table 9-75 and Table 9-76, respectively). Single Protocol
stack implementations use the Stack 0 register. Software must place the Adapter in Compliance
mode (by writing 10b to the ‘Compliance Mode’ field in the Adapter Compliance Control register).
• Retry injection control as defined in the Retry Injection Control register (see Table 9-77).
The registers and associated functionality defined in Section 9.5.4 and the UHM DVSEC Capability
defined in Section 9.5.3.36 are used for Compliance testing. These registers provide the following
functionality:
• Timing margining
• Voltage margining, when supported
• BER measurement
• Lane-to-Lane skew for a given module at both the Receiver and Transmitter
• TX Equalization (EQ) as defined in Section 5.3.3
§§
Register
Register Bits Comments
Block
Slot implemented – set to 0. And hence follow rules for implementing other
PCI Express
8 slot related registers/bits at various locations in the PCIe capability register
Capabilities Register
set.
Device capabilities
8:6 N/A and can be set to any value
Register
14:12 N/A
Link Capabilities
Register L1 Exit Latency: Devices/Ports must set this bit based on whether they are
connected to a retimer or not, and also the retimer based exit latency might
17:15 not be known at design time as well. To assist with this, these bits need to be
made HWInit from a device/port perspective so system FW can set this at
boot time based on the specific retimer based latencies.
HW ignores what is written here but follow any base spec rules for bit
7
attributes.
Link Control Register
8 Set to RO 0
9 Set to RO 0
10, 11, HW ignores what is written in these bits but follows any base spec rules for
12 bit attributes.
15 Hardwired to 0
Target Link speed: Writes to this register are ignored by UCIe hardware, but
3:0
HW follows the base spec rules for bit attributes
PCIe
Link Control 2 HW ignores what is written in this bit but follows any base spec rules for bit
Capability 4
Register attributes.
15:6 N/A for UCIe. HW should follow base spec rules for register bit attributes.
§§
Implementations are permitted to design a superset stack to be interoperable with UCIe/AIB PHY.
This section details the UCIe interoperability criteria with AIB.
Always high Valid is an optional feature that is only applicable to AIB interoperability applications. This
must be negotiated prior to main band Link training through parameter exchange. Raw mode must be
used in such applications.
B.1.3 Sideband
AIB sideband is sent using UCIe main band signals. UCIe sideband is not required in AIB
interoperability mode and it is disabled (Transmitters are Hi-Z and Receivers are disabled).
Disabled
TXDATA[39:0] TX[39:0]
TXDATA[47:40] AIB Sideband Tx Asynchronous path
RXDATA[39:0] RX[39:0]
RXDATA[47:40] AIB Sideband Rx Asynchronous path
RXDATA[63:48] N/A
TXDATASB
RXDATASB
TXCKSB
N/A Disabled (Hi-Z)
RXCKSB
TXDATASBRD
RXDATASBRD
TXDATA[19:0] TX[19:0]
TXDATA[42:20] AIB Sideband Tx Asynchronous path
RXDATA[19:0] RX[19:0]
RXDATA[42:20] AIB Sideband Rx Asynchronous path
RXDATA[63:43] N/A
TXDATASB
RXDATASB
TXCKSB
N/A Disabled (Hi-Z)
RXCKSB
TXDATASBRD
RXDATASBRD
B.2 Initialization
AIB Phy logic block shown in Figure B-1 contains all the AIB Link logic and state machines. Please see
AIB specification (Section 2 and Section 3) for initialization flow.
§§