草榴社区

Ethernet has been a foundational technology since the 1980s.  In the early days, workstations and PCs connected to on-site servers with a 10Mbps shared LAN using coax cables.  Since then, Ethernet has evolved to support twisted pair and fiber optic cabling. The rates have increased 100Mbps and 100Gbps, leading up to the emerging 1.6Tbps Ethernet standard.

As Ethernet speeds increase, so too have its applications diversified.  From Voice over IP and video streaming to multiroom audio, industrial control networks, and even in-car networks.This progression has driven the need beyond simple packet transfer, to reliable packet transfer.  Defining quality of service is critical for data streams that are particularly sensitive to loss and delay. This technical bulletin will take a dive into the need for 1.6T data transfers, the standardization efforts from IEEE’s 802.3dj group, an overview of the components of a 1.6T Ethernet Subsystem, and FEC considerations for the Ethernet controller required to handle all this data.

Why Do We Need Such High Speeds?

Ethernet has evolved along two primary dimensions:

  1. Enhanced performance to move and store massive amounts of data 
  2. Increased predictability and reliability, enabling even the most demanding of control systems

Figure 1: The gigantic world of Ethernet (Ethernet Alliance)

Today, the Internet is humming at an estimated bandwidth of 500 Tbps. This drives a staggering requirement for back-end intra datacenter traffic.  However, individual servers are not yet driving Terabit-per-second speeds.  While aggregate traffic within a datacenter is well into the Terabit-per-second scales, individual data flows are not there yet.

Subscribe to the 草榴社区 IP Technical Bulletin

Includes in-depth technical articles, white papers, videos, upcoming webinars, product announcements and more.

The Dawn of 1.6T Ethernet: Early Adopters in Interprocessor Communication

One application though is already pushing to these speeds – interprocessor communication.  A single device has a finite limit to its processing capacity.  Even with the latest processors or specialized machine learning accelerators, devices are still limited by the manufacturable chip size.  However, when chips are combined, the sky is the limit!  This finite limit to processing capacities underscores the need for a generation of Ethernet capable of Terabit speeds and minimal latency – marking the first application for 1.6T Ethernet.  The first generation of applications is expected to be followed by intra datacenter switch-to-switch connections, enabling the pooling of high-performance processors and memory, boosting scalability and efficiency within cloud computing.

802.3dj: Setting the Stage for 1.6 Terabit Ethernet Standardization

It’s essential that each node on the network adheres to the same rules defined by a set of standards to achieve effective communication.  The IEEE has been responsible for the Ethernet standards since its inception.  The 802.3dj group is currently formulating the latest iteration of the Ethernet standard, outlining the physical layers and management parameters for operations at 200G, 400G, 800G and 1.6 Terabits per second. 

The objective of the 802.3dj group is to support an Ethernet MAC data rate of 1.6 Tbps with:

  • A maximum bit error rate (BER) of 10-13 at the MAC layer
  • Optional 16 & 8 lane Attachment Unit Interfaces (AUI) for both chip to module (C2M) and chip to chip applications (C2C), using 112G and 224G SerDes.

For the physical layer, the specification to support 1.6Tbps includes transmission:

  • Over 8 pairs of copper twinax cables in each direction with a reach of at least one meter
  • Over 8 pairs of fiber up to 500 meters
  • Over 8 pairs of fiber up to 2 km

The outlook is for the standard to be finalized by spring 2026.  However, we anticipate the completion of the baseline set of features in 2024.

Anatomy of a 1.6T Ethernet Subsystem

Let‘s dive into the components of a 1.6 Terabit Ethernet subsystem – specifically, the elements used to implement the Ethernet interface within the  silicon of an ASIC or ASSP.

Figure 2: Diagram depicting the components of a 1.6T Ethernet Subsystem

Networked Applications Overview

At the top we have the networked applications which can be either on client machines or on compute or file servers. They are the source and the destination of all the Ethernet traffic.  A special kind of application is the Ethernet Bridge or Layer 2 switch – this is neither the source nor destination of Ethernet traffic, but rather is an intermediate point along the way that forwards packets per rules defined in 802.1d.

Queue Connections

Individual applications, or instances, connect to and from the Ethernet controller by one or more queues.  The queue is most likely buffering traffic to or from the application, balancing the network‘s performance with that of the client or server.  For maximum performance, the speed of the network should match the rate at which traffic is generated or consumed.  This way we will minimize the delay as packets are exchanged end to end between the applications.

Controller, PHY and Cabling

The Ethernet controller typically comprises a MAC and a PCS, although often this is simply referred to as the Ethernet MAC. Below the PCS we have the Attachment Unit Interface (AUI)– some readers might remember the D-type connector on the back of a workstation into which an AUI cable was plugged.  That interface is still there in today‘s Ethernet, it‘s just faster. Finally, further down the stack, we can find the blocks responsible for controlling and managing the physical elements of the network, whether it is fiber optic, copper cabling or a backplane.

1.6T Ethernet Controller: Delving into MAC, PCS and Advanced FEC Mechanisms

As shown on Figure 3, below the application and queues lies the Media Access Controller (MAC).  The MAC manages Ethernet framing – looking after source and destination addresses, managing the length of the frame, adding padding, if necessary (in the case of a very short payload), and adding / checking the Frame Check Sequence (FCS) to ensure the integrity of the frame.

Figure 3: MAC frame format and Length: An Octet Breakdown

MAC variations can be categorized into 2 main types:

MAC in Network Interface Card (NIC)

One type of MAC is found on a NIC which sits in a client, server or router.  These MACs handle the essential task of terminating the Ethernet layer by adding and stripping the Ethernet specific overhead as the payload passes down and back up the stack.   An integral function includes adding and checking the Frame Check Sequence (FCS) to safeguard data integrity. If any corruption is detected upon receive, the frame will be dropped. Additionally, the MAC in a NIC will be checking the destination address of the frame, ensuring accurate delivery within the network. The payload will most likely be an IP (Internet Protocol) packet.

NIC’s used to be the function of a plug-in card, hence their name: Network Interface Card.  The card implemented the MAC, PCS and PHY– where queueing and any other intelligence was handled by the host processor.  Today, we see smartNICs that can offload many networking functions, but they still maintain the same MAC layer.

Switching/Bridging MAC

On the other hand, switching or bridging MACs have a different approach. Here, the entire Ethernet frame is passed between the MAC and the upper layer.  The MAC is responsible for adding and checking the FCS and statistics gathering for the likes of Remote Network Monitoring (RMON) support.  Conceptually, an Ethernet Switch can be viewed as a dedicated application designed for this purpose. Despite being predominantly hardware-implemented to guarantee optimal wire rate performance, each of its ports incorporates a dedicated MAC. Although these ports may operate at different speeds, any rate adaptation is managed in the queues above the MAC layer.

Figure 4: Diagram showcasing the MAC, PCS and PMA connecting to the AUI

From Basic Encoding to RS-FEC

For lower Ethernet rates, the Physical Coding Sublayer (PCS) simply encodes the data stream to enable start of packet detection and to ensure a balanced signal, even during a long stream of zeros or ones.  However, as Ethernet speeds increase, so has the complexity of the PCS. Today,  given the high-speed signals traversing each physical link, it is necessary to use Forward Error Correction  (FEC) to overcome the inherent signal degradation that will be encountered, even over a very short link.

As with the PCS for other high speed Ethernet variants, 1.6T Ethernet utilizes Reed-Solomon Forward Error Correction (RS-FEC).  This approach builds a codeword comprising 514 10-bit symbols encoded into a 544 symbol block, resulting in a 6% bandwidth overhead.  These FEC codewords are distributed across the AUI physical links so that each physical link (8 for 1.6T Ethernet) doesn’t carry an entire codeword.  This method not only gives additional protection against error bursts, but also enables parallelization at the far end decoder thereby reducing the latency.

Figure 5: Diagram depicting components for Controller, PHY and Cables for a 1.6T Ethernet Subsystem

Achieving Optimal Bit Error Rates in 1.6T Ethernet

While the Ethernet PHY layer includes the PCS, it’s common to associate the PCS with the MAC within the Ethernet Controller.  The Physical Medium Attachment (PMA), featuring a gearbox and SerDes, brings the Ethernet signal onto the transmitted channels. For 1.6T Ethernet, 8 channels run at 212Gbps with a 6% FEC encoding expansion. It’s worth noting that the upper part of the PMA resides within the controller, which then hands the bit streams off to the AUI. Each physical link of the PHY uses 4-Level Pulse Amplitude Modulation (PAM-4). This method encodes two data bits per transmission symbol, doubling the bandwidth compared to the traditional Non-Return Zero (NRZ) transmission. The transmitter employs a digital to analog converter (DAC) to modulate the data, while the far end receiver used an analog to digital converter (ADC) together with DSPs to extract the original signal. 

The Ethernet PCS added FEC to the data stream that is used end to end over an Ethernet Link, often referred to as the “outer FEC” in longer reach Ethernet links. IEEE is defining an additional level of error correction for the individual physical lines to enable longer reach channels. This additional error correction, likely a hamming code, will be supported in an optical transceiver module where the error correction is required.   Figure 6 shows the addition overhead added when using a concatenated FEC for extended reach.

Let’s look at an example system in Figure 6, where the MAC and PCS have optical transmitters and receivers separated by a length of fiber:

Figure 6: Diagram showcasing a MAC and PCS optical TX/RX separated by a length of fiber

The PCS has a bit error rate of 10^-5 on the link connected to the optical module plus additional errors introduced on the optical link. If we only implement a single RS-FEC end to end in this system, the resultant error rate would not meet the 10^-13 Ethernet requirement. The link would be classed as unreliable. An alternative would be implementing a separate RS FEC on each hop, the RS FEC will be encoded and decoded three times.  Once at the transmit PCS, then in the optical module and again at the far link from the optical module to the remote PCS.  This implementation would be very costly and would increase the end-to-end latency.

Integrating a concatenated Hamming Code FEC to the optical link is an optimal solution that meets the Ethernet requirements and is also well suited for random errors encountered on optical connections. This inner FEC layer creates an additional expansion of the line rate from 212 Gbps to 226 Gbps, thus it is essential that the SerDes can support this line rate.

From Send to Receive: Understanding the Latency Landscape in Ethernet Applications

Ethernet latency, simply put, represents the delay between one application transmitting a message over Ethernet to another application receiving it. Round trip latency measures the time taken from a message being sent to a response being received.  Of course, this delay is dependent upon the response time of the far end application – when considering Ethernet latency this can be ignored since it’s external to Ethernet. The components of Ethernet latency include the transmit queue, message processing time, transmission duration, medium traversal time, message receipt time, end processing time, and the time in the receive queue.

Figure 7: Diagram depicting a full 1.6T Ethernet subsystem and the latency path

When focusing on minimizing latency in an Ethernet subsystem (specifically at the Ethernet interface level, not the overall network), it’s crucial to consider the specific circumstances - for instance, when both the packet source and the sink are running at a matched and high data rate.  Conversely, in a trunk connection like those between switches, latency becomes less of a concern due to the more pronounced delays on slower client links. Similarly, when dealing with longer distances, the inherent delay due to the distance will dominate.

Furthermore, it’s worth noting that Time Sensitive Networking (TSN) addresses deterministic latency.  In this context, a maximum upper latency for mission-critical applications is established,  especially for lower speed or shared-infrastructure networks.   Of course, this doesn’t mean that we should overlook latency in other scenarios. Minimizing latency remains a constant objective.  Firstly, the cumulative end to end latency increases with each successive hop.  Secondly, added latency often indicates additional circuitry or processing in the controller, which could lead to increased power consumption in the system.

Latency Insights: Dissecting the Ethernet Subsystem Layers

To begin, we’ll set aside any queuing latency, and assume there’s a clear path from the application to the Ethernet controller without any contention for bandwidth. Bandwidth differences will result in packet queuing delays which should be avoided when latency is critical. Ethernet frames are built or modified on the fly as the packet pass through the transmit controller. Notably, significant storage is not required through line encoding and the transmit FEC phases.

Transmit message processing latency hinges on the particular implementation, but can be minimized with good design practices. The time taken to transmit a message is simply down to the Ethernet rate and the frame size.  For 1.6T Ethernet, transmitting a minimum sized packet takes 0.4ns – essentially, one Ethernet frame per tick of a 2.5 GHz clock.  The transmission time for a standard maximum size Ethernet frame is 8ns, extending to 48ns for Jumbo Frames.

When considering the time to traverse the medium, optical fiber latency stands at roughly 5ns per meter, while copper cabling is slightly faster at 4ns per meter. Although the message receive time is the same as the transmit time, it’s usually ignored since both processes occur simultaneously.

Most of the latency occurs at the receiver controller

Even with the most optimized designs, latency due to the RS FEC decoder is unavoidable. To begin error correction, 4 codewords must be received and stored, which takes 12.8ns at a 1.6Tbps. Subsequent processes, such as executing the FEC algorithm, correcting errors (as necessary), buffering and clock domain management, further contribute to the controller’s receive latency. While FEC codeword storage time is a constant factor, the latency during message receipt is implementation specific, but can be optimized through good digital design practices.  

In essence, there is an inherent, unavoidable, latency due to the FEC mechanism and the physical distance or cable length.  Beyond these factors, good design practices play a pivotal role to minimize latency due to the Ethernet controller. Leveraging an integrated complete solution that includes the MAC, PCS and PHY, along with an expert design team, paves the way for the most efficient, low latency implementation.

Summary

Figure 8: First-pass Silicon Success for 草榴社区 224G Ethernet PHY IP in 3nm process showcasing highly linear PAM-4 eyes

1.6 Tbps Ethernet caters to the most bandwidth intensive, latency sensitive applications. With the advent of 224G SerDes technology, together with MAC and PCS IP developments, complete, off-the-shelf solutions are available aligning with the evolving 1.6T Ethernet standards. Controller latency is all critical in 1.6Tbps applications. Beyond the inherent latency due to the protocol and error correction mechanism, the IP digital design must be meticulously engineered by expert design teams to prevent unnecessary latency being added to the datapath.

A silicon-proven solution requires an optimized architecture and precise digital design, emphasizing power efficiency and reducing the silicon footprint to make 1.6T data rates a reality. 草榴社区 silicon-proven 224G Ethernet PHY IP has set the stage for the 1.6T MAC/PCS possible. Using leading-edge design, analysis, simulation, and measurement techniques, 草榴社区 continues to deliver exceptional signal integrity and jitter performance, with a complete Ethernet solution including MAC+PCS+PHY.

草榴社区 IP Technical Bulletin

In-depth technical articles, white papers, videos, webinars, product announcements and more.

Continue Reading