Accelerating 5G Baseband With Adaptive SoCs | Avnet Silica

Display portlet menu

Accelerating 5G Baseband With Adaptive SoCs | Avnet Silica

Display portlet menu

Accelerating 5G Baseband With Adaptive SoCs

Green-philtered city panorama with nodes connected over the city

5G new radio (NR) network specifications demand new architectures for radio and access networks. While the 5G NR architecture includes new spectrum and massive MIMO (mMIMO) antennas, corresponding access networks architecture must also evolve to implement the services defined by 5G, which include enhanced Mobile Broadband, Ultra Reliable Low Latency Communications and massive Machine Type Communications. Implementation of these services requires the need for network slicing at different levels of network aggregation nodes. Since software-only solutions are not able to handle increased demand of latency and throughput, this is creating an immense need for acceleration which can be ideally handled by programmable hardware. In this article we specifically discuss the first level of 5G access network aggregation accelerated using adaptive radio-frequency (RF) SoCs.

To handle these new requirements, 3GPP standards organizations have defined different split architectures between 5G radio units (RU) and 5G base stations. The different split architectures play a decisive role in deciding the gNodeB architecture. The higher layer splits define the partition of functionality between the centralized unit (CU) and distributed unit (DU), while the lower layer split defines the functional partition between RU and DU. The lower layer (RU-DU) split is more critical and sensitive in terms of timing and latency and is not standardized.

Multiple options for lower layer split - block diagram
Figure 1: There are multiple options for lower layer split

While the Split-8 is more common for traditional 4G-LTE networks, Split-7.2 is more commonly adopted for 5G networks. There are multiple variants of option 7.2 split, so it is also referred as option 7-2x as it can move to the left or right, seen in the above figure, based on deployment scenarios. Since the split options are flexible and the interface between the DU and RU is also not rigidly defined in terms of interface protocols, bandwidth, latency, and timing, the programmable processors for interface and functionality implementation at both RU and DU is commonly desired.

Commercially available network interface cards (NICs) can be used to terminate the fronthaul at the DUs of 5G base stations. However, ASIC-based cards can only process the L2-L3 traffic and depend on software for the O-RAN processing, and timing synchronization functionality is not available in most of the general purpose NICs. Since DUs need to have strict timing synchronization with radio units and neighboring base stations, they need to support master, slave, and boundary clock modes of operation from a central GPS clock source. Another important timing functionality is clock holdover circuitry implemented on the base station hardware needed to maintain the clock synchronization in the event of loss of reference clock.

Once the radio IQ data from RU is available for processing, it needs to be processed to be identified as user plane, control plane, management plane, and synchronization plane data in both uplink and downlink direction. The throughput for the synchronization and management plane protocol messages is significantly lower than U-Plane and C-Plane messages, so most of the time synchronization and management messages are handled in the software, with applications running in user space.

3GPP option 7-2 split also defines a clear split between high-PHY and low-PHY functionality where low-PHY functions such as pre-coding, FFT/IFFT and resource element (RE) mapping/de-mapping functions are implemented either on the remote radio unit (RRU), or in fronthaul gateway network node between the RU and DU. The high-PHY functions, which mainly include encode/decode, scrambling and modulation/demodulations, are performed in the DU.

5G option 7-2 split implementation with Xilinx adaptive RFSoC - block diagram
Figure 2: 5G option 7-2 split implementation with Xilinx adaptive RFSoC

The high-PHY functions in gNodeB (DU) can be performed either completely in software or in a combination of software and programmable hardware. The division between software and hardware of high-PHY functions depends on many factors such as:

  • Performance limits of software (or hardware) on overall performance, i.e. software should not put a limit on performance of hardware, and vice versa.
  • Latency considerations: Since 5G specifications put a strong latency requirement on different classes of services, the division should not affect the latency in a negative way.
  • Compatibility with industry standard software APIs: Some of the high-PHY functions have a standard definition of user space APIs so any hardware implementation should maintain the compatibility with standard APIs for seamless transition.

The above criteria outlines the functionality needed for programmable hardware-based accelerators from companies like Xilinx. The ideal accelerator architecture may require implementing complete 5G high-PHY into hardware, which will enable the highest performance and lowest latency while also scaling across multiple mMIMO-based RRU configuration. With the evolving 5G and O-RAN standards and functionality, Xilinx has started with O-RAN processing and lookaside channel encoding/decoding implementation on programmable accelerator cards. Channel coding is one of the high-Phy function, most suitable for programmable hardware because of its compute-intensive nature. It can also be combined with hybrid automatic repeat request (HARQ) functionality to improve performance and lower the latency.

One approach for accelerating 5G L1 High-PHY functionality is based on the adaptable and programmable Xilinx T Series, Telco Accelerator Cards. These cards feature the adaptive RFSoC, which hardens the soft decision based forward error correction (SD-FEC) blocks and also implement HARQ function with on-board DRAM memory for higher and scalable performance. In the next post, we’ll dive into some specifics about Telco Accelerator Cards, while also touching on what’s next for 5G baseband acceleration.

 

Implementation of the fronthaul and L1 Hi-PHY for 5G base stations

The 5G distributed unit (DU) can be implemented to process fronthaul data with O-RAN processing and partial offload for Hi-PHY processing which includes the LDPC encoder, LDPC decoder and wrapper functions for encoder and decoder logic.

Fronthaul processing: The below example architecture assumes two network interfaces connected to 5G radio units (RUs) as shown in figure 3. The 5G DU must be capable of full capacity of network connectivity data transfer between 5G and 5G base stations. Network interface blocks include Ethernet MAC interfaces connected to industry standard interface optical modules to transmit and receive the Enhanced Common Public Radio Interface (eCPRI), Radio-over-Ethernet (RoE) or Time Sensitive Network (TSN) Ethernet data from the 5G RUs.  The host interface is usually PCIe including a high-speed data transfer mechanism using the direct memory access (DMA).

Fronthaul processing can be divided into the following major subblocks and we’ll dive further into each block below.

Fronthaul processing on 5G base station node - block diagramFigure 3: Fronthaul processing on 5G base station node

1. Precision Time Protocol (PTP) Functionality: This synchronizes the local clock (acting as slave node clock) with a system grand master clock by using traffic time-stamping with sub-nanosecond granularity. The DU receives the 1588v2 PTP packets as part of the traffic and identifies them as synchronization plane packets. It then sends them to the S-plane application running on x86 after replacing the time-stamp field with the time-stamp field generated by the reference clock. The other functionalities of this block may include processing of delay request, update of the master clock timer value for time-of-day from software and generation of 1PPS (pulse per second) in master mode.

2. Traffic Classifier/Aggregator: The functionality of this block is to route the control, user, synchronization and management (C, U, S and M-plane) messages. The traffic classifier block can implement traffic rules which is used to drop or process the incoming fronthaul traffic from incoming network ports. This block can receive eCPRI packets (C and U plane) and Ethernet packets (S and M plane) in both uplink and downlink directions.

For uplink processing, the eCPRI packets are identified by the eCPRI message type field in the packet header. This includes checking the source MAC address, destination MAC address and Virtual Local Area Network (VLAN) ID against the configured rules and dropping the packet if the rules do not match. For S and M plane Ethernet packets in uplink direction, it can implement a simple arbiter to schedule and transmit them to host interface queues.

For downlink, it configures the priority of the different eCPRI messages, based on message type field in eCPRI header. It can also add the VLAN tag based on C and U-plane configuration, priority field in VLAN tag can be used to assign the priority for C/U-plane messages. S and M-plane can also be VLAN tagged and assigned the priority. This block can also implement priority scheduler to send packets to one of the connected fronthaul ports based on assigned priority.

3. eCPRI Framer and De-Framer: eCPRI framer/de-framer processing is responsible for eCPRI protocol processing for up-link and down-link C/U plane messages. The eCPRI processing needs to include separate uplink and downlink data path processing. Since the eCPRI processing has to support multiple antenna-carrier (AxC) configurations in a base station, the adaptability of this block allows to scale up and down based on deployment scenarios. The packet format for eCPRI-over-Ethernet messages is shown in figure 4. The padding (zero padding) field is added to make the eCPRI maximum transmission unit (MTU) size to 64B for short messages.

eCPRI over Ethernet message in an Ethernet packet - block diagram
Figure 4: eCPRI over Ethernet message in an Ethernet packet

eCPRI framer processes both uplink and downlink C-plane messages and downlink U-plane messages, since C-plane messages for downlink are also generated at the 5G DU. Multiple streams/layers of eCPRI messages can be shared by a single eCPRI framer data path by using hierarchical scheduler and multiplexing scheme. eCPRI framer generates the different field of eCPRI messages and does the padding to create eCPRI over ethernet packets to be transmitted over the fronthaul interface.

eCPRI de-framer block has the following functionality:

  • Processing and removal of Ethernet header
  • Parsing and removal of eCPRI header
  • Removal of eCPRI padding which includes the stream identification and sequence numbers based on header fields.
  • Removal of zero padding in eCPRI data (for short messages)
  • Checking of length and other protocol errors
  • Statistics for each eCPRI stream

4. O-RAN Processor: The O-RAN block works in conjunction with eCPRI block and usually interfaces with the host to provide the following functions:

  • Receive the uplink U-Plane messages from e-CPRI de-framer to extract the IQ data and deliver it to host
  • Extracts packing information for C-plane IQ data and using it accordingly for uplink U-plane messages.
  • Delay management and forwarding C-plane messages to the eCPRI block
  • Framing of the U-plane IQ data from host to O-RAN message and delivery to eCPRI framer

The O-RAN module interfaces are shown in figure 5.

O-RAN block interfaces for uplink and downlink data - block diagram
Figure 5: O-RAN block interfaces for uplink and downlink data

Both O-RAN uplink and downlink modules are designed to interface with four independent AxC interfaces. In the uplink direction O-RAN block classifies U-plane messages into Physical Random Access Channel (PRACH) or Physical Uplink Shared Channel (PUSCH) based on a parameter in O-RAN header. These messages are then de-framed to extract the corresponding IQ (Data format used for radio signals) samples. In the downlink block the C-plane messages are parsed to extract the information needed for U-plane framing.

5. IQ Data host interface: The host interface block sends and receives the IQ data samples to and from the CPU, handling the delay management for U-plane and C-plane messages. For buffering of IQ samples, external memory can be used to ensure the loss-less packet transmission to fronthaul interface. Host interface blocks read the data stored in memory along with timing ticks generated on an adaptable System-on-Chip (ASOC) to ensure the slot synchronization between ASOC and host CPU.

As described above, the fronthaul processing and L1 Hi-PHY acceleration need to be adaptable to handle the various Massive multiple Input, multiple Output (mMIMO) antenna configuration for fronthaul connectivity and throughput. The data path processing should be able to provide line rate interface with eCPRI and O-RAN processing, while meeting the latency and synchronization requirements of 5G specifications.

Xilinx has implemented the fronthaul reference design in its T1 Telco Accelerator Card to handle the total throughput of 50Gbps which approximately equals the 8 layers of 4T4R 100MHz in active standby configuration. The card uses adaptable MPSoC and RFSoC devices to keep the functionality flexible. In most of the DU implementations, the x86 software implements the complete wireless L1 stack, using the O-RAN processor on adaptable devices and can provide significant throughput and latency advantages.

Dive deeper into 5G  

Follow Avnet Silica on LinkedIn

Accelerating 5G Baseband With Adaptive SoCs | Avnet Silica

Display portlet menu

Technical support

Online Support Service

Our European team of expert engineers is dedicated to helping you solve your technical challenges. Get support for a specific product, get technical advice or find alternatives for a specific product.

Person sitting in front of computer with headset

Accelerating 5G Baseband With Adaptive SoCs | Avnet Silica

Display portlet menu
Related Articles
An-image-showing-various-modules-from-3G-4G-and-5G-implying-5G-is-faster
What’s new in mmWave and 5G SoCs and modules?
By Avnet Staff   -   August 24, 2023
This article looks at recent developments in mmWave and 5G modules and systems-on-chips (SoCs), and the benefits they represent to product development teams working at both the board level and system level of design.
5G mast in a cityscape at night
What is the impact of using AI in 5G networks
By Avnet Staff   -   August 22, 2023
AI is impacting the performance, security and maintenance of 5G networks. Network operators are racing to reap the benefits. AI promises to deliver returns on network investment and improve the end-customer experience along the way.

Accelerating 5G Baseband With Adaptive SoCs | Avnet Silica

Display portlet menu
Related Events
City at night
How to Quickly Connect STM32U5 Discovery Kit to the Cloud
Date: January 25, 2024
Location: online, on-demand
City at night
Connecting the Future: Matter and Wirepas Unleashed
Date: November 7, 2023
Location: online, on-demand