草榴社区

Maximizing Mobile Performance with LPDDR4 SoC RAM

VIP Expert

Oct 02, 2017 / 5 min read

The mobile industry is growing at a very fast pace with its never-ending hunger for data and bandwidth. We have witnessed the change from a dial-pad to touch-screens, from black and white display to QHD 4k display with millions of colors, and memory space from KB to GB, in a very short span of time. The biggest challenge is increasing bandwidth without compromising performance or adding any significant numbers in the power consumption column. The solution to this challenge is the LPDDR or Mobile DDR standard released by JEDEC. There have been several revisions to this standard, the latest being LPDDR4. LPDDR4 provides a data bandwidth of 4266 Mbps, which is almost double that of LPDDR3. It also provides a significant reduction in power consumption compared to LPDDR3. For further insights on LPDDR4 and its predecessors please refer to our previous blog “LPDDR4: What Makes it Faster and Reduces Power Consumption.”

In this blog, we will discuss the features which make LPDDR4 efficient in terms of power consumption, bandwidth utilization, data integrity and performance.

LPDDR4 features overview

DBI (Data Bus Inversion)

A new I/O signaling scheme has been introduced in LPDDR4, known as low voltage swing terminated logic (LVSTL). LVSTL uses significantly lower voltage levels than used in the previous version of LPDDR. Another advantage of this signaling scheme is that it consumes no termination power while low level (0) is being driven through the I/O drivers. It implies that less power will be consumed if there are more zeros in the data stream. DBI feature has been introduced to keep more zeros than ones in the data stream. DBI works at a byte level granularity. Whenever a byte contains more than four number of bits as 1 then the driver will invert the whole byte and send the corresponding data mask inversion(DMI) bit to notify the receiver that the respective byte has been inverted.

LPDDR4 data bus driver diagram

FSP (Frequency Set Point)

LPDDR4 added two physical sets of register spaces, FSP0 and FSP1, to switch between two different operating frequencies without retraining. These register sets store all operational parameters required at two different frequencies for DRAM, one in effective mode and the other in shadow mode. The DRAM will be trained with both frequencies and the parameters will be stored in the register sets during the command bus training mode. Switching between FSP0 and FSP1, or vice versa, can be completed rapidly by just writing on a mode register.

LPDDR4 FSP switching diagram

TRR (Target Row Refresh)

Increased memory densities within the same chip size lead to smaller DRAM cells. Smaller cells can store a smaller charge compared to a bigger cell, this in turn can lead to a reduced noise margin that makes the system more data error prone. Also, densely placed cells are less immune to cross talk interference which eventually results in data error. To perform any data operation on a row, it needs to be activated first. Here “activate” means to put the cells of the row on a higher voltage level while the other rows of that bank remain at lower voltage level. When a row gets activated rapidly, its voltage level also changes accordingly which eventually accelerates the discharge rate of the cells of an adjacent row due to the close proximity of the cells. Since DRAM cells store data information in capacitors in the form of charges that tends to discharge over a period of time, a refresh cycle is needed within the refresh period to retain the stored charge. Due to the accelerated discharge rate on the cells of an adjacent row, it may result in the loss of data because the capacitor was fully discharged before the next refresh cycle arrived. To overcome this scenario, LPDDR4 introduced the Target Row Refresh (TRR) mechanism. TRR limits the maximum number of activates (MAC count) on a single row within a refresh period. Whenever the activate count per row (target row) reaches the MAC count, the adjacent rows (victim row) will be refreshed by the TRR procedure to avoid data loss.

LPDDR4 DRAM cell array structure

I/O Signal Trainings

There are multiple trainings provided by LPDDR4 to align or re-adjust the delays introduced on the I/O signals with respect to CLK or other signals. As per standard physical interface definition of LPDDR4, there are CLK, CS, CA, DQ and DQS signals which need proper alignment for successful data transfers. As the CA line is sampled at the CLK signal, there should be a proper phase relationship between CA and CLK. Similarly, DQ gets sampled on DQS signal, so again there should be a phase relationship between the two. To maintain these phase relationships, LPDDR4 proposes training mechanisms. Let’s look at those:

  • Command Bus Training (CBT): This is used to align the CS and CA signals with respect to the CLK signal. At power-up the receivers get configured for low speed operations. When operating at high frequencies, the receivers must be trained. The timing margins need to be readjusted per the higher clock frequency which is achieved with the CBT procedure. The entry and exit of the CBT mode are controlled by the mode register write command. In CBT mode, DRAM will switch to the FSP_OP settings, which it will also need to be trained on. DRAM samples the CA bus at CS signal and provides feedback of the sampled signals to the controller for timing adjustments on CS and CA signals.
  • Write Leveling: This is used to adjust the delays on DQS input signals with respect to the CLK signal. The entry and exit of the write leveling training mode are controlled by the mode register write command. DQS signal gets driven by the controller and DRAM samples the CLK signal at the DQS edge. DRAM responds to the controller by providing feedback on the captured CLK level on DQ. This feedback identifies the leading or lagging of DQS, with respect to the CLK, so that controller can readjust the delays accordingly.
  • Write Training (DQS-DQ Training): This is used to align the DQ input signal delays with respect to the DQS input signal. When entering write training mode, MPC WR_DQ_FIFO command must be issued by the controller. This command writes a user defined data in DRAM, then the controller issues MPC RD_DQ_FIFO command to read back the data from the same location and compare both the written and read data to re-adjust the delay on DQ line.

These features make LPDDR4 a complete package and ideal to be used as a RAM in any mobile SoC. These features must be addressed in any verification plan for LPDDR4 based SoC designs. 草榴社区 provides a complete verification solution for LPDDR4 with run time selection of JEDEC and vendor parts, a set of built-in protocol, timing and data integrity checks, configurable timing parameters, built-in functional coverage and verification plan, and backdoor access to memory.

Continue Reading