Documentation Center

  • Trial Software
  • Product Updates

LTE Downlink Shared Channel Processing with GPU Acceleration

This example shows how you can use GPUs to accelerate bit error rate simulations. This example uses a portion of the transport channel processing for the Downlink Shared Channel (eNodeB to UE) of the Long Term Evolution (LTE) specification developed by the Third Generation Partnership Project (3GPP) [ 1]. For more information see the Downlink Transport Channel (DL-SCH) Processing example.

You must have a Parallel Computing Toolbox™ license to run the GPU portion of this example.

Notice: Supply of this software does not convey a license nor imply any right to use any Turbo codes patents owned by France Telecom, Telediffusion de France and/or Groupe des Ecoles des Telecommunications except in connection with use of the software for the purposes of design, simulation and analysis. Code generated from Turbo codes technology in this software is not intended and/or suitable for implementation or incorporation in any commercial products.

Please contact France Telecom for information about Turbo Codes Licensing program at the following address: France Telecom R&D - PIV/TurboCodes 38-40, rue du General Leclerc 92794 Issy-les-Moulineaux Cedex 9, France.

Launch the GUI


Overview of the Simulation

This simulation models part of the Downlink Shared Channel of the LTE system. The LTE specification allows transport blocks that exceed 6144 bits to be segmented into smaller subblocks. Each of these subblocks is turbo encoded separately, and each has a CRC applied. A per-transport block CRC is then appended. Given this scheme, decoding of the subblocks can be done independently and, if desired, in parallel. The GPU and CPU simulations take different approaches here. The CPU simulation turbo decodes each subblock sequentially, while the GPU simulation turbo decodes all subblocks within a transport block in parallel. Both CPU and GPU simulations use early termination (by examining the CRC checksum after each turbo decoding iteration) on a per subblock basis.

Running the Simulation

In the Mode button group, select the CPU option for a CPU only simulation. Click Start Simulation to begin creating the bit error rate curve. You can run the simulation to completion, or stop early by clicking the Stop button. Select the GPU option in the Mode button group and click Start Simulation. The simulation computes the bit error rate using a GPU based Turbo Decoder. Runtime statistics are printed below the bit error rate plot.

Error Rate Performance

You can plot the bit error rate curve for either version of the code. The Number of Errors field determines the number of errors required to plot a single point. Enter the desired number of errors and then click Start Simulation. Both CPU and GPU achieve exactly the same bit error rate in the simulation. There is no accuracy penalty for GPU acceleration.

Load Data

In case you do not have a GPU available for use, you can still see the benefits of GPU acceleration. Click on the GPU button in the Mode button group, and then click Load Data. A previously computed bit error rate curve and associated statistics will be loaded into the GUI. Similarly, a previously computed bit error rate curve and statistics for a CPU simulation can be loaded by clicking the CPU button and then Load Data. The saved CPU simulation uses an Intel Xeon X5650 running at 2.67GHz with 6 processors. The saved GPU simulation uses an NVIDIA K20c with 13 multiprocessors.


Once both CPU and GPU simulations are run to completion (or loaded from the saved data) an overall speedup will be computed and displayed in the runtime pane. Using the saved data for both CPU and GPU the GPU simulation runs 9 times faster than the CPU simulation.

Code Differences

To see the changes in the original CPU source code necessary for the GPU implementation, click View Code Differences. This launches the comparison tool to view the changes necessary for GPU acceleration.

Analysis of the Results

This example shows how a GPU can be used to accelerate a simulation for part of the LTE system. This test achieves a tenfold increase in simulation speed with minor code changes. This performance improvement comes at no cost of accuracy: the bit error rate calculations by both the CPU and GPU are identical.


The CPU and GPU versions of the LTE system simulation are:

Selected Bibliography

  1. 3GPP Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Multiplexing and channel coding (Release 10)", 3GPP TS 36.212 v10.0.0 (2010-12).

Was this topic helpful?