LiteDRAM Memory Controller
-
LiteX Repo, BoxLambda fork, boxlambda branch: https://github.com/epsilon537/litex.
-
LiteX Submodule in the BoxLambda Directory Tree: boxlambda/sub/litex/.
-
LiteDRAM Component in the BoxLambda Directory Tree: boxlambda/gw/components/litedram
SDRAM memory access is pretty complicated. Memory access requests get queued in the memory controller, scheduled, and turned into a sequence of commands that vary in execution time depending on the previous memory locations that were recently accessed.
There exists a class of memory controllers, called Static Memory Controllers, that absorb these complexities and by design create a fixed schedule for a fixed use case, resulting in very predictable behavior. Static Memory Controllers are far off the beaten path, however. Dynamic Memory Controllers are more common. Dynamic Memory Controllers can handle a variety of use cases with good performance on average. Unfortunately, they sacrifice predictability to achieve this flexibility.
I decided to use the LiteDRAM memory controller: https://github.com/enjoy-digital/litedram
The LiteDRAM memory controller falls squarely into the Dynamic Memory Controller class. How do we fit this into a platform that requires deterministic behavior? I think the best approach is to use a DMA engine to transfer data between SDRAM and on-chip memory. Fixed memory access latency to on-chip memory (from any bus master that requires it) can be guaranteed by using Dual Port Memory and a crossbar.
Why choose LiteDRAM over Xilinx MIG?
- LiteDRAM is open-source, scoring good karma points. All the benefits of open-source apply: Full access to all code, access to the maintainers, many eyeballs, the option to make changes as you please, submit bug fixes, etc.
- The LiteDRAM simulation model, the entire test SoC, in fact, runs nicely in Verilator. That's a must-have for me.
- The LiteDRAM core, configured for BoxLambda, is almost 50% smaller than the equivalent MIG core.
Generating a LiteDRAM core
LiteDRAM is a highly configurable core. For an overview of the core's features, please take a look at the LiteDRAM repository's README file:
https://github.com/enjoy-digital/litedram/blob/master/README.md
You specify the configuration details in a .yml file. A Python script parses that .yml file and generates the core's Verilog as well as a CSR register access layer for software.
Details are a bit sparse, but luckily example configurations are provided:
https://github.com/enjoy-digital/litedram/tree/master/examples
Starting from the arty.yml example, I created the following LiteDRAM configuration file for BoxLambda:
{
# General ------------------------------------------------------------------
"speedgrade": -1, # FPGA speedgrade
"cpu": "None", # CPU type (ex vexriscv, serv, None) - We only want to generate the LiteDRAM memory controller, no CPU.
"memtype": "DDR3", # DRAM type
"uart": "rs232", # Type of UART interface (rs232, fifo) - not relevant in this configuration.
# PHY ----------------------------------------------------------------------
"cmd_latency": 0, # Command additional latency
"sdram_module": "MT41K128M16", # SDRAM modules of the board or SO-DIMM
"sdram_module_nb": 2, # Number of byte groups
"sdram_rank_nb": 1, # Number of ranks
"sdram_phy": "A7DDRPHY", # Type of FPGA PHY
# Electrical ---------------------------------------------------------------
"rtt_nom": "60ohm", # Nominal termination
"rtt_wr": "60ohm", # Write termination
"ron": "34ohm", # Output driver impedance
# Frequency ----------------------------------------------------------------
# The generated LiteDRAM module contains clock generation primitives, for its own purposes, but also for the rest
# of the system. The system clock is output by the LiteDRAM module and is supposed to be used as the main input clock
# for the rest of the system. I set the system clock to 50MHz because I couldn't get timing closure at 100MHz.
"input_clk_freq": 100e6, # Input clock frequency
"sys_clk_freq": 50e6, # System clock frequency (DDR_clk = 4 x sys_clk)
"iodelay_clk_freq": 200e6, # IODELAYs reference clock frequency
# Core ---------------------------------------------------------------------
"cmd_buffer_depth": 16, # Depth of the command buffer
# User Port ---------------------------------------------------------------
# Note that this is a _classic_ wishbone port, while BoxLamdba uses a _pipelined_ wisbone bus.
# A pipelined-to-classic wishbone adapter is needed to interface correctly to the bus.
# At some point it would be nice to have an actual pipelined wishbone frontend, with actual pipelining capability.
"user_ports": {
"wishbone_0" : {
"type": "wishbone",
"data_width": 32, #Set data width to 32. If not specified, it defaults to 128 bits.
"block_until_ready": True,
}
},
}
Some points about the above:
- The PHY layer, Electrical, and Core sections I left exactly as-is in the given Arty example.
- In the General section, I set cpu to None. BoxLambda already has a CPU. We don't need LiteX to generate one.
- In the Frequency section, I set sys_clk_freq to 50MHz. The generated core will also provide a double-rate 100MHz clock next to this 50MHz clock.
- In the User Ports section, I specified one 32-bit Wishbone port.
I generate two LiteDRAM core variants from this configuration:
- For simulation:
litedram_gen artya7dram.yml --sim --gateware-dir sim/rtl --software-dir sim/sw --name litedram
- For FPGA:
litedram_gen artya7dram.yml --gateware-dir arty/rtl --software-dir arty/sw --name litedram
The generated core has the following interface:
module litedram (
`ifndef SYNTHESIS
input wire sim_trace, /*Simulation only.*/
`endif
`ifdef SYNTHESIS
output wire pll_locked, /*FPGA only...*/
input wire rst,
input wire clk,
output wire [13:0] ddram_a,
output wire [2:0] ddram_ba,
output wire ddram_cas_n,
output wire ddram_cke,
output wire ddram_clk_n,
output wire ddram_clk_p,
output wire ddram_cs_n,
output wire [1:0] ddram_dm,
inout wire [15:0] ddram_dq,
inout wire [1:0] ddram_dqs_n,
inout wire [1:0] ddram_dqs_p,
output wire ddram_odt,
output wire ddram_ras_n,
output wire ddram_reset_n,
output wire ddram_we_n,
`endif
output wire init_done, /*FPGA/Simulation common ports...*/
output wire init_error,
output wire user_clk,
output wire user_clkx2,
output wire user_port_wishbone_0_ack,
input wire [25:0] user_port_wishbone_0_adr,
input wire user_port_wishbone_0_cyc,
output wire [31:0] user_port_wishbone_0_dat_r,
input wire [31:0] user_port_wishbone_0_dat_w,
output wire user_port_wishbone_0_err,
input wire [3:0] user_port_wishbone_0_sel,
input wire user_port_wishbone_0_stb,
input wire user_port_wishbone_0_we,
output wire user_rst,
output wire wb_ctrl_ack,
input wire [29:0] wb_ctrl_adr,
input wire [1:0] wb_ctrl_bte,
input wire [2:0] wb_ctrl_cti,
input wire wb_ctrl_cyc,
output wire [31:0] wb_ctrl_dat_r,
input wire [31:0] wb_ctrl_dat_w,
output wire wb_ctrl_err,
input wire [3:0] wb_ctrl_sel,
input wire wb_ctrl_stb,
input wire wb_ctrl_we
);
Some points worth noting about this interface:
- A Wishbone control port is generated along with the user port. LiteDRAM CSR register access is done through the control port.
- Both Wishbone ports are classic Wishbone ports, not pipelined. There is no stall signal.
- The Wishbone port addresses are word addresses, not byte addresses.
- The LiteDRAM module takes an external input clock (clk) and generates both a 50MHz system clock (user_clk) and a 100MHz double-rate system clock (user_clkx2). The LiteDRAM module contains a PLL clock primitive.
- The double-rate system clock is a modification for BoxLambda. The vanilla LiteDRAM/LiteX code base only generates one user_clk.
Litedram_wrapper
I created a litedram_wrapper.sv module around litedram.v:
This wrapper contains Pipelined-to-Classic Wishbone adaptation. The adapter logic comes straight out of the Wishbone B4 spec section 5.2, Pipelined master connected to standard slave. The stall signal is used to avoid pipelining:
/*Straight out of the Wishbone B4 spec. This is how you interface a classic slave to a pipelined master.
*The stall signal ensures that the STB signal remains asserted until an ACK is received from the slave.*/
assign user_port_wishbone_p_0_stall = !user_port_wishbone_p_0_cyc ? 1'b0 : !user_port_wishbone_c_0_ack;
LiteDRAM Clock Frequency
See Clocks and Reset