Reference Guide

Direct from Development
Server and Infrastructure Engineering
Copyright © 2020 Dell Inc. or its subsidiaries. All Rights Reserved.
Dell, EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries
NUMA Configuration settings on
AMD EPYC 2
nd
Generation
Introduction
AMD Epyc is a Multi-Chip Module processor. With the 2nd
generation AMD EPYC 7002 series, the silicon package was
modified to make it a little simpler. This package is now divided into
4 quadrants, with up to 2 Core Complex Dies (CCDs) per quadrant.
Each CCD consists of two Core CompleXes (CCX). Each CCX has
4 cores that share an L3 cache. All 4 CCDs communicate via 1
central die for IO called I/O Die (IOD).
There are 8 memory controllers per socket that support eight
memory channels running DDR4 at 3200 MT/s, supporting up to 2
DIMMs per channel. See Figure 1 below:
Figure 1 - Illustration of the ROME Core and memory architecture
With this architecture, all cores on a single CCD are closest to 2
memory channels. The rest of the memory channels are across the
IO die, at differing distances from these cores. Memory interleaving
allows a CPU to efficiently spread memory accesses across
multiple DIMMs. This allows more memory accesses to execute
without waiting for one to complete, maximizing performance.
NUMA and NPS
Rome processors achieve memory interleaving by using Non-Uniform Memory Access (NUMA) in
Nodes Per Socket (NPS). The below NPS options can be used for different workload types:
Tech Note by
Mohan Rokkam
Summary
In multi-chip processors
like the AMD-EPYC series,
differing distances between
a CPU core and the
memory can cause Non-
Uniform Memory Access
(NUMA) issues.
AMD offers a variety of
settings to help limit the
impact of NUMA. One of
the key options is called
Nodes per Socket (NPS).
This paper talks about
some of the recommended
NPS settings for different
workloads.

Summary of content (3 pages)