BladeSymphony 1000 Architecture ® White Paper 1000
Table of Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Executive Summary .................................................................................................................3 Introducing BladeSymphony 1000 ..........................................................................................3 System Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 1 Introduction Executive Summary Blade servers pack more compute power into a smaller space than traditional rack-mounted servers. This capability makes them an attractive alternative for consolidating servers, balancing or optimizing data center workloads, or simply running a wide range of applications at the edge or the Web tier.
BladeSymphony 1000 (Figure 1) is the first blade system designed specifically for enterprise-class, mission-critical workloads. It is a 10 rack unit (RU) system that combines Hitachi’s Virtage embedded virtualization technology, a choice of Intel Dual-Socket, Multi-Core Xeon and/or Intel Dual-Core Itanium Server Blades (running Windows or Linux), centralized management capabilities, high-performance I/O, and sophisticated reliability, availability, and serviceability (RAS) features. Figure 1.
• Reliability — Reliability is increases through redundant components and components are hotswappable.
Chapter 2 System Architecture Overview BladeSymphony 1000 features a very modular design to maximize flexibility and reliability. System elements are redundant and hot-swappable so the system can be easily expanded without downtime or unnecessary disruption to service levels.
Rear Front Server blade Up to 8-way SMP configurable Backplane Switch & Management module Server blade I/O module 16 slots in total Server blade or 8 modules max. HDD module (3 HDDs max.) *1: The HDD module having six HDDs mounted occupies the space for two modules. HDD module (6 HDDs max.) I/O module 16 slots in total or Embedded FCSW module Replaced with 4-server blade Figure 3.
Chapter 3 Intel Itanium Server Blade The BladeSymphony 1000 can support up to eight blades for a total of up to 16 Itanium CPU sockets, or 32 cores, running Microsoft Windows or Linux. Up to four Intel Itanium Server Blades can be connected via the high-speed backplane to form a high-performance SMP server of up to 16 cores. Each Intel Itanium Server Blade, illustrated in Figure 4, includes 16 DDR2 main memory slots.
Table 1: Intel Itanium Server Blade features Item Memory Specifications Capacity Max. 64 GB/server blade (if 4 GB DIMM is used) Type DDR2 240-pin registered DIMM, 1 rank, 2 rank Frequency: DDR2-400 3-3-3 Capacity: 512 MB, 1 GB, 2 GB, 4 GB (DDR2-533) Configuration: 4 bit x 18 devices, 36 devices Availability Advanced ECC, on-line spare memory, and scrubbing supported Node link for SMP Three interconnect ports PCI Express x 4 links, 2 ports Gigabit Ethernet GbE (SerDes 1.25 Gb/sec.
Table 2: Main components of the Intel Itanium Server Blade Component Manufacturer Quantity Bridge Intel 1 PCIe to PCI-X bridge South Bridge Intel 1 South bridge — connects legacy devices SMSC 1 Super I/O chip — contains the COM port and other legacy devices FW ROM ATMEL/ STMicro 8 MB A flash ROM storing the images of system firmware Also used as NVRAM under the control of the system firmware Gigabit Ethernet Intel 1 Gigabit Ethernet interface controller, two ports, SerDes connection Wak
The Intel Itanium is optimized for dual processor-based platforms and clusters and includes the following features: • Wide, parallel hardware based on Itanium architecture for high performance – Integrated on-die cache of up to 24 MB, cache hints for L1, L2, and L3 caches for reduced memory latency – 128 general and 128 floating-point registers supporting register rotation – Register stack engine for effective management of processor resources – Support for predication and speculation • Extensive RAS featur
Enhanced Machine Check Architecture provides extensive error detection and address/data path correction capabilities, as well as system-wide ECC protection. It detects bit-level errors and manages data corruption, thereby providing better reliability and uptime. Intel VT Virtualization Technology The Dual-Core Intel Itanium processor includes hardware-assisted virtualization support that helps increase virtualization efficiency and broaden operating system compatibility.
Table 3: Bus throughput from the Hitachi Node Controller Bus Throughput Connection between nodes 400 MHz FSB = 4.8 GB/sec. 667 MHz FSB = 5.3 GB/sec. Baseboard Management Controller The Baseboard Management Controller (BMC) is the main controller for Intelligent Platform Management Interface (IPMI), a common interface to hardware and firmware used to monitor system health and manage the system. The BMC manages the interface between system management software and the hardware in the server blade.
• ECC — The ECC can correct an error in consecutive four bits in any four DIMM set (i.e., a fault in one DRAM device). This function is equivalent to technology generally referred to as Chipkill and allows the contents of memory to be reconstructed even if one chip completely fails. The concept is similar to the way RAID protects content on disk drives.
Processor Bus 6.4 GB/s (FSB400MHz) 10.6 GB/s (FSB667MHz) Node Bandwidth 4.8 GB/s (FSB400MHz) 5.3 GB/s (FSB667MHz) NDC Node Controller L3 Cache Copy Tag NDC Node Controller PCI Bus 2GB/s x3 CC-Numa Point to point Low Latency Memory Bus 4.8 GB/s (FSB400MHz) 5.3 GB/s (FSB667MHz) PCI-Express (4 Lane) NDC Node Controller MC Memory Controller NDC Node Controller PCI Bridge PCI Slots MC Memory Controller DDR2 Memory DDR2 Memory Figure 6.
SMP Configuration Options BladeSymphony 1000 supports two socket (four-core) Intel Itanium Server Blades that can be scaled to offer up to two 16 core servers in a single chassis or eight four core servers, or a mixture of SMP and single module systems, thus reducing footprint and power consumption while increasing utilization and flexibility. SMP provides higher performance for applications that can utilize large memory and multiple processors, such as large databases or visualization applications.
• Full interleave mode (or SMP mode) — Intended for use with an OS without support for the NUMA architecture or with inadequate support for NUMA. In full interleave mode, main memory is interleaved between CPU modules in units. Since memory accesses do not concentrate on one CPU module in full interleave mode, memory bus bottlenecks are less likely and latency is averaged across CPUs. • Non-interleave mode — This mode specifies the ratio of local memory at a constant rate.
L3 Cache Copy Tag The data residing in caches and main memory across Intel Itanium Server Blades are kept in sync by using a snooping cache coherency protocol. When one of the Intel Itanium processors needs to access memory, the requested address is broadcast by the Hitachi Node Controller. The other Node Controllers that are part of that partition (SMP) listen for (snoop) those broadcasts.
Node Link (for SMP connections) EBS Chassis CPU module #0 CPU CPU IPF I/O expansion module CPU CPU slot #5#5 CPU CPU slot #4#4 CPU slot #6#6 CPU CPU slot #7#7 CPU CPU slot #3 Mem NDC CPU slot #0 CPU #0 CPU slot #1 CPU #1 CPU slot #2 Backplane PCI Express x4 Link SVP GB Switch SVP SVP Module #1 GB Switch Bridge Bridge Bridge Bridge #11 #10 #9 #8 SVP Module #0 Bridge Bridge Bridge #3 #2 #1 #0 IO Module #0 (Type1) IO Module #1 (Type 1) PCI-X slot #15 #14 Bridge #13 #12 #7 #6
Chapter 4 Intel Xeon Server Blade The eight slot BladeSymphony 1000 can accommodate a total of up to eight Dual-Socket, Dual-Core or Quad-Core Intel Xeon Server Blades for up to 64 cores per system. Each Intel Xeon Server Blade supports up to four PCI slots, and provides the option of adding Fibre Channel or SCSI storage. Two on-board gigabit Ethernet ports are also provided, along with IP KVM for remote access, virtual media support, and front-side VGA and USB ports for direct access to the server blade.
Table 4: Intel Xeon Server Blade components Quad Core Intel Xeon Processor Dual Core Intel Xeon Processor Capacity Maximum 32 GB Memory Slots 8 Internal HDD Up to four 2.
• Intel VT Flex Migration — Intel hardware-assisted virtualization provides the ability to perform live virtual machine migration to enable fail-over, load balancing, disaster recovery, and real-time server maintenance.
Advanced ECC Conventional ECC is intended to correct 1-bit errors and detect 2-bit errors. Advanced ECC, also known as Chipkill, corrects up to four or eight bits of an error that occurs in a DRAM installed on a x4- DRAM or x8-DRAM type DIMM, respectively. Accordingly, the system can operate normally even if one DRAM fails, as illustrated in Figure 14. Figure 14.
Table 5: Online spare memory supported configurations Bank Bank1 Bank2 Bank3 Bank4 Configuration 7 2 GB 2 GB 2 GB 2 GB None None None None Configuration 8 1 GB 1 GB 1 GB 1 GB None None None None Configuration 9 512 MB 512 MB 512 MB 512 MB None None None None Configuration 10 256 MB 256 MB 256 MB 256 MB None None None None For example, in Configuration 1 the shaded BANK 4 is a spare bank.
If an uncorrectable error occurs in a DIMM in the primary, the mirror is used for both writing and reading data. If an uncorrectable error occurs in a DIMM in the mirror, the primary is used for both writing and reading data. In this case, the error is logged as a correctable error. If the error is uncorrectable by the primary or mirror, it is logged as an uncorrectable error. On-Module Storage Intel Xeon Server Blades support up to four internal 2.
Chapter 5 I/O Sub System I/O Modules Hitachi engineers go to great lengths to design systems that provide high I/O throughput. BladeSymphony 1000 PCI I/O Modules deliver up to 160 Gb/sec. throughput by providing a total of up to 16 PCI slots (8 slots per I/O module). I/O modules accommodates industry standard PCIe or PCI-X cards, supporting current and future technologies as well as helping to preserve investments in existing PCI cards.
Table 6 provides information on the connector types for PCI-X I/O Modules. Table 6: PCI-X I/O Module connector types Name Protocol Frequency Bus Width Remarks PCI-X slots #0 to 7 PCI-X 133 133 MHz 64-bit PCI Hot Plug SCSI connector #0, #1 Ultra 320 160 MHz 16 bit LVD Each I/O module has two SCSI connector ports PCIe I/O Module To provide more flexibility and to support newer PCI cards, a PCIe I/O module is available.
Figure 18. Outside view of the Embedded Fibre Channel Switch Module The Fibre Channel switch within the module consists of 14 ports compatible with the 4 Gb/sec. Fibre Channel standard. Eight ports are connected internally to the FC-HBA of up to eight FC-HBA + Gigabit Ethernet Combo Cards, and six of the ports are external ports used to connect to external storage. Figure 19 depicts the back view of the module and a blow up of the Fibre Channel switch. The block diagram for the module is shown in Figure 20.
48V 12V Glacier 3.3V 12V 12V (main) 5V (Standby) 5V Total 8 modules mountable FC - SW SFP Flash Five LAN RJ45+(MAG) PCI-X 64b 100MHz ROM PCIeX4 (Server blade PCIeX4 (Server blade PCIeX4 (Server blade PCIeX4 (Server blade PCIeX4 (Server blade PCIeX4 (Server blade PCIeX4 (Server blade PCIeX4 (Server blade PCI-X 64b 6MHz (4Gbs) SFP UART LED 82,546 GB (GbE) 41210 Bridge SFP SFP FC 4.25 Gbs (Max) Con.
Backplane (type D) Server blade #0 Server blade #1 Server blade #2 Server blade #3 Server blade #4 Server blade #5 Server blade #6 Server blade #7 Corresponding server blades Slot #0 (Server Slot #1 (Server Slot #2 (Server Slot #3 (Server blade blade blade blade #0) #1) #2) #3) Slot #4 (Server Slot #5 (Server Slot #6 (Server Slot #7 (Server blade blade blode blode #4) #5) #6) #7) Slot #8 (Server blade #0) Slot #9 (Server blade #1) Slot #10 (Server blade #2) Slot #11 (Server blade #3)
Table 7: Embedded Fibre Channel Switch Module components Function Details Fabric delay time Less than 2 microseconds (no contention, cut-through routing) Maximum frame size 2112-byte payload Service class Class 2, class 3, class F (frame between two switches) Data traffic type Unicast, multicast, broadcast Media type SFP (Small Form-Factor Pluggable) Fabric service SNS (Simple Name Server), RSCN (Registered State Change Notification), Alias Server (Multicast), Brocade Advanced Zoning ISL Trunk
The Hitachi FC Controller FC-HBA supports the functions in Table 8. Table 8: Hitachi FC Controller FC-HBA functions Function Details Number of ports 1 PCI hot plug Supported Port speed 1/2/4/ Gb/sec. Supported standards FC-PH rev.4.3, FC-AL rev. 5.
Embedded Gigabit Ethernet Switch The Embedded Gigabit Ethernet Switch is contained in the Switch & Management Module and is a managed, standards-based Layer 2 switch that provides gigabit networking through cableless LAN connections. The switch provides 12 (single) or 24 (redundant) gigabit Ethernet ports for connecting BladeSymphony 1000 Server Blades to other networked resources within the corporate networking structure.
Table 9: Embedded Gigabit Ethernet Switch features Item Description Port Backplane side: 1 Gb/sec. x 8 External: 10 BASE-T / 100 BASE-T / 1000 BASE-t (auto connection) Auto learning of MAC address (16,384 entries) Switch Layer 2 switch Bridge function Spanning tree protocol (IEEE 802.1d compliant) Network function Link aggregation (IEEE 802.3ad) Trunking (up to 4 ports, 24 groups) Jumbo frame (packet size: 9216 bytes) VLAN Port VLAN TagVLAN (IEEE 802.
A SCSI or RAID type PCI card must be installed in a PCI slot in an I/O module to act as the controller of the storage module. A PCI card is combined with a storage module, as shown in Figure 25, by connecting the SCSI cable from the PCI card to the SCSI connector on the same I/O module, and then connecting the board wiring from the SCSI connector to server slot #4 to #7, where the storage module is installed through the backplane.
Chapter 7 Chassis, Power, and Cooling The BladeSymphony 1000 chassis houses all of the modules previously discussed, as well as a passive backplane, Power Supply Modules, Cooling Fan Modules, and the Switch & Management Modules. The chassis and backplane provide a number of redundancy features including a one-to-one relationship between server blades and I/O modules, as well as duplicate paths to I/O and switches.
Module Type A Type B Switch & Management Module 1 standard, 2 maximum 1 standard, 2 maximum I/O Module (PCI-X) 2 maximum 2 slots maximum per server blade, 16 slots maximum per server chassis 2 maximum 4 slots maximum per server blade, 16 slots maximum per server chassis I/O Module (PCIe) N/A 2 maximum 2 slots maximum per server blade, 16 slots maximum per server chassis I/O Module (Fibre Channel Switch) N/A 2 Maximum Power Module 4 maximum (N+1 redundant configuration) Cooling Fan Module 4
power redundancy, it boots the system after issuing a warning by illuminating the Warning LED. Hot swapping is not possible in the absence of redundant power. Redundant Cooling Fan Modules The Cooling Fan Modules cool the system with variable speed fans, and are installed redundantly, as illustrated in Figure 27. The fans cool the system by pulling air from the front of the chassis to the back.
Chapter 6 Reliability and Serviceability Features Reliability, availability, and serviceability are key requirements for platforms running business-critical application services. In today’s globally competitive environment, where users access applications round-the-clock, downtime is unacceptable and can result in lost customers, revenue, and reputation. The BladeSymphony 1000 is designed with a number of features intended to increase the uptime of the system.
Serviceability Features Switch & Management Module The Switch & Management Module is designed to control the system unit and monitor the environment. Figure 28 shows the block diagram of the module. This module and other system components are connected through I2C or other busses.
Table 12: Switch & Management Module components Component Manufacturer Quantity Description Microprocessor Hitachi 1 RTC Epson 1 FPGA Xilinx 1 SDRAM - 128 MB ECC Flash ROM - 16 MB Stores the OS image NV SRAM - 1 MB Battery backed up SRAM.
• Panel control • Log information management within BladeSymphony 1000 (RC logs, SEL, SVP logs, etc.). • SVP hot standby configuration control • Server Conductor (server management software) interaction function. (Including a function for emulating the PCI-version SVP function.) • HA monitor (cluster software) interaction function.
The Base Management Controller provides the following functions: • Initial diagnosis — initial diagnosis/setting of BMC and BMC's peripheral hardware • Power control — controlling of power input and shutdown for modules • Reset control — controlling hard reset and dump reset • Failure handling — handling of fatal MCK occurrence • Log management — management RC logs, detailed logs and SEL • Environmental monitoring — monitoring the temperature and voltage inside a module • Panel output LOG through a virtual
SVP Console This function is shared between the Intel Itanium and Intel Xeon Server Blades. SVP console is a console under SVP, and provides a user interface for system management.
Chapter 8 Management Software BladeSymphony 1000 delivers an exceptional range of choices and enterprise-class versatility with multi-OS support and comprehensive management software options. Operating System Support With support for Microsoft Windows and Red Hat Linux Enterprise, BladeSymphony 1000 gives companies the option of running two of the most popular operating systems — at the same time and in the same chassis for multiple applications.
BladeSymphony Management Suite provides centralized system management and control of all server, network, and storage resources, including the ability to setup and configure servers, monitor server resources, integrate with enterprise management software (SNMP), phone home, and manage server assets. Deployment Manager Deployment Manager allows the mass deployment of system images for fast, effective server deployment.
– Setting a failover schedule for cluster groups, based on specific dates or at specified times on a weekly schedule. The user can achieve more detailed cluster management by combining this feature with a power control schedule. – Using alerts to predict future server shutdown and implementing automatic failover in the event of specific alerts • Power Scheduling — Power control schedules can be set to turn the power on or off on specific dates or at specified times on a weekly schedule.
Chapter 9 Virtage Virtage is a key technical differentiator for BladeSymphony 1000. It brings mainframe-class virtualization to blade computing. Leveraging Hitachi’s decades of development work on mainframe virtualization technology, Virtage delivers high-performance, extremely reliable, and transparent virtualization for Dual-Core Intel Itanium and Quad-Core Intel Xeon processor-based server blades.
And the host intervention code is tuned for the latest Itanium hardware features, minimizing the performance impact to guests. Virtage offers two modes in which processor resources can be distributed among the different logical partitions: dedicated mode and shared mode, as illustrated in Figure 31. Dedicated Mode CPU CPU Memory Memory NIC NIC Partition 1 PCI PCI Shared Mode CPU CPU Memory NIC PCI Memory NIC Partition 2 CPU NIC Partition 3 PCI PCI Figure 31.
Fiber Channel Virtualization Hitachi also offers Fibre Channel I/O virtualization for Virtage. This allows multiple logical partitions to access a storage device through a single Fiber Channel card, allowing fewer physical connections between server and storage and increasing the utilization rates of the storage connections. This is exclusive to the 4 GB Hitachi FC card.
Chapter 10 Summary In the past, inadequate scalability, compromises in I/O and other capabilities, excessive heat generation, and increased complexity in blade environments caused many data center managers to shy away from using blade servers for enterprise applications. BladeSymphony 1000 overcomes these issues, proving a blade solutions that delivers server consolidation, centralized administration, reduced cabling, and simplified configuration.
HITACHI AMERICA, LTD. SERVER SYSTEMS GROUP 2000 Sierra Point Parkway Brisbane, CA 94005-1836 ph. 1.866.HITACHI email: ServerSales@hal.hitachi.com web: www.BladeSymphony.com ©2008 Hitachi America, Ltd. All rights reserved. Descriptions and specifications contained in this document are subject to change without notice and may differ from country to country. Intel, Itanium, and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.