## Architectures for HPC Workloads – Trends in Microprocessor development

Leif Nordlund HPC Business Development EMEA August 2010







## X86 purchase criteria Changed Over Time







## The road to HPC/Exascale using volume products





fusion

## **Current Microprocessors / Volume products**

## x86 CPUs and Graphics GPUs for PCs (and Servers)



### Multi-core x86 Processors

- Performance/Watt using up to 12 cores
- X86 instruction set (+extensions)
- Enhanced Efficiency and instructions

#### **GPUs for Graphics**

- 3D Accelerators For Visualization
- See More and Do More with Your Data



### **GPU Computing (GPGPU)**

- GPU Optimized For Computation
- Massive Data-parallel Processing
- High FLOP Performance Per Watt





AMD FireStream<sup>™</sup> 9350 Maximum compute density GPGPU (4GPU in 1U) Breakthrough performance in a single-slot solution 2.0+ TFLOPS single precision floating point 400 GFLOPS double precision floating point 2GB GDDR5 memory Single slot with passive heat sink <= 150W peak TDP (single 6-pin AUX connector) Delivery in Q3 2010 **MSRP \$799** 



# Enhancing the x86 core/watt and Instruction set

#### The AMD "Bulldozer" core has shared and dedicated components

The shared components:

- -Help reduce power consumption
- -Help reduce die space
  - The dedicated components:
- -Help increase performance and scalability

Bulldozer dynamically switches between shared and dedicated components to maximize performance per watt

One of the building blocks of the next generation of FUSION processor designs







7

## **Designed for Scalability and Performance**

#### "Bulldozer" module

Two cores in a single unit that enables two simultaneous threads, the building blocks of a "Bulldozer" die

#### Parallel Threads

The ability to execute two threads on two discrete, unshared cores without compromising or increasing bottlenecks



#### Flex FP

A flexible floating point unit that can be dedicated OR shared between the two cores per cycle

#### Dedicated Scheduler

Independent integer schedulers and an FP scheduler help improve scalability by efficient execution



8

## **ISA For First-generation Bulldozer**

**Specification**: <u>http://support.amd.com/us/Processor\_TechDocs/43479.pdf</u> More background: <u>http://forums.amd.com/devblog/</u> - "Striking A Balance"

#### Achieves parity with Intel in instruction set extensions

- SSSE3
- SSE4.1
- SSE4.2
- AES (acceleration for AES encryption standard)
- PCLMULQDQ (carryless multiply -- crypto, CRC acceleration)
- AVX "Advanced Vector Extensions" with 256 bit SIMD Vector registers

#### **Plus the following AMD extensions**

- XOP eXtended **OP**eration: multimedia and vectorization extensions beyond AVX
- FMA4 four-operand Fused Multiply Add
- LWP Lightweight Profiling dynamic performance optimization support



9



## AMD "APU" FUSION Integrated Chip – coming 2011

Combination of CPU and programmable GPU architectures for highperformance heterogeneous compute capability

High-speed bus
architecture
Shared, low-latency
memory model
Single die design





## **The Road to Server Fusion**

AMD's multi-year enablement strategy brings GPGPU together with AMD Fusion architecture to help accelerate technical computing

#### **Integration Steps:**

- 1. Application level tools enable GPGPU computing
- 2. Drivers deliver operating system support
- 3. PCI integration in CPU delivers a direct communication link
- 4. Unified communications; DMA memory and cache access w/accelerators
- 5. Integration of CPU, GPU and communications with shared memory model





### Heterogeneous Computing Industry Standards Drive Adoption



Hardware abstraction



and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos. 12

## **ONLY AMD!**





## OpenCL

## 





GPU

-(fusion)-

Only

AMD

STREAM

TECHNOLOGY

#### **Disclaimer and Attribution**

#### DISCLAIMER

The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.

The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes.

AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.

AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION

© 2008 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, ATI, the ATI logo, Catalyst, CrossFireX, and Radeon and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other names are for informational purposes only and may be trademarks of their respective owners. CAUTIONARY STATEMENT

This release contains forward-looking statements concerning revenue and other expectations which are made pursuant to the safe harbor provisions of the Private Securities Litigation Reform Act of 1995. Forward-looking statements are commonly identified by words such as "would," "may," "expects," "believes," "plans," "intends," "projects," and other terms with similar meaning. Investors are cautioned that the forward-looking statements in this release are based on current beliefs, assumptions and expectations, speak only as of the date of this release and involve risks and uncertainties that could cause actual results to differ materially from current expectations.



