Girish Mahajan (Editor)

Steamroller (microarchitecture)

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Produced
  
beginning of 2014

Min. feature size
  
28 nm SHP

Common manufacturer(s)
  
AMD

Instruction set
  
AMD64 (x86-64)

Socket(s)
  
Socket FM2+ Socket FP3 (µBGA)

Predecessor
  
Piledriver - Family 15h (2nd-gen)

AMD Steamroller Family 15h is a microarchitecture developed by AMD for AMD APUs, which succeeded Piledriver in the beginning of 2014 as the third-generation Bulldozer-based microarchitecture. Steamroller APUs continue to use two-core modules as their predecessors, while aiming at achieving greater levels of parallelism.

Contents

Microarchitecture

Steamroller still features two-core modules found in Bulldozer and Piledriver designs called clustered multi-thread (CMT), meaning that one module is equal to a dual-core processor. The focus of Steamroller is for greater parallelism. Improvements center on independent instruction decoders for each core within a module, 25% more of the maximum width dispatches per thread, better instruction schedulers, improved perceptron branch predictor, larger and smarter caches, up to 30% fewer instruction cache misses, branch misprediction rate reduced by 20%, dynamically resizable L2 cache, micro-operations queue, more internal register resources and improved memory controller. Another improvement over the Piledriver cores were the addition of new CPU instructions, such as HEVC.

AMD estimated that these improvements will increase instructions per cycle (IPC) up to 30% compared to the first-generation Bulldozer core while maintaining Piledriver's high clock rates with decreased power consumption. The final result was a 9% single-threaded IPC improvement, and 18% multi-threaded IPC improvement over Piledriver.

Steamroller, the microarchitecture for CPUs, as well as Graphics Core Next, the microarchitecture for GPUs, are paired together in the APU lines to support features specified in Heterogeneous System Architecture.

History

In 2011, AMD announced a third-generation Bulldozer-based line of processors for 2013, with Next Generation Bulldozer as the working title, using the 28 nm manufacturing process.

On 21 September 2011, leaked AMD slides indicated that this third generation of Bulldozer core was codenamed Steamroller.

In January 2014, the first Kaveri APUs became available.

Starting from May 2015 till March 2016 new APUs were launched as Kaveri-refresh (codenamed Godaveri).

APU lines

  1. Kaveri A-series APU
  2. Desktop budget and mainstream markets (FM2+): The Trinity / Richland APU line was replaced in January 2014 by the Kaveri APU line, as the third generation of A10, A8, A6 and A4 series for the desktop market. Top of the line model in 2014 was the quad-core A10-7850K APU, with a 3.7 GHz core frequency and 4 MB L2 cache, incorporating a 720 MHz GPU with 512 stream processors and over 856 GFLOPS of total processing power.
    In 2015 and 2016 new models with two to four enhanced Steamroller B cores were released as Kaveri-refresh / Godaveri. A10-7890K, the new top of the line model, features a increased core frequency of 4.1 GHz and a 866 MHz GPU.
  3. Two or four CPU cores based on the Steamroller microarchitecture
  4. Socket FM2+-only, Socket FM2 is not supported, support for PCIe 3.0
  5. DDR3 Dual-channel (2x64-bit) memory controller
  6. AMD Heterogeneous System Architecture (HSA) 2.0
  7. SIP blocks: Unified Video Decoder, Video Coding Engine, TrueAudio
  8. Three to eight Compute Units (CUs) based on the revised GCN 2nd gen microarchitecture; 1 Compute Unit (CU) consists of 64 Unified Shader Processors : 4 Texture Mapping Units (TMUs) : 1 Render Output Unit (ROP)
  9. AMD Eyefinity up to 4 monitors, 4K Ultra HD support, DisplayPort 1.2 Support
  10. Select models support AMD Hybrid Graphics by using a Radeon R7 240 or R7 250 discrete graphics card.
  11. Integrated custom ARM Cortex-A5 co-processor with TrustZone Security Extensions
  12. Berlin APU - canceled
  13. Announced in 2013 by AMD the Berlin APU were targeted at the enterprise and server markets featuring four Steamroller cores, up to 512 stream processors and support for ECC memory.

FX lines (discontinued)

In November 2013 AMD confirmed it will not update the FX series in 2014, neither its current Socket AM3+ version, nor will it receive a Steamroller version with a new socket.

Server lines (canceled)

AMD's server roadmaps for 2014 showed:

  • Berlin APU - quad-core x86 Steamroller architecture (as described above) for 1 Processor (1P) compute and media clusters
  • Berlin CPU - quad-core x86 Steamroller architecture for 1P web and enterprise services clusters
  • Seattle CPU - 4/8 core AArch64 Cortex-A57 architecture (Opteron A1100) for 1P web and enterprise services clusters
  • Warsaw CPU - up to 16 core x86 Piledriver (2nd gen Bulldozer) architecture (Opteron 6338P and 6370P) for 2P/4P servers
  • However, plans for Steamroller Opteron products were cancelled, likely due to the poor energy efficiency achieved in this generation of the Bulldozer architecture. Energy efficiency was greatly increased in the following generation, (Excavator), which exceeded Jaguar in performance per watt, and approximately doubled performance/watt over Steamroller (for example 20.74 pt/W vs 10.85 pt/W when comparing similar mobile APUs using rough arbitrary metrics).

    References

    Steamroller (microarchitecture) Wikipedia