Submitted by: Submitted by adelkuwaiti
Views: 429
Words: 10753
Pages: 44
Category: Science and Technology
Date Submitted: 10/14/2012 03:02 AM
Efficient Operating System Scheduling
for Performance-Asymmetric Multi-Core Architectures
Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn
Systems Technology Lab
Intel Corporation
{tong.n.li,dan.baumberger,david.a.koufaty,scott.hahn}@intel.com
ABSTRACT
Recent research advocates asymmetric multi-core architectures, where
cores in the same processor can have different performance. These
architectures support single-threaded performance and multithreaded
throughput at lower costs (e.g., die size and power). However, they
also pose unique challenges to operating systems, which traditionally assume homogeneous hardware. This paper presents AMPS,
an operating system scheduler that efficiently supports both SMPand NUMA-style performance-asymmetric architectures. AMPS
contains three components: asymmetry-aware load balancing, fastercore-first scheduling, and NUMA-aware migration. We have implemented AMPS in Linux kernel 2.6.16 and used CPU clock modulation to emulate performance asymmetry on an SMP and NUMA
system. For various workloads, we show that AMPS achieves a
median speedup of 1.16 with a maximum of 1.44 over stock Linux
on the SMP, and a median of 1.07 with a maximum of 2.61 on the
NUMA system. Our results also show that AMPS improves fairness and repeatability of application performance measurements.
1.
INTRODUCTION
Multi-core architectures are becoming mainstream in both server
and desktop processors. Over the next decade, we expect to see
processors with tens and even hundreds of cores on a chip [6]. To
efficiently utilize chip real-estate, recent research [2, 3, 11, 13, 14,
15] advocates performance-asymmetric (or heterogeneous) architectures, where a processor contains multiple cores with the same
instruction set but different performance characteristics (e.g., clock
speed, issue width, in-order vs. out-of-order). These architectures
provide cost-effective platforms for both throughput-oriented applications and...