12 May, 2010 § Leave a comment
This article was written by Brendan Grebur and Jared Wein.
Adapting to the next step in processor technology requires a reevaluation of current computational techniques. Many-core architecture presents a fundamentally different perspective on both software and hardware systems. With no clearly effective model to follow, we are left to our own devices for advancement. A dwarf’s approach provides guidance, but does not guarantee an effective realization of the underlying issues. However, it provides a starting point to explore the interactions of data at extreme levels of parallelism. Of course, as with any method, pitfalls may arise.
This paper is a critique of the research presented by [Asanovic 06]. We present the motivation for their work, a comparison between their proposal and present-day solutions, as well as a look at advantages and disadvantages of their proposal.
The past 16 years, from 1986 to 2002, have showed tremendous growth in processor performance. Around 2002, the industry hit a “brick wall” when it ran in to limits in memory, power, and instruction level parallelism [Hennessy and Paterson 07]. While performance gains slowed, performance requirements continued to grow at a higher pace than before.
As an example, Intel has been forecasting an “Era of Tera”. Intel claims that the future holds a time where there is demand for teraflops of computing power, terabits per second of communication bandwidth, and terabytes of data storage [Dubey 05]. To prepare for this era, Intel has claimed a “sea change” in computing and put their resources on multi- and many-core processor architectures [Fried 04].
Previously these multi- and many-core computer architectures have not been successful stays in the mainstream [Asanovic 06]. Some reason that one of the causes for poor adoption is the increased complexity when working with parallel computing. [Asanovic 06] developed thirteen independent paradigms that represent the active areas of parallel computing. After compiling their list, they compared theirs with Intel’s list and noticed heavy overlaps between the two. This reassured [Asanovic 06] that they were on the right path.
Advantages of the Dwarfs
The first advantage to the dwarfs that have been proposed is that there exists only a relatively small number of dwarfs to be concerned with. Psychology research has shown that humans can cope with understanding about seven, plus or minus two, items very easily [Miller 56]. The number of dwarfs, at 13, is not too far off from this desirable number, and allows the test space to be conceivably small. Had the number of dwarfs proposed been in the hundreds, it may become too costly to determine if a system can handle all the dwarfs.
Another advantage focuses on benchmarking software. Some of the programs for SPEC benchmarks work on computational problems that are no longer current research areas and have less of an application when determining the system performance. The dwarf approach has a goal to focus on active areas within computer science, and to adapt the focus areas over time to stay relevant.
The years since assembly programming have shown that higher level languages could be created and provide software developers with increased productivity while maintaining software quality. The goal of the parallel programming dwarfs is to extend these common problems to programming frameworks that can help software developers program at an even higher level to maintain efficiency and quality. These dwarfs provide themselves in a way that they can be looked at in the same regard as object-oriented design patterns. One such study was able to use the dwarfs to easily find places within existing sequentially-programmed applications where refactoring could make use of parallel programming solutions [Pankratius 07]. The researchers in this study were able to achieve performance improvements of up to 8x using autotuning and parallel programming patterns described by the dwarfs.
Disadvantages of the Dwarfs
There are a number of questions that haven’t been answered by the Berkeley researchers. We present a couple of the issues below that we are either unsure of or believe are disadvantages of the dwarf paradigm.
First, it is not well-explained how the combination of dwarfs is supported on a given system. When a system claims to support two dwarfs independently, there does not appear to be a deterministic answer as to if the system will support the combination of the dwarfs. Further, the interaction between these dwarfs does not appear to be well documented.
Next, [Asanovic 06] states the power of autotuning when working with sequential programming, but admits that there have not been successful implementations of an autotuner for parallel programs. The problem space for parallel programs is much larger than that of sequential programs, and various factors will have to be elided to make development of an autotuner reasonable. Further, it is unclear if the parameters used to perform the autotuning will provide the optional parameters for use when running the actual software with vastly different datasets than that of the autotuned dataset.
Last, the area of parallel programs presents a virtual graveyard of companies and ideas that have failed to reach the market successfully. The multi- and many-core architectures have failed at producing the same programmer productivity and quality that the uniprocessor delivered. Only time will tell how the field of software development reacts and if it can adopt the shift in computing paradigms.
Computing performance improvements have recently hit a “brick wall” and there has been a computing shift pushed by major chip makers towards multicore systems. There are many complexities with multi- and many-core systems that can lead to unappreciated performance gains. Some of the problems may reside in the complexity of implementing common parallel computing patterns.
The researchers in the Par Lab at University of California at Berkeley have been attacking the multi- and many-core problems since 2006. From their research, they have created a list of 13 dwarfs that represent active areas in parallel computing. These dwarfs provide common patterns that can be reproduced and used for benchmarking and the creation of programming frameworks that offer an abstraction of complex problems.
Asanovic et al. The Landscape of Parallel Computing Research: A View from Berkeley. Dec 2006.
Fried, I. For Intel, the future has two cores. ZDNet. 2004. http://news.zdnet.com/2100-9584_22-138241.html
Dubey, P. Recognition, Mining and Synthesis Moves Computers to the Era of Tera. Technology@Intel Magazine. Feb 2005.
Hennessy, J. and Patterson, D. Computer Architecture: A Quantitative Approach, 4th edition, Morgan Kauffman, San Francisco, 2007.
Miller, G. A. “The magical number seven, plus or minus two: Some limits on our capacity for processing information”. Psychological Review. Vol 63, Iss 2, pp 81–97. 1956.
Pankratius et al. Software Engineering for Multicore Systems – An Experience Report. Institute for Program Structures and Data Organization. Dec 2007.