標題: | Classifying and alleviating the communication overheads in matrix computations on large-scale NUMA multiprocessors |
作者: | Wang, YM Wang, HH Chang, RC 資訊工程學系 Department of Computer Science |
公開日期: | 1-Dec-1998 |
摘要: | Large-scale, shared-memory multiprocessors have non-uniform memory access (NUMA) costs. The high communication cost dominates the source of matrix computations' execution. Memory contention and remote memory access are two major communication overheads on large-scale NUMA multiprocessors. However, previous experiments and discussions focus either on reducing the number of remote memory accesses or on alleviating memory contention overhead. In this paper, we propose a simple but effective processor allocation policy, called rectangular processor allocation, to alleviate both overheads at the same time. The policy divides the matrix elements into a certain number of rectangular blocks, and assigns each processor to compute the results of one rectangular block. This methodology may reduce a lot of unnecessary memory accesses to the memory modules. After running many matrix computations under a realistic memory system simulator, we confirmed that at least one-fourth of the communication overhead map be reduced. Therefore, we conclude that rectangular processor allocation policy performs better than other popular policies, and that the combination of rectangular processor allocation policy with software interleaving data allocation policy is a better choice to alleviate communication overhead. (C) 1998 Elsevier Science Inc. All rights reserved. |
URI: | http://dx.doi.org/10.1016/S0164-1212(98)10040-7 http://hdl.handle.net/11536/147835 |
ISSN: | 0164-1212 |
DOI: | 10.1016/S0164-1212(98)10040-7 |
期刊: | JOURNAL OF SYSTEMS AND SOFTWARE |
Volume: | 44 |
起始頁: | 17 |
結束頁: | 29 |
Appears in Collections: | Articles |