在多處理機系統上的執行時期平行化方法

标题:	在多处理机系统上的执行时期平行化方法 An Efficient Run-Time Parallelizing Method for Multiprocessor Systems
作者:	谢明辉 Hsieh, Ming-Huei 曾宪雄 Shian-Shyong Tseng 资讯科学与工程研究所
关键字:	执行时期;波前;平行度;run-time;wavefront;parallelism
公开日期:	1995
摘要:	回圈在一程式中存在有大量的平行度，为了将此程式平行化，平行编译器利用静态资料相依性分析来获得回圈的平行度。然而，有些回圈则无法于编译时期取得资料相依性的资讯。例如，在稀疏矩阵计算上，阵列述语内通常包含了间接阵列或函式，而无法利用静态资料相依性分析。故便保守的将程式循序的执行，而牺牲了潜在的平行度。因此，在此论文中则提出了一个两阶段 (侦测阶段及执行阶段) 的执行时期平行化方法于执行时期撷取出回圈中潜在的平行度。侦测阶段经由建立一DEF-USE表而决定出可平行执行的回圈轮替集合-波前，此外，此侦测阶段本身可以被完全的平行化以减少因决定波前所照成的额外负担。而经改良的执行阶段则根据波前来执行回圈并且使用auto-adapted函式来获得合适的Thread数量而非传统固定的指定Thread数量。实验的结果显示，这个平行侦测演算法能处理较复杂的资料相依性而且能明显缩短本身执行时间。此外，在执行阶段所利用的新策略能提高整个执行时期平行化的效率并且增加多处理机系统的利用度。 Loop-level parallelism is the most common resource to be exploited by parallelizing compiler. To parallelize a sequential loop, a parallelizing compiler must compute a parallel schedule of the iterations based on a static data dependenceanalysis at compile-time. Some loops, however, may contain parallelism not detectable in this way. For example, insparse matrix computations, array subscripts often involve indirection arrays and thus defy static analysis. In conservatively, the loop iterations in such examples will be performed sequentially. Motivated by these concerns, a run-time technique based on inspector-executor scheme is proposed for finding available parallelism on loops in this thesis. Our inspector can determine the wavefronts by building DEF-USE table. Additionally, the inspector is fully parallel without any synchronization for reducing overhead that indicates the wavefronts. Our improved executor performs the loop iterations concurrently for each wavefront in a loop by using auto-adapted function to get a tailored thread number rather than using fixed thread number. Experimental results show that our new parallel inspector algorithm can handle complex data dependency patterns that cannot be performed by the previousresearches and reduce itself running time obviously. Besides, the new strategyfor executor can also achieve high system utilization and improve the performance of run-time parallelization.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#NT840394006 http://hdl.handle.net/11536/60445
显示于类别：	Thesis

APA	谢., Hsieh, M., 曾., & Shian-Shyong T. (1995). 在多处理机系统上的执行时期平行化方法. http://hdl.handle.net/11536/60445.
Bibtex	@article{谢明辉 and Hsieh1995, title={在多处理机系统上的执行时期平行化方法}, author={谢明辉 and Hsieh, Ming-Huei and 曾宪雄 and Shian-Shyong Tseng}, journal={http://hdl.handle.net/11536/60445}, year={1995}, url={https://ir.lib.nycu.edu.tw/handle/11536/60445?locale=zh_CN}, }