Hybrid OpenMP/AVX acceleration of a Split HLL Finite Volume Method for the Shallow Water and Euler Equations

doi:10.1016/j.compfluid.2014.11.011

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Liu, Ji-Yueh	en_US
dc.contributor.author	Smith, Matthew R.	en_US
dc.contributor.author	Kuo, Fang-An	en_US
dc.contributor.author	Wu, Jong-Shin	en_US
dc.date.accessioned	2015-07-21T08:29:28Z	-
dc.date.available	2015-07-21T08:29:28Z	-
dc.date.issued	2015-03-30	en_US
dc.identifier.issn	0045-7930	en_US
dc.identifier.uri	http://dx.doi.org/10.1016/j.compfluid.2014.11.011	en_US
dc.identifier.uri	http://hdl.handle.net/11536/124494	-
dc.description.abstract	Presented is the application of the Split Harten, Lax and van Leer (SHLL) technique applied to parallel computation using a hybrid OpenMP/AVX parallelization paradigm for the Shallow Water Equations and Euler Equations. The key behind the ease of parallelization of the SHLL method for both governing equations is the mathematical/vector splitting in each coordinate direction - this splitting results in a high degree of locality, producing a scheme which is embarrassingly parallel and well suited for the vectorization capacities offered by vector-computing architectures. Here we demonstrate this capacity using the SIMD capacities of modern CPUs, namely the Advanced Vector eXtensions (AVX) capability of recent CPUs. The main feature of AVX is the capacity to perform SIMD operations on 8 floating point variables in parallel - an increase from 4 floating point variables as possible using the previous SIMD Streaming Extensions (SSE). Furthermore, since modern CPU\'s employ a large number of cores, we further extend the performance by using AVX on each available CPU core using shared memory (OpenMP) parallelization. We present a direction-split higher order extension to both the SHLL method and apply it to AVX through the use of intrinsic functions in the flux computation and state computation modules. High performance is obtained by ensuring that all flux computations are performed using only AVX intrinsic functions - no computations are performed in serial. Through this approach, a single workstation with 2x Xeon CPU\'s (16 physical cores) allows a performance increase of over 117 times that of a single core alone in the flux evaluation kernel. (C) 2014 Elsevier Ltd. All rights reserved.	en_US
dc.language.iso	en_US	en_US
dc.subject	Parallel computing	en_US
dc.subject	AVX	en_US
dc.subject	Advanced Vector eXtensions	en_US
dc.subject	OpenMP	en_US
dc.subject	Vectorization	en_US
dc.subject	SHLL	en_US
dc.subject	HLL	en_US
dc.subject	Integral balance	en_US
dc.subject	Finite Volume Method	en_US
dc.subject	Shallow Water Equations	en_US
dc.subject	Euler Equations	en_US
dc.title	Hybrid OpenMP/AVX acceleration of a Split HLL Finite Volume Method for the Shallow Water and Euler Equations	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1016/j.compfluid.2014.11.011	en_US
dc.identifier.journal	COMPUTERS & FLUIDS	en_US
dc.citation.volume	110	en_US
dc.citation.spage	181	en_US
dc.citation.epage	188	en_US
dc.contributor.department	機械工程學系	zh_TW
dc.contributor.department	Department of Mechanical Engineering	en_US
dc.identifier.wosnumber	WOS:000350535100019	en_US
dc.citation.woscount	0	en_US
顯示於類別：	期刊論文