標題: 具有偵測前端零位元之非同步陣列乘法器
Asynchronous Array Multiplier with Leading-Zero-Bits Detection
作者: 陳進勇
Chen Chin-Yung
陳昌居
Chen Chang-Jiu
資訊學院資訊學程
關鍵字: 非同步乘法器;非同步乘法加速器;前端零位元檢測器;零加法器;大老二非同步乘法器;有號數非同步乘法器;asynchronous array multiplier;asynchronous multiplier with accelerator;Leading-Zero-Bit dtector;Zero-Adder;Big2 asynchronous multiplier;Signed asynchronous multiplier
公開日期: 2003
摘要: 乘法器在很多的應用上如微處理器、數位信號處理、離散和弦轉換等都是一個極重要的元件。乘法器常常耗費最長的時間,此時間往往決定整個晶片的效率。到目前為止, 在同步設計領域中有很多的方法被提出;在非同步的設計中也有數種的方法被提出。 由於非同步電路具有低耗電、平均計算時間、能適應製程和環境的變動等優點。綜合上述原因,我們希望能提出一個能應用現有同步架構,具有非同步電路優點的乘法器。 在本篇論文中,首先對標竿程式做統計所得到的結果,對於 32 位元的乘法器其乘數與被乘數的有效長度分別只有8.4 和 5.6 位元而已。根據此特性,我們提出了偵測前端零位元檢測器和零加法器所構成非同步乘法加速器。偵測前端零位元檢測器檢查乘數每一個位元的值,並且輸出旗標通知零加法器。零加法器由多工器和加法器組成,若旗標為零,表示乘數為零則直接輸出零。將此架構應用到陣列乘法器可以縮短非同步乘法器的平均運算時間。 經實驗結果顯示,此架構對於無號數乘法器可縮短 42% 的平均運算時間。對於有號數乘法器,我們亦提出大老二的非同步乘法器架構,將原來由左至右(Left-to-Right)的架構中的最高位元的部分積,移到最後做總和,而由第二高位元的部分積先做總和。經實驗結果顯示,此架構對於有號數乘法器亦可縮短 36% 的平均運算時間。
Multiplication is the most significant operation of many applications, such as microprocessor, DSP (digital signal processing), and DCT (discrete cosine transform). The multiplier is usually on the critical delay path, and it dominates the whole performance of the chip. Lots of research of multiplier has been proposed in synchronous system but only a few architectures are proposed in asynchronous system. Asynchronous circuits have several advantages including lower system power consumption, reducing noise, average-case performance, adapting to the processing and environmental variations. Because of these reasons mentioned above, we proposed new architectures of asynchronous multiplier to integrate existing synchronous architectures and asynchronous advantages. In order to find the effective length of the multiplier and multiplicand, we gathered detailed information of multiply operations from studying of the SPEC95 benchmark programs in advance. We found that the average effective length of multiplier and multiplicand of 32-bit multiplier are only 8.4 and 5.6 bits. According to this result, we propose a new parallel asynchronous multiplier with accelerator. This architecture comprises a Leading-Zero-Bit detector and a Zero-Adder. The experimental results show that the new design can reduce 42% computation time in average than other parallel structure unsigned multipliers. Furthermore, we also propose the architecture of Big2 multiplier with accelerator to do the signed multiply operations. The experimental results show that our design can reduce 36% computation time in average than other parallel structures.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT008967577
http://hdl.handle.net/11536/80125
Appears in Collections:Thesis