應用於次世代定序之高速資料處理器設計與實現

Full metadata record

DC Field	Value	Language
dc.contributor.author	吳易忠	zh_TW
dc.contributor.author	楊家驤	zh_TW
dc.contributor.author	Wu, Yi-Chung	en_US
dc.contributor.author	Yang, Chia-Hsiang	en_US
dc.date.accessioned	2018-01-24T07:41:09Z	-
dc.date.available	2018-01-24T07:41:09Z	-
dc.date.issued	2016	en_US
dc.identifier.uri	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070250228	en_US
dc.identifier.uri	http://hdl.handle.net/11536/141584	-
dc.description.abstract	近年來生物領域對於基因的各項研究都越發重要，而能夠快速的搜尋出任意DNA片段位址對於不管是在生物或醫學研究領域都有非常大的助益。基因定序是將DNA內部含氮鹼基(由四種嘌呤與嘧啶所構成) 的確切位置標的出來的技術。人類的基因擁有大約三十億個含氮鹼基，要徹底地在如此巨大的序列之中搜尋出某個特定片段是非常耗時的。多方的研究也因此使現今的高通量定序技術(亦稱為Next-Generation Sequencing、次世代定序)與發展越來越成。一般來說，要在如此大量的資料內搜尋出某個特定片段，我們需要先將其排序，並且記錄所有片段的前綴代號(suffix)並搭配進一步的搜尋與分析。而過長的排序時間是整個次世代定序分析的主要挑戰與瓶頸。本論文提出一個具有高效率的次世代基因定序分析之硬體設計方法。其中亦使用了BWT與FM-indexing 演算法來呈現搜尋基因片段的功能。此論文提出藉由桶排序(Bucket sort)、分群與硬體快速排序電路大量降低硬體複雜度。為了達到最佳的硬體成本與執行時間的平衡，許多重要參數都必須詳細與嚴謹的分析來達到最佳效能。此NGS 資料處理器是由台積電四十奈米製程所下線，其可達到相較於軟體演算法有147 倍的加速效果，同時相較於高效能GPU 亦有12 倍的以上的效率提升。	zh_TW
dc.description.abstract	There is a strong scientific and medical need for a mechanism by which to search for arbitrary sequences in DNA. DNA sequencing is the process of determining the precise order of nucleotides (i.e., adenine, guanine, cytosine, and thymine) within DNA. Human DNA contains about three billion nucleotides, thus searching the entire genome for a specific sequence is very time consuming. This has driven the development of high-throughput sequencing, also known as next-generation sequencing (NGS). Generally, to search for an arbitrary sequence from among a large volume of DNA nucleotides, all sequences should be pre-sorted and each suffix array is recorded. The high computational complexity in the pre-processing stage still poses design challenges. This work proposes a design methodology associated with efficient hardware mapping for NGS with suffix array sorting and sequence searching. The Burrows-Wheeler Transform (BWT) with the Ferragina-Manzini (FM) index is used to improve storage capacity. Hardware complexity is significantly reduced through distributed sorting, grouping, and fast serial sorting circuits. Key design parameters are analyzed to achieve optimal performance, thus balancing hardware cost and execution time. Designed using 40nm CMOS technology, the proposed NGS sorting processor achieves a 147x speedup compared to the software algorithm and a 12x speedup compared to the high-end GPU implementation.	en_US
dc.language.iso	en_US	en_US
dc.subject	次世代定序	zh_TW
dc.subject	快速排序演算法	zh_TW
dc.subject	字串搜尋	zh_TW
dc.subject	基因定序	zh_TW
dc.subject	桶排序	zh_TW
dc.subject	Next-generation sequencing	en_US
dc.subject	fast serial sorting algorithm	en_US
dc.subject	string matching	en_US
dc.subject	DNA sequencing	en_US
dc.subject	Bucket sorting	en_US
dc.title	應用於次世代定序之高速資料處理器設計與實現	zh_TW
dc.title	Design and Implementation of a High-Speed Data Processor for Next-Generation Sequencing	en_US
dc.type	Thesis	en_US
dc.contributor.department	電子工程學系電子研究所	zh_TW
Appears in Collections:	Thesis