利用文件影像分析及分類壓縮技術建構數位圖書館電子書之研究

Full metadata record

DC Field	Value	Language
dc.contributor.author	陳世豪	en_US
dc.contributor.author	Shih-hao Chen	en_US
dc.contributor.author	蔡文祥	en_US
dc.contributor.author	Wen-Hsiang Tsai	en_US
dc.date.accessioned	2014-12-12T02:22:58Z	-
dc.date.available	2014-12-12T02:22:58Z	-
dc.date.issued	1999	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#NT880394040	en_US
dc.identifier.uri	http://hdl.handle.net/11536/65536	-
dc.description.abstract	本論文發展了一套離線式書籍內容自動數位化及展示系統。首先我們利用掃描器的自動饋紙機將多頁的書籍內容掃描到電腦中。接著採用由下往上的方式作彩色文件分析，將書頁影像切割並區分成文字區塊及圖形區塊。為了節省書籍內容儲存的空間，我們提出了一個分類壓縮的方法來達到這個目的。在此方法中，我們首先基於圖形區塊的特性，使用決策樹來作影像內容的分類，並且使用全彩階層式矩量保持原理來作圖形減色。在影像內容分類後，我們提出了一個分類壓縮的方法，根據不同影像區塊的特性採用適當的壓縮演算法來壓縮，並再度利用減色的技術來消除彩色圖片經過印刷、掃描後的失真，並且保留最重要的少數顏色，以達到較好的壓縮效果。另外，根據書籍中不同書頁間所具有共同部份的特性，我們也提出一偵測重覆區塊的方法，用來找出不同書頁間的相同內容，進一步提高整體的壓縮率。最後，為了讓使用者可以清楚地閱讀電子書的內容，我們對書頁內容作了一些美化的動作，並提供了一個操作方便的使用者界面，讓使用者可以輕鬆地閱讀電子書。藉由實驗的結果，可以證明我們提出的系統與方法是可行的。	zh_TW
dc.description.abstract	In this study, an offline automatic book content digitization and display system is developed. First, we utilize an automatic document feeder (ADF) to scan multiple book pages into a computer. Then, we segment and classify page images into text blocks and picture blocks by an adopted bottom-up segmentation approach. In order to save book content storage space, we employ a compression-by-classification approach. In the approach, first we propose an image content classification method using a decision tree to classify picture blocks into various types, based on the properties of picture blocks as well as the use of a full-color hierarchical moment-preserving color reduction method. After classification of page contents, we propose a content-based compression scheme, which compresses different image blocks by appropriate compression algorithms according to their image attributes. A color reduction algorithm is adopted to eliminate distortion caused by printing or scanning and preserve the most important colors in image blocks, achieving a great deal of compression effect. Besides, we propose a repetitive-pattern recognition approach to detect common parts among different page images in order to improve compression effect further. Finally, we enhance page contents and provide a user-friendly interface for book contents display and reading. Experimental results show the feasibility and practicability of the proposed approaches.	en_US
dc.language.iso	en_US	en_US
dc.subject	文件影像分析	zh_TW
dc.subject	分類壓縮	zh_TW
dc.subject	數位圖書館	zh_TW
dc.subject	電子書	zh_TW
dc.subject	Document Image Analysis	en_US
dc.subject	Compression-By-Classification	en_US
dc.subject	Digital Libraries	en_US
dc.subject	Electronic Book	en_US
dc.title	利用文件影像分析及分類壓縮技術建構數位圖書館電子書之研究	zh_TW
dc.title	Book Content Digitization and Display for Digital Libraries by Document Image Analysis and Compression-By-Classification Techniques	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
Appears in Collections:	Thesis