完整後設資料紀錄
DC 欄位語言
dc.contributor.author楊博宇en_US
dc.contributor.authorPo-Yu Yangen_US
dc.contributor.author吳毅成en_US
dc.contributor.authorI-Chen Wuen_US
dc.date.accessioned2014-12-12T03:01:09Z-
dc.date.available2014-12-12T03:01:09Z-
dc.date.issued2004en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT008967587en_US
dc.identifier.urihttp://hdl.handle.net/11536/80158-
dc.description.abstract網路上產品資訊非常豐富且多樣,但是要找到自己真正需要的資訊卻不是件容易的事。一般的作法,是上各個相關網站收集資料,非常的耗費時間而且不方便。比較方便的作法是將資料存到資料庫,然後再利用查詢介面找到想要的資料。但是往往找到的是一堆相似的產品,還是需要人工判斷出相同的產品。所以本篇論文以產品資料比對,自動判斷相同產品為目標。 產品名稱、序號為辨識產品是否相同的重要條件。而比對兩個產品的名稱、序號,就像是兩個字串的比對。我們引用了最長共同子序列Longest Common Subsequence (LCS)的概念,提出最長最多共同片段Longest and Most Common Segments (LMCS)演算法,用來計算所有產品之間的分數,兩個產品之間分數越高代表兩個產品之間的相似度越高。並調整LMCS的計算權重,再以比對策略找到最相似的產品。經過調整後,回收率、精確度、相似度都可以達到85%以上。zh_TW
dc.description.abstractThe product information is very rich and various on the web. It is difficult to find the information that we really need. The general way is to connect to all relevant website to collect product information. It is time-consuming and inconvenient very much. A more convenient way is to store the product information to the database, then utilize and inquire about interfaces to find the wanted information. Usually found a lot of similar products, and need to judge which products are the same products. Therefore, the goal of this thesis is to automatically judge that the same product by product name matching. The products’ name and serial number are important terms to judge same products. Comparing the name and serial number of products is like sequence comparison. We propose longest and most common segments (LMCS) algorithms which are based on longest common subsequence (LCS). LMCS used for calculating all products matching that higher score of LMCS to represent have higher similar degree. Adjust weight to calculate LMCS and use matching strategy in order to find the most similar products. After adjusting, the rates of recall, precision and similarity can be more than 85%.en_US
dc.language.isozh_TWen_US
dc.subject最長共同子序列zh_TW
dc.subject序列比對zh_TW
dc.subject字串比對zh_TW
dc.subject網頁萃取zh_TW
dc.subject比對zh_TW
dc.subjectLCSen_US
dc.subjectSequence Comparisonen_US
dc.subjectString Matchingen_US
dc.subjectData Extractionen_US
dc.subjectMatchingen_US
dc.title產品比對的研究zh_TW
dc.titleThe Study of Product Matchingen_US
dc.typeThesisen_US
dc.contributor.department資訊學院資訊學程zh_TW
顯示於類別:畢業論文


文件中的檔案:

  1. 758701.pdf
  2. 758702.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。