標題: | Uniqueness: skews bit occurrence frequencies in randomly generated fingerprint libraries |
作者: | Chen, Nelson G. 分子醫學與生物工程研究所 Institute of Molecular Medicine and Bioengineering |
關鍵字: | Algorithm;Molecular modeling;Fingerprints;Simulation |
公開日期: | 八月-2016 |
摘要: | Requiring that randomly generated chemical fingerprint libraries have unique fingerprints such that no two fingerprints are identical causes a systematic skew in bit occurrence frequencies, the proportion at which specified bits are set. Observed frequencies (O) at which each bit is set within the resulting libraries systematically differ from frequencies at which bits are set at fingerprint generation (E). Observed frequencies systematically skew toward 0.5, with the effect being more pronounced as library size approaches the compound space, which is the total number of unique possible fingerprints given the number of bit positions each fingerprint contains. The effect is quantified for varying library sizes as a fraction of the overall compound space, and for changes in the specified frequency E. The cause and implications for this systematic skew are subsequently discussed. When generating random libraries of chemical fingerprints, the imposition of a uniqueness requirement should either be avoided or taken into account. |
URI: | http://dx.doi.org/10.1007/s11030-016-9674-y http://hdl.handle.net/11536/133873 |
ISSN: | 1381-1991 |
DOI: | 10.1007/s11030-016-9674-y |
期刊: | MOLECULAR DIVERSITY |
Volume: | 20 |
Issue: | 3 |
起始頁: | 741 |
結束頁: | 745 |
顯示於類別: | 期刊論文 |