使用機器學習方法分析有機分子之螢光波長
No Thumbnail Available
Date
2018
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
由於目前科技的進步相當快速,各項應用對於螢光材料的要求條件也日趨嚴苛,故針對有機分子進行波長的分析研究,以期望找到更好的有機螢光分子。
有機螢光材料具有相當廣泛的應用。有機螢光色素除了一般民生產品的螢光應用(如螢光紡織品、螢光油墨、螢光塑膠製品等)之外,有機螢光色素在螢光檢驗/生物探針/標示方面的應用可以說是非常廣泛。
因此,我們找尋了大量的有機分子來做分析研究。針對有機分子的結構特性,其中包括結構和電性組成的特徵值,來和螢光放光波長來進行機器學習和演算法的分析。以期望找到其中的關鍵因素,對於螢光分子材料的選擇和設計有更精準的方向。
此篇論文應用了目前正在發展中的機器學習方法來進行螢光分子的挑選,我們使用了Reaxys化學資料庫的分子結構檔案和波長數據,有了這兩個資訊;我們可以推展到機器學習的使用。
先將分子結構檔案(檔案類型: .smile)使用PaDEL結構描述符計算軟體,計算出大量結構檔轉換出的描述符,這些描述符包括電子結構和分子結構。有了大量的分子描述符,我們使用隨機森林演算法挑選出其中與波長數據關聯性較高的描述符,挑選了十個描述符,將這些重要性較高描述符與波長進行支持向量機回歸演算法,並建構出回歸模型,利用此回歸模型進行預測,並將預測波長與訓練用的Reaxys原始波長數進行線性比對,探討其精確性。
Due to the rapid progress in technology at the moment, the conditions required for the application of fluorescent materials are becoming increasingly stringent. Therefore, we focus on the wavelength analysis of organic molecules. Expect to find better organic fluorescent molecules. Despite this, organic fluorescent materials still have a wide range of applications. Organic fluorescent pigments in addition to fluorescent applications of general livelihood products. Such as Fluorescent textiles, Fluorescent ink, and Fluorescent plastic products, etc. In addition, the application of organic fluorescent pigment can be very extensive in the fluorescent inspection, biological probes, labeling aspect. Therefore, we have sought a large number of organic molecules for analysis. We use the structures and electrical eigenvalues for molecular, and fitting with wavelength via machine learning. Finding the key factor of fluorescent wavelength. For the selection and design of a fluorescent molecule materials have more accurate direction and spur the development of fluorescent material. This paper applies the currently evolving machine learning method for the selection of fluorescent molecules. We used the molecular structure file and the Reaxys wavelength to be the database. At the beginning, We use PaDEL software to calculate descriptors. These descriptors include electronic structures and molecular structures. There are a large number of molecular descriptors, and we use random forest algorithms to pick out descriptors that are highly correlated with wavelength data. And, We selected ten highly important descriptors. Using these more important descriptors and wavelength to construct support vector machine regression model. We use this regression model to make predictions and compare them with the original data.Finally, We must discuss the accuracy.
Due to the rapid progress in technology at the moment, the conditions required for the application of fluorescent materials are becoming increasingly stringent. Therefore, we focus on the wavelength analysis of organic molecules. Expect to find better organic fluorescent molecules. Despite this, organic fluorescent materials still have a wide range of applications. Organic fluorescent pigments in addition to fluorescent applications of general livelihood products. Such as Fluorescent textiles, Fluorescent ink, and Fluorescent plastic products, etc. In addition, the application of organic fluorescent pigment can be very extensive in the fluorescent inspection, biological probes, labeling aspect. Therefore, we have sought a large number of organic molecules for analysis. We use the structures and electrical eigenvalues for molecular, and fitting with wavelength via machine learning. Finding the key factor of fluorescent wavelength. For the selection and design of a fluorescent molecule materials have more accurate direction and spur the development of fluorescent material. This paper applies the currently evolving machine learning method for the selection of fluorescent molecules. We used the molecular structure file and the Reaxys wavelength to be the database. At the beginning, We use PaDEL software to calculate descriptors. These descriptors include electronic structures and molecular structures. There are a large number of molecular descriptors, and we use random forest algorithms to pick out descriptors that are highly correlated with wavelength data. And, We selected ten highly important descriptors. Using these more important descriptors and wavelength to construct support vector machine regression model. We use this regression model to make predictions and compare them with the original data.Finally, We must discuss the accuracy.
Description
Keywords
機器學習, 螢光, 化學材料, machine learning, fluorescence, wavelength