Estimation Model of Polyphenols Content in Yellow Tea Based on Spectral-Spatial Features
YANG Bao-hua1, GAO Yuan1, WANG Meng-xuan1, QI Lin1, NING Jing-ming2
1. School of Information and Computer, Anhui Agricultural University, Hefei 230036, China
2. State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei 230036, China
Abstract:Tea polyphenols (TP) is one of the important ingredients of yellow tea, which has health and medicinal effects. Moreover, accurate estimation of tea polyphenol content is of great significance for tea quality identification and quantitative analysis. Previous scholars have used E-nose, E-tongue, hyperspectral and near-infrared techniques to conduct research on the estimation of tea polyphenols, and they have achieved good results. However, due to the lack of spatial characteristics, it is difficult to meet the accuracy requirements for the comprehensive judgment of the internal and external quality of tea. With the development of hyperspectral imaging system (HIS), although the estimation of tea texture based on the gray level co-occurrence matrix (GLCM) has made progress, there are still some obstacles in practical application. On the one hand, if the resolution is low, there will be no significant difference in the texture features of the image, and fewer features will not be able to fully interpret the image, resulting in lower model accuracy. On the other hand, if the resolution is high, the increase of features will make the model more complicated. Therefore, on the premise of retaining the original information of the hyperspectral image, it is necessary to explore further the potential features of hyperspectral images, especially the details of the texture. Consequently, a method of combining spectral and spatial features is proposed to improve the accuracy of tea polyphenol estimation. First, the wavelet coefficients are extracted using continuous wavelet transform based on the spectral information obtained from the hyperspectral image. Second, the wavelet coefficient features are extracted based on the wavelet coefficients, including 959 and 1 561 nm at the 4th scale, 1 321, 1 520 and 1 540 nm at the 5th scale, and 1 202 and 1 228 nm at the 6th scale. Furthermore, two characteristic wavelengths are preferred based on the sum of the energy of wavelet coefficients, which are 1 102 and 1 309 nm, respectively. Then, the gray level co-occurrence matrix and wavelet texture are extracted according to the hyperspectral image corresponding to the characteristic wavelength. Finally, the wavelet coefficient features, co-occurrence matrix, wavelet texture, and their combinations were used to construct an estimation model for the content of polyphenols in yellow tea. By comparing different regression methods based on different characteristics, including partial least squares regression (PLSR), support vector regression (SVR), and random forest (RF), five types of yellow tea were analyzed and verified. The experimental results show that SVR model based on the fusion of wavelet coefficient features, co-occurrence matrix texture, and wavelet texture achieves the best results with R2 of 0.933 0 for calibration set and 0.823 8 for the validation set. Therefore, the proposed model can effectively improve the prediction accuracy of tea polyphenol content, which also provide a technical basis for predicting other components of tea.
杨宝华,高 远,王梦玄,齐 麟,宁井铭. 基于光谱-空间特征的黄茶多酚含量估算模型[J]. 光谱学与光谱分析, 2021, 41(03): 936-942.
YANG Bao-hua, GAO Yuan, WANG Meng-xuan, QI Lin, NING Jing-ming. Estimation Model of Polyphenols Content in Yellow Tea Based on Spectral-Spatial Features. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2021, 41(03): 936-942.
[1] Li R, Jia K, Chen X G, et al. Journal of Food Biochemistry, 2017, 41(2): e12332.
[2] Ren G X, Wang S P, Ning J M, et al. Food Research International, 2013, 53(2), 822.
[3] Dutta D, Das P K, Bhunia U K, et al. International Journal of Applied Earth Observation and Geoinformation, 2015, 36: 22.
[4] Hazarika A K, Chanda S, Sabhapondit S, et al. Journal of Food Science and Technology, 2018, 55(12): 4867.
[5] Tu Y X, Bian M, Wan Y K, et al. PeerJ, 2018, 6: e4858.
[6] Yang B H, Gao Y, Li H M, et al. PLOS ONE, 2019, 14(2):e0210084.
[7] Sohara Y, Ryu C, Suguri M, et al. IFAC Proceedings Volumes, 2010, 43(26): 172.
[8] CAI Qing-kong, LI Er-jun, JIANG Jin-bao, et al(蔡庆空, 李二俊, 蒋金豹,等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2019, 39(8): 2522.
[9] Li X L, He Y, Qiu Z J. Proceedings of 2008 International Conference on Machine Learning and Cybernetics, 2008,1-7:1461.
[10] Rosso O A, Blanco S, Yordanova J, et al. Journal of Neuroscience Methods, 2001, 105(1): 65.
[11] Haralick R M, Shanmugam K, Dinstein I H. IEEE Transactions on Systems, Man & Cybernetics, 1973,(6): 610.
[12] Fu Y Y, Yang G J, Wang J H, et al. Computers and Electronics in Agriculture, 2014, 100: 51.
[13] Mountrakis G, Im J, Ogole C. ISPRS Journal of Photogrammetry and Remote Sensing, 2011, 66(3): 247.
[14] Breiman L. Machine Learning, 2001, 45(1): 5.
[15] Bruce L M, Li J, Huang Y. IEEE Transactions on Geoscience and Remote Sensing, 2002, 40(4): 977.