Improving the accuracy of green bean palm civet coffee purity classification using wrapper feature selection
DOI:
https://doi.org/10.25186/.v20i.2277Abstract
Palm civet coffee, a highly prized specialty coffee, faces counterfeiting due to its limited production. The lack of reliable detection methods necessitates the development of non-destructive sensing techniques. This study investigates the use of machine vision and feature selection to classify the purity of palm civet coffee. A dataset of 101 image features (11 color and 90 textural) was extracted from coffee bean images. A wrapper-based feature selection approach, employing K-Nearest Neighbors (KNN), Random Forest (RF), and Support Vector Machine (SVM) classifiers with four optimization algorithms (Bat Algorithm, Cuckoo Search, Genetic Algorithm, and Grey Wolf Optimizer), was used to identify the most informative features. The results demonstrate that a Random Forest classifier, optimized using Grey Wolf Optimizer with 500 trees, achieved the highest accuracy (0.981) using a subset of five features: Blue_Mean, Hue_Entropy, Gray_Inverse, S_HSL_Correlation, and Green_Cluster. These findings suggest that machine vision, combined with feature selection, holds promise for developing a robust and non-destructive method for detecting palm civet coffee counterfeiting.
Key words: Classifier; feature selection; learning algorithm; machine vision; palm civet coffee purity.
References
ADHAO, R.; PACHGHARE, V. Feature selection using principal component analysis and genetic algorithm. Journal of Discrete Mathematical Sciences and Cryptography, 23(2):595-602, 2020.
AFSHAR, M.; USEFI, H. Optimizing feature selection methods by removing irrelevant features using sparse least squares. Expert Systems with Applications, 200:116928, 2022.
AKSAKALLI, V. et al. Feature selection using stochastic approximation with Barzilai and Borwein non-monotone gains. Computers & Operations Research, 132:105334, 2021.
ALI, N.; NEAGU, D.; TRUNDLE, P. Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets. SN Applied Sciences, 1:1-15, 2019.
ALYASSERI, Z. A. A. et al. Recent advances of bat-inspired algorithm, its versions and applications. Neural Computing and Applications, 34(19): 16387–16422, 2022.
ALZAQEBAH, M. et al. Memory based cuckoo search algorithm for feature selection of gene expression dataset. Informatics in Medicine Unlocked, 24:100572, 2021.
AMINI, F.; HU, G. A two-layer feature selection method using genetic algorithm and elastic net. Expert Systems with Applications, 166:114072, 2021.
BISWAS, S.; BORDOLOI, M.; PURKAYASTHA, B. Review on feature selection and classification using neuro-fuzzy approaches. International Journal of Applied Evolutionary Computation (IJAEC), 7(4):28-44, 2016.
CERVANTES, J. et al. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408:189-215, 2020.
CHANDRASHEKAR, G.; SAHIN, F. A survey on feature selection methods. Computers & Electrical Engineering, 40(1):16-28, 2014.
COSMA, G. et al. A survey on computational intelligence approaches for predictive modeling in prostate cancer. Expert systems with applications, 70:1-19, 2017.
FANG, Z.; YU, X.; ZENG, Q. Random forest algorithm-based accurate prediction of chemical toxicity to Tetrahymena pyriformis. Toxicology, 480:153325, 2022.
FAWAGREH, K.; GABER, M. M.; ELYAN, E. Random forests: from early developments to recent advancements. Systems Science & Control Engineering: An Open Access Journal, 2(1):602-609, 2014.
FENG, Q.; LIU, J.; GONG, J. UAV remote sensing for urban vegetation mapping using random forest and texture analysis. Remote sensing, 7(1):1074-1094, 2015.
GOEL, L. An extensive review of computational intelligence-based optimization algorithms: trends and applications. Soft Computing, 24(21):16519-16549, 2020.
HENDRAWAN, Y. et al. Deep learning to detect and classify the purity level of luwak coffee green beans. Pertanika Journal of Science & Technology, 30(1):1-18, 2022.
HENDRAWAN, Y.; MURASE, H. Neural-Genetic algorithm as feature selection technique for determining sunagoke moss water content. Engineering in Agriculture, Environment and Food, 3(1):25-31, 2010.
HENDRAWAN, Y.; MURASE, H. Bio-inspired feature selection to select informative image features for determining water content of cultured Sunagoke moss. Expert Systems with Applications, 38(11):14321-14335, 2011a.
HENDRAWAN, Y.; MURASE, H. Neural-intelligent water drops algorithm to select relevant textural features for developing precision irrigation system using machine vision. Computers and Electronics in Agriculture, 77(2):214-228, 2011b.
HENDRAWAN, Y.; WIDYANINGTYAS, S.; SUCIPTO, S. Computer vision for purity, phenol, and pH detection of Luwak Coffee green bean. TELKOMNIKA (Telecommunication Computing Electronics and Control), 17(6):3073-3085, 2019.
IRANZAD, R.; LIU, X. A review of random forest-based feature selection methods for data science education and applications. International Journal of Data Science and Analytics, 1-15, 2024.
JUMHAWAN, U. et al. Selection of discriminant markers for authentication of Asian palm civet coffee (Kopi Luwak): a metabolomics approach. Journal of agricultural and food chemistry, 61(33):7994-8001, 2013.
JUMHAWAN, U. et al. Application of gas chromatography/flame ionization detector-based metabolite fingerprinting for authentication of Asian palm civet coffee (Kopi Luwak). Journal of bioscience and bioengineering, 120(5):555-561, 2015.
JUMHAWAN, U. et al. Quantification of coffee blends for authentication of Asian palm civet coffee (Kopi Luwak) via metabolomics: A proof of concept. Journal of bioscience and bioengineering, 122(1):79-84, 2016.
KAR, A.K. Bio inspired computing-a review of algorithms and scope of applications. Expert Systems with Applications, 59:20-32, 2016.
KASHEF, R. A boosted SVM classifier trained by incremental learning and decremental unlearning approach. Expert Systems with Applications, 167:114154, 2021.
KELIDARI, M.; HAMIDZADEH, J. Feature selection by using chaotic cuckoo optimization algorithm with levy flight, opposition-based learning and disruption operator. Soft Computing, 25(4):2911-2933, 2021.
LEE, W.-S. et al. Sensing technologies for precision specialty crop production. Computers and electronics in agriculture, 74(1):2-33, 2010.
MAFARJA, M. et al. An efficient high-dimensional feature selection approach driven by enhanced multi-strategy grey wolf optimizer for biological data classification. Neural Computing and Applications, 35(2):1749-1775, 2023.
MAFARJA, M.; MIRJALILI, S. Whale optimization approaches for wrapper feature selection. Applied Soft Computing, 62:441-453, 2018.
MALEKI, N.; ZEINALI, Y.; NIAKI, S. T. A. A k-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection. Expert Systems with Applications, 164:113981, 2021.
MALLIDI, S. Bat optimization algorithm for wrapper‐based feature selection and performance improvement of android malware detection. IET Networks (Wiley-Blackwell), 10 (3), 2021.
MARCONE, M. F. Composition and properties of Indonesian palm civet coffee (Kopi Luwak) and Ethiopian civet coffee. Food Research International, 37(9):901-912, 2004.
MOUSAVIRAD, S.; EBRAHIMPOUR-KOMLEH, H. Wrapper feature selection using discrete cuckoo optimization algorithm. International Journal of Mechatronics, Electrical, and Computer Engineering, 4(11):709-721, 2014.
REDA, M. et al. Path planning algorithms in the autonomous driving system: A comprehensive review. Robotics and Autonomous Systems, 174:104630, 2024.
REN, G. et al. Evaluating Congou black tea quality using a lab-made computer vision system coupled with morphological features and chemometrics. Microchemical Journal, 160:105600, 2021.
ROSTAMI, M. et al. Review of swarm intelligence-based feature selection methods. Engineering Applications of Artificial Intelligence, 100:104210, 2021.
SAADATFAR, H. et al. A new K-nearest neighbors classifier for big data based on efficient data pruning. Mathematics, 8(2):286, 2020.
SEJUTI, Z. A.; ISLAM, M. S. A hybrid CNN–KNN approach for identification of COVID-19 with 5-fold cross validation. Sensors International, 4:100229, 2023.
SHANG, Y. et al. An effective feature selection approach based on hybrid Grey Wolf Optimizer and genetic algorithm for hyperspectral image. Scientific Reports, 15(1):1968, 2025.
SUHANDY, D.; YULIA, M. The use of partial least square regression and spectral data in UV-visible region for quantification of adulteration in Indonesian palm civet coffee. International journal of food science, 1-7, 2017.
TIWARI, A.; CHATURVEDI, A. A hybrid feature selection approach based on information theory and dynamic butterfly optimization algorithm for data classification. Expert Systems with Applications, 196: 116621, 2022.
TSAI, C.-F.; CHEN, Y.-C. The optimal combination of feature selection and data discretization: An empirical study. Information Sciences, 505:282-293, 2019.
UDDIN, S. et al. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1):6256, 2022.
WANG, J. et al. An adaptively balanced grey wolf optimization algorithm for feature selection on high-dimensional classification. Engineering Applications of Artificial Intelligence, 114:105088, 2022.
YU, L.; LIU, H. Efficient feature selection via analysis of relevance and redundancy. The Journal of Machine Learning Research, 5:1205-1224, 2004.
ZHANG, P.; GAO, W. Feature selection considering uncertainty change ratio of the class label. Applied Soft Computing, 95:106537, 2020.
ZHAO, W.; FORTE, E.; PIPAN, M. Texture attribute analysis of GPR data for archaeological prospection. Pure and Applied Geophysics, 173:2737-2751, 2016.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Coffee Science - ISSN 1984-3909The copyright of the articles published in this journal belongs to the authors, with the journal holding the first publication rights. As the articles are published in this journal under open access, they may be freely used, with proper attribution, for educational and non-commercial purposes