AUTOMATED EDIBILITY CLASSIFICATION OF MUSHROOMS USING MORPHOLOGY-BASED RANDOM FOREST ALGORITHM.

Authors

  • Muhammad Reza Pahlevi Universitas Amikom Purwokerto
  • Imam Tahyudin Universitas Amikom Purwokerto
  • Ades Tikaningsih Universitas Amikom Purwokerto

DOI:

https://doi.org/10.30656/jsii.v12i2.10868

Abstract

Mushrooms exhibit high morphological diversity; however, some species are poisonous and harmful if consumed. The physical similarities between edible and poisonous mushrooms often complicate manual identification. This research aims to develop an automated mushroom classification system based on the Random Forest algorithm to distinguish between edible and poisonous mushrooms. It leverages all 21 attributes (18 categorical, 3 numerical) from a compr[1]ehensive Kaggle dataset comprising 61,069 entries. The research methodology follows a modified CRISP-DM workflow, from data collection through to implementation. Crucial data preprocessing steps were extensively performed, including handling data duplicates and imputing missing values (using median and mode). Subsequently, class labels were transformed (edible=0, poisonous=1), and One-Hot Encoding was applied to categorical features for appropriate numerical representation. Numerical features such as cap-diameter and stem-height were normalized using Standard Scaling to balance their contributions. The data was then split 80:20 for training and testing. The Random Forest model was developed with optimal parameters (n_estimators=200, max_depth=15, class_weight="balanced") for efficiency and robustness against class imbalance. Evaluation results demonstrated excellent performance with an overall accuracy of 99.34%, along with balanced precision, recall, and F1-score values of 0.99 for both classes. Feature importance analysis identified stem-width, stem-height, and cap-diameter as the most influential attributes. The learning curve indicated model stability without significant overfitting. Implementation on new mushroom samples also confirmed consistent predictive capability, making this model suitable as an automated decision support system for detecting poisonous mushrooms.

Keywords: Musrom, Clasification, Random Forest, Morphology, Machine Learning

References

[1] I. P. and P. D. M. Putra, “Implementasi Metode Cnn Dalam Klasifikasi Gambar Jamur Pada Analisis Image Processing,” Nov. 2020. Accessed: Apr. 08, 2025. [Online]. Available: https://dspace.uii.ac.id/handle/123456789/23677

[2] B. Zhang, Y. Zhao, and Z. Li, “Using Deep Convolutional Neural Networks To Classify Poisonous And Edible Mushrooms Found In China.” [Online]. Available: https://drive.google.com/drive/folders/13NFDI5UhcLHPSL2WMcOrFs6QhhmrDjxQ?usp=sharing

[3] W. Ketwongsa, S. Boonlue, and U. Kokaew, “A New Deep Learning Model for The Classification of Poisonous and Edible Mushrooms Based on Improved AlexNet Convolutional Neural Network,” Applied Sciences (Switzerland), vol. 12, no. 7, Apr. 2022, doi: 10.3390/app12073409.

[4] G. M. C. Batubara, A. Desiani, and A. Amran, “Klasifikasi Jamur Beracun Menggunakan Algoritma Naïve Bayes dan K-Nearest Neighbors,” Jurnal Ilmu Komputer dan Informatika, vol. 3, no. 1, pp. 33–42, Jun. 2023, doi: 10.54082/jiki.68.

[5] U. Sri Rahmadhani and N. Lysbetti Marpaung, “Klasifikasi Jamur Berdasarkan Genus Dengan Menggunakan Metode CNN,” vol. 8, no. 2, 2023.

[6] D. B. Saputra, V. Atina, F. E. Nastiti, and F. I. Komputer, “Penerapan Model Crisp-Dm Pada Prediksi Nasabah Kredit Menggunakan Algoritma Random Forest,” 2024. [Online]. Available: http://jom.fti.budiluhur.ac.id/index.php/IDEALIS/indexDwiBagusSaputra|http://jom.fti.budiluhur.ac.id/index.php/IDEALIS/index|

[7] S. Nuanmeesri and W. Sriurai, “Development of the Edible and Poisonous Mushrooms Classification Model by using the Feature Selection and the Decision Tree Techniques,” Int J Eng Adv Technol, vol. 9, no. 2, pp. 3061–3066, Dec. 2019, doi: 10.35940/ijeat.B4115.129219.

[8] F. Azhiman, R. N. Dasmen, A. Putra, and W. Agustian, “Collecting Data Desa di Kecamatan Rambutan Kabupaten Banyuasin untuk Pengimplementasian Sistem Digital Desa,” Jurnal Abdi Masyarakat Indonesia, vol. 2, no. 6, pp. 1733–1742, Nov. 2022, doi: 10.54082/jamsi.532.

[9] J. B. Angela, Islamiyah, and Ahmad Irsyad, “Implementasi Visualisasi Data Berbasis Web Pada Exploratory Data Analysis Profil Kesehatan Kota Samarinda,” Kreatif Teknologi dan Sistem Informasi (KRETISI), vol. 1, no. 1, pp. 9–16, Jul. 2023, doi: 10.30872/kretisi.v1i1.447.

[10] A. Wijayanto, U. Sayyid, and A. R. Tulungagung, “Revitalisasi Penggunaan Media Serta Metode Belajar Dalam Pembelajaran Matematika Dan Teknik.” [Online]. Available: https://www.researchgate.net/publication/378314350

[11] H. N. Putri, D. Retno, and S. Saputro, “Clustering Data Campuran Numerik dan Kategorik Menggunakan Algoritme Ensemble Quick RObust Clustering using linKs (QROCK),” 2022, [Online]. Available: https://journal.unnes.ac.id/sju/index.php/prisma/

[12] C. Fan, M. Chen, X. Wang, J. Wang, and B. Huang, “A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data,” Mar. 29, 2021, Frontiers Media S.A. doi: 10.3389/fenrg.2021.652801.

[13] J. Burdack, F. Horst, S. Giesselbach, I. Hassan, S. Daffner, and W. I. Schöllhorn, “Systematic Comparison of the Influence of Different Data Preprocessing Methods on the Performance of Gait Classifications Using Machine Learning,” Front Bioeng Biotechnol, vol. 8, Apr. 2020, doi: 10.3389/fbioe.2020.00260.

[14] T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago, and O. Tabona, “A survey on missing data in machine learning,” J Big Data, vol. 8, no. 1, Dec. 2021, doi: 10.1186/s40537-021-00516-9.

[15] J. A. Samuels and J. I. Samuels, “One-Hot Encoding and Two-Hot Encoding: An Introduction”, doi: 10.13140/RG.2.2.21459.76327.

[16] D. Singh and B. Singh, “Feature wise normalization: An effective way of normalizing data,” Pattern Recognit, vol. 122, p. 108307, 2022, doi: https://doi.org/10.1016/j.patcog.2021.108307.

[17] Z. Sun, G. Wang, P. Li, H. Wang, M. Zhang, and X. Liang, “An improved random forest based on the classification accuracy and correlation measurement of decision trees,” Expert Syst Appl, vol. 237, p. 121549, 2024, doi: https://doi.org/10.1016/j.eswa.2023.121549.

Downloads

Published

2025-09-07

Issue

Section

Articles