Implementasi Arsitektur MTCNN pada Kelas Dimensi Piksel Berbeda dan Plotting Multi-Wajah pada Hasil Deteksi

Merry Anggraeni; Hillman Akhyar Damanik

doi:10.56873/jpkm.v9i1.5843

PDF (Indonesian)

Published: Jun 30, 2025

DOI: https://doi.org/10.56873/jpkm.v9i1.5843

Keywords:

Convolutional Neural Networks, face detection, MTCNN, deep learning, face crop

Dimensions

Altmetrics

Statistics

Read Counter : 230

Download : 205

Crossmark/ Data Version

Merry Anggraeni

Universitas Budi Luhur

Hillman Akhyar Damanik

Universitas Budi Luhur

https://orcid.org/0000-0001-7551-2172

Abstract

Face detection is a computer vision task to identify and verify a person based on a photo of their face. Face detection and alignment in unconstrained environments are very challenging due to various poses, illumination, and occlusions. The human face is difficult to model because there are many variables that can change, such as facial expression, orientation, lighting conditions, and partial occlusions, such as sunglasses, scarves, masks, and others. Recent studies have shown that deep learning approaches can achieve impressive performance on these two tasks. In this paper, face detection on multi-faces will be carried out as well as mapping one by one the results of the face detection obtained (face crop) for the needs of various systems related to face detection using the Multi-Task Cascaded Convolutional Neural Network (MTCNN) approach. This study aims to implement the MTCNN architecture using TensorFlow and OpenCV, with two main benefits. First, this study is expected to provide a pre-training model that performs optimally and strengthens evidence from previous studies that have examined this model. Second, this model can be used as input for other systems. The input variable is a photo image of a face containing one or more to be processed. This photo image will have various pixel dimensions to represent different resolutions. The output variable produced is in the form of coordinates of the detected face location or in the form of landmarks of key facial points, such as the position of the eyes, the corner of the nose, and the mouth. The results of the study showed an average score on various pixel dimensions in the dataset, with an accuracy of 93%, a precision of 95%, a recall of 96%, an F1-score of 95%, and an ROC-AUC of 90.89%.

Issue

Vol. 10 No. 1 (2025): Jurnal Pekommas Vol.10 (1) June 2025

Section

Informatics

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The proposed policy for journals that offer open access

Authors who publish with this journal agree to the following terms:

Copyright on any article is retained by the author(s).
Author grant the journal, right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work’s authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal’s published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
The article and any associated published material is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

References

Chen, X., Luo, X., Liu, X., & Fang, J. (2019). Eyes Localization Algorithm Based on Prior MTCNN Face Detection. Itaic, 1763–1767.

Du, J. (2020). High-Precision Portrait Classification Based on MTCNN and Its Application on Similarity Judgement. Journal of Physics: Conference Series, 1518(1). https://doi.org/10.1088/1742-6596/1518/1/012066

Ranjan, R., Patel, V. M., & Chellappa, R. (2019). HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(1), 121–135. https://doi.org/10.1109/TPAMI.2017.2781233

Ravidas, S., & Ansari, M. A. (2018). Multi View Face Detection using Deep Learning. Ii.

Thohari, A., & Hertantyo, G. B. (2018). Implementasi Convolutional Neural Network untuk Klasifikasi Pembalap MotoGP Berbasis GPU. Proceedings on Conference on Electrical Engineering, Telematics, Industrial Technology, and Creative Media, 50–55.

Viola, P., & Jones, M. (2001). Managing work role performance: Challenges for twenty-first century organizations and their employees. Rapid Object Detection Using a Boosted Cascade of Simple Features, 511–518.

Xiang, J., & Zhu, G. (2017). Joint Face detection and Facial Expression Recognition with MTCNN. 424–427. https://doi.org/10.1109/ICISCE.2017.95

Zhang, K., Zhang, Z., Li, Z., Member, S., Qiao, Y., & Member, S. (n.d.). Joint Face Detection and Alignment using Multi - task Cascaded Convolutional Networks. 1, 1–5.

Zhao, F., Li, J., Zhang, L., Li, Z., & Na, S. G. (2020). Multi-view face recognition using deep neural networks. Future Generation Computer Systems, 111, 375–380. https://doi.org/10.1016/j.future.2020.05.002

Article Sidebar

Main Article Content

Abstract

Article Details

The proposed policy for journals that offer open access

References