Implementation of MTCNN Architecture on Different Pixel Dimension Classes and Plotting Multi-Face on Detection Results

Main Article Content

Merry Anggraeni
Hillman Akhyar Damanik

Abstract

Face detection is a computer vision task to identify and verify a person based on a photo of their face. Face detection and alignment in unconstrained environments are very challenging due to various poses, illumination, and occlusions. The human face is difficult to model because there are many variables that can change, such as facial expression, orientation, lighting conditions, and partial occlusions, such as sunglasses, scarves, masks, and others. Recent studies have shown that deep learning approaches can achieve impressive performance on these two tasks. In this paper, face detection on multi-faces will be carried out as well as mapping one by one the results of the face detection obtained (face crop) for the needs of various systems related to face detection using the Multi-Task Cascaded Convolutional Neural Network (MTCNN) approach. This study aims to implement the MTCNN architecture using TensorFlow and OpenCV, with two main benefits. First, this study is expected to provide a pre-training model that performs optimally and strengthens evidence from previous studies that have examined this model. Second, this model can be used as input for other systems. The input variable is a photo image of a face containing one or more to be processed. This photo image will have various pixel dimensions to represent different resolutions. The output variable produced is in the form of coordinates of the detected face location or in the form of landmarks of key facial points, such as the position of the eyes, the corner of the nose, and the mouth. The results of the study showed an average score on various pixel dimensions in the dataset, with an accuracy of 93%, a precision of 95%, a recall of 96%, an F1-score of 95%, and an ROC-AUC of 90.89%.

Article Details

Section
Informatics

References

Chen, X., Luo, X., Liu, X., & Fang, J. (2019). Eyes Localization Algorithm Based on Prior MTCNN Face Detection. Itaic, 1763–1767.

Du, J. (2020). High-Precision Portrait Classification Based on MTCNN and Its Application on Similarity Judgement. Journal of Physics: Conference Series, 1518(1). https://doi.org/10.1088/1742-6596/1518/1/012066

Ranjan, R., Patel, V. M., & Chellappa, R. (2019). HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(1), 121–135. https://doi.org/10.1109/TPAMI.2017.2781233

Ravidas, S., & Ansari, M. A. (2018). Multi View Face Detection using Deep Learning. Ii.

Thohari, A., & Hertantyo, G. B. (2018). Implementasi Convolutional Neural Network untuk Klasifikasi Pembalap MotoGP Berbasis GPU. Proceedings on Conference on Electrical Engineering, Telematics, Industrial Technology, and Creative Media, 50–55.

Viola, P., & Jones, M. (2001). Managing work role performance: Challenges for twenty-first century organizations and their employees. Rapid Object Detection Using a Boosted Cascade of Simple Features, 511–518.

Xiang, J., & Zhu, G. (2017). Joint Face detection and Facial Expression Recognition with MTCNN. 424–427. https://doi.org/10.1109/ICISCE.2017.95

Zhang, K., Zhang, Z., Li, Z., Member, S., Qiao, Y., & Member, S. (n.d.). Joint Face Detection and Alignment using Multi - task Cascaded Convolutional Networks. 1, 1–5.

Zhao, F., Li, J., Zhang, L., Li, Z., & Na, S. G. (2020). Multi-view face recognition using deep neural networks. Future Generation Computer Systems, 111, 375–380. https://doi.org/10.1016/j.future.2020.05.002