Hostname: page-component-f554764f5-68cz6 Total loading time: 0 Render date: 2025-04-20T18:57:24.940Z Has data issue: false hasContentIssue false

Stacking ensemble learning based material removal rate prediction model for CMP process of semiconductor wafer

Published online by Cambridge University Press:  15 November 2024

Zhilong Song
Affiliation:
College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou, China Ultra-Precision Machining Center, Key Laboratory of Special Purpose Equipment and Advanced Processing Technology, Ministry of Education and Zhejiang Province, Zhejiang University of Technology, Hangzhou, China
Wenhong Zhao
Affiliation:
College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou, China Ultra-Precision Machining Center, Key Laboratory of Special Purpose Equipment and Advanced Processing Technology, Ministry of Education and Zhejiang Province, Zhejiang University of Technology, Hangzhou, China
Xiao Zhang
Affiliation:
College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou, China Ultra-Precision Machining Center, Key Laboratory of Special Purpose Equipment and Advanced Processing Technology, Ministry of Education and Zhejiang Province, Zhejiang University of Technology, Hangzhou, China
Mingfeng Ke
Affiliation:
College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou, China Ultra-Precision Machining Center, Key Laboratory of Special Purpose Equipment and Advanced Processing Technology, Ministry of Education and Zhejiang Province, Zhejiang University of Technology, Hangzhou, China
Wei Fang
Affiliation:
College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou, China Ultra-Precision Machining Center, Key Laboratory of Special Purpose Equipment and Advanced Processing Technology, Ministry of Education and Zhejiang Province, Zhejiang University of Technology, Hangzhou, China
Binghai Lyu*
Affiliation:
College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou, China Ultra-Precision Machining Center, Key Laboratory of Special Purpose Equipment and Advanced Processing Technology, Ministry of Education and Zhejiang Province, Zhejiang University of Technology, Hangzhou, China
*
Corresponding author: Lyu Binghai; Email: [email protected]

Abstract

The material removal rate (MRR) serves as a crucial indicator in the chemical mechanical polishing (CMP) process of semiconductor wafers. Currently, the mainstream method to ascertain the MRR through offline measurements proves time inefficient and struggles to represent process variability accurately. An efficient MRR prediction model based on stacking ensemble learning that integrates models with disparate architectures was proposed in this study. First, the processing signals collected during wafer polishing, as available in the PHM2016 dataset, were analyzed and preprocessed to extract statistical and neighbor domain features. Subsequently, Pearson correlation coefficient analysis (PCCA) and principal component analysis (PCA) were employed to fuse the extracted features. Ultimately, random forest (RF), light gradient boosting machine (LightGBM), and backpropagation neural network (BPNN) with hyperparameters optimized by the Bayesian Optimization Algorithm were integrated to establish an MRR prediction model based on stacking ensemble learning. The developed model was verified on the PHM2016 benchmark test set, and a Mean Square Error (MSE) of 7.72 and a coefficient of determination (R2) of 95.82% were achieved. This indicates that the stacking ensemble learning based model, integrated with base models of disparate architectures, offers considerable potential for real-time MRR prediction in the CMP process of semiconductor wafers.

Type
Research Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Batista, GEAPA, Prati, RC and Monard, MC (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter, 2029. https://doi.org/10.1145/1007730.1007735.CrossRefGoogle Scholar
Breiman, L (2001) Random Forests. Machine Learning 532. https://doi.org/10.1023/a:1010933404324.CrossRefGoogle Scholar
Devlin, J, Chang, M-W, Lee, K and Toutanova, K (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North. Presented at the Proceedings of the 2019 Conference of the North, Minneapolis, Minnesota. https://doi.org/10.18653/v1/n19-1423CrossRefGoogle Scholar
Di, Y, Jia, X and Lee, J (2021) Enhanced virtual metrology on chemical mechanical planarization process using an integrated model and data-driven approach. International Journal of Prognostics and Health Management 8(2). https://doi.org/10.36001/ijphm.2017.v8i2.2641.Google Scholar
Dosovitskiy, A, Beyer, L, Kolesnikov, A, Weissenborn, D, Zhai, X, Unterthiner, T, … Houlsby, N (2020) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv: Computer Vision and Pattern Recognition.Google Scholar
Evans, CJ, Paul, E, Dornfeld, D, Lucca, DA, Byrne, G, Tricard, M and Mullany, BA (2003) Material removal mechanisms in lapping and polishing. CIRP Annals 52(2), 611633. https://doi.org/10.1016/s0007-8506(07)60207-8.CrossRefGoogle Scholar
Friedman, JH (2001) Greedy function approximation: A gradient boosting machine. Annals of Statistics. https://doi.org/10.1214/aos/1013203451CrossRefGoogle Scholar
Hanin, B and Rolnick, D (2019) Deep ReLU networks have surprisingly few activation patterns. Neural Information Processing Systems, Neural Information Processing Systems.Google Scholar
Jia, X, Huang, B, Feng, J, Cai, H and Lee, J (2021) A review of PHM data competitions from 2008 to 2017: Methodologies and analytics. Annual Conference of the PHM Society 10(1). https://doi.org/10.36001/phmconf.2018.v10i1.462.Google Scholar
Ke, G, Meng, Q, Finley, T, Wang, T, Chen, W, Ma, W, … Liu, T-Y (2017) LightGBM: a highly efficient gradient boosting decision tree. Neural Information Processing Systems.Google Scholar
Köksoy, O (2006) Multiresponse robust design: Mean square error (MSE) criterion. Applied Mathematics and Computation 175(2), 17161729. https://doi.org/10.1016/j.amc.2005.09.016.CrossRefGoogle Scholar
Lee, H (2019) Semi-empirical material removal model with modified real contact area for CMP. International Journal of Precision Engineering and Manufacturing 20(8), 13251332. https://doi.org/10.1007/s12541-019-00161-6.CrossRefGoogle Scholar
Li, X, Wang, C, Zhang, L, Mo, X, Zhao, D and Li, C (2018) Assessment of physics-based and data-driven models for material removal rate prediction in chemical mechanical polishing. In International Conference on Electrical Engineering and Automation (ICEEA 2018). Chengdu, China. https://doi.org/10.2991/iceea-18.2018.26.CrossRefGoogle Scholar
Li, Z, Wu, D and Yu, T (2019) Prediction of material removal rate for chemical mechanical planarization using decision tree-based ensemble learning. Journal of Manufacturing Science and Engineering 141(3). https://doi.org/10.1115/L4042051.CrossRefGoogle Scholar
Malik, M, Nehra, AK and Saini, BK (2021) A study on factors affecting job satisfaction of working women with Karl Pearson’s chi-square test. Research Journal of Humanities and Social Sciences 12(2), 511.Google Scholar
Pearson, P and Karl, K (2010) LIII. On lines and planes of closest fit to systems of points in space. Philosophical Magazine Series 1, Philosophical Magazine Series 1.Google Scholar
Ruan, B (2021) Prediction of stock market by BP neural network model. Journal of Physics: Conference Series, 042232. https://doi.org/10.1088/1742-6596/1744/4/042232Google Scholar
Rumelhart, DE, Hinton, GE and Williams, RJ (1986) Learning representations by back-propagating errors. Nature 533536. https://doi.org/10.1038/323533a0CrossRefGoogle Scholar
Sicard, D, Briois, P, Billard, A, Thevenot, J, Boichut, E, Chapellier, J and Bernard, F (2022) Deep Learning and Bayesian Hyperparameter Optimization: A Data-Driven Approach for Diamond Grit Segmentation toward Grinding Wheel Characterization. Applied Sciences 12(24), 12606. https://doi.org/10.3390/app122412606.CrossRefGoogle Scholar
Sun, Y, Li, H, Zhao, X, Fei, J, Liu, X and Niu, Y (2022) A Novel Denoise Method of Acoustic Signal from Train Bearings Based on Resampling Technique and Improved Crazy Climber Algorithm. Shock and Vibration 2022, 111. https://doi.org/10.1155/2022/8303722.Google Scholar
Tang, R, Tao, Y, Li, J, Chen, Z, Deng, X and Li, H (2022) The Short‐time Prediction of the Energetic Electron Flux in the Planetary Radiation Belt Based on Stacking Ensemble‐Learning Algorithm. Space Weather. https://doi.org/10.1029/2021sw002969CrossRefGoogle Scholar
Vaswani, A, Shazeer, N, Parmar, N, Uszkoreit, J, Jones, L, Gomez, Aidan N and Polosukhin, I (2017) Attention is All you Need. Neural Information Processing Systems.Google Scholar
Wang, P, Gao, RX and Yan, R (2017) A deep learning-based approach to material removal rate prediction in polishing. CIRP Annals 66(1), 429432. https://doi.org/10.1016/j.cirp.2017.04.013.CrossRefGoogle Scholar
Wolpert David, H (1992) Stacked generalization. Neural Networks (2). https://doi.org/10.1016/S0893-6080(05)80023-1.CrossRefGoogle Scholar
Xu, Q, Chen, L, Cao, H and Liu, J (2021) A neural network-based approach to material removal rate prediction for copper chemical mechanical planarization. ECS Journal of Solid State Science and Technology 10(5), 054003. https://doi.org/10.1149/2162-8777/abfc20.CrossRefGoogle Scholar
Xu, Q, Chen, L, Liu, J and Cao, H (2020) A wafer-scale material removal rate model for chemical mechanical planarization. ECS Journal of Solid State Science and Technology 9(7), 074002. https://doi.org/10.1149/2162-8777/abadea.CrossRefGoogle Scholar
Yu, J-h, Lin, Y-j, Zhang, B, Qu, Y-x, Wang, B-q, Li, Z-r, Xia, Y-c and Chen, L (2022) Prediction method of premixed flammable gas explosion experimental result based on Adam-BP. Journal of Dalian Maritime University (02), 110117. https://doi.org/10.16411/j.cnki.issn1006-7736.2022.02.013Google Scholar
Yu, T, Li, Z and Wu, D (2019) Predictive modeling of material removal rate in chemical mechanical planarization with physics-informed machine learning. Wear 426–427, 14301438. https://doi.org/10.1016/j.wear.2019.02.012CrossRefGoogle Scholar
Zhang, J, Jiang, Y, Luo, H and Yin, S (2021) Prediction of material removal rate in chemical mechanical polishing via residual convolutional neural network. Control Engineering Practice 104673. https://doi.org/10.1016/jxonengprac.2020.104673.CrossRefGoogle Scholar
Zhang, Z, Liu, J, Hu, W, Zhang, L, Xie, W and Liao, L (2021) Chemical mechanical polishing for sapphire wafers using a developed slurry. Journal of Manufacturing Processes 62, 762771. https://doi.org/10.1016/j.jmapro.2021.01.004.CrossRefGoogle Scholar
Zhao, Y and Chang, L (2002) A micro-contact and wear model for chemical-mechanical polishing of silicon wafers. Wear 252(3–4), 220226. https://doi.org/10.1016/s0043-1648(01)00871-7.CrossRefGoogle Scholar
Zhou, H, Wang, X and Zhu, R (2022) Feature selection based on mutual information with correlation coefficient. Applied Intelligence 52(5), 54575474. https://doi.org/10.1007/s10489-021-02524-x.CrossRefGoogle Scholar
Zounemat-Kermani, M, Stephan, D, Barjenbruch, M and Hinkelmann, R (2020) Ensemble data mining modeling in corrosion of concrete sewer: A comparative study of network-based (MLPNN & RBFNN) and tree-based (RF, CHAID, & CART) models. Advanced Engineering Informatics 43, 101030. https://doi.org/10.1016/j.aei.2019.101030.CrossRefGoogle Scholar