Research on intelligent classification of coal mine safety hazards based on the CoalBERT model
-
Graphical Abstract
-
Abstract
With the rapid development of various information platforms, the coal mining industry has widely adopted various information platforms to optimize operations and improve safety production levels. These platforms have helped coal mining enterprises accumulate and manage a large amount of relevant data, but the semantic complexity and domain specialization of coal mine safety hazard text data make it difficult to effectively utilize this data. To this end, based on the 2022 version of the Coal Mine Safety Regulations, 17 first-level hazard categories and 109 second-level hazard categories are defined as the sample label system for coal mine safety hazard data. A systematic coal mine safety hazard classification method is constructed, and a CoalBERT pre trained language model is used to classify coal mine safety hazard text data into a two-layer category system. At the same time, the BERT model is used as a reference for comparative analysis. This model solves the two major limitations of general models in the field of coal mine safety, namely insufficient semantic understanding of professional terms such as “anchor support” and “gas extraction”, and limited logical coherence in hazard description texts, by introducing domain term masking language modeling(DP-MLM) and sentence order prediction(SOP) tasks. The model training is carried out under the PyTorch framework, by setting the learning rate and iteration times, and optimizing using stochastic gradient descent. The research results indicate that the CoalBERT model performs well in coal mine safety hazard classification tasks. In the first level category classification experiment, the CoalBERT model outperforms the BERT model in accuracy, recall, and F1 score, with improvements of 0.34%, 0.21%, and 0.27%, respectively. In the second-level category classification experiment, the F1 value of the CoalBERT model increases by an average of 3%-5%, and the highest classification performance reaches 97.75%. Especially in categories such as “mine construction”“rockburst prevention and control”, and “hidden danger investigation”, the CoalBERT model demonstrates significant advantages. It can be seen that the coal mine safety hazard classification algorithm based on the CoalBERT pre trained language model performs well in the task and can become an important auxiliary tool for coal mine safety management, providing strong support for improving the level of coal mine safety management and preventing accidents.
-
-