氦气全产业链人工智能大模型应用

    Application of artificial intelligence large model in the whole industry chain of helium

    • 摘要: 氦气作为支撑航天航空、医疗影像、半导体制造和量子计算等高精尖领域的不可再生战略资源,在我国产业链发展中面临资源与信息的双重困境,已成为制约相关领域自主创新的关键瓶颈。在资源层面,我国氦气对外依存度长期偏高,属于典型的“卡脖子”资源,且主要来源的天然气田氦含量普遍低于0.1%,远低于0.3%的经济开采阈值,导致提氦工艺经济性严重不足,难以支撑规模化自主生产。在信息层面,氦气产业链核心文献超90%以英文呈现,国内研究学者面临文献获取成本高、专业术语翻译不统一、知识提炼效率低等问题,显著阻碍了自主创新进程。为有效破解上述难题,本研究依托清华智谱AI-ChatGLM系列开源大语言模型的技术优势,在国内率先研发了面向氦气全产业链的垂直领域大模型及智能知识图谱系统。研究创新性构建了覆盖1990—2024年,整合全球期刊论文、专利、行业报告、技术标准、专著及新闻报道六大类数据源的氦气专属数据集,规模超过120万文档切片,为模型训练与知识图谱构建提供高质量数据支撑。系统通过实体识别与关系抽取技术,将非结构化知识转化为“实体-关系-实体”三元组,并存储入Neo4j数据库,构建形成完整动态的氦气全产业链知识图谱。同时,采用检索增强生成(RAG)架构和专家标注相结合机制,实现事实优先的知识服务,从根本上解决了专业领域大模型常见的“幻觉”问题。在包含500个问题的专业测试集上,系统表现优异:知识问答准确率达86.4%、实体识别F1值为89.7%、关系抽取准确率为84.3%,各项性能较通用大模型显著提升,且平均响应时间仅1.2 s,可满足实时科研需求。实际应用表明,该系统能高效支撑氦气资源评估、工艺优化提取、回收技术路径设计和产业政策分析等全链条研究任务。本研究构建的专业领域抗“幻觉”知识服务体系,不仅为我国氦气产业链突破技术瓶颈、提升自主创新能力提供了关键技术支撑,更为其他战略资源领域智能化研究与发展提供了可借鉴的技术范式,对保障我国氦气全产业链自主可控与高质量发展具有重要战略意义。

       

      Abstract: Helium, as a non-renewable strategic resource essential for supporting high-tech fields such as aerospace, medical imaging, semiconductor manufacturing, and quantum computing, presents dual challenges in both resource scarcity and information access within China’s industry chain, becoming a critical bottleneck constraining autonomous innovation in related sectors. At the resource level, China’s foreign-dependency ratio for helium has long been relatively high, classifying it as a typical “chokehold” resource; furthermore, the helium content in major domestic natural gas fields is generally below 0.1%, far lower than the 0.3% economic extraction threshold, resulting in inadequate economic viability of extraction processes and difficulty in supporting large-scale autonomous production. At the information level, over 90% of core literature in the helium industry is published in English, leading domestic researchers to face high costs in literature acquisition, inconsistent translation of technical terms, and low efficiency in knowledge extraction, which significantly hinders the process of autonomous innovation. To effectively address these challenges, this study leverages the technical advantages of the open-source ChatGLM series large language models(LLMs) from Zhipu AI and spearheaded the development of a domain-specific large model and an intelligent knowledge graph system tailored specifically for the entire helium industry chain in China. The study innovatively constructs a dedicated helium dataset covering the period from 1990 to 2024, integrating six major categories of data sources—global journal papers, patents, industry reports, technical standards, monographs, and news reports—with a scale exceeding 1.2 million document segments, providing high-quality data support for model training and knowledge graph construction. By employing entity recognition and relation extraction technologies, the system transforms unstructured knowledge into “entity-relation-entity” triples stored in a Neo4j database, forming a comprehensive and dynamic knowledge graph of the whole industry chain of helium. Simultaneously, by adopting a Retrieval-Augmented Generation(RAG) architecture combined with expert annotation mechanisms, the system achieves fact-prioritized knowledge services, fundamentally resolving the “hallucination” problem common in domain-specific large models. On a professional test set comprising 500 questions, the system demonstrates excellent performance: knowledge question-answering accuracy reaches 86.4%, the F1 score for entity recognition is 89.7%, and relation extraction accuracy is 84.3%, with all metrics showing significant improvements over general-purpose models and an average response time of only 1.2 seconds, meeting real-time research demands. Practical applications indicate that the system efficiently supports whole chain research tasks such as helium resource assessment, extraction process optimization, recycling technology pathway design, and industry policy analysis. The domain-specific anti-hallucination knowledge service system constructed in this study not only provides critical technical support for breaking through technological bottlenecks and enhancing autonomous innovation in China’s helium industry, but also offers a replicable technical paradigm for the intelligent development of other strategic resource sectors, holding significant strategic importance for ensuring the autonomous controllability and high-quality development of China’s helium industry chain.

       

    /

    返回文章
    返回