Published: 2026

Can Large Language Models Automate the HAZOP Process without Human Intervention?

CATEGORIES

RISK-BASED PROCESS SAFETY ELEMENTS

Research Summary

This study tests whether four multimodal LLMs (GPT-4o, GPT-4o-mini, LLAMA 3.2, and Gemini 2.0) can autonomously perform complete HAZOP studies from a P&ID using a standardized prompt. 120 HAZOP worksheets were generated and benchmarked against expert-prepared references. While all models achieved high textual similarity (F1 > 86%), the proportion of semantically valid accident scenarios was low (19-37%), and safeguards were heavily biased toward procedural measures. The authors conclude LLMs are useful as assistive tools but not replacements for expert-led HAZOP, and that advanced prompt engineering and domain-specific fine-tuning are needed. Highly relevant to PSM as it provides empirical evidence on capabilities and limitations of prompt-based LLM automation for Process Hazard Analysis.

AUTHORS

J. Lee, S. Park, S. Oh, and B. Ma

CITATIONS

J. Lee, S. Park, S. Oh, and B. Ma, "Can large language models automate the HAZOP process without human intervention?," Safety Sci., vol. 187, art. 106820, 2025, doi: 10.1016/j.ssci.2025.106820.