Life-Cycle Maintenance Optimization of Bridge Networks Against Seismic Risks Through Hierarchical Deep Reinforcement Learning
Please login to view abstract download link
Engineering networks are designed with the aim of satisfying multiple performance requirements during their life-cycle. However, they are subjected to different degradation mechanisms triggered by stressors and hazards. In response, sequential data-collection and intervention strategies need to be efficiently planned to monitor and maintain structural health within levels that ensure minimized costs and risks. This defines a computationally hard optimization problem, the complexity of which is further magnified by the inherent noise in inspection outcomes, the large number of components, and the immense resulting state and action spaces. Recent research has indicated that Deep Reinforcement Learning (DRL), equipped with partially observable Markov decision processes, possesses significant potential for global optimization of inspection and maintenance at scale. Specifically, decentralized multi-agent actor critic DRL-POMDP formulations have been shown to tackle structural integrity management problems, allowing linear scaling of system-level decisions with the number of components. However, for environments with very large component numbers, coordination of multiple agents can become cumbersome, which rises the need for exploiting known structural dependencies. To address these challenges, we propose a novel framework for structural integrity management of large-scale structural networks, using continuous-action hierarchical resource allocation DRL architectures. The global policy is decomposed into subpolicies, which refer to simpler decision subproblems defined at different hierarchy levels of the system (e.g., network, link, bridge, bridge parts). The new framework is tested in a structural integrity management application of a bridge network subjected to time-dependent oxidative corrosion deterioration and earthquake hazards. The learned policies are shown to mitigate the continuous deterioration, keeping the probability of failure due to seismic shocks within predetermined thresholds, while minimizing the life-cycle costs. Comparisons against traditional baselines and DRL counterparts verify the efficiency of the algorithmic framework.