Causal Temporal Reasoning for Markov Decision Processes

Milad Kazemi; Jessica Lally; Nicola Paoletti

doi:10.1017/cbp.2025.2

Accepted manuscript

Causal Temporal Reasoning for Markov Decision Processes

Published online by Cambridge University Press: 10 February 2025

Milad Kazemi

Jessica Lally and

Nicola Paoletti

Show author details

Milad Kazemi: Affiliation:
Department of Informatics, King’s College London, London, United Kingdom
Jessica Lally: Affiliation:
Department of Informatics, King’s College London, London, United Kingdom
Nicola Paoletti: Affiliation:
Department of Informatics, King’s College London, London, United Kingdom

Article contents

Abstract

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We present PCFTL (Probabilistic CounterFactual Temporal Logic), a new probabilistic temporal logic for the verification of Markov Decision Processes (MDP). PCFTL introduces operators for causal inference, allowing us to express interventional and counterfactual queries. Given a path formula φ, an interventional property is concerned with the satisfaction probability of φ if we apply a particular change I to the MDP (e.g., switching to a different policy); a counterfactual formula allows us to compute, given an observed MDP path τ, what the outcome of φ would have been had we applied I in the past and under the same random factors that led to observing τ. Our approach represents a departure from existing probabilistic temporal logics that do not support such counterfactual reasoning. From a syntactic viewpoint, we introduce a counterfactual operator that subsumes both interventional and counterfactual probabilities as well as the traditional probabilistic operator. This makes our logic strictly more expressive than PCTL⋆. The semantics of PCFTL rely on a structural causal model translation of the MDP, which provides a representation amenable to counterfactual inference. We evaluate PCFTL in the context of safe reinforcement learning using a benchmark of grid-world models.

Keywords

causality temporal logic verification

Type: Results
Information: Research Directions: Cyber-Physical Systems , Accepted manuscript , pp. 1 - 23

DOI: https://doi.org/10.1017/cbp.2025.2 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.

Article contents

Causal Temporal Reasoning for Markov Decision Processes

Abstract

Keywords

What is Research Directions?

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests