Search

NLP verification: towards a general methodology for certifying robustness
Marco Casadio, Tanvi Dinkar, Ekaterina Komendantskaya, Luca Arnaboldi, Matthew L. Daggitt, Omri Isac, Guy Katz, Verena Rieser, Oliver Lemon
Journal:

European Journal of Applied Mathematics , First View

Published online by Cambridge University Press:

02 April 2025, pp. 1-58
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Machine learning has exhibited substantial success in the field of natural language processing (NLP). For example, large language models have empirically proven to be capable of producing text of high complexity and cohesion. However, at the same time, they are prone to inaccuracies and hallucinations. As these systems are increasingly integrated into real-world applications, ensuring their safety and reliability becomes a primary concern. There are safety critical contexts where such models must be robust to variability or attack and give guarantees over their output. Computer vision had pioneered the use of formal verification of neural networks for such scenarios and developed common verification standards and pipelines, leveraging precise formal reasoning about geometric properties of data manifolds. In contrast, NLP verification methods have only recently appeared in the literature. While presenting sophisticated algorithms in their own right, these papers have not yet crystallised into a common methodology. They are often light on the pragmatical issues of NLP verification, and the area remains fragmented. In this paper, we attempt to distil and evaluate general components of an NLP verification pipeline that emerges from the progress in the field to date. Our contributions are twofold. First, we propose a general methodology to analyse the effect of the embedding gap – a problem that refers to the discrepancy between verification of geometric subspaces, and the semantic meaning of sentences which the geometric subspaces are supposed to represent. We propose a number of practical NLP methods that can help to quantify the effects of the embedding gap. Second, we give a general method for training and verification of neural networks that leverages a more precise geometric estimation of semantic similarity of sentences in the embedding space and helps to overcome the effects of the embedding gap in practice.

Causal Temporal Reasoning for Markov Decision Processes
Milad Kazemi, Jessica Lally, Nicola Paoletti
Journal:

Research Directions: Cyber-Physical Systems / Accepted manuscript

Published online by Cambridge University Press:

10 February 2025, pp. 1-23
- Article
- - You have access
  - Open access
- PDF
- Export citation
We present PCFTL (Probabilistic CounterFactual Temporal Logic), a new probabilistic temporal logic for the verification of Markov Decision Processes (MDP). PCFTL introduces operators for causal inference, allowing us to express interventional and counterfactual queries. Given a path formula φ, an interventional property is concerned with the satisfaction probability of φ if we apply a particular change I to the MDP (e.g., switching to a different policy); a counterfactual formula allows us to compute, given an observed MDP path τ, what the outcome of φ would have been had we applied I in the past and under the same random factors that led to observing τ. Our approach represents a departure from existing probabilistic temporal logics that do not support such counterfactual reasoning. From a syntactic viewpoint, we introduce a counterfactual operator that subsumes both interventional and counterfactual probabilities as well as the traditional probabilistic operator. This makes our logic strictly more expressive than PCTL⋆. The semantics of PCFTL rely on a structural causal model translation of the MDP, which provides a representation amenable to counterfactual inference. We evaluate PCFTL in the context of safe reinforcement learning using a benchmark of grid-world models.

Chapter 6 - The Language Problem
Christopher Burke, University of Reading, Adam Tamas Tuboly, HUN-REN Research Centre for the Humanities
Book:

Otto Neurath in Britain

Published online:

09 January 2025

Print publication:

23 January 2025, pp 154-178
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The ‘linguistic turn’ in twentieth-century philosophy is reflected through Neurath’s writings of his British period. He responded to serious criticism that Bertrand Russell made in his book An Inquiry into Meaning and Truth, developing the physicalism of the Vienna Circle into a cautious approach to ‘terminology’. Neurath revealed details of his index verborum prohibitorum, a list of ‘dangerous’ words to be avoided due to their misleading and metaphysical connotations. However, Neurath was resistant to the formalist tendencies evident in the work of Vienna Circle associates, in particular Carnap’s development of semantics. Their disagreement on the matter is examined through their prolific correspondence of the 1940s. While Neurath is often portrayed as losing this battle, we discuss how his own approach to the philosophy of language (including his ‘terminology’ project) prefigured the later development of ‘ordinary language philosophy’ to a certain extent.

QuantiVA: quantitative verification of autonomous driving
Renjue Li, Tianhang Qin, Pengfei Yang, Cheng-Chao Huang, Youcheng Sun, Lijun Zhang
Journal:

Research Directions: Cyber-Physical Systems / Volume 3 / 2025

Published online by Cambridge University Press:

13 December 2024, e1
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
We present a practical verification method for safety analysis of the autonomous driving system (ADS). The main idea is to build a surrogate model that quantitatively depicts the behavior of an ADS in the specified traffic scenario. The safety properties proved in the resulting surrogate model apply to the original ADS with a probabilistic guarantee. Given the complexity of a traffic scenario in autonomous driving, our approach further partitions the parameter space of a traffic scenario for the ADS into safe sub-spaces with varying levels of guarantees and unsafe sub-spaces with confirmed counter-examples. Innovatively, the partitioning is based on a branching algorithm that features explainable AI methods. We demonstrate the utility of the proposed approach by evaluating safety properties on the state-of-the-art ADS Interfuser, with a variety of simulated traffic scenarios, and we show that our approach and existing ADS testing work complement each other. We certify five safe scenarios from the verification results and find out three sneaky behavior discrepancies in Interfuser which can hardly be detected by safety testing approaches.

6 - Climate Finance
from Part II - Financing
Jeff Delmon, World Bank
Book:

Innovative Funding and Financing for Infrastructure

Published online:

01 February 2024

Print publication:

08 February 2024, pp 141-158
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Climate finance remains a relatively small part of the global finance market but is becoming increasinglt prominent; it is expected that all global finance will take on climate characteristics, causing climate standards and verification to become common. This chapter explores the current climate finance markets, the standards and other market infrastrucutre that have developed.

13 - Delegated Computation
Thomas Vidick, California Institute of Technology, USA, Stephanie Wehner, Delft University of Technology, The Netherlands
Book:

Introduction to Quantum Cryptography

Published online:

10 October 2023

Print publication:

14 September 2023, pp 302-326
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Delegated computation is a two-party task where there is a large asymmetry between the two parties: on the one hand, Alice would like to execute a quantum computation, but she does not have a powerful enough quantum computer to execute it. On the other hand, Bob has a quantum computer, but he is not trusted by Alice. Can Alice make sure that Bob executes her computation correctly for her? In this chapter we present three very different approaches to this problem. Each of the approaches is based on a different model for quantum computation, and the chapter also serves as an introduction to these models.

Compositional Verification in Rewriting Logic
ÓSCAR MARTÍN, ALBERTO VERDEJO, NARCISO MARTÍ-OLIET
Journal:

Theory and Practice of Logic Programming / Volume 24 / Issue 1 / January 2024

Published online by Cambridge University Press:

31 August 2023, pp. 57-109
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
In previous work, summarized in this paper, we proposed an operation of parallel composition for rewriting-logic theories, allowing compositional specification of systems and reusability of components. The present paper focuses on compositional verification. We show how the assume/guarantee technique can be transposed to our setting, by giving appropriate definitions of satisfaction based on transition structures and path semantics. We also show that simulation and equational abstraction can be done componentwise. Appropriate concepts of fairness and deadlock for our composition operation are discussed, as they affect satisfaction of temporal formulas. We keep in parallel a distributed and a global view of composed systems. We show that these views are equivalent and interchangeable, which may help our intuition and also has practical uses as, for example, it allows global-style verification of a modularly specified system. Under consideration in Theory and Practice of Logic Programming (TPLP).

Ayer on Religious Language
Stephen Law
Journal:

Think / Volume 22 / Issue 63 / Spring 2023

Published online by Cambridge University Press:

28 June 2023, pp. 63-66

Print publication:

Spring 2023
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Here is a brief introduction to Ayer's radical criticism of religious belief. According to Ayer, a sentence like ‘God exists’ doesn't assert something false; rather, it fails to assert anything at all.

10 - Verification vs. Falsification
from Part III - How Do We Claim Knowledge?
Richard Ned Lebow, Dartmouth College, New Hampshire
Book:

The Quest for Knowledge in International Relations

Published online:

07 April 2022

Print publication:

14 April 2022, pp 139-154
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Verification and falsification are standard techniques for the evaluation of truth claims. Both are problematic because they rest on shifting understandings of these concepts and their operationalization. Science as a practice offers an alternative and more sophisticated approach to assessment.

2 - Mathematics and Software Verification
from Part I - Computing
Edited by Liao Heng, Bill McColl
Book:

Mathematics for Future Computing and Communications

Published online:

03 December 2021

Print publication:

16 December 2021, pp 54-73
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Verification
Stuart Casey-Maslen, University of Pretoria
Book:

Nuclear Weapons

Published online:

14 October 2021

Print publication:

04 November 2021, pp 174-197
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

From the earliest days of the nuclear age, the issue of verification has plagued efforts to restrain the development, testing, and deployment of nuclear weapons and to ensure their destruction. It continues to do so. Especially sensitive are on-site inspections, but they have proved their worth in disarmament treaties since the 1980s and the last years of the Cold War. This chapter looks at verification thematically, by reference to testing, non-proliferation, disarmament, and deployment of nuclear weapons.

The Theory-Practice Gap in the Evaluation of Agent-Based Social Simulations
David Anzola
Journal:

Science in Context / Volume 34 / Issue 3 / September 2021

Published online by Cambridge University Press:

17 January 2023, pp. 393-410

Print publication:

September 2021
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Agent-based social simulations have historically been evaluated using two criteria: verification and validation. This article questions the adequacy of this dual evaluation scheme. It claims that the scheme does not conform to everyday practices of evaluation, and has, over time, fostered a theory-practice gap in the assessment of social simulations. This gap originates because the dual evaluation scheme, inherited from computer science and software engineering, on one hand, overemphasizes the technical and formal aspects of the implementation process and, on the other hand, misrepresents the connection between the conceptual and the computational model. The mismatch between evaluation theory and practice, it is suggested, might be overcome if practitioners of agent-based social simulation adopt a single criterion evaluation scheme in which: i) the technical/formal issues of the implementation process are tackled as a matter of debugging or instrument calibration, and ii) the epistemological issues surrounding the connection between conceptual and computational models are addressed as a matter of validation.

6 - Treaty Safeguards, Verification and Implementation: A Simple-Ban Approach and a Need for Oversight
Jonathan L. Black-Branch, University of Manitoba, Canada
Book:

The Treaty on the Prohibition of Nuclear Weapons

Published online:

17 April 2021

Print publication:

20 May 2021, pp 135-167
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The success of any arms control treaty generally depends on its ability to achieve its primary objectives and intended outcomes. At the heart of measuring such success are effective compliance criteria and verification mechanisms. This includes the ability to apply metrics to assess tangible outcomes and measurable outputs and benchmarks of achievement, including on-site visits. In relation to nuclear issues, this also means that verification of both the non-diversion of nuclear material from declared peaceful activities (i.e., correctness of conduct), and the absence of undeclared or clandestine nuclear activities in a particular state (i.e., completeness in following treaty terms).

9 - Competition, Fragmentation and Polarization: A Bifurcated International Legal Infrastructure Regarding the Nuclear Architecture and Regulation?
Jonathan L. Black-Branch, University of Manitoba, Canada
Book:

The Treaty on the Prohibition of Nuclear Weapons

Published online:

17 April 2021

Print publication:

20 May 2021, pp 236-279
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Nothing about developing and implementing a treaty on the prohibition of nuclear weapons is easy. While supporters of the TPNW undoubtedly claim a victory in its coming into being, its opponents note its shortcomings warning of the adverse and dire consequences. The degree to which such concerns will materialize remains to be seen. What is certain, however, is that the adoption of the TPNW has marked the beginning of a new schism in the international community. The word schism is appropriate in this context, loosely defined as “a split or division between strongly opposed sections or parties, caused by differences in opinion or belief.”

1 - Changing the Status Quo in Nuclear Arms Control Law: The Treaty on the Prohibition of Nuclear Weapons 2017
Jonathan L. Black-Branch, University of Manitoba, Canada
Book:

The Treaty on the Prohibition of Nuclear Weapons

Published online:

17 April 2021

Print publication:

20 May 2021, pp 1-10
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The Treaty on the Prohibition of Nuclear Weapons 2017 marks an important development in nuclear arms control law, diplomacy and relations between states. Adopted by the UN General Assembly on July 7, 2017, it was supported by 122 nations, representing a potential disruptor to the nuclear status quo. It is the first treaty to ban nuclear weapons outright, taking a clear humanitarian approach to disarmament. Despite its success in coming to fruition, however, it is not celebrated by all nations. The permanent members of the UN Security Council neither participated in its negotiations, nor adopted the final text. No state with nuclear weapons endorses the Treaty and indeed they openly oppose its very existence.

Concolic Testing in CLP
FRED MESNARD, ÉTIENNE PAYET, GERMÁN VIDAL
Journal:

Theory and Practice of Logic Programming / Volume 20 / Issue 5 / September 2020

Published online by Cambridge University Press:

21 September 2020, pp. 671-686
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Concolic testing is a popular software verification technique based on a combination of concrete and symbolic execution. Its main focus is finding bugs and generating test cases with the aim of maximizing code coverage. A previous approach to concolic testing in logic programming was not sound because it only dealt with positive constraints (by means of substitutions) but could not represent negative constraints. In this paper, we present a novel framework for concolic testing of CLP programs that generalizes the previous technique. In the CLP setting, one can represent both positive and negative constraints in a natural way, thus giving rise to a sound and (potentially) more efficient technique. Defining verification and testing techniques for CLP programs is increasingly relevant since this framework is becoming popular as an intermediate representation to analyze programs written in other programming paradigms.

3 - Making Enlightenment Local
Sophie Brockmann, De Montfort University, Leicester
Book:

The Science of Useful Nature in Central America

Published online:

27 August 2020

Print publication:

17 September 2020, pp 91-119
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 3 explores the promises and contradictions inherent in the information drawn from these local and global knowledge networks. There were tensions that were never quite resolved between the production of locally relevant knowledge that rejected theoretical approaches and a global intellectual movement that praised universal knowledge. The Economic Society responded to this by carefully negotiating the sources of knowledge which it received from its networks, especially on the topics of natural history and medical botany, and building up its own epistemologies and definitions of practical Enlightenment that made the local applicability of any information the ultimate test of its value. Frameworks of knowledge with universal aspirations, such as Linnaean taxonomy, were not welcome when local descriptions would be more translateable within Central America. I argue that these stubbornly local conceptualisations of knowledge became problematic when a comparison with other places was required, for instance in the context of attempting to export plants from Guatemala to other places, and in debating the merits of plantain trees with scholars in other parts of the empire.

TRANSMISSION OF VERIFICATION
Part of
- Proof theory and constructive mathematics
- General logic
ETHAN BRAUER, NEIL TENNANT
Journal:

The Review of Symbolic Logic / Volume 14 / Issue 4 / December 2021

Published online by Cambridge University Press:

21 July 2020, pp. 866-881

Print publication:

December 2021
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper clarifies, revises, and extends the account of the transmission of truthmakers by core proofs that was set out in chap. 9 of Tennant (2017). Brauer provided two kinds of example making clear the need for this. Unlike Brouwer’s counterexamples to excluded middle, the examples of Brauer that we are dealing with here establish the need for appeals to excluded middle when applying, to the problem of truthmaker-transmission, the already classical metalinguistic theory of model-relative evaluations.

9 - Toward Systemic Disarmament: Resetting Global Priorities
from Part II - Reforming the Central Institutions of the United Nations
Augusto Lopez-Claros, Arthur L. Dahl, Maja Groff
Book:

Global Governance and the Emergence of Global Institutions for the 21st Century

Published online:

18 January 2020

Print publication:

23 January 2020, pp 181-207
- Chapter
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Summary

Arms control and disarmament are among the unfulfilled promises of the UN Charter. Alongside the establishment of an International Peace Force, a Standing Committee on Disarmament should oversee a binding and staged process of universal disarmament, leaving only those arms needed for ensuring internal security. This would require global monitoring and verification, an avoidance of destabilizing forces and building trust among countries. There are both positive recent developments in arms control, and a regression towards arms build-ups eroding the accomplishment of past disarmament proposals. The prevention and abolition of war should be a central focus of renewed global governance, required by fundamental changes in the nature of armed conflict, threats from new technologies and the involvement of new actors beyond states. There are also new capacities for collective good. Proposals for modern comprehensive disarmament must go beyond the simple destruction of weapons to include the adaptation and reconversion of all the economic resources, infrastructure and human resources presently devoted to military forces and the arms industry. Many obstacles are acknowledged and will have to be overcome, but eliminating the anachronism of war will free enormous resources for other, more constructive uses.

A tale of two frequency effects: Toward a verification model of L2 word recognition
Nan Jiang, Man Li, Taomei Guo
Journal:

Applied Psycholinguistics / Volume 41 / Issue 1 / January 2020

Published online by Cambridge University Press:

25 November 2019, pp. 215-236
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This study examined the activation of first language (L1) translations in second language (L2) word recognition in a lexical decision task. Test materials included English words that differed in the frequency of their Chinese translations or in their surface lexical frequency while other lexical properties were controlled. Chinese speakers of English as a second language of different proficiencies and native speakers of English were tested. Native speakers produced a reliable lexical frequency effect but no translation frequency effect. English as a second language speakers of lower English proficiency showed both a translation frequency effect and a lexical frequency effect, but those of higher English proficiency showed a lexical frequency effect only. The results were discussed in a verification model of L2 word recognition. According to the model, L2 word recognition entails a checking procedure in which activated L2 words are checked against their L1 translations. The two frequency effects are seen to have two different loci. The lexical frequency effect is associated with the initial activation of L2 lemmas, and the translation frequency effect arises in the verification process. Existing evidence for verification in L2 word recognition and new issues this model raises are discussed.

Search Results

Refine search

Refine search

Actions for selected content:

46 results

NLP verification: towards a general methodology for certifying robustness

Causal Temporal Reasoning for Markov Decision Processes

Chapter 6 - The Language Problem

Summary

QuantiVA: quantitative verification of autonomous driving

6 - Climate Finance

Summary

13 - Delegated Computation

Summary

Compositional Verification in Rewriting Logic

Ayer on Religious Language

10 - Verification vs. Falsification

Summary

2 - Mathematics and Software Verification

7 - Verification

Summary

The Theory-Practice Gap in the Evaluation of Agent-Based Social Simulations

6 - Treaty Safeguards, Verification and Implementation: A Simple-Ban Approach and a Need for Oversight

Summary

9 - Competition, Fragmentation and Polarization: A Bifurcated International Legal Infrastructure Regarding the Nuclear Architecture and Regulation?

Summary

1 - Changing the Status Quo in Nuclear Arms Control Law: The Treaty on the Prohibition of Nuclear Weapons 2017

Summary

Concolic Testing in CLP

3 - Making Enlightenment Local

Summary

TRANSMISSION OF VERIFICATION

9 - Toward Systemic Disarmament: Resetting Global Priorities

Summary

A tale of two frequency effects: Toward a verification model of L2 word recognition

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

46 results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary