Machine Learning with Legal Texts

Kevin D. Ashley

doi:10.1017/9781316761380.008

INTRODUCTION

In the examples of ML so far, a program has learned from data about judges, trends, or cases as in the Supreme Court Database, but not from the texts of cases or other legal documents. This chapter introduces applying ML algorithms to corpora of legal texts, discusses how ML models implicitly represent users’ hypotheses about relevance, illustrates how ML can improve full-text legal information retrieval, and explains its role in conceptual information retrieval and in cognitive computing. The chapter also distinguishes between supervised and unsupervised ML from text and discusses techniques for automating learning of structure and semantics from legal documents.

Along the way, the chapter answers the following questions: How can ML be applied to textual data? What is the difference between supervised and unsupervised ML from texts? What is predictive coding? How well does predictive coding work? What is “information extraction” from text? How are texts represented for purposes of applying ML? What is a “support vector machine (SVM)” and why use one with textual data?

APPLYING MACHINE LEARNING TO TEXTUAL DATA

ML algorithms identify patterns in data, summarize the patterns in a model, and use the models to make predictions by identifying the same patterns in new data (see Kohavi and Provost, 1998).

A model is a structure that summarizes the patterns in data in some statistical or logical form in which it can be applied to new data (see Kohavi and Provost, 1998). This book has already introduced some examples of ML models, such as the decision tree for bail decisions in Figure 4.2 or the random forests of decision trees referred to in Section 4.4.

The models capture the strength of the association in the patterns between observed features and an outcome feature. For example, the decision on bail is an outcome feature, and the observed features included whether the offense involved drugs or the offender had a prior record. The Supreme Court's decision to affirm or not is an outcome feature, and the observed features included a justice's gender or the appointing president's party. The model captures the strength of the association in the patterns between observation and outcome features either statistically, logically, or in some combination of the two.

Book contents

8 - Machine Learning with Legal Texts

Summary

Access options

Book purchase

Temporarily unavailable

Book contents

8 - Machine Learning with Legal Texts

Summary

Access options

Book purchase

Temporarily unavailable

Save book to Kindle

Save book to Dropbox

Save book to Google Drive