Book contents
- Legal Informatics
- Legal Informatics
- Copyright page
- Contents
- Figures
- Tables
- Contributors
- Part I Introduction to Legal Informatics
- Part II Legal Informatics
- A Information Representation, Preprocessing, and Document Assembly
- 2.1 Representation of Legal Information
- 2.2 Information Intermediation
- 2.3 Preprocessing Data
- 2.4 XML in Law
- 2.5 Document Automation
- B. Artificial Intelligence, Machine Learning, Natural Language Processing, and Blockchain
- C. Process Improvement, Gamification, and Design Thinking
- D. Evaluation
- Part III Use Cases in Legal Informatics
- Part IV Legal Informatics in the Industrial Context
2.3 - Preprocessing Data
from A - Information Representation, Preprocessing, and Document Assembly
Published online by Cambridge University Press: 04 February 2021
- Legal Informatics
- Legal Informatics
- Copyright page
- Contents
- Figures
- Tables
- Contributors
- Part I Introduction to Legal Informatics
- Part II Legal Informatics
- A Information Representation, Preprocessing, and Document Assembly
- 2.1 Representation of Legal Information
- 2.2 Information Intermediation
- 2.3 Preprocessing Data
- 2.4 XML in Law
- 2.5 Document Automation
- B. Artificial Intelligence, Machine Learning, Natural Language Processing, and Blockchain
- C. Process Improvement, Gamification, and Design Thinking
- D. Evaluation
- Part III Use Cases in Legal Informatics
- Part IV Legal Informatics in the Industrial Context
Summary
Every once in a while, a ready-to-use data set falls down the chimney like a diamond in a gift box, perfectly suited to the problem at hand. Unfortunately, what we usually have is a pile of coal and a rusty shovel. This is because most data resides in unstructured and disparate information systems or data sources. In order to apply most informatics methods, including a markup system like XML, we must first retrieve and then preprocess data from these sources to produce a structured, linked data set. These phases, sometimes colloquially referred to as data scraping, cleaning, wrangling, or “munging,” are arguably more important, and typically more time-consuming, than many other tasks in legal informatics.
- Type
- Chapter
- Information
- Legal Informatics , pp. 55 - 60Publisher: Cambridge University PressPrint publication year: 2021