Hostname: page-component-586b7cd67f-tf8b9 Total loading time: 0 Render date: 2024-11-25T00:47:18.722Z Has data issue: false hasContentIssue false

Plea Bargaining and Prohibition in the Federal Courts, 1908–1934

Published online by Cambridge University Press:  01 July 2024

Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

This article documents and explains the emergence of implicit plea bargaining in the federal district courts during the Progressive and Prohibition periods. Three competing explanations for plea bargaining are tested statistically—the caseload, the substantive justice, and the evidentiary quality arguments. All three receive qualified support. The historical operation of each of these causal paths, however, was shaped by the preoccupation of more elite federal judges with their own professional self-image in the face of Prohibition. Implicit plea bargaining in the federal courts emerged reflexively as an unintended consequence of the failed Progressive assault on the “corrupt” explicit plea bargaining practices of lower state and county courts.

Type
Part III: New Theory for Longitudinal Trial Court Research
Copyright
Copyright © 1990 The Law and Society Association.

Footnotes

I appreciate the detailed comments of Albert Alschuler, Chris Achen, Gerald Rosenberg, and especially Frank Munger on an earlier draft. I also was fortunate to have Chris Ansell as a research assistant.

References

1 The sign of this predicted relationship depends on the form of plea bargaining in place, however (see below, in sec. VII).

2 These revisions, providing guidelines for explicit plea bargaining, were provoked by the Supreme Court's “legalization” of plea bargaining in 1970. Before this time, appellate courts had held plea bargaining to be illegitimate (Alschuler, 1979), even though it was practiced widely.

3 “The situation was summarized by an Assistant United States Attorney in Chicago who said, ‘There may be one or two judges in the Northern District of Illinois who make sentence promises in advance of trial, but there may also be one or two judges who take bribes. Neither activity is really considered a suitable part of the judicial process’ ” (Alschuler, 1976: 1078).

4 Alschuler (1976: 1078–79) reports:

Although federal prosecutors were usually willing to discuss plea agreements with defense attorneys, they commonly made available only insubstantial concessions. In some federal courts, pretrial bargaining focused on the prosecutors' sentence recommendations; but most federal prosecutors did not make sentence recommendations, and others made recommendations that were not subject to negotiation. ... In most offense areas, moreover, the Federal Criminal Code was not well adapted to the patterns of charge reduction that characterized bargaining in many state courts.

5 Before 1934 the annual reports were compilations of self-reports by the various district attorneys. After 1934, statistics were compiled centrally. While this change in data collection procedure probably improved the accuracy of national statistics, it had the unfortunate consequence for my purposes of decreasing the detail of district-by-district breakdowns.

Another unfortunate shift in these reports' accounting, at the district level of aggregation, should be noted. Before 1922, offenses were classified by legal charge; after 1922, charges were aggregated into administrative categories. In particular, Volstead Act liquor cases were included in “Public Health and Safety” along with internal revenue liquor cases, narcotics cases, white slavery (i.e., prostitution) cases, peonage cases, and a few others. However, since Volstead Act cases consistently comprise about 90 percent of this category at the national level, in district-specific time-series analyses below, I will treat “Public Health and Safety” as equivalent to liquor cases. This accounting complication has been circumvented for national data.

6 Actually, these national data exclude the District of Columbia district court because for this court reporting was not consistent over time. In particular, the number of both guilty pleas and trials was reported to be zero for 1910–15, even though convictions annually were reported in the thousands. One result of including data from the District of Columbia is the misleading plot presented in the American Law Institute's own study (p. 58), wherein a very sharp increase in the guilty plea rate is shown for 1916.

7 Guilty plea percentages are measured here and throughout not as percentage of convictions but rather as the probability, on average, that defendants choose guilty pleas. That is,

(# guilty pleas / # guilty pleas + # trials).

Dismissal percentages include both nolle prosequi dismissals by prosecutors and “quashed, dismissed, demurrer, etc.” by judges:

(# nol pros + # quashed / # cases terminated).

Of the two modes of dismissal, nolle prosequi by prosecutors are by far (70–90 percent) the predominant type in this data set.

8 In the period under study here, criminal cases predominated in the federal courts, due to Prohibition. Sample distributions of the overall commenced-case mix at the beginning and end of the study period are as follows:

1908 1932 1934
Criminal cases to which U.S. was a party 13,345 92,174 34,152
Civil cases to which U.S. was a party 3,202 34,189 9,487
Civil cases to which U.S. was not a party 11,703 26,326 26,472

Moreover, as is apparent, many of the “civil cases to which U.S. was a party” were in fact Prohibition related. The American law Institute (1934) study labels these “quasi-criminal” civil cases.

9 The ALI study (1934) reports the distribution of trial lengths, in days, from which an average was calculated (using midpoints of intervals). I consulted the Federal Reporter (for January of each year) in order to code the names and number of full-time district judges on duty in all eighty-two district courts over the entire 1908–32 period. From these names, k was calculated as the average number of the thirteen courts' judges on duty for the three years of the ALI study. For purposes of the trial capacity calculation, raw caseload n was taken to be the sum of criminal and civil cases commenced, averaged over the same three years. To translate trial length into appropriate annual units, an estimate of the number of annual judge-days available for trial duty is required. I chose 200 days (equals 50 weeks × 4 days per week), under the assumption that at least one day per week must be available for other court business. Given all this, trial capacity was calculated as:

10 A simple regression through the data in Fig. 3 generates a significant relationship (R 2 = .668; t =4.48). However, this relationship is destroyed by the Connecticut outlier (R 2=.018; t =0.45).

11 Misdemeanor crimes are an exception to this statement (Feeley, 1979). When prison is not an issue, transaction or “process” costs that otherwise are relatively trivial can loom large. Felony liquor crimes involving mere possession, however, might count in this category, since (depending on the district) prison in these cases was often only a remote possibility.

12 That is,

Expected Prison Sentence =

The American law Institute study (1934) reports both fines and imprisonments as sentences, but to avoid scaling difficulties only imprisonment sentences were used here.

13 That is,

Note that when the sentence discount equals 1, there is no difference between trial and guilty plea sentences.

14 The sentence discount schedule for judicial plea bargaining is similar to that for sentence recommendation plea bargaining. I suppress it here only to improve visual clarity.

15 In particular, E(Acquittal | Trial) = (1/1+v).

16 Northern District of Illinois (Chicago) had a sentence discount for liquor cases that was so extreme that I deleted it from all statistical analyses. It had a calculated value of 2.57, which means that guilty pleas were punished two and a half times more severely than convictions at trial! I strongly suspect the reliability of this particular datum.

17 A potential complication is whether these sentence discounts are an artifact of charge composition effects. The ALI study did not disaggregate sentencing by charge. There are four reasons why the discounts measured here are not artifacts:

  1. 1.

    1. As long as district discounting percentages are constant across charges, then variation across districts in charge composition or in sentencing severity is completely irrelevant to the estimation of discounts. For the example of two charges, where d is the discount, p is the percentage of cases in the first charge class, and m 1, m 2 are mean sentences per charge:

    While perfect constancy is an idealization, modest deviation affects this conclusion only slightly.

  2. 2.

    2. Case docket studies of federal courts' sentencing in the 1960s and early 1970s have demonstrated strong effects of plea on sentence, even after controlling for many variables not available to me—charge, type of trial, legal representation, defendant's prior record, age, and race (Tiffany et al., 1975; Cook, 1973). While not from my period, these studies do cover a period of known implicit plea bargaining.

  3. 3.

    3. Surveys of judges in the late 1930s (U.S. Department of Justice, Office of the Attorney General, 1939) and in the 1950s (Yale Law Journal, 1956) revealed that federal judges admitted and defended their differential sentencing of defendants by plea, largely on the grounds (a) that guilty pleas are evidence of remorse and/or (b) that the state is saved time and cost.

  4. 4.

    4. The only charge disaggregation of the ALI data we have is for liquor cases in Connecticut (Wickersham Commission, 1931a: 24). The number of trials is far too small to be definitive, but these data do not show that liquor trials involved more severe charges than guilty pleas.

18 Indicators of professionalism and politicization are based on data about judges' career histories. For all district judges listed in the Federal Reporter from 1908 to 1932, I consulted the two legally oriented Who's Whos published in the era: Who's Who in Jurisprudence, 1925 and Who's Who in Law, 1937. Between them I located information on 207 of the 302 judges.

This was done at the suggestion of Albert Alschuler. Chris Ansell provided able research assistance in this task.

19 The Carnegie Foundation's operational definition of elite law schools was “Group 1. Full-Time Schools requiring, after the High School, a Minimum of More than Five Academic Years” (Reed, 1928: 169). These Group 1 law schools, rank-ordered by the number of judges in my data set attending them, were as follows: Harvard (16), Michigan (16), Columbia (10), Yale (4), Pennsylvania (4), Wisconsin (2), Chicago (1), Northwestern (1), California at Berkeley (1), Cornell (1), Stanford (0), Pittsburgh (0), William and Mary (0), and Western Reserve (0).

20 The following scale, applied to each judge and then averaged over judge-years by district or by nation, was used here: received LL.B. or J.D. from elite law school (1.00); attended elite law school but did not receive degree (0.75); received LL.B. or J.D. from non-elite law school (0.50); attended non-elite law school but did not receive degree (0.25); did not attend law school (0.00).

21 Background in state parties, however, was not recorded as reliably in Who's Who as was background in legislatures. (I infer this from listings for judges in both sources.) Both measures are conservative in the sense that I did not include prosecutorial background in either of them. I had no way of ascertaining when prosecutor positions were highly politicized and when not.

22 In particular, I coded the votes of my twelve state delegations in the House of Representatives on (a) the December 17, 1917, vote to send the eighteenth amendment to the states for ratification, which passed 282 dry to 128 wet, and (b) the March 14, 1932, preelection vote to extract a nullification bill from committee, which failed 227 dry to 187 wet. For my final measure, I averaged these two votes at the beginning and end of Prohibition.

I chose congressional votes over public referenda as an indicator in order to hold the question constant. I chose congressional votes over state legislature ratification votes because only finally successful votes were available (U.S. Department of the Treasury, Bureau of Industrial Alcohol, 1924–33). This source provides only a truncated sample of states, with votes that span numerous years. Scientific public opinion polls do not exist for this period.

23 Technically of course, these are ecological regressions, and hence this leap to individual inference may be challenged. However, in this case the number of judges per court is often small: for the cross-sectional fiscal year 1928–30 period, four of the thirteen courts had only one judge, and three of the thirteen had only two. The maximum number, for the district of Southern New York, was six. The ALI study did not break down sentencing by judge.

24 Proxy variables are not intended as unbiased estimators of the unobserved variable of interest; rather they are observed variables expected to co-vary strongly with the unobserved variable. According to Figs. 6a and 6b, observed trial acquittal rates could also serve for this purpose, but the number of observed trials in some districts becomes too small once we disaggregate to the district level.

25 One accounting problem was that the Bureau reported Volstead Act liquor arrests, whereas at the district level of aggregation after 1922, the Justice Department reported “Public Health and Safety” cases commenced. This category can be disaggregated with national data only. Another, more intractable accounting problem was that, until the Bureau of Prohibition was transferred from Treasury to the Justice Department in fiscal year 1931, it reported arrests in terms of individuals arrested, rather than in terms of cases in which arrests were made.

Perhaps the most bedeviling barrier to cross-sectional comparisons, however, was alluded to in the Kansas vignette. For the most part federal arrests were prosecuted in federal courts, but sometimes federal arrests were prosecuted in state courts under local prohibition statutes, and sometimes state arrests were handed over to federal authorities. I had no way of controlling for these jurisdictional crossovers in order to estimate true commissioner dismissal rates. These problems with federal court statistics were well known at the time (Wickersham Commission, 1931c).

26 The greatest potential problem with the national data was temporal shifts in jurisdictional crossovers. Federal officials at the time were constantly trying to get recalcitrant states to absorb more of the Prohibition burden.

A systematic national trend by state authorities into or out of federal courts would probably be reflected in cooperation patterns at the enforcement level as well. The latter statistics were reported in comparable units by the Bureau of Prohibition from 1925 to 1930 (U.S. Department of the Treasury, Bureau of Prohibition, 1924–33):

27 Once we conclude that police work improved, however, it is not at all obvious about whom we are talking. As was described by numerous contemporary authors (e.g., Millspaugh, 1937), federal police work was organizationally very complex because there were so many special-purpose investigatory agencies. The Bureau of Prohibition took care of Volstead Act cases, and the Bureau of Narcotics (later folded into the Prohibition Bureau) took care of its namesake. The Alcohol Tax Unit (the “revenooers”) of the Internal Revenue Service took care of traditional liquor cases. Postal inspectors dealt with violations involving the mails. The Bureau of Customs and the Coast Guard cooperated on smuggling cases. The Labor Department's Immigration and Naturalization Service took care of the smuggling of people. The Secret Service division of the Treasury occupied itself primarily with counterfeiting (rather than the protection of presidents). And the Departments of Interior and Agriculture had their own investigatory units in order to cope with land fraud cases and with food & drug and meat inspection cases, respectively. Even the FBI, currently our only general-purpose federal police force, was at this time only just beginning. It occupied itself (to its virtual disgrace) with prostitution, selective service, and espionage cases. No central compilation of statistics exists about the arresting behavior of all these various police.

This special-purpose profusion of federal enforcement agencies during this period, however, can be used to our advantage in sorting through police effects. For when we disaggregate national court data by legal offense, we are disaggregating by police agency as well. I have plotted national time-series data on guilty plea, trial acquittal, and dismissal rates (available on request), disaggregated by legal offense cum police agency. Once again, I used dismissal and acquittal rates as proxy variables for average strength of case.

The time-series plots reveal that, while the Immigration Service and the Interior Department showed dramatic improvement over time, the more typical pattern was slow incremental improvement. The Customs Bureau, the Post Office, the ICC, the FBI (after 1920), the IRS, and the Bureau of Prohibition all showed gradual, steady declines in acquittal rates, dismissal rates, or usually both. The IRS was a somewhat special case, because superimposed onto its long-term evidentiary improvement was a dramatic short-term reaction to Prohibition—massive dismissal of older liquor (and related) case backlog, and a fairly rapid modification of its guilty plea rates to equal that of the newer Bureau of Prohibition. Furthermore, while I do not present the details here, bivariate negative relationships between dismissal and guilty plea rates were statistically significant (at .05) for Prohibition, other liquor, customs, post office, banking, interstate commerce, land, immigration, and white slavery cases. In other words, consequential improvement in police work was not concentrated in a few agencies. It was virtually across the board.

28 “[Through] acceptance of a plea of a lesser degree than that for which the defendant was indicted, those deserving of extreme punishment are permitted to escape with a suspended sentence or with punishment all too inadequate for the crime committed. We deplore the tendency of some district attorneys, following the course of least resistance, thus to relax the rigid enforcement of our penal statutes.” (People v. Gowasky, 219 N.Y. A.D. 19 (1926)).

29 Established by Congress in 1922 at the initiative of Chief Justice Taft, the Judicial Conference was an administrative board composed of the Chief Justice from the Supreme Court and of the senior judges from the nine circuit courts. Its centralizing purpose was to provide a clearinghouse for legislative proposals to Congress, and to oversee the administration of the federal courts.

30 The 1934 ALI study relied upon so heavily in this article is typical. The research was too good not to note in passing the existence of sentence discounts, but no critical or causal conclusions were drawn from this fact. As is also apparent in Attorney General Cummings's remarks, cited at the outset, implicit plea bargaining was accepted as normatively legitimate.

31 This comparative statement is based both on this study and on the derivations in Padgett (1985).