Policing by data.

Are there more traffic stops in Hartford or Philadelphia?

This question seems easy to answer. We can determine the number of stops in each city between April 1, 2014 and September 29, 2016.

There were a lot more stops in Philadelphia (678,445) than in Hartford (9630)! In fact, there were

678445/9630 ~ 70

times as many traffic stops in Philadelphia than in Hartford. Does this mean people in Philadelphia are worse drivers or commit more crime? Not necessarily--there are many more people in Philadelphia, so it makes sense that there are more traffic stops. Using data from the U.S. Census Bureau, we see that Philadelphia had approximately 1,555,000 people in 2015 and Hartford had 125,000. Therefore, there were

15500/12500 ~12.4

times as many people in Philadelphia than in Hartford.

Well, that's interesting. Philadelphia had about 13 times the population, but 70 times the number of traffic stops as Hartford. These data show us that, there were many more traffic stops in Philadelphia than in Hartford in this time frame, even when we control for population. Unfortunately, the data cannot tell us why this is the case. Further research, with the help of experts in policing, history, crime, and a slew of other subjects may help us.

Are Black drivers stopped more often than drivers of other races?

Now let's look at the racial makeup of the stopped drivers in each city. The data here isn't perfect; there are only six possible options for race: asian/pacific islander, black, hispanic, white, other, and unknown. Many, many people do not fit neatly into one of these categories, but we do what we can with the data we have.

we see that 3589/9630 ~ 37.3% and 435548/678445 ~ 64.2%

of the drivers stopped were Black in Hartford and Philadelphia, respectively. We now compare these percentages to 2015 population data from the US Census Bureau. For Hartford, 37.3\% of the drivers stopped were Black and 38\% of the population was Black--these numbers seem to indicate that Black drivers were not stopped more frequently than drivers of other races. In Philadelphia, however, 64.2\% of the drivers stopped were Black while only 42.4\% of the population was Black. It seems as though Black drivers were disproportionately stopped in Philadelphia during this time.

Are stopped drivers searched more often if they are Black?

Once a driver is stopped, officers may or may not search them or their vehicle.

In Hartford, we already saw that there were 3589 stops of Black drivers. In our new table, we see that 925 of these drivers were searched in some way. The remaining 2664 were not searched. For white drivers, 835 were searched out of a total of 3486 who were stopped. Therefore,

825/3589 ~ 25.8%

of stopped Black drivers were searched while

835/3486 ~ 24.0%

of stopped white drivers were searched. Stopped Black drivers seem to be searched at slightly higher rate than stopped white drivers. But perhaps the Black drivers have contraband at a slightly higher rate as well?

Notice that contraband was found for 8 of the 925 searched or frisked Black drivers, or , and for 10 of the 835 searched or frisked white drivers, or . We see that the proportion of white drivers who were stopped or frisked and who possess contraband is slightly higher than that for Black drivers. It is therefore possible that officers are searching Black drivers on less evidence than they are for white drivers.

In summary, in Hartford between April 2014 and September 2016, it seems as though Black drivers were not stopped disproportionately. Stopped Black drivers were searched or frisked at a slightly higher rate than stopped white drivers, and contraband was found on a slightly higher percentage of searched or frisked white drivers than searched or frisked Black drivers. While we cannot make any definitive conclusions based on our analysis, it seems as though there was a small anti-Black bias in vehicular stops and searches, but there did not seem to be large-scale, widespread racial disparities.

Applying Bayes' Theorem to policing: practice problem

Bayes theorem also gives us a tidy way to analyze probabilities in police interactions. Consider the following scenario: the residents of Fourtown have recently been complaining of age discrimination. They would like to calculate the probability of being searched given that the motorist was over 65.

We know that of the 100 police stops in the last year, 50 ended in searches. Of those 100 stops, 80 were over the age of 65. Thirty of those motorists over 65 years of age were searched. What is the probability of a Fourtown motorist being searched, given that they are over the age of 65?

We see that P(searched|65+) = 0.375.  Make a tree diagram, and see if you can get this same result.  From the story, what is Pr(searched|65-)?  What did they mean by "age discrimination?"  And what is P(65+|searched)? 

Real Problem (for PSE)

How much more likely are Black motorists to get searched after they are stopped compared to white motorists? We will need two different calculations: 1) Probability of being searched, given that the stopped motorist is Black and 2) Probability of being searched given that the motorist is white. Let's start with calculation 1) and use Bayes' theorem.

Pr(searched/motorist Black) = P(searched)P(motoristBlack|searched) / P(motoristBlack)  We are going to use data from Nashville, TN because it has the most columns, or types of data that were collected.

Number of motorists stopped:  3092312
Number Searched: 127705
Black and searched: 67985
Black motorists stopped:  1165871

This will give us a basis for comparison. Now find:
๐‘ƒ(๐‘ ๐‘’๐‘Ž๐‘Ÿ๐‘โ„Ž๐‘’๐‘‘|๐‘š๐‘œ๐‘ก๐‘œ๐‘Ÿ๐‘–๐‘ ๐‘ก๐‘คโ„Ž๐‘–๐‘ก๐‘’)=๐‘ƒ(๐‘ ๐‘’๐‘Ž๐‘Ÿ๐‘โ„Ž๐‘’๐‘‘)๐‘ƒ(๐‘š๐‘œ๐‘ก๐‘œ๐‘Ÿ๐‘–๐‘ ๐‘ก๐‘คโ„Ž๐‘–๐‘ก๐‘’โˆฃ๐‘ ๐‘’๐‘Ž๐‘Ÿ๐‘โ„Ž๐‘’๐‘‘) / ๐‘ƒ(๐‘š๐‘œ๐‘ก๐‘œ๐‘Ÿ๐‘–๐‘ ๐‘ก๐‘คโ„Ž๐‘–๐‘ก๐‘’). โ€จ
 You already know
๐‘ƒ(๐‘ ๐‘’๐‘Ž๐‘Ÿ๐‘โ„Ž๐‘’๐‘‘)

White and searched:  47826
White motorist stopped:  1670873
Now, answer this question: How much more likely are Black motorists to be searched compared to white motorists?

Follow-up (for PSE)

But this severely underestimates the problem of racial profiling. How? Nashville is 63.49% white and 27.58% Black.

๐‘ƒ(๐‘๐‘™๐‘Ž๐‘๐‘˜๐‘€๐‘œ๐‘ก๐‘œ๐‘Ÿ๐‘–๐‘ ๐‘ก)=๐‘ƒ(๐‘๐‘™๐‘Ž๐‘๐‘˜๐‘€๐‘œ๐‘ก๐‘œ๐‘Ÿ๐‘–๐‘ ๐‘กโˆฃ๐‘๐‘ข๐‘™๐‘™๐‘’๐‘‘๐‘‚๐‘ฃ๐‘’๐‘Ÿ)๐‘ƒ(๐‘๐‘ข๐‘™๐‘™๐‘’๐‘‘๐‘‚๐‘ฃ๐‘’๐‘Ÿ)+๐‘ƒ(๐‘๐‘™๐‘Ž๐‘๐‘˜๐‘€๐‘œ๐‘ก๐‘œ๐‘Ÿ๐‘–๐‘ ๐‘กโˆฃ๐‘›๐‘œ๐‘ก๐‘๐‘ข๐‘™๐‘™๐‘’๐‘‘๐‘‚๐‘ฃ๐‘’๐‘Ÿ)๐‘ƒ(๐‘›๐‘œ๐‘ก๐‘๐‘ข๐‘™๐‘™๐‘’๐‘‘๐‘‚๐‘ฃ๐‘’๐‘Ÿ)
Note that the probability of getting pulled over and not getting pulled over might be difficult to estimate, and might require more data. The probability of being a Black motorist should be roughly the same as the Black percentage of the population, although you could probably problematize that assumption if car ownership rates are not equal amongst different groups. The probability of being a Black motorist given being pulled over can be calculated from existing data. This will give you pretty much everything else you need to solve for the rest of the missing pieces, and ultimately make a better calculation of ๐‘ƒ(๐‘ ๐‘’๐‘Ž๐‘Ÿ๐‘โ„Ž๐‘’๐‘‘|๐‘š๐‘œ๐‘ก๐‘œ๐‘Ÿ๐‘–๐‘ ๐‘ก๐ต๐‘™๐‘Ž๐‘๐‘˜). This is the power of Bayesian inference and probability theory, you can continually interrogate your assumptions and add more nuance to your estimates.

If you have access to all the data that you want, what other information would you want to make a more informed analysis?  This is, curiously, a short essay question.  Maybe a few sentences. 
.