The BPD Field Interrogation and Observation dataset was collected by the Boston Police Department and can be found on Analyze Boston. The data set looks at field interrogation stop contacts during the year 2025 only within the entirety of Boston and its many neighborhoods.
This data set was created as a way for the police to be transparent with the public who are able to act on it. It also stands as a way to hold them accountable by making this information available to its intended audience: the citizens of Boston and anyone else who would like to view the data.
At face value it answers the question as to what areas are experiencing the most amounts of stops, or what populations are facing field interrogation rates higher than others.
However, there are also many flaws within this set, especially as to what information it may appear to display onto the general audience who will encounter it. For example, the way in which it organizes each area is not consistent. Rather than adding a category to specify the neighborhoods, the ‘city’ field is inconsistent naming multiple neighborhoods across the same zipcode. In addition, it does not provide the ratio of cops to each populated area. Getting to see which areas may be overpoliced, or not is important as that information would explain influxes of police stops in certain neighborhoods.
Overall, there is a large influx in stops in neighborhoods such as Dorchester, Roxbury, and Mattapan, compared to the others. The set, while describing the specifics of the stop, leaves out information on demographics of the person who is stopped which is crucial.
Despite this it is a fact that, according to where the stops are concentrated, the majority of stops are in Boston’s majority Black and Brown neighborhoods, the majority in Dorchester with over 1200 stops being made. Whereas predominantly white neighborhoods fall below 75.
While this data documents everything about the stop itself, the officer included, and even details about the approach, it is missing important demographic information. In addition to that, it does not include the outcomes of the stops, whether they were justified or if they led to an arrest. This data set is meant to provide transparency, but the only way to tell the accuracy in the jobs that police are conducting is to see the ratio of how many of their stops were justified, or if they ended in rightful arrest. This data set lacks that part completely.
Because the data set comes from the officers themselves, there is really no way to measure from it how accurate everything is, no way to know what information could have gone undocumented. This causes a number of barriers for the communities that are affected, due to what this data set, at face value, may say to the general audience in terms of what neighborhoods a person may consider to be safe based on the amount of traffic stops.
While this can be factored into what the general audience would consider to be a ‘safe’ neighborhood, the fact that it is missing so much information, and messily organized it can be very misleading. Leaving out information such as the amount of policing is in each area, the result of these approaches, and just the unclear organization can lead to misinformed conclusions.
If I were to continue this data set, I would ensure that there was a better distinction between neighborhoods, as well as include information about the result of each stop. Whether or not it ended in arrest, to then allow the public to see the ratio of what stops led anywhere, versus when officers were maybe improperly profiling individuals. Having the information of each person stopped would also be needed in order to make a connection should it be the case that certain demographics are stopped more than others. The data set interestingly enough does have the info of the specific officers, so even seeing a pattern between each officer, and how it correlates to specific neighborhoods, and demographics could also tell an interesting narrative.