Why a campaign for improving the quality and use of policing data?
The contributions that operational policing data has made to the field of criminology, and to the development of innovative policing research specifically, are innumerable. Police data has been used to evaluate the effectiveness of hot spot policing, problem-oriented policing strategies, foot patrols and focused deterrence strategies, as well as to identify spatial dimensions of criminal offending, the size and scope of criminal networks, and characteristics of traffic collisions and fatalities. Access to quality data from police agencies can increase the overall volume of empirical research on policing and community safety issues, providing police leaders with an improved evidence base upon which to draw when making important operational and policy decisions.
Unfortunately, my own experiences in research and program evaluation with a number of police services across Canada have demonstrated there can be significant problems in how police data is collected, verified, used and shared. For example, an attempt to analyze Canadian missing persons data at the national registry was confounded by the fact there is no reporting requirement for solved cases. In another study, the idea to use police generated statistics on ‘time spent’ in relation to a policing activity had to be abandoned when it was realized there was no standards in place as to how ‘time spent’ was defined, rendering the statistics produced meaningless. Crime analysts from services across Canada have shared similar stories involving coding errors, inflated numbers due to misidentification of activities, and problems with the data verification process that allow errors to pass through the system.
Although reliability issues and other problems with policing data have not generated significant attention, they have not entirely escaped notice. To illustrate one of a few such examples drawn from the research literature, I might point to a study on traffic accidents by Taylor and Malik that relied on police accident data. In attempting to explore the relationship between highway geometry, vehicle characteristics and accidents, these scholars observed that over 25% of the VIN codes found in police data did not decode. A recent study examined US Bureau of Justice Statistics on police of use force, which are drawn from police data sources. What this analysis revealed is that “the BJS data suffer from serious measurement flaws, do not provide a valid and reliable basis for comparative statistical reporting and research purposes” (Hickman and Poore 2016). In a study using UK police-recorded data on domestic violence calls, the author concluded that problems encountered with relying on this data included quality issues linked to a lack of “consistency and completeness” with which data was entered (Brimicombe 2016).. Similar problems have been observed in the Canadian context by other researchers. A study of officer coding of offenses for the Canadian Uniform Crime Reporting system – the basis of the Canadian criminal justice system’s annual statistics – found that officers in several Canadian jurisdictions believed that the standardized system of UCR coding was “a somewhat subjective process” (McCormick, Haarhoff, Cohen, Plecas and Burk 2012). Researchers also observed that coding errors were not uncommon and identified several factors influencing inaccurate coding, including a belief by officers that any errors would be caught by Quality Control checks. It was further found that acceptance of inaccurate work was also due in part to “the attitude of general duty and supervisors towards the importance of file scoring. At both the general duty and supervisor levels, accuracy of file scoring was not reported to be a priority”.
Armed with knowledge of the problems associated with a lack of best practices in data collection, retentions, sharing and use, the Canadian Society of Evidence Based Policing launched the Good Data Initiative (GDI) with two overarching goals:
1. to generate empirical knowledge of current police data practises and research into best practices, and;
2. to promote quality data collection, verification, analysis and sharing among police agencies.
One important consideration is: what does ‘good’ or ‘quality’ look like within this context? Mitar (2004) suggests that data management within the policing domain should be able to serve both operational and management needs, including crime analytics and auditing purposes. Practitioners of evidence based policing, myself included, would add that we should also think about the importance of reliable policing data for research, particularly for evaluation and/or intervention-based research intended to test the effectiveness and efficiency of policing activities. Thinking more broadly, we might then borrow from Condelli et al. (2002) and consider the following as dimensions of ‘good data’:
1. Relevance – that data collected is relevant to its assigned uses;
2. Accuracy – that we can be confident that analyses produce reliable results because the data employed accurately captures a phenomenon and/or is coded/captured without bias or error;
3. Accessibility – that data can be easily obtained and used;
4. Interpretability – that there is sufficient information available about how the data was collected in order to properly interpret and use it;
5. Coherence – that there are standardized systems in place for classification and/or coding that allows data to be interpreted accurately over time.
And, given the importance of developing evidence-based approaches – that is approaches that are tested and re-tested using comparable data, we might also add a sixth dimension:
6. reproducibility. By reproducibility I mean that data can be reproduced using the same or similar sources, which should, in theory, allow researchers to achieve similar findings when analyzed using the same or comparable techniques.
In short, through the GDI, we want to encourage police agencies to produce and share good ‘good data’ through embracing the dimensions listed above.