This article was previously published by Mosaic Institute and is reprinted with permission.
Much of the work we do in relation to social justice and dismantling prejudice involves doing research and collecting data on the state of social issues and those who are most affected by them. Through the use of surveys, interviews, focus groups, and other quantitative and qualitative methodologies we are able to develop an extensive cache of data that can be used to draw conclusions and make recommendations for policy development. At face value, this data is an incredibly valuable source of information that can make social justice work much easier. But, while it is true that collecting data is an important part of this type of work, there are also drawbacks that must be considered.
The life events and experiences that make up an individuals’ identity are remarkably nuanced, making it difficult to summarize them within a data set. As a result, the findings that we, as researchers, develop can sometimes be skewed or inaccurate. This can have harmful impacts on our perceptions of different communities and the development of policies overtime.
In what ways does poor data impact communities?
There are a variety of ways that poor-quality data can negatively impact communities. A major example of this is how groups are represented, or not represented, in research.
For example, many studies either do not distinguish between different Indigenous communities or categorize them in different ways. This can be significant if research participants feel that they have to misrepresent themselves in order to be involved in the project. Choosing a category that they do not really identify with means their experiences may be attributed to other individuals and groups or that certain elements of their life course may not be factored into the conclusions made by researchers.
Alternatively, if the participant chooses to not identify themselves with an improper category, their perspective may not be represented at all. In this situation, the participant is forced to choose between two ways of representing themselves, neither of which are accurate. The impact this has on the representativity of data can be compounded if different studies categories individuals in different ways, meaning the data sets are talking about entirely different groups of people. This makes is difficult to triangulate findings or compile multiple data sets in order to draw conclusions.
Another way that research and the data sets that researchers develop can be harmful in how they represent communities is by not distinguishing between different groups at all and lumping the experiences of many different individuals together.
Going back to the example discussed previously, there have been many cases where data will categorize an individual as Indigenous but not allow them to specify which group they identify with more specifically. In this case, the unique experiences of different communities, tribes, and locations are not recognized. This issue is also related to the importance of conducting research from an intersectional lens, something that is difficult to do when looking at numbers, survey data, or a spreadsheet, which is often the form that the data we work with comes in. The life experience of an individual is nuanced and complex in a way that cannot be generalized into data. For example, racialized women often face barriers due to their identity as a racialized person as well as their identity as a woman, but this is difficult to depict within a data set.
Solving the problem
The best way to solve the problem of non-representative or skewed data is to simply develop better data when conducting research. This could come in the form of expanding identifying categories within surveys, obtaining feedback from community groups and leaders about whether or not they feel accurately represented within the research, and allowing for open ended responses so participants can express their opinions and describe experiences that researchers may not have been aware of.
Researchers should also keep an open mind about the representativity of their research and be willing to adapt surveys, studies, and other projects if problems arise. Building better research and data sets is an important of improving the quality of data overall but can be difficult. This is especially true when much of the work we do in the social justice space involves using data that has already been collected in order to draw conclusions and make policy recommendations. We are often limited in our capacity to conduct our own research in terms of both time and resources, and the urgency of the work we do means that using secondary data is often the best option. So, we should also ask ourselves, how can we address the limitations of the data we were using while we work?
Acknowledging the limitations to the data we’re using is an integral part of improving the quality of conclusions we can draw about communities and the recommendations we are able to make regarding policies and program development. In fact, concluding that the data is inherently flawed and non-representative of community populations can be a core finding of the research we do. By doing this, we can begin to advocate for different types of knowledge, such as lived experience narratives, in order to develop policies, programs, and supports that will have the most meaningful impacts on communities.
Reference materials and future readings:
-
Bowleg, L. (2020). We’re Not All in This Together: On COVID-19, Intersectionality, and Structural Inequality. American Journal of Public Health, 110(7), 917. Retrieved from https://ajph.aphapublications.org/doi/pdf/10.2105/AJPH.2020.305766
-
Datta, G., Siddiqi, A., Lofters, A. (2021). Transforming race-based health research in Canada. Canadian Medical Association Journal, 193(3), 99-100.
-
Fremantle, E., Zurynski, Y. A., Mahajan, D., D’Antoine, H. & Elliott, E. J. (2008). Indigenous child health: urgent need for improved data to underpin better health outcomes. Medical Journal of Australia, 188(10), 588-591.