Understanding your research goals
In computational text analysis, text is your data, and the text corpus is your dataset. The most important part of any data analysis is knowing the data you are working with: the context in which it was collected, its strengths, its limitations, why it has value and how you relate to it.
Have a healthy amount of skepticism as you move through the iterative process of crafting corpora and performing different text analysis methods. The analysis will always be influenced by what texts you choose to include (or not include) and the perspectives that you and your team bring to the research process.
As you begin to build your corpus, consider:
- What is the main goal of your research? What texts do you anticipate needing for this project?
- What kinds of patterns are you interested in exploring, and why?
- Whose perspectives are incorporated into this text corpus? What historical and social contexts informed the creation of the texts? How might this impact the analysis?
- Positionality: How do you (or the research team) relate to the concepts reflected in these texts?
- What assumptions do you have about the texts and the computational methodologies you’d like to use for analysis?
Further reading
- “A Dataset is a Worldview,” by Hannah Davis (2020)This brief article outlines helpful ideas in critically framing data as inherently subjective.
- Text Analysis: A Walking Tour of What People are Using in Digital Humanities Right Now: Miriam Posner explores commonly used text analysis methodologies and cautions that the nuanced relationships between words may not always be apparent with distant reading approaches.
The Digital Humanities Coursebook by Johanna Drucker
ISBN: 9781003106531
Publication Date: 2021-03-24
Chapter 7: “Data Mining and Analysis” provides an overview of key concepts and histories in computational text analysis methodologies. It introduces critical approaches to thinking about social implications of data. It also features several exercises for text analysts of all skill levels.
The Digital Black Atlantic by Roopika Risam (Editor);
Kelly Baker Josephs (Editor)
ISBN: 9781452965307
Publication Date: 2021-03-16
Chapter 7: “Text Analysis for Thought in the Black Atlantic” assesses the limitations of text analysis methodologies as well as assumptions in understanding the meaning of words, centering perspectives from digital African diaspora studies.