I have previously tested a few text-mining tools. One tool that I have experimented with is the Google Books Ngrams Viewer. This tool scans texts from books that are currently in the public domain, and tracks the frequency in which certain terms are used over time (currently it will go up until 2008). My research centers on the trauma of nurses during the First World War, so I decided to use three key terms from my research: “trauma”, “nurse”, and “World War I”. As seen below in the screenshot of the graph from the search, a number of things can be interpreted/analyzed, such as the frequency of the words in each given year and the overlapping (or lack of overlapping) of the terms. For example, it is interesting to see how “World War I” peaks following the Second World War in the late 1940s. What can be inferred from that peak? Did books published at the time seek to make connections between World War I and World War II? Or, another observation- there is an overlap with “World War I” and “trauma” in the years prior to the Second World War, specifically around 1937. Why is that? Does this relate to the books being published by those of the First World War period who lived through the trauma? Or could it have to do with pensions? Or maybe it relates to research being done on trauma from the First World War.
This tool is a great starting point, but I think it is important to realize it is just that, a starting point. The search I did is very bare-bones. There are a number of issues with it being such a simple search. First, there is the limitation of publication dates searched stopping at 2008. Another issue is that the three key search terms are terms that have other forms. “World War I” is also “First World War,” “Great War,” and “WWI,” at the very least. Or, the term “trauma” is not limited to psychological trauma, in addition to the issue being that “psychological trauma” has multiple forms as well, such as “shell shock,” which is a key term associated with World War I. This also proves another limitation, which is not being able to necessarily see the change in terms over time. Of course, more than just these three key terms can be searched, but this leaves it up to the user to determine which key terms are the most relevant to the search.
Despite the limitations to this very broad and general search, it is again, a great place to start and ways for individuals to see initial patterns that can lead to further research and investigation. I believe that by working with tools such as the Google Books Ngrams Viewer, I could use text-mining to benefit my own research. For example, it would be interesting to see, when looking at the different (and similar) terminology used to describe psychological trauma, the frequency in which terms were used in different periods, and what other key terms are most frequently associated with terms centering on psychological trauma. With the wide range of tools available, and that will become available over time, it is very possible to do this.