Categories
General

That’s a primary source? Yes, of course!

The best way to represent primary sources is by creating a tidy dataset with the document(s). That way, when analyzing the primary source, the raw data has been organized in a way that is both easier for computer softwares to understand and convert as well as having the individual input of each variable standardized in a way that allows for more practical analysis. When working with hand written, highly subjective primary sources, creating a tidy dataset out of the measurable information from that source will help to filter out some, but not all bias that exists within the document. 

The advantages of considering primary sources as data are that it enables the audience to develop a deeper and more personal understanding of the event in question. It also allows for a less biased analysis of the data, instead of multiple eras of bias, there is only one: when the data is from. Also, transcribing primary sources into a dataset creates a wider scope of accessibility for the potential audience of that data. By typing old-timey handwritten documents, it is now possible that those who may have trouble reading cursive will be able to comprehend the writing on the original source. While primary sources may not paint a complete picture of what the subject is, working with primary sources can provide an inside look at the unrefined reactions and opinions of the author. Also, referincing primary sources may uncover quantifiable raw data that may have been removed by the filters of every new analysis. Therefore, while primary sources may not always delineate the entire event, they are great references to have that cover thorough, specific aspects of an event or events. 

Wickham’s principles of tidy data are that each row has only one observation, every column has only one variable, and that each cell has only one value. This provides for an analysis of the data in a way that is both efficient and effective. In the tidy dataset that I had recently created, it was difficult to document the data in a way that met all of these requirements. While creating the dataset, I had to constantly reevaluate the variables and observations in a way that prioritized the efficiency and effectiveness that is appropriate in order to create the standardized version of a tidy dataset. However, looking at the dataset when it was finished, I was able to locate a specific value in a much faster time than when it was written on the original document. This made for much quicker analysis, as I was able to focus solely on analyzing similar values of observations rather than spending a lot of time trying to simply locate each individual piece of data.