Project Management R+ Text Analysis

Data Management


Being a digital humanist, and before that, a bit of a web nerd, I had file naming and data management conventions drilled into my head.  At least that is what I thought.  As part of our incubator, we had the opportunity to meet with Assistant Professor and Data Curation Librarian, Jennifer Thoegersen.


Here are just a few of the things that I learned:

  • You want a local and backup copy; three copies on two different mediums and one remote copy.
  • When dealing with human subjects, it’s important to think about private information—especially if they are alive.
  • For my project, which uses R, it is important to ensure that the names of the R output files match the text input and that there are no spaces or special characters in the file names (For example, airp730.pdf. airp730.txt, airp730.R)  I
  • Always keep a README File.
  • It helps others who might not be familiar with my project to understand my choices.  For example, I removed non-essential pages from the .pdf’s before I used OCR software.



Leave a Reply

Your email address will not be published. Required fields are marked *