February 15th: Good Data Examples
Post authored by Lora Leligdon
Day three of Love Your Data week brings us to some examples of good data! What are good data?
Good data are FAIR – Findable, Accessible, Interoperable, Re-usable
Things to consider:
What makes data good?
- Data has to be readable and well-documented enough for others (and a future you) to understand.
- Data has to be findable to keep it from being lost. Information scientists have started to call such data FAIR — Findable, Accessible, Interoperable, Re-usable. One of the most important things you can do to keep your data FAIR is to deposit it in a trusted digital repository. Do not use your personal website as your data archive.
- Tidy data are good data. Messy data are hard to work with.
- Data quality is a process, starting with planning through to curation of the data for deposit.
Example: This dataset is still around and usable more than 50 years after the data were collected and more than 40 years after it was last used in a publication.
Counterexample: This article: http://www.sciencedirect.com/science/article/pii/S1751157709000881 promises:
“Statistical scripts and the raw dataset are included as supplemental data and are also available at http://www.researchremix.org.”
Hadley Wickham tells you how to tidy your data: http://vita.had.co.nz/papers/tidy-data.pdf
- Project TIER teaches undergraduate students how to structure data for reproducible research: http://www.projecttier.org/tier-protocol/specifications/
- UK Data has great instructions for how to document your data: http://www.data-archive.ac.uk/create-manage/document
- If you want to go all in, look at the instructions for documenting data in ICPRS’s Guide to Social Science Data Preparation and Archiving http://www.icpsr.umich.edu/files/deposit/dataprep.pdf
Example: Data can take many forms. This compilation of “Morale and Intelligence Reports” collected by the UK Government during and after the war is a great example of qualitative historical data: https://discover.ukdataservice.ac.uk/catalogue/?sn=7465
- Want to learn more? Register and attend a Dartmouth research data management workshop to learn more about planning, cleaning, visualizing, storing, sharing, and preserving your data at Dartmouth.
- What is your favorite data set? How/why is it good for your project? Try out the FAIR Principles to describe and share examples of good data for your discipline. Tell us on Twitter or Facebook (#LYD 2017 #loveyourdata)
Our daily blog posts are courtesy of the 2017 LYD Week Planning Committee. Learn more at https://loveyourdata.wordpress.com/lydw-2017/!
We’re getting close to the end of our quality data posts! But stay tuned – tomorrow we will be discussing how to find the right data for your project.