Skip to Main Content
JMU Libraries logo .jmulib-logo-purple{fill:#450084;}
Loading

PHYS 105: Foundations of Physics

What is data?

  • Any information collected, recorded, or generated in the course of your research!

What is basic data literacy?

  • Having a grasp on understanding key terms, concepts, and language about data.

What is data management?

  • The care and maintenance of data that is produced during the course of a research cycle. Management of research data is intentionally taking actions to make sure your research data is secure, sustainable, findable, understandable, and reusable.

Why is Data Management Important?

Good data management helps you keep your data organized, making it easier to find, understand, and use. This is crucial for maintaining data quality and integrity, especially in collaborative projects.

Key Practices for Data Management

File Naming: Use clear, consistent names for your files. Avoid spaces and special characters. Include dates and version numbers to track changes.

Folder Structure: Organize your files logically, by type (e.g., text, images), time (e.g., year, month), or activity (e.g., interviews, experiments).

Data Ecosystem: Your data ecosystem includes servers, cloud storage, code packages, and software. Proper management ensures data is secure, sustainable, findable, understandable, and reusable.

Manage and Clean Data: Intentionally care for and maintain your data. This involves planning from the beginning of a project, maintaining quality throughout, and ensuring data can be shared and reused effectively.

FAIR Data Principles: In 2016, the "FAIR Guiding Principles for scientific data management and stewardship" were published to ensure that your data is Findable, Accessible, Interoperable, and Reusable. These principles help make your research data more visible and usable, enabling integration with other data and reuse in new research.

CARE Data Principles: The CARE Principles for Indigenous Data Governance focus on how to handle data with respect and responsibility. CARE stands for Collective benefit, Authority to control, Responsibility, and Ethics. These principles guide researchers to consider not just data sharing, but also the rights and cultural contexts of the communities the data represents. This approach ensures that data use respects the values and interests of Indigenous and diverse cultures.

File Naming & Structure

Why is file naming important?

Think of a file name as a unique identifier for each of your files. Following a naming convention allows you to simplify the organization of your files and locate your files with ease, as well as making it easier for others to understand and reuse your data. This is particularly important when you are working on a collaborative project.

How should you name your file?

Here are some recommended best practices for naming your files:

  • Use names that are brief but descriptive
  • Avoid spaces and special characters (like *, #, % etc.)
  • Come up with a naming convention adhered to by everyone using the files
  • Identify versions of files using dates and version numbering in file name
  • Use three letter file extensions to ensure backwards compatibility (ex: .doc, .tif, .txt)

How should files be structured?

Folder structure for your files can assist in the unique identification of the files contained within them. Consider the structure of the folders containing your data files before you begin to collect your data. Ideas for how to organize your folders include:

Data type (text, images, models, etc.)
Time (year, month, session, etc.)
Subject characteristic (species, age grouping, etc.)
Research activity (interview, survey, experiment, etc.)

Consider these examples of file naming and folder structure:

File001.txt vs.
201206blood_ID0234.txt

MyDocuments\Research\Sample12.jpg vs
C:\\NEHGrant01234\WWI\Images\London_001.jpg

Data Organization

Why should you organize your data?

The organizational structure of your data can help secondary users of your data find, identify, select, and obtain the data they require.

How do you organize your data?

For best results, data structure should be fully modeled top-to-bottom/beginning-to-end in the planning phase of a project.
You'll want to devise ways to express the following:

  • the context of data collection: project history, aim, objectives and hypothesis. 
  • data collection methods: sampling, data collection process, instruments used, hardware and software used, scale and resolution, temporal and geographic coverage and secondary data sources used
  • dataset structure of data files, study cases, relationships between files
  • data validation, checking, proofing, cleaning and quality assurance procedure carried out
  • changes made to data over time since their original creation and identification of different versions of data files
  • information on access and use conditions or data confidentiality

(adapted from UKDA)

What is Metadata?

Metadata is data that describes your data. Metadata is used to structure actual data sets -- like the column headings of simple tabular data -- as well as to describe features of data sets. Some examples of metadata include information that answers the questions of when, who, what, why, how:

  • date the data was created
  • creators of the data
  • the source of the data
  • purpose for which the data was collected
  • structure of data files
  • changes made between different versions of data
  • codes used for variables and missing values
  • data collection methods and instruments used
  • steps taken to de-identify the data

Sometimes metadata is contained in the data files produced by the software used to collect or analyze the data, other times it is included in a codebook or lab notebook. Every effort needs to be made to keep this information with the data set with which it is affiliated.

Why should you care about metadata?

It provides the means for organizing and describing your data. Metadata facilitates data collection, processing, archiving, discovery, re-use and analysis.

What are metadata standards?

Metadata standards not only facilitate use of your data in its native environment, but maximize its usability in other environments. For example, standardized metadata will allow you to more easily move your data from one data repository to another. Check into whether there are standards commonly employed by your department or your organization. Perhaps your research domain commonly employs a metadata standard. It may be that the repository into which you will be depositing your data has metadata requirements. You will have to do a little research.

How it Ties to Data Literacy

Data Integrity: Proper data management ensures your data is reliable and accurate.

Collaboration: Well-organized data makes it easier to share and collaborate with others.

Privacy: Keeping track of data access and confidentiality helps protect your privacy.

Example: Imagine you’re working on a project about social media trends. Properly naming and organizing your files means you can quickly find and share specific data sets with your team, ensuring everyone is on the same page.

For more detailed guidelines on data management, check out the Data Management LibGuide.