Skip to Main Content

Scholarly Communication Guide: Data Collection

This guide provides information on the support services available at the library to help with all stages of your research, from planning your research, to measuring the impact of your research.

What is Data collection?

Data collection can be defined as the  methodical process of obtaining information on a certain topic. It is critical to make sure your data is acquired legally, ethically, and completely during the collecting process. Otherwise, your analysis won't be true and might have serious repercussions.

Before collecting data, there are several factors you need to define:

  • The question you aim to answer
  • The data subject(s) you need to collect data from
  • The collection timeframe
  • The data collection method(s) best suited to your needs

The data collection method you select should be based on the question you want to answer, the type of data you need, your timeframe, and your budget.

For more on data collection you can refer to the Sage Project Planner. 

Different types of data

Information gathered directly from the source for a particular research project is referred to as primary data. It is unique data that hasn't been examined or published before. Using a variety of techniques, researchers obtain primary data directly from sources, guaranteeing that the information is pertinent to their specific investigation.

Different techniques for gathering Primary Data:
Questionnaires and surveys: Made to collect data from a particular population.
Interviews: Gathering in-depth information through individual or group interviews.
Experiments: Performing controlled testing to see how things turn out in different scenarios.
Observations: Gathering information by watching people in their natural settings.
Focus groups: Holding discussions with a small group of people to get their opinions on a certain subject.

Features of the original data:
Original: This is the first time it has been collected.

Secondary data is when the researcher uses existing data that has been collected, analyzed, and published by someone other than the researcher. It is data that already exists and is typically used to support or enhance research without the need to gather new data. Researchers often use secondary data to gain insights, validate findings, or provide context for their studies.

Sources of Secondary Sources 

  1. Books and Journals: Academic publications that provide existing research findings.
  2. Government Reports: Statistics and studies published by government agencies.
  3. Research Articles: Previous studies that have analyzed specific topics.
  4. Databases: Compiled datasets from surveys or research, such as census data.
  5. Media Reports: News articles, magazines, and other media sources that cover relevant topics.

Characteristics of Secondary Data:

  • Previously Collected: Data that has been gathered by others.
  • Cost-Effective: Often less expensive and time-consuming to obtain than primary data.
  • Broader Context: Can provide a wider perspective on a topic through various sources.

Limitations:

  • Relevance: The data may not be perfectly suited to the research question.
  • Accuracy: The quality and reliability of secondary data can vary, depending on the source.
  • Timeliness: The data may be outdated or not reflective of current conditions.

Secondary data is valuable for researchers looking to support their work with existing evidence or to understand trends and patterns over time.

Sampling and representation

What is sampling?

Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. 

Why is it crucial to consider sampling?
Matching your sample as closely as feasible to the larger population to which you want to draw generalizations is crucial. Your sample and sampling strategy may have an impact on the generalizability of your findings, or how well they apply to situations or individuals that you have not studied. For instance, you might not be able to extrapolate the findings of your interviews with homeless persons in hospitals who have mental health issues to all Australians who suffer from mental health issues, are homeless, or are hospitalized. Depending on your research, your sampling strategy will change; however you may employ a probability or non-probability (or randomised) approach.

For more on sampling, please visit the Sage Research Methods Site