0%

Theme 8 - Data Collection

1. Data Collection

1.1. Main concept

  • credibility: the quality of being trusted and believed in (results are not made up, and its methods and results can be trusted)
  • reliability: the extent to which findings are consistent; results are obtainable again under similar condition​s (same data would have been collected each time)
  • validity: the extent to which a test measures or predicts what it is supposed to (the appropriateness/meaningfulness/usefulness of findings​)
  • generalisability: whether findings are applicable to other research settings or organisations​
  • trustworthiness: if research has credibility, transferability, dependability and confirmability
  • authenticity: if research represents different viewpoints, helps improve the social setting and the understanding of others, and if it empowers members to engage in action
  • objectivity: personal neutrality in conducting research
  • rigour: activity of being careful and paying great attention to detail
  • measurement: the process of observing and recording the observations that are collected as part of a research effort
  • consistency: the quality of always being the same, doing things in the same way, having the same standards etc
  • yield: to produce a result, answer, or piece of information
  • research integrity: conducting research in a way which allows others to have trust and confidence in the methods used and the findings that result from this
  • maturation: when subjects change during the course of the experiment or between measurements
  • bias: a strong feeling in favour of or against one group of people, or one side in an argument, often not based on fair judgement
  • measurement (construct validity): the degree to which a test measures what it claims, or purports, to be measuring
  • internal validity: whether or not there is really a causal relationship between two variables
  • external validity: the extent to which the results of a study can be generalized to other situations and to other people
  • ecological validity: The extent to which a study is realistic or representative of real life.

1.2. Research quality

Research quality concerns the credibility of research.

  • How do we judge the results of the research?
  • How can we trust research findings?
  • How do we know the research has not been made up?
  • How do we know the research is capturing something about the phenomenon in question?

1.3. Research credibility

Research credibility is about making the research process transparent, explaining and justifying what you did and why - including research choices, decisions and procedures and processes. This ensures that you produce high quality research even if it reveals limited results. Traditional criteria derive from quantitative perspectives.

2. Quantitative Research

2.1. Traditional criteria

These traditional measures of quality are primarily concerned with objectivity, rigour and measurement

  • Reliability
  • Validity
  • Generalisability

2.1.1. Reliability

Reliability concerns the extent to which data collection and analysis techniques produce consistent findings. It implies that the same data would have been collected each time over repeated tests / observations.

  1. How to assess reliability of your research?
  • Measures of reliability
    • Stability: Is the measure stable over time?
    • Internal reliability: Are the indicators (kinds of measure) consistent? Indicators must relate to the same thing, be coherent.
    • Inter-observer consistency: Will similar observations be reached by other observers?
  1. Threats to reliability
  • Participant error: any factor which influences the way in which a participant performs (e.g. emotional, physical)
  • Participant bias: any factor that causes a false response (participants say what they think the interviewer wants to hear)
  • Observer error: any factor that causes a false interpretation of the researcher (misunderstanding)
  • Observer bias: any factor that causes a false interpretation of the researcher (their own objective view)

2.1.2. Validity

  • Validity is concerned with the integrity of the conclusions that are generated from a piece of research.
  • Validity refers to obtaining results that accurately reflect the concept being measured and it implies reliability (consistency)
  • A valid measure is one which is measuring what it is supposed to measure.
  1. Four types of validity
  • Measurement validity e.g. Does an IQ test really measure variations in intelligence?
  • Internal validity e.g. if we suggest that x causes y, can we be sure that it is x that is responsible for the variation in y and not something else? (the principle of cause and effect)
  • External validity e.g. can the results of a study be generalised beyond the specific research content? (the importance of choosing participants)
  • Ecological validity e.g. are social scientific findings applicable in people’s everyday, natural social settings?
  1. Threats to (internal) validity
  • Past or recent events: The influence of specific experiences
  • Testing: The impact of testing on participant’s view or actions(having been exposed to the test before). For example, informing participants about a research project may later affect their work behaviour or responses during the research if they believe it might lead to future consequences for them
  • Maturation: The impact of change in participants outside of the influence of the study that affects their attitudes and behaviour. For example, management training may make participants revise their responses during a subsequent research stage
  • Ambiguity about causal direction: Lack of clarity about cause and effect
  • Instrumentation: The impact of a change in a research instrument between different stages of the research affecting the comparability results
  • Mortality: The impact of participants withdrawing form the studies.
  1. Example of Validity

The Hawthorne effect: The Hawthorne Effect describes the curious reality that humans under observation usually perform better than those who are unobserved.

2.1.3. Generalisability

Generalisability concerns the extent to which research results are applicable to other research settings and/or organisations

  • Can the findings be generalised to other contexts?
  • Can the findings be generalised from sample to population?
  1. Credibility in quantitative research
  • Establishing reliability and validity of measures, and generalisability of findings is important in terms of assessing the credibility of quantitative research.
  • Quantitative research should be capable of replication. Another researcher, in another place and at a different time, could repeat the methods and procedures used in order to verify the original findings in the new context.

3. Qualitative Research

3.1. The characteristic of qualitative research

  • Seeing through the eyes of those studied: how patterns of events unfold over time; social worlds characterised by change and flux
  • Description and emphasis on context: detailed account of the social setting; thick description of what is going on
  • Emphasis on social process
  • Flexibility and limited structure: no prior contamination by rigid schedules; sensitising concepts
  • Concepts and theories grounded in data
  • The development of concepts
  • Relationship between researcher and respondents

3.2. Credibility in qualitative research

Traditional criteria derive from quantitative perspectives and some qualitative researchers also apply these to their studies. Other qualitative researchers view traditional criteria as problematic.

  • Problems with reliability
    • Qualitative research views reality as context bound: External reliability concerns the degree to which a study can be replicated. In qualitative research, social reality cannot be replicated, because reality is understood to be multiple, subjective and constructed. Qualitative research has an unstructured nature with no standard procedures. Qualitative researchers do not believe there are single accounts of particular aspects of the social world that can be replicated. They argue that developing multiple perspectives enhances our understanding with the social world by building a richer, more complex picture.
    • Researcher is the main instrument: The purpose of internal reliability is to ensure uniformity and objectivity across researchers so that bias does not influence the findings. In qualitative research, however, the researchers is the main instrument of data collection and analysis. What the researcher observes, hears and decides to concentrate on is a produce of her/his perspective. So qualitative researchers recognises that findings are influenced by their own subjectivity/experience. As the researcher is the main instrument of data collection and analysis, influenced by personal experience, objectivity is not achievable. Researchers choose what areas to focus on, participant responses are likely to be affected by the characteristics of the researcher. Eg. age, gender, ethnicity, class, personality and the researcher’s interpretation will be affected by their subjective learning.
  • Problems with generalisability: Generalisability is not appropriate because the research is concerned with context, detail and depth. Qualitative research tends to use case studies and small samples. This criteria is not appropriate as the focus is on small numbers of individuals, organisations and settings. It can be difficult to generalise from these small-scale studies. Generalisation is not an intention of qualitative research. The strength of qualitative research is the detailed, complex and rich picture it can build about a particular phenomenon.
  • Validity: validity tends to be a strength in qualitative research because there is a good match between researcher observations and theoretical ideas developed. This is particularly the case in ethnographic research because the research participates for a prolonged period in the social life of a group to ensure high levels of congruence between concepts and observations.

3.3. Attempts to address the problems

  • Tensions between traditional criteria and qualitative preoccupations have resulted in ongoing debate.
  • Attention to developing appropriate criteria for measuring quality in qualitative research is limited.
  • Qualitative researchers have two choices:
    • Reformulate traditional criteria
    • Apply alternative criteria

3.3.1. Reformulating criteria for qualitative researcher

  • Reliability: researcher should document each state of the research proces (it provides transparency).
  • Validity: researcher should concentrate on listening, not talking, producing accurate notes and seeking feedback from respondents.

3.3.2. Alternative criteria for qualitative research

  1. Trustworthiness (replaces reliability and generalisation)
  • Credibility: carrying out research according to principles of good practice.
  • Dependability: completing records of all phases of the research process. Eg. documenting the problem formulation, selection of research participants, fieldwork notes, interview transcripts, data analysis decisions and so on.
  • Transferability: contextual richness and in depth descriptions enable researchers to apply the study in other settings.
  • Confirmability: Not allowing personal values to influence the findings.
  1. Authenticity (replaces validity)

    Validity grounded in the researcher’s practice is required. This is achieved through authenticity. Authenticity criteria concern the wider political impact of the research. Authenticity criterial have not been influential and their emphasis on the wider impact of research is controversial.

  • Does research represent different viewpoints in the social setting?
  • Does the research help improve understanding of the social setting?
  • Does the research help improve understanding of others in the social setting?
  • Does the research empower members to engage in action?

4. Secondary Research

4.1. The definition

  • Secondary research involves data that has been collected by somebody else previously.
  • It involves re-analysing, interpreting, or reviewing past data which is usually accessible via past researchers, government records and various online and offline resources.
  • The role of the researcher is always to specify how this past data informs his or her current research.

4.2. Advantages

  • Inexpensive: Conducting secondary research is much cheaper than doing primary research
  • Saves time: Secondary research takes much less time than primary research
  • Accessibility: Secondary data is usually easily accessible from online sources
  • Large scope of data: You can rely on immensely large data sets that somebody else has collected
  • Professionally collected data: Secondary data has been collected by researchers with years of experience

4.3. Disadvantages

  • Inappropriateness: Secondary data may not be fully appropriate for your research purposes
  • Wrong format: Secondary data may have a different format than you require
  • May not answer your research question: Secondary data was collected with a different research question in mind
  • Lack of control over the quality of data: Secondary data may lack reliability and validity, which is beyond your control
  • Lack of sufficient information: Original authors may not have provided sufficient information on various research aspects

4.4. Methods and purposes of secondary research

Method Purpose
Using secondary data set in isolation Re-assessing a data set with a different research question in mind
Combining two secondary data sets Investigating the relationship between variables in two data sets or comparing findings from two past studies
Combining secondary and primary data sets Obtaining existing information that informs your primary research

4.5. Types of secondary data

  • The two most common types of secondary research are, as with all types of data, quantitative and qualitative.
  • Secondary research can, therefore, be conducted by using either quantitative or qualitative data sets.
  • Both can be used when you want to:
    • inform your current research with past data
    • re-assess a past data set

4.6. Sources of secondary data

The two most common types of secondary data sources are labelled as internal and external.

4.6.1. Internal sources of data 

Internal sources of data are those that are internal to the organisation in question. For instance, if you are doing a research project for an organisation (or research institution) where you are an intern, and you want to reuse some of their past data, you would be using internal data sources.

  • Examples include: Sales data, financial data, transport data, marketing data, customer data, safety data.
  • The benefit of using these sources is that they are easily accessible and there is no associated financial cost of obtaining them.

4.6.2. External sources of data

External sources of data are those that are external to an organisation or a research institution where you conduct your research. This type of data has been collected by “somebody else”.

  • Examples include: Government sources; National and international institutions; Trade, business, and professional associations; Scientific journals; Commercial research organisations.
  • The benefit of external sources of data is that they provide comprehensive data. However, you may sometimes need more effort (or money) to obtain it.

4.7. Steps for doing secondary research

4.7.1. Step 1 - Develop your research question

  • The first step here is to specify the general research area in which your research will fall.
  • Once you have identified your general topic, your next step consists of reading through existing papers to see whether there is a gap in the literature that your research can fill.
  • Having found your topic of interest and identified a gap in the literature, you need to specify your research question.

4.7.2. Step 2 - Identify a secondary data set

  • It is after reviewing the literature and specifying your research questions, that you may decide to rely on secondary data.
  • You will do this if you discover that there is past data that would be perfectly reusable in your own research, therefore helping you to answer your research question more thoroughly (and easily).
  • To discover if this is done through reviewing the literature on your topic of interest. During this process, you will identify other researchers, organisations, agencies, or research centres that have explored your research topic.
  • You need to ensure that a secondary data set is a good fit for your own research question. Once you have established that it is, you need to specify the reasons why you have decided to rely on secondary data.

4.7.3. Step 3 - Evaluate a secondary data set

Because of the disadvantages of secondary data, it is crucial to evaluate a secondary data set.

  1. What was the aim of the original study?
  • When evaluating secondary data, you first need to identify the aim of the original study. This is important because the original authors’ goals will have impacted several important aspects of their research, including their population of choice, sample, employed measurement tools, and the overall context of the research. 
  • During this step, you also need to pay close attention to any differences in research purposes and research questions between the original study and your own investigation. As we have discussed previously, you will often discover that the original study had a different research question in mind, and it is important for you to specify this difference.
  1. Who has collected the data?
    A further step in evaluating a secondary data set is to ask yourself who has collected the data. To what institution were the authors affiliated? Were the original authors professional enough to trust their research? Usually, you will be able to obtain this information through quick online searches.

  2. Which measures were employed?

  • If the study on which you are basing your research was conducted in a professional manner, you can expect to have access to all the essential information regarding this research.
  • Original authors should have documented all their sample characteristics, measures, procedures, and protocols. This information can be obtained either in their final research report or through contacting the authors directly.
  1. When was the data collected?
    When evaluating secondary data, you should also note when the data was collected. The reason for this is simple: if the data was collected a long time ago, you may conclude that it is outdated. And if the data is outdated, then what’s the point of reusing it?

  2. What methodology was used to collect the data?

  • When evaluating the quality of a secondary data set, the evaluation of the employed methodology may be the most crucial step.
  • We have already noted that you need to evaluate the reliability and validity of employed measures. In addition to this, you need to evaluate how the sample was obtained, whether the sample was large enough, if the sample was representative of the population, if there were any missing responses on employed measures, whether confounders were controlled for, and whether the employed statistical analyses were appropriate. Any drawbacks in the original methodology may limit your own research as well.
  1. Making a final evaluation
    The final question to ask is: “what can be done if our evaluation reveals the lack of appropriateness of secondary data?”. The answer, unfortunately, is “nothing”. In this instance, you can only note the drawbacks of the original data set, present its limitations, and conclude that your own research may not be sufficiently well grounded.

4.7.4. Step 4 - Prepare and analyse secondary data

  • Outline all variables of interest
  • Transfer data to a new file
  • Address missing data
  • Recode variables
  • Calculate final scores
  • Analyse the data

5. Sampling

5.1. Definition of key words

  • Population: the universe of units from which the sample is to be selected.
  • Sample: the segment of population that is selected for investigation.
  • Sampling frame: a list of all units within the population that can be sampled.
  • Representative sample: a sample that reflects the population accurately.
  • Sample bias: distortion in the representativeness of the sample.
  • Probability sample: sample selected using random selection
  • Non-probability sample: sample selected not using random selection method
  • Sampling error: difference between sample and population.(because the sample size was limited)
  • Non-sampling error: findings of research into difference between sample and population.(refers to an error that results during data collection)
  • Non-response: when members of sample are unable or refuse to take part
  • Census: data collected from entire population

5.2. Samples in research projects

A sample is a limited number of objects, phenomena, or people which are studied in order to make observations about the group as a whole.

5.3. Sample size and representativeness

  • The size of a sample, or the number of measurements taken, can also have an impact on the validity of results.
  • A sample should be large enough that the researcher can generalise the results by claiming that they give an insight into the general population or phenomena being studied.
  • If you are conducting a survey (e.g. a questionnaire or online survey) for your project, it is recommended that you have a sample size of 30.
  • If you are conducting interviews, it is recommended that you interview 5 people with each interview lasting approximately 10 minutes.

5.4. Sampling error

Difference between sample and population​

  • A biased sample does not represent the population ​

    • some groups are over-represented​
    • others are under-represented​
  • Sources of bias​

    • non-probability sampling​
    • inadequate sample frame​
    • non-responses​
  • Probability sampling reduces sampling error and allows for inferential statistics​.

5.5. Probability Sampling

  • Simple random sample​
    • Each unit has an equal probability of selection​
    • Sampling fraction: n/N. where n = sample size and N = population size​
    • List all units and number them consecutively
    • Use random numbers table to select units
  • Systematic sample​
  • Stratified random sample​
  • Multi-stage cluster sample

5.6. Non-Probability Sampling

  1. Convenience sampling

    • the most easily accessible individuals​
    • useful when piloting a research instrument​
    • may be a chance to collect data that is too good to miss
  2. Snowball sampling

    • researcher makes initial contact with a small group​
    • these respondents introduce others in their network​
  3. Quota sampling

    • interviewers select people to fit their quota for each category​
    • The sample may be biased towards those who appear friendly and accessible (e.g. in the street)​
    • can lead to under-representation of less accessible groups​

5.7. Limits to generalisation

  • Findings can only be generalised to the population from which the sample was selected​
    • be wary of over-generalising in terms of locality​
  • Time, historical events and cohort effects​
    • results may no longer be relevant and so require updating (replication)

5.8. Error in survey research

  • Sampling error: unavoidable difference between sample and population​
  • Sampling-related error: inadequate sampling frame; non-response​
  • Data collection error​: implementation of research instruments​. E.g. poor question wording in surveys​
  • Data processing error​: faulty management of data. E.g. coding errors

6. Surveys

6.1. Different types

  • Structured e.g. face-to-face questionnaire, online survey
  • Semi-structured e.g. focus group, interview
  • Unstructured e.g. interview

6.2. Planning

  • Population
    • Who forms the population you plan to survey?
  • Sample
    • Sample size = approximately 30 for questionnaires and 5 for interviews
  • Anonymity
    • What (if any) identifying information is required?
  • Delivery
    • Face-to face / Online survey?
  • Incentives/Rewards
    • No incentives or rewards should be offered

6.3. Defining your objectives

  • Establish a clear hypothesis​
  • Establish the exact variables you wish to gather data about and how they can be assessed
  • Think about the answers you want before you write the questions​
  • Think about how you are going to analyse and evaluate your data

6.4. The hypothesis

  • When you hypothesise, you have a theory which then has to be supported or not supported.​
  • To test the hypothesis, a survey is conducted
  • The results are analysed​
  • A report on the results is written which will prove or disprove the hypothesis.

6.5. Question types

  1. Closed and fixed choice questions. e.g. multiple choice, rating
  2. Open questions

6.6. Challenges

  • Clarity: Avoid vague questions that could be misunderstood by participants
  • Embarrassing questions: Are they necessary? e.g. How much do you weigh / earn?
  • Leading questions: e.g. “What do you think about our wonderful college?”
  • Prestige bias: e.g. “Research has shown that people who exercise every day live longer. How often do you exercise?”

6.7. Interview topics

Interviews are designed to capture qualitative data about participants.

  • Behaviours
  • Knowledge
  • Opinions / values
  • Feelings

6.8. Interview question types

  • Description: Tell me about… / What got you interested in…?
  • Detail: Can you tell me more about…? What might happen if…? Can you start with…?
  • Reflection: You mentioned… How do you feel about that? What’s it like when…?
  • Probing: When did that happen? Can you remember what you did in that situation? How did you feel at that moment?

6.9. Finally

  • Provide your participant with an information sheet and gain informed consent before starting your research
  • Include a welcome e.g. introduce yourself and the purpose of the research
  • State how many questions there are and how long you anticipate it will take to complete the questionnaire / survey / interview
  • Collect necessary demographic data e.g. age range / nationality / gender
  • Thank your participant for their time