Chapter 16 Answers to self-check questions

Data analysis in cytopathology

16.1 Name each of the following types of data as nominal, ordinal or continuous: (a) the temperature of a water bath, (b) volume of a solution of fixative solution, (c) the proportion of eosinophils in a pleural fluid, (d) the gender of a group of research participants, (e) the depth of invasion of a tumour, (f) the grade of a tumour.

  1. Continuous
  2. Continuous
  3. Continuous
  4. Nominal
  5. Continuous
  6. Nominal or ordinal


16.2 State the most appropriate graphical method of visualizing data in each of the following scenarios: (a) a comparison of the viral load measurements, which do not have a normal distribution, in normal cells, low grade and high grade cervical dyskaryosis, (b) a comparison of the expression of two antigens, which are measured on a continuous scale, in a population of neoplastic cells, (c) a comparison of immunocytochemical reaction scores for three fixation protocols, (d) the decay of staining intensity over time, (e) the proportion of different tumour types in population of individuals with thyroid neoplasms, (f) the sensitivity of cervical cytology by age group.

  1. Box and whisker chart
  2. Scatterplot
  3. Box and whisker chart
  4. Line graph
  5. Pie chart
  6. Bar chart


16.3 Calculate the following: (a) sensitivity when there are 576 true positives and 7 false negatives, (b) sensitivity when there are 1057 cases of disease and 1008 true positives, (c) specificity when there are 56 false positives and 93 true negatives, (d) specificity when the false positive rate is 37%, (e) the number of false negatives when there are 79 true negatives and the negative predictive value is 93%, (f) the positive predictive value when there are 3783 positive test results, of which 3501 are true positives.

  1. sensitivity = true positives/(true positives + false negatives) = 576/(576+7) = 0.99
  2. sensitivity = true positives/number of cases of disease = 1008/1057 = 0.95
  3. specificity = true negatives/(true negatives + false positives) = 93/(93+56) = 0.62
  4. specificity = 1-false positive rate = 1-0.37 = 0.63
  5. rearranging the formula for negative predictive value gives false negatives = (true negatives/negative predictive value) – true negatives = (79/0.93)-79 = 6
  6. positive predictive value = true positives/total positives = 3501/3783 = 0.93


16.4 Give the name of the appropriate statistical test in each of the following scenarios: (a) comparison of the immunocytochemical H-scores for an antibody in adenocarcinoma versus squamous cell carcinoma, (b) the relationship between the H-scores for two antibodies in a representative group of adenocarcinoma samples, (c) determining the influence of the type of slide (nominal variable) and extraction methodology (nominal variable) on the yield of DNA (continuous variable) from cytological samples for genetic analysis, (d) the association between p16 gene methylation status and progression of cervical intraepithelial neoplasia, (e) comparison of the nucleocytoplasmic ratio (a continuous variable with a normal distribution) of mesothelial cells from paired samples that have been fixed using two different preservatives, (f) determining whether there are significant differences in cell yield (a continuous variable with a normal distribution) from urine specimens in the age groups 30–39 years, 40–49 years, and 50–59 years.

  1. comparison of two independent groups of ordinal data requires a Mann Whitney U test
  2. correlation between two ordinal variables requires calculation of Spearman’s correlation coefficient
  3. prediction of a continuous dependent variable from one or more independent variables requires linear regression analysis
  4. determining the association between two nominal variables requires a Chi-squared test
  5. comparison of paired groups of continuous data which have a normal distribution requires a paired t-test
  6. comparison of three or more groups of continuous data which have a normal distribution requires analysis of variance (ANOVA)