Altfragen Biostatistics
1. Say you want to know how many books an average Varna university student reads per month. Say
you have taken a random sample of 144Varna university students and that the mean number of
books for the ones in your sample is 10 and the standard deviation is 3. What is your best estimate
for the standard error of the mean (SEM)?
A. 0,2
B. 0,25
C. 0,33
D. 0,1
2. In a frequency distribution of 130 elements, the mean value is 28,5 and the median is 58. What is the
type of distribution?
A. Positive (right) skewed
B. Negative (left) skewed
C. Normal distribution
D. Bimodal
3. A study rejects the null hypothesis that the mean duration of viral shedding in primary HSV-1
infections in 7 days, t= 2935, df=12, p<0,01. This means that:
A. 1% of the cases did have a mean duration of viral shedding of 7 days
B. There is a 5% chance that the null hypothesis is being rejected incorrectly
C. There is a 1% chance that the null hypothesis is being rejected incorrectly
D. There is a 1% chance that a case would have less than 7 days of viral shedding
4. Nonparametric tests:
A. Are very difficult to calculate
B. Are “distribution free”
C. Are used to estimate at least one population parameter from our sample statistics
D. Are fit for small samples and interval/ ratio scaled data
5. Say you want to know how many books the average Varna university student reads per month . Say
you have taken a random sample of 144 Varna university students and that the mean number of books
for the ones in your sample is 10 and the standard deviation is 3. If you want to be 99% certain about
your answer, what is the range of error you must be willing to accept?
A. +-2,58*SEM
B. +- 1,96*SEM
C.+- 1,65*SEM
6. A histogram (mark the choice that is not correct)
A. Could be used to show the distribution of lengths of hospital stay in patients admitted with
acute asthma
B.Could be used to show a distribution of birth weights in babies
C. Is the best way to show the relationship between weight and blood sugar level in a sample of
diabetic patients
D. Conveys information about the spread of a distribution
7. A researcher is interested in obesity in the population. For this purpose he has to weigh a sample of
30 individuals to half the width of the convert a confidence interval the researcher would have to:
A. Weigh approximately 60 people instead of 30
B. Weigh approximately 120 people instead of 30
C. Weigh approximately 180 people instead of 30
D. Weigh approximately 270 people instead of 30
8.Which of the following statements about confidence intervals is incorrect?
A. If we keep the sample size fixed, the confidence interval gets wider as we increase the
confidence coefficient.
B.A confidence interval for a mean always contains the sample mean.
C. If we keep the confidence coefficient fixed, the confidence interval gets narrower as we
increase the sample size.
D. If the population standard deviation increases, the confidence interval decreases in width.
E. If the confidence intervals for two means to not overlap very much there is evidence that the
two population means are different.
9. The p-value:
A.Is the probability that the null hypothesis is false.
B. Is large for small studies.
C. Is the probability of the observed result, or one or more extreme, if the null hypothesis were
true.
D. Is one minus the type I error.
E. Can only take a limited number of values such as 0.1, 0.05, 0.01 etc.
10. Non-parametric tests are:
A. Are “distribution free”
B. Are very difficult to calculate
C. Are fit for large samples and serious distortion of the data
D. Are used to estimate at least one population parameter from our sample statistics
11. A correlation coefficient:
A. Always lies in the range 0-1
B. Could be used to summarize the relationship between hemoglobin concentration and blood
group in a sample of hospital patients
C. Is a measure of the extent to which two variables are linerally related
D. Can be used to predict one variable from another
12. Two prediction equations have been developed. In case (A), R=0,60 and in case (B), R=0,75. In
which case can we make a more accurate prediction?
A. In case (A)
B. In case (B)
C. In neither of the cases
13. A linear regression equation:
A. Is not affected by a change of scale
B. Minimizes the sum of differences between the observed values and the predicted ones
C. Can be used for prediction
D. Requires that’s C dependent variable is normally distributed
14. The coefficient of determination r^2:
A. Expressive his report portion of variances in the dependent variable explained by the
independent variable
B.Always lies in the range -1 to +1
C. Can be used to predict one variable from another
D. Is a useful way to describe the accuracy of a study
15. You want to develop a technique for predicting who is likely to develop postoperative
complications. What statistical techniques you use?
A. Chi square test
B. Multiple regression analysis
C. Correlation
D. Independent sample t test
16.Nationally, about 27% of households have a computer with a connection to the Internet. You think
this number is not valid for a description of your community, so you do a survey with the goal being to
collect data that will allow you to determine whether your suspicion is accurate. What is your null
hypothesis? What is the alternate hypothesis?
A.H0: µ= 27; Halt: µ>27
B.H0: µ= 27; Halt: µ<27
C. H0: µ= 27; Halt: µ=27
D.H0: µ= 27; Halt: µ=27
17. What is Type I error?
A. You commit a type I error if you reject a false null hypothesis
B. You commit a type I error if you reject a true null hypothesis
C. You commit a type I error if you accept a true null hypothesis
D. You commit a type I error if you accept a false null hypothesis
18.What is Type II error.
A. You commit a type II error if you reject a false null hypothesis
B. You commit a type II error if you reject a true null hypothesis
C. You commit a type II error if you accept a true null hypothesis
D. You commit a type II error if you accept a false null hypothesis
19. When should you do a t-test instead of a z-test?
A. When you have two means in the sample size is a greater than or equal to 30
B. When you have two means in the same sizes are smaller than 30
C. When you have more than two means
D. When you have two means and they are equal
20. Samples of students in rolled into courses covering the same material received the following
grades on their term project:
Course A 75,6 55,9 63,5 66,8 87,6 93,0
68,4 79,9 83,7 35,8 45,5 72,4
Course B 55,4 69,5 72,7 56,8 46,4 52,7
62,8 66,4 60,1 29,7 37,8 77,2
A. Suppose, you want to conclude from this data that mean great given in course A What’s the
difference from the main great given in course B. Which test is appropriate to do this, what are you null
and alternate hypothesis, is this a 1-tailed test or a 2-tailed test.
A. z-test, H0: µa=µb; Halt µa>µb, 1-tailed
B. t-test, H0: µa=µb; Halt µa>µb, 1-tailed
C. t-test, H0: µa=µb; Halt µa=µb, 2-tailed
I
D. z-test, H0: µa=µb; Halt µa=µb, 2-tailed
B. Suppose, you want to conclude from this data that the mean grade given in course A was higher
than the mean grade given in course B. Which test is appropriate to do this, what are your null and
alternate hypothesis, is this a one-tailed or two-tailed test?
A. z-test, H0: µa=µb; Halt µa>µb, 1-tailed
B. t-test, H0: µa=µb; Halt µa>µb, 1-tailed
C. t-test, H0: µa=µb; Halt µa=µb, 2-tailed
D. z-test, H0: µa=µb; Halt µa=µb, 2-tailed
21. What is the aim of standardization of the intensive indicators?
A. To remove the differences in the structure of the populations to be compared
B. To test the null hypothesis
C. To define the optimal number of cases for the investigation
D. To define the proportions of the different age groups
22. The paired T test is:
A. Used for independent samples
B. Suitable for very small samples
C. Requires the assumption that differences between pet observations follow a normal
distribution
23. Fill in the missing words to the quote: “ statistical methods may be described as methods for
drawing conclusions about……based on……computed from the ……”
A. Statistics, samples, populations
B. Populations, parameters, samples
C. Statistics, parameters, samples
D. Parameters, statistics, populations
E. Populations, statistics, samples
24. Consider the following statement concern with the collection of data, and determine the best
selection of terms to complete the statement:”The entire group of objects or people about which
information is wanted is called the ……. Individual members are called…….The…… is the part that is
actually examined in order to gather information.”
A.Population, explanatory variables, subgroup
B. whole, items of interest, stratum
C. Responsible, respondents, no response group
D. Sample, units, target population
E.Population, units, sample
25.The p-value
A. Is the probability that the null hypothesis is false
B. Is the probability of the observed results, or one more extreme, if the null hypothesis we’re true
26. Nonparametric tests:
A.very difficult to calculate
B. Are “distribution free”
C.are used to estimate at least one population parameter from our sample statistics
D.Outfits for small samples and interval/ratioscaled data
27. Select the best statement concerning a scatterplot:
A. It provides a visual description of the distribution of potential sample means drawn from a
given population
B. It plots the distribution of potential confidence intervals in a data set
C. It plots the values of two continuous variables in a data set
D. It plots the standard errors of randomly selected data from a given population
E. It’s a method to convert skewed data to normal distributed data in a data set
28. A study finds that there is correlation of +0.7 between self-reported work satisfaction and life
expectancy in a random sample of 9200 teachers (P<0,001)This means that,
A. Work satisfaction is one factor involved in increasing one’s life expectancy
B. To live longer one should try to enjoy one’s work
C. There is a strong statistically positive association between work satisfaction and life
expectancy
29. Advantage of random simplifying for studying the general human population include:
A. it can be applied to any population
B. Likely errors can be estimated
C. The sample can be referred to a known population
30.All of the following statements about the normal distribution are true except
A. This distribution is represented by a symmetric bell shaped curve
B. Normally distributed random variable is also continuous
C. This distribution is a theoretical frequency distribution
D.this distribution may be skewed
E. This distribution may be used to calculate the normal range of clinical values for a diagnostic
test
31. A person’s highest educational level is which type of variable?
A. Nominal
B.discrete numeric
C. Ordinal
D. Continuous
32. To analyze the relationship between blood pressure and height in a sample of students, we could
use:
A.paired t test E. Regression
B. Chi-squared Test
C. Correlation coefficient
D. Two sample t test
33. Attempts to predict future health events by systematically comparing groups of human subjects
differentially exposed to agents of interest is called:
A. Clinical experience
B.Clinical guidelines
C.Scientific model
D. Outcomes research
34. For a t-test for two independent samples to be valid:
A. The numbers of observations must be approximately the same in the two groups
B. The means must be approximately the same in the two groups
C. Standard deviation of observations must be approximately the same in the two groups
D. The sizes of samples must be small
35.Which of the following are not matters for spread of a distribution
A. Range
B. Standard deviation
C. Interquartile range
D. Median
36. In a case control study, 101 stroke patients were compared with 137 healthy control. Among the
results were:Cigarettes smoking by stroke patients and healthy controls-ever smoked? Cases:71 yes,
30 never, 101 total; controls: 36 yes, 101 never, 137 total; Chi squared equals 45.5, 1° of freedom, PA
less than 0.0001, odds ratio= 6.6 (95% confidence interval = 3.8 to 11.8) please choose the wrong
answer:
A.we can conclude that smoking cause a stroke
B. It is estimated that the relative risk of stroke for ever smoke is compared to never smokers is
6.6
C.where is good evidence that smokers have an increased risk of stroke
D. In the population from which cases and controls come, the odds ratio is estimated to die
between 3.8 and 11.8
E. This data would be unlikely if smoking and stroke were on related
37. Researchers conducted a study on the weight difference of 95 subjects at baseline and after being
on a new diet for two months. These subjects were randomly selected from a nutrition class for
overweight patients. They report that mean weight loss was 5 pounds with a 95% confidence limit from
0.5 pounds to 9.5 pounds. Alpha was set at 0.05. Select the correct answer
A.The researchers should reject the null hypothesis
B.95% of the subject had a weight loss between 0.5 pounds and 9.5 pounds
C. The internal validity is limited given the study only lasted two months
D.A type II error could have occurred
38. The product-moment correlation coefficient between two variables, r:
A. Is expected to be zero when there is no relationship between the variables
B.Pens on which of the two rifles is chosen to be the dependence variable
C. Measures the magnitude of the change in one variable associated with a change in the other
D. Can only have a valid significance test carried out when one variable follows a normal
distribution
39.When comparing the meaning of two samples using t-test
A. Do you sample size must be equal
B. The null hypothesis is that the means are not equal
C. The data have to be normally distributed
40.A linear regression equation:
A. Requires that the dependent variable is normally distributed
B.Is not affected by a change of scale
C.Can be used for prediction
D.Minimizes the sum of differences between the observe values and the predicted ones
41. The coefficient of determination r^2:
A. Always lies in the range -1 to+1
B.Can be used to predict on variable from another
C. Express the proportion of variance in the dependent variable explained by the independent
variable
D.is a useful way to describe the accuracy of a study
42. In a case-control study, patients with lung cancer had a highly significant lower cholesterol level
them it controls. This provides strong evidence that:
A. There is evidence for a relationship between low cholesterol and lung cancer in the sampled
population
B. Low cholesterol and lung cancer always go together
C. Low cholesterol is not related to lung cancer
D. Low cholesterol is a risk factor for the development of lung cancer
43. In measuring the Centre of data from a skewed distribution, the median would be preferred over
the mean for most purposes:
A.The main measures the spread in the data
B. The medium is the most frequent number where the mean is most likely
C. The medium is less than the mean and smaller numbers are always appropriate for the center
D. Comedian measures the arithmetic average of the data excluding outliers
44. A surrogate end point is:
A. A measurement to extrapolate the results from clinical trials into clinical practice
B. A laboratory measurement or a physical sign used as a substitute for a clinically meaningful
end point
45.Select the correct statement concerning relative risk and odds ratio:
A. A relative risk of 20 has the same strength of association as a relative risk of 0,05
B. As a general statement, there is less confounding when calculating relative risk vs. an odds
ratio
C. Coefficients from logistic regression analysis yield relative risks
D. One should not calculate a relative risk when theData are from a retrospective cohort study
E. Underlying data should be normally distributed to calculate relative risk
46.The value “187 sm” can be presented on:
A.Ordinal scale
B.Interval scale
C.Ratio scale
D.Nominal scale
47. The following are not nominal variables:
A.Presence or absence of blood in sputum
B.International classification of disease code
C.Size of a tumor in centimeters
D.Racial group
48. To analyze the relationship between blood pressure and height in a sample of students we could
use:
A.Paired T test
B.Regression
C.Correlation coefficient
D.Chi square test
E.Two sample t-test
49.The Paired t-test is
A. Requires the assumption that differences between paired observations follow a normal
distribution
B.Used for independent samples
C. Suitable for very small samples
50. What type of data is formed by the figures a physician has generated regarding the number of
cigarettes her/his patients smoke?
A.Continuous
B.Ratio
C.Nominal
D.Ordinal
E. Interval
51. A researcher wishes to compare The effects of three different antiretroviral drug combinations on
the survival time of two groups of patients with AIDS; One group are IV drug abusers, the others are
infants infected in utero.Each of these groups is divided into four subgroups; each subgroup is given a
different drug combination. Which statistical technique would be most appropriate for analyzing the
results of this study?
A.Independent sample T test
B.Paired sample t test
C.Regression
D. Analysis of Variance (ANOVA)
E.Chi square
52. A researcher wants to find out if there is a difference between cholesterol level of male and female
patients. He measures the cholesterol level of 25 male and 25 female patients. What will be the most
appropriate test for the data analysis?
A. Two samples t-test
B.Paired sample t-test
C.ANOVA
D.Chi square test
53.An investigator find that 66 of 250 pregnant women at high risk of hypertension who take 100 mg of
aspirin daily are less likely to develop hypertension compared to 88 women that do not take aspirin.
What statistical technique should be used to test the null hypothesis that there is no difference
between these proportions.
A.T-test
B.Correlation with associated T test
C.Chi square test
D.Analysis of variance (ANOVA)
54. A researcher wants to find out is there a difference between the behavior status of 50 patients after
the application of a new therapy. The behavior status was measured before and after the therapy.
What will be the most appropriate test for the data analysis?
A.Two sample t-test
B. PairedSample t-test
C.ANOVA
D.Chi-square Test
55. You want to develop a technique for predicting who is likely to develop postoperative
complications. What statistic technique should you use?
A.Chi squared test
B.Independent sample t-test
C. Paired sample t-test
D.Regression analysis
E. Correlation
56.A researcher wants to assess the efficiency of a new drug treatment. He studies five groups of
individuals: medication-one, medication-two, medication-three, conventional treatment and placebo.
What will be the most appropriate test for the data analysis?
A.Independent sample t-test E. Regression analysis
B. Paired sample t-test
C. Analysis of variance (ANOVA)
D.Chi-square
57.Did 99% CI in a study is 13÷19. This means that:
A. The probability that the population mean will fall outside the interval 13÷19 is 0.99
B.The probability that the interval 13÷19 will include the population mean is 0.05
C.This is not a valid CI
D. The probability that the population mean will fall within the interval 13÷19 is 0.95
E. The probability that the interval 13÷19 will include the population mean is 0.99
58. A researcher is interested in obesity in the population. For this purpose he has to weigh A sample
of 30 individuals. To half the width of the confidence interval, the researcher would have to:
A.Weigh approximately 60 people instead of 30
B.Weigh approximately 120 people instead of 30
C.Weigh approximately 180 people instead of 30
D. Weigh approximately 270 people instead of 30
59. A random sample of 471 nurses in Varna was selected and several variables are recorded for each
of them. Which of the following is not correct?
A.Nurse’s total income is a ratio scaled variable
B. Educational level bus coded as on 1=professional 2=bachelor 3=master 4= doctoral and it’s a
ratio scaled variable
C. The gender of the patients in their clinic is a nominal scaled variable
D. The number of visits in the clinic is a discrete variable
60. Proportions are used to present the structure of the phenomena and rates are used for
investigation of the frequency of the occurrence of given event or phenomena.
A.True
B.False
61. What is the relation between sample size and standard error?
A.Bigger samples produce bigger standard errors
B. Smaller Samples produce smaller standard errors
C. Increasing the sample size by a factor of C increases the standard error by a factor of one over
the square root of C
D. Increasing the sample size by a factor of C decreases the standard error by a factor of one
over the square root of C
62. In general, which of the following statements is not correct?
A. If a distribution is normal, then the mean, the median and the mode are equal
B. The sample mean is more sensitive to extreme values then the mode
C. Sample standard deviation is a measure of central tendency around the mean
D. The sample range is more sensitive to extreme values than the standard deviation
E. The sample standard deviation is a measure of spread around the sample mean
63. The power of the test is (Mark the answer that is not correct)
A. The probability of rejecting the null hypothesis when it is false
B. One minus type II error
C. One minus type I error
D. Probability of not committing a type II error
64. Standardized indicators are conditional indicators
A. True
B.False
65. What type of data is formed by the figures the physician has generated regarding the number of
visits her patients have to ring the last six months?
A. Nominal
B. Ordinal
C. Interval
D.Ratio
66. What is the level of significance of which the null hypothesis is accepted with 99% confidence?
A.p<0,05
B.p> 0,05
C.p>0.01
D.p<0.01
67. Which one of the following measurements is not a quantitative scale?
A.Height in centimeters
B.Stage of cancer disease
C.Number of people exposed to passive smoking
D. IQ score
68.Type I error is also what level of significance alpha?
A.True
B.False
69. The Pearson product moment correlation coefficient (r):
A.Always lies in the range from 0 to 1
B. Could be used to summarize the relationship between hemoglobin concentration and blood
group in a sample of hospital patients
C. Is a measure of the extent to which two interval/ratio variables are linearly related
D. Can be used to predict one variable from another
70. Which one of the following statements is false?
A. Pie charts are better than bar graphs for comparing relative sizes
B. Data that are nominal scale are presented using frequency tables
C. Means and standard deviation of nominal data are meaningless
D. The scatter plot is the basic graphic tool for investigating relationships between two interval or
ratio scaled variables
E. Box plots are a good choice for comparing the distribution of values among groups
71.A medical school professor states that the correlation coefficient between students final
examination grades and the number of times three attendance seminars is 1.12. This means that:
A.A student will improve his or her grade attending seminars regularly
B.The lecturer has reported the correlation coefficient incorrectly
C.65% of the variation in final grades is accounted for by seminar attendance
D.The correlation is too low to be of significance
E. 80% of the variation in final grade is accounted for by seminar attendance
72. Say you want to know how many books the average Varna university student reads per month. Say
you have taken a random sample of 144 Varna university students, and that the mean number of books
for the ones in your sample is 10 and the standard deviation is 3. What is your best estimate for the
mean number of books for all students?
A.3
B.10
C.12
D.30
73. In a sample of 150 patients with depression who are currently taking a specific antidepressant
medication, it is found that depression and anti-depressant drug dosage correlate r=-0.7, p<0.05. It is
correct to conclude all of the following except
A. The relationship between drug dosage and depression is unlikely to be due to chance
B. Relationship between drug dosage and depression is a moderate negative one
C. Although other factors are clearly involved also drug dosage is one factor causing these
patients depression to be reduced
D.Drug dosage accounts for 49% of the variation in depression