Genome-wide association studies (GWAS) analyse a genome-wide set of genetic variants in different individuals to see if any are associated with a trait or disease. Such studies are getting larger and larger and, in some cases, millions of participants are involved. This means that researchers can see smaller and smaller effects increasing the number of genes they can link to a disease or trait.
«It is good for us, because it allows us to understand much more about genetics influences our make-up, behaviour, and disease status, » says Dr Andrea Ganna, from the Institute for Molecular Medicine Finland, Helsinki, Finland, who will present his team's research to the 53rd annual conference of the European Society of Human Genetics, being held entirely on-line due to the Covid-19 pandemic, today [Monday]. « But this good news comes with a downside. These large numbers mean that biases can creep in and affect our results. The most difficult of these to control is participation bias - when people who participate in a study are not from a random set but have something in common that is linked to their participation.
«To give an extreme example, if we were to use the participants in a professional basketball team to understand how tall or fit people are, the results would not be at all representative of the general population. But even low- level participation bias can skew results,» says Dr Ganna.
Recent studies looking at people who have participated more than once in a genetic study have shown a correlation with their level of educational attainment, for example. The researchers set out to characterise better what the consequences of this type of bias were. To do so, they needed a trait that they were certain was not determined genetically on the non-sexual chromosomes and about which they could be sure in advance that no association with those genes existed. «The only area where we felt certain that genetics outside the sexual chromosomes was not involved was the genetic differences between males and females," commented co-author Dr Nicola Pirastu, from the University of Edinburgh, Edinburgh, United Kingdom. «Therefore our analyses should have come out completely negative.» The team carried out an association study of data from over three million individuals1 looking at which genetic variants showed differences in study participation frequencies between males and females. «To our surprise, we found over 150 loci with such differences. For example, we saw more body mass index-raising alleles2 among men than women, suggesting that genetically higher-weight women were less likely to participate in population studies than men. This can only have been related to differences in the characteristics that drive men and women to participate. And we saw a similar effect in different cohorts, which confirmed our hypothesis,» said Dr Pirastu.
These findings emphasise the importance of scientists' awareness of the necessity of careful study design and the meticulous choices of cases and controls when conducting genetic studies. In order to draw useful conclusions, the risk of participation bias should be minimised. If these kinds of bias exist in a study involving men and women, it will be far more difficult to distinguish between true results and those arising from biases when looking at disease. «For example, in the recent pandemic we know that those people who have been tested for Covid-19 were not chosen at random and share common characteristics, so making the right choice of controls to be used to understand if there are any genetic determinants involved is very important. I think our study shows what the risks are if this is not done,» says Dr Ganna.
At the moment, all the evidence is that participation biases are mild enough not to be a major problem. But it is important to take them into account when planning the collection of data from large cohorts and when data from participants is collected at multiple time points. Genotyping a random set of the population, for example, from the blood spots collected at birth, would be a good way of further verifying if these biases exist. «We have shown that it would allow us to correct the statistical analyses. In addition, it would cost very little in comparison of what it has cost to date to create these studies. We really need to do this if we are to be able to draw the right conclusions from our analyses,» Dr Ganna will conclude.
Chair of the ESHG conference, Professor Joris Veltman, Dean of the Biosciences Institute at Newcastle University, Newcastle upon Tyne, UK, said: "This fascinating study shows us how important it is to be aware of unexpected biases in participation in genetic association studies as well as other large scale 'population' studies, as this can significantly impact results if not properly corrected for."
1. Data from the UK BioBank, 23andMe, iPSYCH FinnGen, and Biobank Japan.
2. An allele is a variant form of a gene.
Abstract no: C21.1 A genome-wide association study of sex at birth in 3 million individuals reveals widespread sex-differential participation bias with potential implications for GWAS interpretation