Software

Software for analyzing biobank data

We have developed an R package, qgg, that is suitable for large-scale quantitative genetic analyses of complex traits. It is publicly available on CRAN and github with online tutorials and a scientific publication in bioinformatics.

qgg provides an infrastructure for efficient processing of large-scale genotype and phenotype data, including core functions for:

  • fitting linear mixed models
  • estimating genetic parameters (heritability and correlation)
  • genomic prediction using Bayesian linear regression methods
  • single marker association analysis
  • gene set enrichment analysis

qgg handles large-scale data using efficient algorithms and by taking advantage of:

  • multi-core processing using openMP
  • multithreaded matrix operations implemented in BLAS libraries (e.g. OpenBLASATLAS or MKL)
  • fast and memory-efficient batch processing of genotype data stored in binary files (e.g. PLINK bedfiles)