Learn more about NNS 2022 Symposium

Bioinformatics

To read more details about the Bioinformatics theme within IHIW16, click on each of the following subthemes. You will get to learn about the project leader, project description, milestones data required and more.

Data Standards Hackathon for Next Generation Sequencing (DaSH for NGS)

Project Leaders:

Steven Mack
Martin Maiers
Kazutoyo Osoegawa

Detailed Project Description: Development of methods, standards, software tools, and online services to foster the standardized analysis, collection, exchange and storage of highly polymorphic immune-related genetic data for the purposes of basic and clinical research, the advancement of medical therapies, and understanding the genomics of the vertebrate immune system. A specific focus of this project involves development of pangenomic graphs for the HLA, KIR and LILR genes.

A key goal of this project is the development of am IHIWS database (dbIHW) structure and system fostering community data-access along with database continuity across subsequent IHIWS iterations.

Milestones in Years:

2023: Community outreach, enrolment and new project formulation, continuation of existing projects, dbIHIW design.

2024: Construction of datasets, satellite DaSH meetings, publication of papers and standards, continued enrolment, continued dbIHIW design and implementation.

2025: Identification of sub-projects, satellite DaSH meetings, publication of papers and standards, continued enrolment, dbIHW data-migration and population

2026: Presentation and dbIHIW launch

Data Required (number, type of data, inclusion/exclusion criteria): TBD, but will include existing data from prior DaSH efforts along with data from prior IHIW efforts, going back to at least the 13th IHIW (e.g., dbMHC), and including data from the 17th and 18th IHIW efforts.

Samples required (if applicable, number, type of samples, inclusion/exclusion criteria): No biological samples will be *required* as part of this project, but data describing and generated from current and prior IHIW efforts will be requested.

Reagents/Additional Assays Required: NONE

Data Infrastructure Required: We look forward to discussing this with the 19th IHIW organizers and data team. Certainly AWS/Cloud resources will be used for development.

SNP-HLA Reference Consortium (SHLARC)

Project Name: SNP-HLA Reference Consortium (SHLARC)

Project Leaders: Nicolas Vince, Pierre-Antoine Gourraud

Detailed Project Description:

Over the past 15 years, genome-wide association studies (GWAS) have identified more than 10,000 associations. Particularly, the HLA genomic region stands out as the most highly associated locus in GWAS, predominantly in immune-related diseases. SNPs are the hallmark of GWAS, however, the information on this type of genetic marker is very limited, especially in the HLA region where linkage disequilibrium (LD; defined as the non-random association of allele frequencies) is strong and extends over several megabases. To advance our understanding of functional mechanisms and potentially identify therapeutic targets, we must move beyond these simple associations, especially when dealing with HLA alleles. HLA typing techniques are expensive, require specialized laboratory infrastructure, and are in constant evolution.

However, recent developments in statistical inference enable us to impute HLA alleles from genotyped GWAS SNPs. Successful implementation of this technique relies on the availability of adequate reference panels for imputation. The objective of this project is to create diverse reference panels that enhance HLA imputation accuracy from GWAS datasets. To achieve this goal, we still need to:

1- Collect additional HLA and SNP data from numerous sources.

2- Improve our understanding of how diverse haplotypes and populations influence HLA imputation accuracy.

3- Maintain a digital platform (SHLARC, the SNP-HLA reference consortium: https://hla.univ-nantes.fr) accessible to scientists for their own data imputation needs.

Practically, we have successfully gathered more than 10,000 samples from several sources including public data (the 1000 Genomes project), semi-public data (via access to dbGAP and EGA data repositories), and direct collaborations. These later datasets come from diverse ancestry backgrounds such as Brazil (European + African + Native American), Benin (African), and various European (Western Europe, USA, Finland). We are still open to expanding the diversity of our data sources.

We have also developed an online platform to perform Hla imputation using the datasets mentioned above, which is freely accessible: (https://hla.univ-nantes.fr).

Milestones in years:

2023: Launch of the SHLARC website.
2024: Joint SHLARC/SIP workshop in Nantes, France (around September).
2026: Final Report on database diversity, HLA imputation performance, and applicability for research projects.

Data Required (number, type of data, inclusion/exclusion criteria):

Several types of data are suitable but all need to contain at least second-field molecular HLA typing for all HLA genes and SNP genotypes.

SNP genotypes: all types of GWAS chip data, sequencing data covering 500 kb around HLA genes, whole-genome sequencing WGS data.

Minimal HLA typing resolution: second-field. HLA can also be called from WGS data.

Data Infrastructure Required:

Data infrastructure will be hosted in the Nantes Université data center. Additionally, we will make use of our local high throughput calculation center (Glicid, Nantes Université) to build reference panels with the help of high-performance GPUs (NVIDIA A100).

Pharmacogenetics