Pangenome: Full set of genomic elements in a given species or clade

The pan-genome represents the entire set of genes within a species, consisting of a core genome containing sequences shared between all individuals of the species and the dispensable genome.

The concept of a pan-genome was first described by Tettelin et al in 2005, in the context of bacteria.

A pangenome as a “core genome containing genes present in all strains and a dispensable genome composed of genes absent from one or more strains and genes that are unique to each strain”.

A pan-genome contains two types of genes, core genes shared by all individuals and distributed genes shared by some but not all individuals.

 The first pan-genomes were developed for small, easy-to-sequence bacteria, but, even in that context, pan-genomes provided novel scientific insights.

Restricting the pan-genome to gene content makes less sense in eukaryotes, particularly those with large genomes (>500 Mb), where more than 50% of the genome may be intergenic.

Pan-genome studies that consider protein-coding genes can use protein sequence conservation in addition to DNA sequence to determine whether genes are homologous.

Pangenomes are frequently formulated as sequence graphs -mathematical graphs that represent the homology relationships between multiple sequences.

Pangenomic study might improve the completeness of human reference genome and promote precision medicine.