Research | Paneec

Only a few steps left before a pangenome (© IRD – Tiphaine Chevallier)

While cultivated and wild plants face increasingly significant and rapid upheavals (climatic, anthropogenic, sociological, ecological changes, etc.), the resilience and capacity for evolution and adaptation of plants is a major challenge. To be able to anticipate the evolution of cultivated plants, it is essential to understand in advance how diversity and genetic structure are affected by evolutionary forces (drift, selection, mutation, etc.). With technological advances, our understanding of diversity and its evolution is changing. Early diversity studies were limited by the number and characteristics of the markers used. In the era of high-throughput sequencing, the limit is no longer the number of markers available, but the reference used to analyze them. Indeed, the standard is to use a single genome as a reference to compare its sequence with that of other individuals. In this case, any diversity absent from the reference individual is ignored. An increasing number of studies have shown that this ignored diversity can be very important (e.g. 1.2 Gb for 251 rice individuals for a reference genome of ~400Mb, Shang et al., 2022). Accessing this diversity could prove decisive in our understanding of the mechanisms of plant evolution.

Rice diversity

Rice diversity – From Orjuela et al. (2014)

The pangenome concept emerged from the limitation of using a single reference. A pangenome is a collection of genomes from different individuals, making it possible to define a core genome, a dispensable genome and a pangenome. The core genome brings together the sequences present in all individuals, while the dispensable genome contains those present in certain individuals only. The pangenome corresponds to the combination of the two, thus representing the total genetic diversity. The transition from genome to pangenome avoids neglecting the diversity absent in a single reference. The challenge now lies in the description of the pangenome and its use to address the problems of plant adaptation to global changes.     

The concept of pangenome

The concept of pangenome – From Durant 2022

The team PANEEC (PANgenome, Evolution, ECosystem) aims to (i) develop methodological tools to better characterize pangenomes and (ii) study the dynamics of pangenomes in relation to evolutionary questions such as the domestication of cultivated plants or their adaptation and vulnerability to global changes.

(i) Methodology

While several tools have been developed for generating pangenomes, very few are capable of efficiently working with them, particularly with pangenome graphs. Within the PANEEC team, we focus on graph topology, especially their structures and the metadata they contain, to extract biological and evolutionary insights. To achieve this, we contribute to the development of dedicated tools:

Work in progress

  • Graph construction and manipulation tool (GraTools)
  • Graph visualization tools (Panache, Savanache)
  • Annotation transfer tool for graphs (GrAnnoT)
  • Graph enrichment tool (FrangiPANe)
  • Genotyping tool based on graphs
  • GWAS on graphs tool
  • Graph statistics computation tool (on collaboration with Egglib)
  • Recombination detection tool for graphs
  • Graph storage and query tool
  • Graph quality estimation tool

Work planned

  • Graph comparison tool
  • Accounting for ploidy in graph construction
  • Graph construction from pool-seq data

These tools aim to characterize the diversity of structural variations and core/dispensable diversity. This diversity will then be used to address evolutionary questions.

Pangenome graph

A pangenome graph – From Yang et al. (2023)

(ii) Dynamics

Our goal is to study how ecosystem changes (geographical, climatic, anthropogenic, etc.) and evolutionary forces influence and modify the structure of (pan)genomes. To achieve this, we leverage the diversity characterized using the tools developed in Axis 1, as well as evolutionary models that we are developing (e.g., the evolution of transposable elements and structural variations).

We are currently focusing on two main models:

  • Yam in West Africa
    By analyzing the impact of domestication on the diversity of cultivated yams and the effect of gene flow on the diversity of related wild species, we aim to study yam adaptation to cultivated environments and climate change.
  • Rice in Africa and Asia
    We are working simultaneously on wild rice and the domestication and diversification of cultivated rice. On one hand, we seek to understand what defines wild rice and how it is adapted to various ecosystems. On the other hand, we study the impact of domestication and selection on the structure of (pan)genomes by comparing the domestications of Indica, Japonica, and Aus rice in Asia, and the domestications of rice in Asia and Africa. We are also investigating the evolution of the core-to-pangenome ratio during domestication.

Additionally, we occasionally work on other models (algae, Drosophila, etc.) to address specific questions based on emerging collaborations.

Wild and cultivated yam. From Scarcelli et al. 2019

Wild and cultivated yam tubers – From Scarcelli et al. (2019)