The Versailles Arabidopsis Stock Center (VASC): original genetic resources exploiting both induced and natural diversity to investigate gene functions and analyze the impact of variation on plant biology
Abstract
Arabidopsis thaliana is a powerful plant model for functional biology, genetics and, more recently, population genomics. The Versailles Arabidopsis Stock Center collects, produces, preserves, characterizes and distributes various Arabidopsis biological resources. Besides large collections of mutants, including homozygous mutant lines, this stock centre offers numerous natural genotypes collected worldwide, as well as resources resulting from crosses between these variants. Most of the resources are unique and can be useful to a wide range of users, ensuring cumulative characterization of the same material over time. They are accompanied by molecular characterization, genotyping or sequencing data, enabling the analysis of diversity’s impact, particularly on complex plant traits. The collections are made easily and reliably available through an information system comprising a database and a web portal for description and distribution (https://publiclines.versailles.inrae.fr/). Several thousand seed lots are provided each year to the international scientific community.
Keywords
Plant genetic resources, Arabidopsis thaliana, natural variation, mutants
Introduction
Arabidopsis thaliana (L.) Heynh. is a small wild plant belonging to the Brassicaceae family, like rapeseed, cabbages, radish or mustard. It is easy to grow, has a short life cycle in greenhouse conditions, is mainly self-pollinating but can undergo crosses, and produces many seeds. Thanks to these biological characteristics, it became a plant model species in the 1980s (Meinke, Cherry, Dean, Rounsley, & Koornneef, 1998). In 2000, it was the first plant whose genome was completely sequenced (Arabidopsis Genome Initiative, 2000). As the international scientific community working on Arabidopsis has grown, numerous shared molecular tools, data and genetic resources have emerged and developed, making Arabidopsis the model system of choice in plant functional biology. In addition to enabling the understanding of many biological questions in this species, findings or biotechnological methods developed in Arabidopsis have also been transposed into crops or other organisms, and to more applied scientific fields such as plant breeding or even medicine (Yaschenko, Alonso, & Stepanova, 2024). Community-driven databases and stock centres have been created and have played a major role in the advancement of many research programmes. The Arabidopsis Information Resource (Reiser et al., 2024) maintains an extensive database, with links to other Arabidopsis resources. Besides the historical stock centres – the Arabidopsis Biological Resource Center (ABRC), Ohio, USA and Nottingham Arabidopsis Stock Centre (NASC), UK, the Versailles Arabidopsis Stock Center (VASC) was developed in the early 1990s in Versailles (France), at the Institute Jean-Pierre Bourgin for Plant Sciences (IJPB) of the National Research Institute for Agriculture, Food and Environment (INRAE), with a first collection of T-DNA insertion mutants to explore gene function. Since then, VASC has produced many specific resources exploiting both induced and natural diversity. Except for natural genotypes collected worldwide, these resources are unique, they are not distributed elsewhere, so VASC is complementary to other existing Arabidopsis stock centers. In addition to T-DNA insertion mutant lines and homozygous EMS mutant lines, the collections include worldwide natural genotypes and segregating populations or cytolines derived from crosses between these genotypes, to analyze the impact of natural diversity particularly on complex plant traits such as growth, development, reproduction or stress tolerance. The resources are molecularly characterized and provided to the Arabidopsis community all over the world.
Mutant collections
T-DNA insertion lines
The earliest collection was a set of 55,000 T-DNA insertion mutant lines, generated in the Ws (Wassilewskija) background (Bechtold, Ellis, & Pelletier, 1993), in which T-DNA was inserted randomly in the genome. This collection has been extensively used in numerous studies of forward genetics, based on the screening of mutated lines affected in diverse phenotypes, and the subsequent cloning of the tagged genes. Then, genomic sequences flanking the T-DNA insertions (Flanking Sequence Tags, FST) have been determined for all the T-DNA lines. A total of 46,236 FST have been systematically sequenced (Balzergue et al., 2001). They are available in the databases SIGnAL (http://signal.salk.edu/cgi-bin/tdnaexpress) and TAIR (https://www.arabidopsis.org/), allowing reverse genetics approaches which consist in looking for a line with an insertion in a candidate gene and then analyzing the mutant phenotype.
Genetic screens have played a major role in deciphering the genetic basis of many biological processes. Both forward and reverse genetics have been used for example to get insight into plant meiosis (Mercier, Grelon, Vezon, Horlow, & Pelletier, 2001). Many genes involved in meiosis were identified in A. thaliana by using a phenotypic screen on reduced fertility in the greenhouse, and, in parallel, by searching mutants in homologs of genes that play a role in meiosis in non-plant organisms, for example Saccharomyces cerevisiae (Couteau et al., 1999; Gallego et al., 2001).
Today, these T-DNA insertion mutants are still used to validate candidate genes involved in numerous biological processes.
Homozygous EMS mutant lines (HEMs)
After a while, forward genetic screens had identified most of the meiotic genes which, when mutated, cause a dramatic reduction in fertility in A. thaliana. However, an increasing number of genes that play a role in meiosis without causing marked phenotypes when mutated were being identified by reverse genetics, suggesting that many genes with a meiotic function remained to be discovered. To this end, VASC, together with the IJPB team working on meiosis, produced about 900 lines randomly mutagenized by EMS (Ethyl Methyl Sulfonate), which were then made homozygous or nearly-homozygous through either haplodiploidization or four generations of selfing by single seed descent (Capilla-Perez et al., 2018). In both cases, each line is composed of identical or nearly identical plants. In addition to mutations in promoters and untranslated regions (UTRs) that can impact gene expression, each line contains between 100 and 500 homozygous mutations that affect the sequence of protein-coding genes (e.g. amino-acid change, stop codon, loss of splicing sites). These resources can be used for forward genetic screening, examining either a single plant per line, or several plants to observe a more quantitative phenotype, and enable subtle and repeated phenotyping.
In the HEM collection, 43 lines with meiotic defects were phenotypically identified, of which 21 lines had a mutation in a gene whose role in meiosis had already been demonstrated in another organism. For six of these genes, this was the first time they were identified in a direct screen in Arabidopsis (Capilla-Perez et al., 2018). These results show the value of the HEM population and illustrate its potential to screen for any qualitative or quantitative phenotype.
In addition, the whole-genome sequences of all the HEM lines were recently made available (Carrère et al., 2024), enabling reverse genetics approaches. On average, three mutations affecting protein sequences are found per gene in the collection. The ATHEM web interface (https://lipm-browsers.toulouse.inra.fr/pub/ATHEM/) provides the community with the raw sequences, SNP calling results, and an interface to search for SNPs in given HEM lines or genes. Reverse genetic screens for various functions show the power of this resource to obtain different types of mutant alleles (Carrère et al., 2024). In addition, the knowledge of mutations greatly accelerates the search for causal genes in forward genetic screens.
Since 2020, this resource has been the most widely distributed by VASC.
Collections exploiting natural diversity
Natural variants (accessions)
Arabidopsis grows naturally throughout the northern hemisphere, in a wide variety of ecological conditions. This makes it an excellent model for studying natural diversity and adaptation, either directly in association studies using natural genotypes, or through segregating populations (Bazakos, Hanemian, Trontin, Jiménez-Gómez, & Loudet, 2017). At present, over 600 natural accessions – individuals collected worldwide in diverse environments – are available at VASC. Most of these genotypes exist in other stock centres or laboratories under the same name, but correspond to different batches of seeds. Due to possible mislabeling or sequence divergence across time between lineages, these seed batches can be genetically different and should not be mixed to avoid affecting genetic analyses. Each seed batch of the VASC accessions was identified by genotyping with a set of 384 SNP markers (Simon et al., 2012), and the genotyping data are available on the dedicated web interface ANATOOL (https://www.versailles.inra.fr/ijpb/crb/anatool/index.html). This interface also provides tools that offer a simple and efficient means to verify or determine the identity of the accessions in any laboratory, without the need for any specific or expensive technology.
Recently, chromosome-level genome assemblies were generated from long-read de novo sequencing for 69 natural accessions, using the DNA of plants issued from the VASC seed batches (Lian et al., 2024; Simon et al., 2022). These data provide insight into the overall genetic variation of the species and add value to our collection of natural accessions. All the parental lines of the VASC recombinant inbred lines (RILs), heterogeneous inbred families (HIFs), and cytolines (see below) are part of these 66 sequenced accessions.
Mapping populations: F2s, RILs and HIFs
The most important plant traits are quantitative traits, controlled by several genes at different loci and their interactions. To characterize the genetic architecture and identify the molecular basis of such traits, segregating populations dedicated to quantitative trait loci (QTL) analyses have been developed. The VASC generated 262 F2 families and 16 RIL populations from crosses between natural accessions. RILs are particularly interesting because they are nearly homozygous and can be propagated as genetically identical individuals, enabling the phenotyping of many traits on the same material genotyped only once. The VASC RIL populations (Simon et al., 2008) have been generated from genetically and phenotypically distant accessions, covering a wide range of diversity (Mckhann et al., 2004). They are composed of large numbers of individuals (343 on average per population) to enhance the statistical power of QTL detection. In addition, an optimal subset of 164 lines (core population) was determined for each RIL population, allowing users to phenotype a reduced number of lines with limited QTL detection power loss. The genetic maps rely on common markers, enabling the localization of QTLs mapped with different RIL populations to be compared. A very large number of studies have been published using this resource to decipher the genetic basis of various traits (for example Brachi et al. (2010); Brock, Rubin, Dellapenna, and Weinig (2020); Gravot et al. (2011); Hanemian et al. (2020); Poque et al. (2015); Shahzad et al. (2016); Wuest and Niklaus (2018)).
HIFs are nearly isogenic lines used as a complement to the RIL populations to confirm QTLs (Loudet, Gaudon, Trubuil, & Daniel-Vedele, 2005). They were selected in the progeny of RILs that show a single residual heterozygous region. Three complete HIF populations covering the whole genome are currently available.
Cytolines
Because the functioning of organelles (mitochondria and plastids) involves the interaction of proteins encoded by the nuclear and cytoplasmic genomes, these genomes are coadapted at the species level. To assess the impact of cytoplasmic variation and nucleo-cytoplasmic interactions on plant phenotypes, we created a unique series of 56 cytolines, whose cytoplasmic and nuclear genomes come from two different natural accessions (Roux et al., 2016). The cytolines were generated from reciprocal crosses between eight natural accessions representative of the species diversity, followed by recurrent backcrossing with the nuclear genome donor. Cytonuclear interactions were shown to affect several phenotypic traits, 1) indicating that cytoplasmic and nuclear genomes can interact to shape integrative traits that contribute to adaptation, and 2) highlighting a possible role for these interactions in the evolutionary dynamics of the species (Roux et al., 2016).
Epigenetic recombinant inbred lines
In addition to genetic variation, epigenetic variation can affect plant phenotype. Epigenetic modifications, such as DNA methylation, do not alter the DNA sequence but can be transmitted from one generation to the next. DNA methylation is a source of heritable phenotypic variation notably because it can affect gene expression. A set of 500 epigenetic recombinant inbred lines (epiRILs) was generated to study the impact of DNA methylation on phenotypic variation (Johannes et al., 2009). These epiRILs are derived from two closely related parents that have few DNA sequence differences but contrasting DNA methylation profiles. One parent is the accession Col-0, and the other is a homozygous mutant in Col-0 for the DDM1 gene, involved in the maintenance of DNA methylation (Vongs, Kakutani, Martienssen, & Richards, 1993). These epiRILs enable the analysis of epigenetic variation and the mapping of epigenetic QTL associating epialleles with phenotypic traits (Petitpas et al., 2024; Zhang et al., 2021).
Management
Staff and partnership
VASC is run by two permanent INRAE staff members, a scientific manager (Research Engineer) and an operating manager (Technician), for its scientific and technical activities. Since 2022, these two members have been supported by two additional staff, each for 20% of their time, in charge of the quality and certification procedures. Governance includes a steering committee comprising these four persons plus the head of the IJPB, a user committee comprising the steering committee plus IJPB researchers and an external scientist, and a scientific advisory board made up of the user committee plus a foreign scientist.
Despite its limited staff, VASC manages to continue the development and characterization of new genetic resources, such as the HEM collection recently, through projects carried out in partnership with other research teams, at IJPB or more widely. VASC is always open to developing new collaborations. We can provide our expertise in producing resources dedicated to specific approaches that can subsequently be useful to a wide audience. We can also maintain, host and distribute resources collected or generated by other laboratories. To this end, VASC benefits from IJPB's infrastructures (large-scale greenhouses, growth chambers, seed conservatories) and a skilled workforce for plant growing. Within the IJPB Plant Observatory, VASC also interacts closely with the Phenoscope high-throughput phenotyping platform (Tisné et al., 2013). A large proportion of our genetic resources (accessions, RIL, HIF, cytolines) have been phenotyped using this tool under homogeneous and highly controlled conditions (for example, Marchadier et al. (2019)).
VASC is part of the National Research Infrastructure of Agronomic Biological Resource Centers RARe (Agronomic Resources for Research). This enables us to share experiences with other resource centres, particularly in terms of management, regulation and quality.
Information system, distribution and funding model
VASC has established its own information system comprising a database and a web portal for data and distribution (https://publiclines.versailles.inrae.fr/). The online catalogue presents all resources and their descriptions. Collections are systematically characterized and molecular data are made easily available to the scientific community via downloadable files or hypertext links. Seeds can be ordered directly from the catalogue pages of the website. The price of orders is calculated automatically, and seeds are paid for online at the time of ordering. An invoice is issued and sent automatically to the client. An e-mail is automatically sent when the seeds are shipped, on average within four working days. The website enables the VASC staff to track all orders and clients.
Over the past five years, an average of more than 5,000 seed samples were distributed annually. More than 200 customers, from 26 countries, have placed orders, of which around one-third in France and two-thirds abroad. The most represented foreign countries were Germany, the USA, the Netherlands, Belgium, the United Kingdom, Italy, Switzerland and China. The most widely distributed resources are always the most recent. The EMS collection, which is the most recent, has been the most widely distributed since 2020. This motivates us to acquire new resources.
Seed sales represent a total income of about €20,000 per year. VASC is part of IJPB and has no funding of its own: VASC revenues are pooled at the institute level and operating costs are covered by IJPB funds.
Quality
Multiplication of seed stocks is conducted according to defined protocols designed especially to avoid seed contamination. Seeds are kept in a seed conservatory under controlled conditions at a low temperature (4°C) with 12% hygrometry. Security duplicates are maintained at -20°C to ensure preservation of the resources in the long term. Germination rates are regularly evaluated on samples from the different collections, testing 100 seeds per sample. To regenerate seed stocks, propagation is carried out by self-fertilization in insect-proof greenhouses. An identification number is assigned to each seed batch and is associated with a barcode that enables computerized tracking from sowing to harvesting and distribution. These procedures guarantee traceability and reliability during the production and distribution of the resources. Under these high-quality standards, VASC obtained the IBiSA 1 label in 2023. It has also implemented a Quality Management System based on the ISO9001:2015 standard, and achieved certification in 2024. Our efforts in the production, conservation and characterization of resources, as well as in the establishment of an efficient information and distribution system, have already earned us worldwide recognition for the interest and quality of our collections, our prompt distribution and the support we provide to our customers.
Past and present research projects
The resources produced have always been exploited by the VASC team in research projects. This enables us to anticipate the needs in terms of genetic resources, to obtain funding and gain recognition. Our research focuses on genomes, both their expression and their evolution, particularly from the point of view of genomic conflicts that can lead to the establishment of reproductive barriers.
A transcriptome study of two RIL populations has revealed, in each population, several thousands of expression QTLs (eQTLs;Cubillos et al. (2012)) providing a basis for identifying the gene networks involved in different pathways (Xue et al., 2024).
We have observed genetic incompatibilities in the progenies of certain crosses, where particular combinations of alleles at different loci lead to lethality (e.g. at the embryonic stage) or to total or partial sterility. We have found in our RIL populations several different pairs of loci that lead to this type of situation, and we have identified the partner genes and elucidated the mechanisms involved, some of which are epigenetic in origin (Agorio et al., 2017; Bikard et al., 2009; Durand, Bouché, Strand, Loudet, & Camilleri, 2012; Jiao et al., 2021). These phenomena can explain the lethality observed in hybridizations between varieties or species, which can have major implications for plant breeding and introgression programmes. Studying the reproductive barriers they create can also help us understand the mechanisms that lead to the formation of new species, an overarching goal in biology.
We uncovered a cryptic cytoplasmic male sterility (CMS) in A. thaliana. CMS, which is a source of reproductive polymorphism in angiosperms and of major relevance in hybrid breeding, is genetically determined by both mitochondrial and nuclear factors. A new mitochondrial gene causing sterility (Gobron et al., 2013) as well as a nuclear gene restorer of fertility (Durand et al., 2021) were identified, and the process of pollen abortion in this CMS system was characterized (Dehaene et al., 2024). This CMS participates in the hybrid sterility phenotypes observed in some crosses, together with segregation distorter loci responsible for pollen lethality (Simon et al., 2016). We characterized one of these pollen killers, identifying three genes involved in its functioning and exploring the high locus diversity at the species level (Ricou et al., 2025; Simon et al., 2022). We found both sensitive and killer plants coexisting in local French populations, which constitutes an invaluable resource for studying pollen killer evolution in the wild. Indeed, understanding how gamete killers appear and propagate in populations remains a major issue in evolutionary biology, and Arabidopsis proved to be a powerful model for investigating evolutionary dynamics at complementary geographical scales.
Conclusion and perspectives
We are determined to continue our commitment to proposing high-quality genetic resources, guaranteeing their long-term conservation, and generating knowledge on these resources to increase their value for research.
Our recent results (Ricou et al., 2025) underline that Arabidopsis, originally mainly a functional biology model, is also a valuable model for conducting studies in population biology, thanks to tens or hundreds of genotypes collected in many local populations (Brachi et al., 2013; Frachon et al., 2017). In this framework, our upcoming resources coming soon will consist of 458 whole-genome sequenced accessions collected from 168 natural sites located in the southwest of France and characterized for a unique set of ecological factors, including climate, edaphic properties, bacterial communities (soil, root and leaf), plant communities and human activities including urbanization (Bartoli et al., 2018; Frachon, Mayjonade, Bartoli, Hautekèete, & Roux, 2019; Roux, Frachon, & Bartoli, 2023). Both whole-genome sequences and deep ecological characterization of their native habitats represent a strong added value to these resources.
We wish the Arabidopsis community to keep using the VASC resources. Citing this article when publishing your results that use these resources will enable us to list the studies based on our collections, attest their usefulness, and therefore ensure the continuity of VASC funding.
Acknowledgements
VASC benefits from the support of IJPB's Plant Observatory platform PO-Plants. It receives financial contributions from the INRAE Biology and Plant Breeding department. IJPB benefits from the support of Saclay Plant Sciences-SPS (ANR-17-EUR-0007).
Author contributions
CC wrote the manuscript and AR, CG, CH and OL reviewed it
Conflict of interest statement
The authors have declared that no conflicts of interest exist.