5S ribosomal RNAs (rRNAs) are critical components of the ribosomes, molecular machineries that translate genetic information into proteins. Due to their essential role and the requirement to be produced in large quantities, 5S rRNA genes are highly repeated in the genome. Given their repetitive nature and their localization in the pericentromeric regions of the genome, 5S rRNA genes are only partly assembled in the reference genome, which has rendered detailed analyses of their organization, dynamics and epigenetic regulation difficult. Nevertheless, 5S rRNA genes show small variations in sequence and not all copies are transcribed, inciting the question how expression of these quasi-identical sequences is regulated.
Using next generation sequencing of DNA and 5S ribosomal RNA we show here that specific DNA signatures distinguish the 5S rRNA gene copies from the three major 5S rDNA loci. Using these sequence signatures, we have built bioinformatics pipelines and developed specific probes for DNA Fluorescence in situ hybridization specific for a particular locus. With these tools, we revealed sequence polymorphisms between the different 5S rRNA gene copies and between loci as well as differential enrichment in epigenetic marks linked to differential gene expression. We further show important variation in copy number and position of 5S rRNA genes between ecotypes of Arabidopsis thaliana, the latter influencing genome organization within the nucleus. Finally, our results indicate important plasticity among ecotypes in 5S rRNA gene expression both between 5S rDNA loci and within a particular locus.