Linkage disequilibrium and population structure in Fragaria chiloensis revealed by SSR markers transferred from commercial strawberry

The Chilean strawberry [Fragaria chiloensis (L.) Mill.] is the maternal progenitor of the commercial strawberry (Fragaria × ananassa Duch.), which is characterized by fruits with high organoleptic quality and is well-suited to areas where drought and salinity represent a constraint on crop growth and productivity. We examined the patterns of linkage disequilibrium, genetic diversity and population structure among 54 accessions of F. chiloensis to understand the genetic basis of this species. We used a core microsatellite marker set (n = 95) from a consensus linkage map of strawberry. A transferability rate of 82.1% (78/95) was found, and 38 markers were selected for this study. The SSR primers produced a total of 259 alleles, which varied between 112 and 342 bp. Lower genetic diversity at the species level (HE = 0.17, Shannon’s index = 0.28) was found compared to previous studies of this species. No climatic region pattern for SSR diversity was observed. Structure analysis suggests that the accessions are grouped into three significantly differentiated clusters. Pairwise estimates of φST indicated a low degree of differentiation between the three genetic groups (φST = 0.023 to 0.06). These groups are in concordance with potential glacial refugia in the region, with many accessions being an admixture of them.


Introduction
Fragaria chiloensis is a native Chilean polyploid species (2n = 8X = 56) of the Rosaceae family.This family includes many economically important edible fruit crops such as apples, peaches, and cherries (Folta & Davis, 2006).F. chiloensis (aka beach, Chilean or coastal strawberry) is best known as the parental mother of F. × ananassa, the commercial strawberry (Hancock, Lavín, & Retamales, 1999).This species has a wide geographical distribution across the Chilean territory, from 34°55'S to 47°33'S and from sea level to 1850 m above (Lavín, Del Pozo, & Maureira, 2000).Although F. chiloensis is a smallscale cultivar with poor agronomical management over the last several years, it has gained attention due to the interesting organoleptic properties of its fruits turning into a fruit ripening study model (Cherian, Figueroa, & Nair, 2014).Moreover, the plant is resistant to many pathogens and has adapted to different abiotic stresses, including drought and soil salinity (Retamales, Caligari, Carrasco, & Saud, 2005).Thus, F. chiloensis has become a valuable source for the genetic improvement of the commercial strawberry, in particular for breeding programs with a limited genetic base (Stegmeir, Finn, Warner, & Hancock, 2010;Carrasco et al., 2007).Recently, the characterization of a F. chiloensis germplasm collection has confirmed the potential for selective breeding of this Fragaria species (Mora, Concha, & Figueroa, 2016).
Previous genetic studies of F. chiloensis have shown that it has high genetic diversity (Becerra, Paredes, & Lavín 2005;Carrasco et al., 2007), as determined by using analysis of amplified fragment length polymorphism (AFLP) and inter simple sequence repeat (ISSR) markers.We proposed that the findings of these studies should be maintained by using genome-wide co-dominant markers such as simple sequence repeats (SSR), or at least validated, as has been proven in other species such as olive (Belaj et al., 2003), maize (García et al., 2004) and Brassica napus (Li et al., 2011).In fact, SSR markers have been predominantly used in the Fragaria genus for genetic diversity studies, genetic mapping and identification of quantitative trait loci (QTL) (Sargent et al., 2012;Yoon et al., 2012;Zorrilla-Fontanesi et al., 2011).One reason that SSR markers are currently used in many species of the Fragaria genus, such as F. moschata, F. × ananassa and F. virginiana (Gil-Ariza et al., 2006) is because they represent a valuable option for the genetic evaluation of polyploid species even if they are analyzed as dominant markers.In fact, according to Pffeifer, Roschanski, Pannell, Korkbecka, and Schnittler (2011) SSR markers are more precise and reliable than other dominant markers such as AFLP and ISSR.
SSR markers can be found in both coding and noncoding regions of DNA, are widely used in genetic studies due to their abundance and co-dominance and are easy to read and interpret because they are present in just one locus per microsatellite (Rentaría-Alcántara, 2007).Cross-species amplification of SSR loci is considered a cost-effective method that facilitates genetic studies in species where sequence information is not available such as F. chiloensis.A cross-species SSR transferability rate between 70% and 100% has previously been reported in the Fragaria genus (Folta & Davis, 2006;Dirlewanger, Denoyes-Rothan, Yamamoto, & Chagné, 2009).The aim of this study was to examine the genetic diversity, linkage disequilibrium, and population structure in natural populations of F. chiloensis in order to improve the knowledge about the genetic basis of this species.

Plant material and DNA extraction
For genetic and population analysis, 54 accessions were studied covering most of the natural geographic range of this species in Chile: 51 accessions previously reported by Mora et al. (2016) plus 3 other recently collected (Table 1).Samples were grouped as putative populations of 6 to 13 individuals per population according to their climatic distributions as reported by Lavín et al. (2000), which were noted as MTM (Marine and Temperate Mediterranean), FCM (Fresh and Cold Marine), HPM (Humid Patagonian Marine), CM (Cold Mediterranean) and PAT (Polar Alpine Tundra) (Table 1).DNA extraction from fresh leaves was performed using the DNeasy plant mini kit (Qiagen, USA) following manufacturer's instructions with slight modifications.We used 60 mg of fresh weight per sample instead of 100 mg.To remove phenolics from the samples, 4% (w/v) polyvinylpyrrolidone (Bioworld, USA) was added to AP1 buffer (Qiagen, USA).To obtain a higher yield of DNA, only one elution was performed with nanopure water at 70°C instead of two elutions at room temperature (15-25ºC) as recommend by the manufacturer.

Cross-species amplification of SSR markers
A total of ninety-five SSR markers distributed across the whole genome (i.e., 8 sets of chromosomes) of F. × ananassa were screened for PCR amplification feasibility in F. chiloensis.The SSR primer pairs were selected from the first high-density SSR-based linkage map of F. × ananassa (Sargent et al., 2012).Particular interest was given to markers associated with QTL in F. × ananassa following the work of Zorrilla-Fontanesi et al. (2011).Cross-species amplification of SSR markers was carried out using ten F. chiloensis accessions selected from a germplasm collection (Mora et al., 2016) according to their geographic distance (Table 1).The PCR amplifications were carried out in a total volume of 0.015 cm 3 , containing 20 ng of DNA template, 1X reaction buffer (5X MangoTaq Colored Reaction Buffer, Bioline, USA), 0.2 μM primers, 200 μM dNTPs, 2 mM MgCl 2 (Bioline, USA), and 0.5 U MangoTaq DNA polymerase (Bioline, USA).A T100 TM thermal cycler (Bio-Rad, USA) was used with the following amplification program: initial denaturation at 94°C for 3 min.,30 cycles of denaturation at 94°C for 30 s, annealing at 60°C for 30 s, and extension at 72°C for 45 s, and a final extension at 72°C for 10 min.SSR markers were considered successfully transferred if they had clear and intense amplification in at least 6 out of 10 samples, and in a 100 bp range of the size described for the marker, following the parameters described by Kuleung, Baezinger, and Dweikat (2004).

Molecular data
Thirty-nine SSR markers were selected among the successfully transferred markers distributed across the F. × ananassa linkage map carried out by Sargent et al. (2012).These SSR markers were amplified using the Multiplex-Ready PCR protocol (Hayden, Nguyen, Waterman, & Chalmers, 2008).The reactions occurred in a total volume of 0.01 cm 3 , containing 20 ng of DNA template, 1X reaction buffer (10X PCR Buffer, Invitrogen, USA), 25 nM tag primers with either 6-FAM, VIC or PET fluorophores (Invitrogen, USA), 20 to 40 nM locus specific primers, 8 μg BSA (Bovine serum albumin), 200 μM dNTPs, 1.5 mM MgCl 2 (Invitrogen, USA), and 0.5 U Platinum Taq DNA polymerase (Invitrogen).A T100 TM thermal cycler (Bio-Rad, USA) was used with the following amplification program: initial denaturation at 94°C for 2 min., 20 cycles of denaturation at 92°C for 30 s, annealing at 63°C for 90 s, and extension at 72°C for 60 s, then 40 cycles of denaturation at 92°C for 15 s, annealing at 54°C for 30 s and extension 72°C for 60 s and a final extension at 72°C for 10 min.Samples were evaluated through capillary electrophoresis in an ABI3130 Genetic Analyzer (Applied Biosystems, USA) with an injection time of 10 s and 15,000 V, and it was run at 13,000 V in a 36 cm capillary.The resulting data were analyzed using Genemapper ® (v 4.0, Applied Biosystems, USA), and the detected alleles of each marker were converted to a binary matrix.

Genetic diversity and linkage disequilibrium
The GenAlEx 6.5 software package (Peakall & Smouse, 2006;Peakall & Smouse, 2012) was used to calculate total number of alleles (N A ), effective alleles (N E ), Shannon's index (S), percentage of polymorphic loci (P) and expected heterozygosity (H E ).Linkage disequilibrium (LD) was estimated using TASSEL 5 (Bradbury et al., 2007) by means of pairwise linkage tests with 100,000 steps in Markov chain and 1,000 steps of dememorization.Pairwise LD was estimated for the entire set of F. chiloensis samples as well as for each genetic group derived from structure analysis.The program LOSITAN (Antao, Lopes, Lopes, Beja-Pereira, & Luikart, 2008) was used to discriminate the SSR loci that did not fulfill the criteria of neutrality (outliers) following the work of Excoffier, Hofer, and Foll (2009).

Population structure
The population structure determination and identification of admixed individuals were performed using the program STRUCTURE 2.3.4 (Pritchard, Stephens, & Donnelly, 2000).Based on an admixture and independent allele frequency population model, we proposed that the number of genetic groups (K) could vary between 1 and 10, as previous studies suggested that a bigger number of groups would not be viable (Carrasco et al., 2007).The K value was calculated from 3 independent repetitions for each possible K, with a burn-in value of 100,000 samples and the number of Gibbs chains estimated at 1,000,000.The optimal K value was calculated using a method described by Evanno, Regnaut, and Goudet (2005) using STRUCTURE HARVESTER (Earl & vonHoldt, 2012).An analysis of molecular variance (AMOVA, p < 0.05; Excoffier, Smouse, & Quattro, 1992) for the optimal K clusters, and the putative populations, was performed using GENALEX 6.5 (Peakall & Smouse, 2006).Pairwise estimates of the correlation of alleles within subpopulations (φ ST ) for the model-based groupings were calculated using an analysis of molecular variance (AMOVA) approach.Pairwise φ ST estimates were calculated for all genetic groups determined by structure using the AMOVA approach in GENALEX 6.5 (Peakall & Smouse, 2006).φ ST is a measure analogous to the better known F ST derived from AMOVA (Meirmans & Hedrick, 2011).

Results and discussion
The SSR primers yielded a total of 259 alleles whose size varied between 112 and 342 bp.The population's genetic diversity parameters were estimated at P = 80%, H E = 0.17, and S = 0.28, as shown in Table 2.These results are average results for mixed breeding plants assessed with dominant markers (Nybom, 2004), and they relate to our data because the SSR markers were evaluated as binary data.Compared to previous studies (Becerra et al., 2005;Carrasco et al., 2007), we found lower heterozygosity, but that can be caused by differences in the samples utilized or the marker chosen for these studies.
Significant pairwise LD (p < 0.1) was observed in 3,752 pairs of loci.The r 2 values ranged from 0.001 and 0.652.The threshold beyond which the LD is probably caused by real physical linkage was estimated to be r 2 = 0.24.The intersection of the fitted curve of r² values with this threshold was considered to be an estimate of the range of LD (Figure 1; horizontal dotted line).Sixteen pairs of loci (0.4%) were higher than the threshold.The fitted regression intersected the thresholds at approximately 4 cM.Reports of LD range in the Rosaceae family are scarce (Khan & Korban, 2012), however our result is slightly lower than that of wild Chinese landraces of peach at 6 cM (Cao et al., 2012).Meanwhile, improved peach cultivars had an LD range of 13.3-15.2cM (Aranza, Abbassi, Howad, & Arus, 2010).In the population structure analysis, F. chiloensis accessions were previously and putatively grouped by climate region (Table 1).In this scenario, the genetic diversity parameters ranged between P = 54% and 76%, H E = 0.15 and 0.16, and S = 0.24 and 0.27.The Bayesian clustering analysis always tended to group the accessions into three genetic groups, which was confirmed by the ad hoc quantity (ΔK) having its highest value (ΔK = 277.78)for K = 3. Thirty-nine samples shared > 70% membership with any of the genetic groups (Figure 2), whereas 15 samples were admixture forms with varying levels of membership shared among each group.Genetic group 1 consisted of 11 accessions, genetic group 2 of 11 accessions, and genetic group 3 of 17 accessions, with the remaining being an admixture.These groups had genetic diversity parameters similar to those of the putative populations by climate region (Table 1).An analysis among botanical forms was not performed, as only six accessions belonged to F. chiloensis ssp.chiloensis f. chiloensis.According to the AMOVA, 96% of the genetic variation was present within the genetic groups.Meanwhile, 4% of the variation was present among the groups (Table 3).The average φ ST of the overall loci was 0.043, and pairwise estimates of φ ST indicated a low degree of differentiation between the three genetic groups with values ranging from 0.023 to 0.06.In comparison, Carrasco et al. (2007) described that among F. chiloensis individuals grouped by latitude, genetic distance φ ST = 0.03, which is remarkably similar to our results.It is worth noting that the genetic groups described by STRUCTURE analysis strongly resemble the distribution of potential glacial refuges in Chile (Figure 3), and samples that are not present in these refugia coincide with colonization routes described for other plant and vertebrate species (Sérsic et al., 2011).In particular, samples located in Chiloé Island and its surroundings are separated from the rest in the structure analysis and should be the focus of future in depth studies.

Conclusion
In this study, we analyzed the patterns of linkage disequilibrium, genetic diversity and population structure present in natural populations of the Chilean strawberry.The genetic diversity showed lower values than in previous studies, with low genetic differentiation among genetic groups.
These results validate previous studies of the population structure of the species with different type of genetic markers, as expected.The genetic groups proposed through the structure analysis resemble historical ecological processes of other species that share the same habitat.Our results also provide preliminary insight into the degree of linkage disequilibrium among loci, which is moderate among inbred species.

Figure 1 .
Figure 1.Linkage disequilibrium (r 2 ) plot of whole genome in 54 accessions of Fragaria chiloensis.The horizontal dotted lines indicate the 95 th percentile of the distribution of unlinked r 2 , which gives the critical value of r 2 .

Figure 2 .
Figure 2. Bar plot showing the probability of membership for 54 accessions (ordered from north to south) assessed using SSR markers.Each accession is represented by a vertical column.Black, light gray and dark gray bars correspond to genetic groups 1, 2, and 3, respectively.

Figure 3 .
Figure 3. Map with the locations of the 54 accessions of Fragaria chiloensis used in this study.Each accession is represented by a circle with the probability of membership in each cluster determined by STRUCTURE analysis.Black, light gray and dark gray bars correspond to genetic group 1, genetic group 2 and genetic group 3, respectively.

Table 1 .
Lavín et al. (2000)our Fragaria chiloensis accessions used in this study.Putative Populations were established as MTM (Marine and Temperate Mediterranean), FCM (Fresh and Cold Marine), HPM (Humid Patagonian Marine), CM (Cold Mediterranean) or PAT (Polar Alpine Tundra) according to the climate regions described byLavín et al. (2000).
1 Climate regions are described as in the Lavin et al. (2002) map of Fragaria chiloensis natural distribution in Chile.

Table 2 .
Lavín et al. (2000)cs for the genetic variation of 54 Fragaria chiloensis accessions using 259 bands produced by 38 simple sequence repeat markers: (A) accessions grouped by climate region and (B) accessions grouped by genetic groups suggested by STRUCTURE software, which share at least >70% membership with any of the clusters.The abbreviations MTM, FCM, HPM, CM and PAT represent Marine and Temperate Mediterranean, Fresh and Cold Marine, Humid Patagonian Marine, Cold Mediterranean, and Polar Alpine Tundra, respectively, according to the climate regions described byLavín et al. (2000).

Table 3 .
Analysis of molecular variance of 203 alleles from 38 simple sequence repeat bands in Fragaria chiloensis accessions: (A) 54 accessions of F. chiloensis grouped by climate region and (B) 54 accessions grouped according to the genetic groups suggested by structure analysis.