INTRODUCTION
Rice domestication is perennial and widely distributed in the tropical regions of the monsoon areas in South and Southeast Asia, where rice production has increased quickly in recent years (Mutert & Fairhurst, 2000). This increase has occurred mostly in irrigated areas, and proceeding has been much slower in areas where the irrigated area scale is very small. Therefore, more suitable to environmental restraints, new cultivated varieties with higher yield and more disease resistance are necessary for the evolution of rice planting, and the assessment of genetic variability can offer new insights into the evolutionary history and genetic relationships (Ge et al., 1999). To date, most studies have concentrated on assessing genetic variation and the domestication process of cultivated rice, but determining the population structure and genetic diversity of nature-domesticated rice may be the first step in achieving these goals.
Microsatellites, also called simple sequence repeats (SSRs), are tandem repeats of short nucleotide motifs that usually exhibit high polymorphism and species-specific characteristics in repeat number (Yu et al., 2009). In addition to a relatively low cost, high genetic resolution, and simple technical execution, microsatellites have advantages over other markers. For example, microsatellites are codominant, abundant, disseminate over the genome, and display high polymorphism in plant species (McCouch et al., 1997; Gwag et al., 2010; Cho et al., 2011). Recently, SSRs have become a popular marker choice for population, genetic diversity, and evolutionary studies in many plants. It has also been used to research cultivated rice varieties, including quantitative trait locus analysis (Xiao et al., 1996; Ishii et al., 2001; Moe et al., 2011), variety resource conservation, and cultivar identification, and has been used to research wild kindred rice (Olufowote et al., 1997; Garland et al., 1999; Zhou et al., 2003).
Our previous research illustrated the genetic diversity and population structure of rice collected from East Asia (Zhao et al., 2009) and America (Lu et al., 2009), but we know little about South and Southeast Asia collections. As a model organism with a fully sequenced genome, the aim of the present research was to explore the genetic diversity and population structure of 85 accessions from 10 countries of the above areas using 29 SSR markers.
MATERIALS AND METHODS
Plant Materials
The plant materials used in this study consisted of 85 accessions acquired from the National Agrobiodiversity Center of the Rural Development Administration (RDA) in the Republic of Korea. The accessions were mainly collected from the following South and Southeast Asian areas: Bangladesh (4), Bhutan (2), India (17), Laos (2), Sri Lanka (5), Myanmar (2), Pakistan (5), the Philippines (28), Thailand (13), and Vietnam (7).
DNA Extraction and SSR Genotyping
Total DNA was extracted from fresh, young leaves of the 85 accessions using a Qiagen DNA Extraction Kit (Qiagen, Seoul, Republic of Korea). The relative purity and concentration of the extracted DNA was estimated with a Nanodrop ND-1000 (Nanodrop Technologies, Inc., Wilmington, DE, USA), and the final concentration of each DNA sample was adjusted to 20 ng/μL.
Markers were chosen according to their location on the rice genetic map and their suitability for high-throughput genotyping. In all, 29 microsatellite markers distributed on 12 chromosomes, including 26 rice microsatellite markers obtained from GRAMENE and three microsatellite markers related to starch synthesis (Bao et al., 2002), were used to analyze population structure. PCR product size was measured following the M13-tail PCR method of Schuelke et al. (2000). Amplification reactions were carried out in a total volume of 20 μL, which contained 200 ng of template DNA (about 10 μL DNA sample), 1× PCR buffer, 0.2 mM of each dNTP, 1U Taq DNA polymerase, 8 pmol of each reverse and fluorescent-labeled M13(-21) primer, and 2 pmol of the forward primer with the M13(-21) tail at its 5’- end. PCR amplification was conducted at 94? for 3 min, then 30 cycles each were performed at 94? for 30 s, 55? for 45 s, and 72? for 1 min, followed by 10 cycles at 94? for 30 s, 53? for 45 s, and 72? for 1 min, and a final extension at 72? for 10 min. SSR alleles were resolved with ABI Prism 3100 DNA sequencer (Ap Biosystems, Foster City, CA, USA) using GeneScan software (version 3.7; Applied Biosystems) and sized precisely against the 6-carboxy-Xrhodamine (ROX) molecular size standard using Genotyper software (version 3.7; Applied Biosystems).
Data Analysis
Basic statistics were used to measure diversity at each SSR marker using PowerMarker, version 3.0 (Liu & Muse, 2005), including the total number of alleles (NA), polymorphism information content (PIC), allele frequency, genetic distance, and genetic diversity (GD). Expected heterozygosity (HE) was calculated using the genetic analysis package POPGENE, version 1.31 (Yeh et al., 2000). The unweighted pair group method with arithmetic mean (UPGMA) tree was constructed from shared allele frequencies using the program MEGA 4.0 (Tamura et al., 2007), which was exported from the PowerMarker.
The analysis of population structure and the identification of ancestral or hybrid forms within the accessions were performed using Structure 2.2 model-based software (Pritchard et al., 2000; Falush et al., 2003) In this model, several populations (K) were assumed to be present, each of which was characterized by a set of allele frequencies at each locus. All loci were assumed to be independent, and each K population was assumed to follow Hardy–Weinberg equilibrium (HWE). Individuals were also allowed to be products of an admixture between two or more of the populations. We routinely employed three iterations for estimation after a burn-in period of 100,000. An individual having more than 70% of its genome fraction value was assigned to a group.
RESULTS AND DISCUSSION
SSR Polymorphism in Entire Accessions
All 29 SSR markers showed polymorphism, producing a total of 342 alleles among 85 accessions (Table 1). The allelic richness per locus varied widely, ranging from 2 to 28, with an average of 11.8 alleles. The two SSR markers, RM206 and RM6144, produced the highest and lowest number of alleles at each locus. PIC ranged from 0.11 (RM6144) to 0.93 (RM206), with an average of 0.71, revealing an excess of heterozygous individuals at 29 SSR markers, and an excess of homozygous individuals at two markers (RM6144 and RM12676). HE values for genetic diversity varied from 0.12 (RM6144) to 0.93 (RM206), with an average of 0.73 (Table 2). Detecting the degree of polymorphism using SSR markers did not display significance with the number of alleles, HE, and PIC values. Some SSR markers producing similar allele numbers varied greatly in their HE and PIC values (e.g., marker RM048 and marker SSS produced five equal alleles, but the PIC value for RM048 was 0.25 and that for SSS was 0.69, which were significantly different). The database of allelic frequencies showed that rare alleles (frequency < 0.05) comprised 65.5% of all alleles, whereas common alleles (0.05 < frequency < 0.5) and abundant alleles (frequency > 0.5) comprised 33.6% and 0.9%, respectively, of all detected alleles (Table 3). These results indicated that the presence of a significantly high frequency of rare alleles made a greater contribution to the genetic diversity of the collection (Roussel et al., 2004; Yifru et al., 2006). Hence, clarifying rare alleles is important for maximizing the genetic variations in the gene bank collections and to utilize them in a breeding program.
Genetic Diversity and Phylogenetic Relationships
The shared allele frequencies were used to calculate the genetic distance between all pairwise combinations among the 85 accessions, and an UPGMA tree was constructed using the MEGA 4.0 program (Tamura et al., 2007) imported through the PowerMarker; the resulting dendrogram revealed a complex accession distribution pattern.
As shown in Fig.1, most accessions were clearly classified into three groups (SI, SII, and SIII). SI consisted of 25 accessions, and 21 accessions belonged to Southeast Asia, and 4 accession from South Asia. SII included 21 accessions that predominated in the Indian (11), and were all from Southeast Asia. SIII consisted of 39 accessions originating from 8 countries, predominantly Southeast Asian countries (31 accessions), and the others belonged to South Asia. In this dendrogram, there are 5 admixture accessions.
The genetic distance matrix of the rice population from 10 countries was also used to construct an UPGMA tree (Fig. 2). The similarity coefficients ranged from 0.227 to 1, with an average of 0.725. The rice populations could be clustered into two branches (BI and BII). BI consisted of four populations from Sri Lanka, India, Bangladesh, and Pakistan—all South Asian countries. BII contained five populations from Myanmar, Bhutan, Vietnam, the Philippines, and Thailand—all Southeast Asian countries—only Laos was out of this group. The dendrogram indicated that the genetic diversity of the rice populations was characterized by an average of 3.4 alleles, varying from 1.3 in Myanmar to 7.0 in the Philippines, and the average major frequency per locus was 0.58, ranging from 0.44 in the Philippines to 0.81 in Myanmar (Table 4). The results presented in this study suggest that rice breeders could select germplasm polymorphisms for use in a breeding plan.
Population Structure
Effective protection and administration strategies for rice accessions demand a basic understanding of their population structure. Once the total accessions were selected, the model-based clustering method for inferring population structure was performed using 85 accessions and a total of 29 SSR markers (Pritchard et al., 2000). The estimated likelihood values for a given K showed consistent results with three independent runs, but the distribution of LnP(D) did not show a clear pattern for the true K because of their behaviors, which were expected when factors such as inbreeding and departures from HWE were present (Falush et al., 2003). These factors could generate an overestimation of the number of a population’s K. Thus, another ad hoc quantity (ΔK) was used to overcome the difficulty of interpreting the real K values (Evanno et al., 2005). ΔK represented a clear peak at the true value of K. Base on this, we selected ΔK = 3 and all rice accession studies fell into three genetic groups (Fig. 3). The relatively small value of the alpha parameter (α = 0.028) demonstrated that most of the accessions originated from one primary ancestor, with a few admixed individuals (Ostrowski et al., 2006).
The model-based structure analysis used here revealed the presence of three populations in the selected core set. When clustering based on genetic distance and structure analyses based on the model were compared, similar patterns of groupings of accessions were discovered (Fig. 4). From the figure we can see Population 1 consisted of 19 accessions, mainly from Southeast Asia, predominated in the Philippines (6) and Vietnam (3). 24 accessions collected from 9 countries of both South and Southeast Asia formed population 2. Population 3 consisted of 37 accessions, mainly from the Philippines (17). The occurrence of some admixed and introgression events and new gene combinations between domestic cultivars and their wild or weedy relatives were important for the evolution of domesticated plant species (Jarvis et al., 1999). Of the 85 rice accessions, 80 (94.1%) shared at least 75% membership with one of the three populations and were classified as the members of that population, whereas five accessions (5.9%) were categorized as admixed forms with varying levels of membership shared among the three genetic groups (Table 5). In our result, some of the rice accessions collected from the same geographic locations did not cluster together. This result indicated that significant variations of rice could be generated diversity. Rice germplasm is freely distributed through the geographical barriers. Moreover, we can conclude our result will fulfill the corner of background information for rice crop improvement programs.
적 요
본 연구는 효율적인 자원보존과 유전자원의 작물육종 활용 성 제고를 위한 기초정보제공을 목적으로, 29개 SSR marker 를 이용하여 아시아지역의 10개 국가에서 수집된 벼 유전자원 에 대해 유전적 다양성 및 집단 구조 분석을 수행하였다.
-
총 85점의 벼 유전자원이 수집되었으며, 29개 SSR 마커 에 의해 증폭된 allele 수는 총 342개 였다. Allele 수는 2 개에서 28개 범위로 나타났으며, 마커당 평균 allele 수는 11.8 개 였다. 유전적 다양성을 나타내는 genetic diversity와 PIC 값의 범위는 각각 0.12-0.93과 0.11-0.93으로 나타났고, 평균은 각각 0.73과 0.71로 나타났다.
-
국가별 벼 자원의 유전적 거리를 기초로 유연관계를 분 석한 결과 2개 그룹으로 구분되었다. BⅠ 그룹에는 남아시아 지역에 속하는 스리랑카, 인도, 방글라데시, 파키스탄이 포함 되었고, BII 그룹에는 라오스를 제외한 동남아시아 지역인 미 얀마, 부탄, 베트남, 필리핀, 태국이 포함되었다.
-
수집 국가별 마커당 평균 allele 수는 미얀마가 1.28개로 가장 낮았고, 필리핀이 7.03개로 가장 높았으며, PIC 값 역시 필리핀이 0.64로 가장 높은 값을 보였고, 미얀마는 가장 낮은 0.15로 관찰됐다.
-
유전적 거리와 Structure ver2.2를 이용하여 집단의 구조 를 분석한 결과, 75%의 확률에서 85개 자원 중 80개 자원 (64.1%)는 3개의 subpopulation으로 나눌 수 있었으며, 5개 자원(5.9%)은 유전적으로 혼입된 형태를 나타냈다.
-
각각의 subpopulation은 수집 국가의 특성과 일치하지 않 았으며, 대부분 자원은 각각의 subpopulation에 불규칙적으로 분포되었다.