Journal Search Engine
Search Advanced Search Adode Reader(link)
Download PDF Export Citaion korean bibliography PMC previewer
ISSN : 1225-8504(Print)
ISSN : 2287-8165(Online)
Journal of the Korean Society of International Agriculture Vol.24 No.4 pp.451-462
DOI :

녹두 핵심집단 작성에서 군집방법 핵심자원수 결정방법 및 표본추출 방법의
조합에 따른 다양성 비교

곽재균, 오세종, 고호철, 마경호, 조규택, 이기안, 이석영, 박용진*†
농촌진흥청 국립농업과학원 농업유전자원센터, *공주대학교 식물자원학과
1. 세계 26개국에서 도입된 녹두 유전자원 705점에 대하여 16개의 형태적 형질 특성을 이용하여 각각 두 가지의 군집방법, 핵심집단 자원수 결정방법 및 표본추출 방법을 조합하여 총 8개의 핵심집단(SUNPR, SUNPPr, UNPR, UNPPr, SUNLR, SUNLPr, UNLR, UNLPr)을 작성하여 원수집단과 비교한바 8개 핵심집단 모두 원수집단과는 분산과 평균치에서 차이가 없었다.
2. Nei의 다양성지수는 8개 핵심집단 모두 원수집단 보다 높았고, 특히 UNLPr이 가장 높았으며, UNPR은 다른 핵심집단에 비하여 상대적으로 낮은 지수를 보였다.
3. 원수집단과 핵심집단간 자원분포의 동질성을 파악하기 위하여 카이자승 검증을 한 결과 12개 양적형질과 1개의 질적형질(종피색)은 모든 핵심집단이 동질분포를 보였으나 3개의 질적 형질 (배축색, 종피광택, 생육습성)은 핵심집단에 따라서 동질분포를 하는 것과 그렇지 않은 것이 있었으며, 8개 핵심집단 중 UNPPr은 동질성이 가장 높았다. 평균차율(MD%)과 일치율(CR%)은 8개 핵심집단 모두 유의한 수준을 보였다.
4. 8개의 모든 핵심집단이 원수집단과 평균치에서 차이가 없고, 높은 다양성을 과 동질한 분포를 보이며, 평균차율(MD%)과 일치율(CR%)이 유의한 수준을 보여 원수집단의 다양성을 잘 나타내는 것으로 해석되었다.
5. 핵심수집단 크기(10%와 15%)에 따른 다양성은 유의한 차이가 없었으며, 8개의 핵심집단 중 평균치, 동질분포, 일치율 및 도복내성을 고려할 때 UNPPr이 가장 좋은 핵심집단으로 사료되었다.

Diversity Capturing Comparison by Combinations among Clustering Methods, Allocation Strategies, and Sampling Strategies in Establishment Core-Sets of Mungbean (Vigna radiata) Germplasm Preserved in NAC Genebank

Yong-Jin Park*†, Jae-Gyun Gwag, Se-Jong Oh, Ho-Cheol Ko, Kyung-Ho Ma, Gyu-Taek Cho, Gi-An Lee, Sok-Young Lee
*Department of Plant Resources, College of Industrial Sciences, Kongju National University
National Agrobiodiversity Center, National Academy of Agricultural Science, RDA
Received Aug. 7, 2012 / Revised Nov. 7, 2012 / Accepted Dec. 7, 2012

Abstract

A core collection is a subset chosen to represent the diversity of a collection with a mini-mum of redundancies and is established to improve the conservation and use of genetic resources. In thisstudy, eight core subsets were established by combinations among two clustering methods (SUN, UN),two allocation strategies (P, L), and two sampling strategies (R, Pr) using morphological traits and werecompared with entire collection by variance, means, Nei's diversity index, goodness of fit, the mean dif-ference percentage (MD%), the variance difference percentage (VD%), coincidence rate (CR%), andthe variable rate (VR%) of traits. The variances between entire and eight core subsets was homogeneousfor all the traits of all core subsets by Levene's test both 10% and 15% sample sizes and the meansbetween entire and 8 core subsets were found to be non-significant by Newman-Keuls test for all traits ofall core subsets both 10% and 15% sample sizes. The average Nei's diversity index of the eight core sub-sets showed higher than the entire collection. UPGMA NTSYS Logarithmic Pragmatic (UNLPr) showedthe highest average Nei's diversity index, and Logarithmic (L) allocation strategy showed higher averagediversity index than Proportional (P) strategy in both 10% and 15% sample size. Frequency distributionbetween the entire collection and the eight core subsets using chi square test showed homogeneous distri-bution for 13 traits of 16 traits. The proportional allocation strategy showed relatively more homoge-neous distribution than the logarithmic allocation strategy for the three qualitative traits; hypocotylcolor (HC), luster on seed surface (LS), and growth habit (GH). All the core subsets had significant val-ues of MD% and CR%. Although decrease of VD% and increase of CR% according to sample sizeincrease from 10% to 15%, there was no significant difference between 10% and 15% sample size. Con-sidering all parameters such as means, homogeneous distribution, VD%, CR%, average lodging toler-ance, and bruchid resistance, UNPPr is the best among eight core subsets. Present result suggested thatwith regardless of clustering methods, if group size can be adjusted according to group diversity and util-ity, better core subset will be established so that original purposes of effective utilization and enhance-ment of the genetic diversity will be increased.

 Genebanks around the world hold collections of the genetic resources of crop plants for long-term conser-vation and for ease of access by plant breeders, researchers and other users (van Hintum et al., 2000). However, as the size of collections increases, the cost of their conservation and evaluation has risen. Frankel and Brown (1984) pro-posed the idea of core collections, in which a limited set of accessions, with a minimum amount of repetitiveness, were chosen to represent a maximum genetic diversity of entire germplasm resources. Since the concept of the core collec-tion was proposed, investigations on developing core collec-tions have increased and have been more extensively involved in different aspects for grouping method, sampling strategy, core size, etc. (Erskine and Muehlbauer, 1991; Perry  et al., 1991; Zeuli and Qualset, 1993; van Hintum, 1994, 1995; Basigalup et al., 1995; Diwan et al., 1995; Mar-ita et al., 2000; Wang et al., 2006; Jansen et al., 2007; Gan-gopadhyay et al., 2010).

 A core subset derived from an existing collection com-prises a chosen set of accessions that represent a wide cross-section of genetic spectrum available in a given crop spe-cies, with least amount of duplication. Such a core collection facilitates the genebank curators in doing a better job of assembly, management, and use of collections, particularly of indigenous genetic resources (Brown, 1995). The core subset can be utilized economically for genomics research, and the latter in turn can result in better strategies for devel-oping, validating, and revising a core collection (Jackson et al., 1999). Core collections have been constituted in at least 20 plant species (reviewed by Cui  et al., 2003)  including grain legumes, such as common bean (Tohme et al., 1995), chickpea (Hannon et al., 1994; Upadhyaya and Ortiz, 2001), groundnut (Holbrook et al., 1993; Upadhyaya et al., 2003), mungbean (Bisht  et al., 1998a), and pea (Wojciech  et al., 2000).

 Core collections can be established based on phenotypes(Huamán et al., 1999; Upadhyaya et al., 2001), pedigrees (Mar-tynov et al., 2003), geographic origins (Tai and Miller, 2001), isozymes (Chandra et al., 2002), DNA markers (Grenier et al., 2000; McKhann et al., 2004), and combination of phenotypes and DNA markers (Wang et al., 2006; Moe et al., 2012).

 Sampling strategy is important in establishing a good core collection. Clustering by stratification is more effective than non-stratification (Brown, 1989). In other words, the collec-tions ought to be divided into distinct genetic groups before clustering (van Hintum et al., 1995). Various clustering tech-niques viz. single linkage clustering, complete linkage clus-tering, unweighted pair group method using arithmetic averages (UPGMA) and Ward's minimum variance cluster-ing method were used to classify the core accessions into discrete groups (Romesburg, 1984).

 There was another question for sampling strategy: how to determine or allocate accessions to each group. At least six strategies have been proposed (Brown 1989, Yonezawa et al., 1995, Chandra et al., 2002) including constant (strategy C), proportional (strategy P), logarithmic (strategy L), ran-dom (strategy R), square root (strategy S), and genetic diver-sity-independent (strategy G). Brown (1989) considered strategy L was superior to C and P; Yonezawa et al. (1995) concluded that if the genetic structure of the germplasm was well understood, then strategy G should be used, otherwise P was superior to R, C, and L strategy. Li  et al. (2000) pointed out that the optimal sampling strategy was depen-dent on both the germplasm structure and the groups defined before sampling.

 Franco  et al. (2005) compared four allocation methods that is D method (the mean of the Gower’s distance between accessions within the cluster), L method (the logarithm of the cluster size), NY method (the product of the cluster size times the mean Gower distance), and LD method (the prod-uct of the logarithm of the cluster size times the mean Gower distance) with the aim of determining which one forms core subsets that best retain the diversity contained in the original collection. New trials such as the M (maximiza-tion) strategy or nested selection methods (Schoen and Brown, 1993; Bataillon  et al., 1996; Marita  et al., 2000) have been conducted to select specific combinations of accessions that include complete coverage and retention. The objective of this study was to find an optimal clustering method and sampling strategy, and their combination for forming a representative mungbean core subset preserved at NAC based on the morphological traits.

MATERIALS AND METHODS

Germplasm and morphological traits characterization

 The seven hundred and five mungbean accessions col-lected from 26 countries were taken from the National Agrobiodiversity Center (NAC) of Korea (Table 1). Each accession was grown in the field to characterize 16 morpho-logical traits data, such as hypocotyl color (HC, 1. green, 2.greenish purple, 3. purple), seed coat color (SC, 1. yellow, 2.greenish yellow, 3. light green, 4. dark green, 5. brown, 6. mottled), luster on seed surface [LS, 1. absent (dull), 2. present (shiny)], growth habit (GH, 1. erect, 2. semi-erect, 3. spreading), days to flowering (DF, days to 50% flowering), days to maturing (DM, days to 50% maturing), pod length (PL, cm, average of 20 pods), pod width (PW, cm, average of 20 pods), seed length (SL, mm, average of 20 seeds), seed width (SDW, mm, average of 20 seeds), plant height (PH, cm, average of 10 plants), number of first branches (BN, number of pod bearing branches originated from the leaf axils on the main stem, average of 10 plants), number of pods per plant (PNP, average of 10 plants), number of seeds per pod (SNP, average number of seeds from 20 pods), 100-seed weight (SW, g), yield per plant (YI, g). These charac-terized data was subjected to the unweighted pair group method using arithmetic algorithm (UPGMA) NTSYS pro-gram to construct clusters and used to calculate the variance, the means, the mean difference percentage (MD%), the vari-ance difference percentage (VD%), the coincidence rate(CR%), and the variable rate (VR%) of the entire and core subsets. Quantitative traits data was graded like qualitative traits data by Sturge's rule to calculate Nei’s diversity index, goodness of fit of traits of the entire and core subsets.

Table 1. Origin country of the seven hundred and five mungbean accessions used in this study and number of accessions according to growth habit.

Construction of core subsets

 To compare their effectiveness, eight core subsets(SUNPR, SUNLR, SUNPPr, SUNLPr, UNPR, UNLR, UNPPr, UNLPr) were established (Table 2) by combinations among two clustering methods [Stratification by origin/growth habit and UPGMA NTSYS clustering(SUN), UPGAM NTSYS clustering (UN)], two allocation strategies [Propor-tional(P), Logarithmic(L)], and two sampling strategies [Ran-dom(R), Pragmatic(Pr)] using morphological traits.

Table 2. Clustering methods and allocation/sampling strategies to construct core subsets.

 Two sampling sizes of core subsets, 10% (71 accessions) and 15% (106 accessions) of the entire collection (705 accessions), were constructed. The unweighted pair group method using arithmetic averages (UPGMA) algorithm based on similarity matrix data was used to construct clus-ters with the help of SAHN-clustering and TREE programs from NTSYS-pc.V.2.1 (Rohlf, 1992).

METHODS OF CORE SUBSET CONSTRUCTION BY COMBINATION AMONG CLUSTERING METHODS AND SAMPLING STRATEGIES

Stratification by origin/growth habit, UPGMA NTSYS clustering and proportional sampling strategy (SUNPR)

 At first seven hundred and five accessions were stratified by three growth habit according to their origin countries, and 57 groups were formed from this result. The number of core subset entries allocated to each cluster (group) in proportion to the number of accessions in each cluster. Among 57 groups, the groups which have three or more accessions sub-jected to NTSys-pc.V.2.1, to analyze dendrogram using 16 morphological traits. Core entries were chosen as following; if there is only one accession in a group, it was directly sampled to core subset, if one accession is to be selected from two or more accessions in a group, it was selected randomly.

Stratification by origin/growth habit, UPGMA NTSYS clustering and logarithmic sampling strategy (SUNLR)

 Stratification, clustering and core entries selection strategy is same as SUNPR. The number of core entries allocated to each cluster in proportion to the logarithm of the number of accessions in the cluster.

Stratification by origin/growth habit, NTSys clustering and pragmatic proportional sampling strategy (SUNPPr)

 Stratification and clustering method, and number of core entries allocation is same as SUNPR. But for selection of core entries, the accessions which is high reputation, high availability expected or maximum and minimum values of traits was considered instead of random selection.

Stratification by origin/growth habit, NTSys clustering and pragmatic logarithmic sampling strategy (SUNLPr)

 Stratification and clustering method and number of core subset entries is same as SUNLR. But core entries selection method is same as SUNPPr.

UPGMA NTSYS clustering and proportional sam-pling strategy (UNPR)

 Data of 16 traits of 705 accessions was directly subjected to NTSYS-pc.V.2.1 program to construct clusters. After cluster construction, number of core entries were allocated to each cluster by proportionally according to number of accessions in entire collection. Core entry selection method is same as SUNPR.

UPGMA NTSys clustering and logarithmic sampling strategy (UNLR)

 Clustering and core entry selection method is same as UNPR. After cluster construction, number of core subset entries were allocated to each cluster by logarithmically according to number of accessions of entire collection.

NTSys clustering and pragmatic proportional sam-pling strategy (UNPPr)

 Clustering method and number of core subset entries is same as UNPR and core entries selection method is same as SUNPPr.

NTSys clustering and pragmatic logarithmic sam-pling strategy (UNLPr)

 Clustering method and number of core subset entries is same as UNLR and core entries selection method is same as SUNPPr.

EVALUATION OF THE CORE SUBSETS

 The homogeneity of variances in the entire and core sub-sets were compared for all the morphological traits by Lev-ene’s (1960) test. The means for the entire collection and core subsets were compared using Newman–Keuls proce-dure (Newman 1939; Keuls 1952). The distribution homo-geneity for each of the agronomic traits was analyzed by the chi square test. The diversity index of Nei (1972) was esti-mated and used as a measure of phenotypic diversity in the entire collection and the core subsets for each trait. The per-centage of the significant difference between the core sub-sets and the entire collection was calculated for the mean difference percentage () or the variance difference percentage ( ×100) of traits. The coincidence rate ( ×100)  and the variable rate (VR) was calculated to evaluate the property of the core collection in terms of the entire collec-tion, where Rc = range of the core collection, Re = range of the initial collection, CVc = coefficient of variation of the core col-lection, CVe = coefficient of variation of the entire collection, m=number of traits (Hu et al., 2000). Cluster analysis of the morphological data was performed with NTSYS-pc, version 2.1 (Rohlf, 1992) based on the simple matching coefficient and the unweighted pair-group method using an arithmetic average(UPGMA). The Levene's test, Newman–Keuls test, Nei's diversity index, chi square test, MD, VR, VR calculated using excel program V (SAS Institute, 1989).

RESULTS

Allocation of core subsets entries

 Results of core subsets entries allocation according to combinations among clustering methods, allocation strate-gies and sampling strategies were shown in Table 3.

Table 3. Core entry number allocation for each cluster according to allocation strategies in UPGMA NTSys clustering method (UN) and growth habit.

 The number of core entries allocated to each growth habit and cluster is not accurately accord with the ratio of acces-sions in the entire collection, because the number of entries in each growth habit and cluster was rounded to the nearest whole number and many clusters include only one accession or a few number of accessions in entire collection, and we tried to select at least one accession from every group and cluster.

Evaluation of the core subsets

 The variances between entire and 8 core subsets was homogeneous for all the traits of all core subsets by Levene's test both 10% and 15% sample sizes (Table 4). The means between entire and 8 core subsets were found to be non-sig-nificant by Newman-Keuls test for all traits of all core sub-sets both 10% and 15% sample sizes (Table 5).

Table 4. Variance comparison between entire and 8 core subsets.

Table 5. Means comparison between entire and 8 core subsets.

 A comparison of Nei's diversity index in the entire collec-tion and core subsets for the 16 qualitative and quantitative traits used in core subsets formation is given in Table 6. All of the core subsets showed higher average Nei's diversity index than the entire collection both 10% and 15% sample size. Among eight core subsets, UNLPr showed the highest Nei's diversity index, but UNPR showed the lowest Nei's diversity index in both sample size. L allocation strategy(SUNLR, SUNLPr, UNLR, UNLPr) showed higher aver-age Nei’s diversity index compare to the P strategy(SUNPR, SUNPPr, UNPR, UNPPr) both 10% and 15% sample size.

Table 6. Nei’s diversity index for 16 morphological traits in the entire collection and 8 core subsets.

 The frequency distribution using chi square test indicated homogeneity of distribution for all traits used for core for-mation, except three qualitative traits that is, hypocotyl color, luster on seed surface and growth habit. Among eight core subsets UNPPr showed none significant difference (at 0.05 level) for all the traits except luster on seed surface in 10% sample size (Table 7) and hypocotyl color in 15% sam-ple size. The proportional allocation strategy (SUNPR, SUNPPr, UNPR, UNPPr) showed relatively more homoge-neous distribution than the logarithmic allocation strategy(SUNLR, SUNLPr, UNLR, UNLPr) for the three qualitative traits (HC, LS, GH).

Table 7. Chi square test for 16 traits in the entire collection and 8 core subsets.

 Eight core subsets were compared for VD%, MD%, CR%, and VR% with the entire collection. Seven core sub-sets showed more than 20% values of VD%, except both sample size of UNPR and 15% sample size of SUNPR. All the core subsets had significant values of MD%. Also all the core subsets had significant values of CR%, especially the 15% sample size of SUNPPr and UNPPr showed 100% CR% (Table 8). This indicated that each of the eight core subsets was representative of the entire collection. Although decrease of VD% and increase of CR% according to sample size increase from 10% to 15%, there was no significant dif-ference between 10% and 15% sample size.

Table 8. Percentage of the trait differences between the entire collection and the 8 core subsets.

DISCUSSIONS

 For forming core subsets which well representative of the entire collection in terms of maintaining genetic diversity, eight core subsets comprising 71 and 106 accessions (10% and 15% sample size of the entire collection, respectively) were constructed by combinations among two clustering methods (SUN, UN), two allocation strategies (P, L), and two sampling strategies (R, Pr) using 16 morphological traits. Verification and evaluation are important for a core collection, which can determine the captured range of full variation in the core collection (Galwey, 1995). The mean, genetic diversity, distribution homogeneity, percentage of the significant difference (VD%, MD%), the coincidence rate (CR%) for range and the variable rate (VR%) for the coefficient of variation was compared between the core sub-sets and the entire collection.

 The means between the entire and the eight core subsets, both 10% and 15% sampling size, showed non-significant for all traits of all core subsets (Table 5). This indicates that the core subsets and the entire collection are not different for their means. Although there are no significant difference for means between the entire collection and the core subsets UNPPr, and UNLPr, it is noteworthy that this two core sub-sets had more number of pods per plant and more yield per plant than entire collection(data not shown). Two quantita-tive traits, that is lodging tolerance and bruchid resistance which was not used to form core subsets, was compared between entire and core subsets. As the results two core sub-sets, SUNPPr and UNPPr, showed more tolerant to lodging compare to entire collection both 10% and 15% sample size(Fig. 1).

Fig. 1. Comparison of average lodging tolerance index between entire collection and eight core subsets [grade; 1(no lodging), 3, 5, 7, 9(severe lodging)].

All the core subsets, except UNPR, showed more resistant to bruchid (Callobruchus spp.) compare to entire collection both 10% and 15% sample size (Fig. 2). Among eight core subsets SUNLPr and UNLPr showed more resistant to bruchid than others. This indicates that pragmatic sampling strategy can select more useful accessions to core subsets. Similar result was reported by Huamán et al. (1999) and they advocated that excess of less-susceptible accessions to pests in the core subset may be regarded as a bonus for breeders. 

Fig. 2. Comparison of average bruchid resistance index between entire collection and eight core subsets [grade; 0(no damage), 1, 3, 5, 7(severe damage)].

 The average Nei's diversity index of the eight core subsets revealed higher than the entire collection. Usually the core subsets have higher diversity than entire collection, because redundant accessions are ruled out from each cluster when forming it. Among eight core subsets, UNLPr showed higher average Nei's diversity index, and L allocation strat-egy (SUNLR, SUNLPr, UNLR, UNLPr) showed higher average diversity index compare to the P strategy (SUNPR, SUNPPr, UNPR, UNPPr) both 10% and 15% sample size. The L strategy select relatively more accessions from small clusters with greater diversity to core subsets, this may lead to higher diversity index of L strategy.

A comparison of frequency distribution between the entire collection and the eight core subsets using chi square test showed homogeneous distribution for 13 of 16 traits of all of the eight core subsets. Three qualitative traits that is hypo-cotyl color, luster on seed surface and growth habit, revealed homogeneous or none homogeneous distribution according to sampling size and core subsets. Especially, luster on seed surface showed none homogeneous distribution for all of seven core subsets both 10% and 15% sample size except 15% sample size of UNPP (Table 7). The three traits which showed homogeneous or none homogeneous distribution according to sampling size and core subset has less number of grade of the traits. The number of grade of luster on seed surface is only two (absent and present), and hypocotyl color and growth habit is three, respectively. This less num-ber of grade of traits may lead to none homogeneous distri-bution between entire and the core subsets. The proportional allocation strategy (SUNPR, SUNPPr, UNPR, UNPPr) showed relatively more homogeneous distribution than the logarithmic allocation strategy (SUNLR, SUNLPr, UNLR, UNLPr) for the three qualitative traits (HC, LS, GH) and the UNPPr showed the most homogeneous distribution to the entire collection both 10% and 15% sample size among eight core subsets. The logarithmic strategy allocate rela-tively more entries to small clusters than large clusters to form core subsets, this result may lead to none homoge-neous distribution between entire and core subsets, one the contrary the proportional strategy allocate similar ratio with entire collection to core subsets (Table 3).

 The core collection is considered to be the representative of the entire collection under the following situations: (1) no more than 20% of the traits have different means (significant at a = 0.05) between the core set and the initial collection; and (2) the CR% retained by the core collection is no less than 80% (Hu et al., 2000). In this study, most of the core collections had more than 20% values of VD% except both 10% and 15% sample size of UNPR and 15% sample size of SUNPR. Three core subsets, (SUNPR, UNPR, and UNLR) which showed relatively low value of VD%, are well retained the genetic diversity structure of the entire collec-tion and other six core subsets are retain high genetic diver-sity. All of the core subsets had significant values of MD%, CR%, and VR%, especially the 15% sample size of SUN-PPr, UNPPr. This suggests that each of the eight core sub-sets is representative of the entire collection. Hu et al. (2000) established a core collection with coincidence rate is 100% by preferred sampling strategy which can produce acces-sions with both maximum and minimum values of traits of the initial collection.

 An effective representation of the genetic diversity of the whole collection in a core depends on first separating the accessions into meaningful groups. There are a number of different approaches which have been tested either singly or in combination (Knüffer and van Hintum, 1995; van Hin-tum, 1995; Hu et al., 2000; Zhang et al., 2000; Amadou et al., 2001; Li et al., 2005; Gangopadhyay et al., 2010). The optimum choice will depend not only on the information available but also on the way in which genetic diversity varies within crop gene pools and the collections being used (van Hintum et al., 2000). In this study the SUN clustering method showed higher average values of VD%, CR%, and VR% than UN method, but the difference is not significant level.

 An allocation method provides criteria for determining the number of accessions to be selected from each cluster. Many strategies have been proposed (Brown, 1989; Yonezawa et al., 1995; Chandra et al., 2002; Franco et al., 2005). Li et al.(2000) pointed out that the optimal sampling strategy was dependent on both the germplasm structure and the groups defined before sampling. However, up to now there is no suitable way of deciding the number of accessions selected from each group. Sampling in proportion to the number of accessions in a group does not consider the genetic relation-ship among the groups. Methods of constructing core collec-tion by stepwise clustering in the present study does not need to determine the threshold value (cut-off point) or to consider the group number and group size (Hu et al., 2000). Franco et al. (2005) considered D method was superior to L, NY, and LD method.

 The reason for sampling accessions when forming core subsets is to identify a strategy that will structure a sample that recovers most of the diversity contained in the original collection, while maximizing the variance and the distances between accessions in the sample (Franco, 2005). Stratified sampling by clustering within groups is considered to be a better choice than non-stratified sampling, but there is no consensus strategy for determining accessions within groups. Many attempts have been made to determine opti-mal sampling strategies for core collections (Hu et al., 2000; Bisht, 1998b; Jansen and Hintum, 2007; Li and Nelson, 2002). In our study, random (R) strategy showed lower val-ues of VD%, CR%, and VD% than pragmatic (Pr) strategy. Random sampling can determine the genetic diversity struc-ture of initial genetic resources, because accessions are ran-domly sampled from each of the subgroups at the lowest level of sorting, with the small VD% and VR%. Random sampling can be used if a core collection maintains the genetic diversity pattern of the initial collection (Hu et al., 2000). Pr strategy showed higher values of VD%, CR%, and VD% than R strategy and has higher average lodging toler-ance as well as higher average bruchid resistance than R strategy (Fig. 1, 2). Pr strategy is considered well representa-tive of the entire collection in terms of maintain genetic diversity and valuable traits.

The core collection with a larger VD% and VR% is con-sidered to provide a good representation of the genetic diversity of the initial collection. The preferred sampling strategy can produce accessions with both maximum and minimum values of traits, and at the same time still retain the genetic variation structure of the initial collection, so that can be used for developing a core collection retaining acces-sions with special or valuable characteristics in the initial germplasm collection (Hu et al., 2000).

 Most core collections described so far are an order of magnitude smaller than the collection from which they came. The optimum sample fraction depends largely upon the degree of genetic redundancy among accessions, the resources available for maintenance of core entries and the frequency of regeneration of the entries (Yonezawa et al., 1995). Charmet and Balfourier (1995) and Bisht et al. (1998b) have analyzed size and grouping strategies and found that sizes of 5 ~ 10% were optimal, capturing 75 ~ 90% of the diversity. In contrast, Noirot et al. (1996) have suggested that higher percentages (20 ~ 30%) are needed, particularly where the objective is to capture the genetic diversity of quantita-tively inherited characters. In this study, although decrease of average VD% and increase of CR% according to sampling size increase from 10% to 15%, there was no significant dif-ference between the sample size for all parameters in the eight core subsets (Table 8). Considering all parameters such as means, homogeneous distribution, VD%, CR%, average lodging tolerance and bruchid resistance, UNPPr is the best among eight core subsets.

 Various data have been used to analyze the genetic diver-sity in crops, including morphological, agronomic and eco-geographical traits or molecular and biochemical markers. Each of these criteria has its advantages and disadvantages for measuring genetic diversity. The representation of core collections developed by different sampling strategies and cluster methods is quite distinct. For better representation of the core collection, clustering methods should be combined with different clustering methods and sampling strategies can affect the property of core collections (Hu et al., 2000).

 Present result suggested, with regardless of clustering methods, if group size can be adjusted according to group diversity and utility, better core subset will be established so that original purposes of conservation of genetic resources, namely, safe conservation and enhancement of the genetic diversity will be increased.

ACKNOWLEDGMENTS

 This study was carried out with the support of “Research Program for Agricultural Science & Technology Develop-ment (Project No. PJ00750602)”, National Academy of Agricultural Science, Rural Development Administration, Republic of Korea.

Reference

1.Amadou HI, Bebeli PJ and Kaltsikes PJ. 2001. Genetic diver-sity in Bambara groundnut (Vigna subterranean L.) germplasm revealed by RAPD markers. Genome 44: 995-999.
2.Basigalup DH, Barnes DK and Stucker RE. 1995. Development of a core collection for perennial Medicago plant introductions. Crop Science 35: 1163-1168.
3.Bataillon TM, David JL and Schoen DJ. 1996. Neutral genetic markers and conservation genetics: simulated germplasm col-lection. Genetics 144: 409-417.
4.Bisht IS, Mahajan RK and Patel DP. 1998. The use of character-isation data to establish the Indian mungbean core collection and assessment of genetic diversity. Genetic Resources and Crop Evolution 45: 127-133.
5.Bisht IS, Mahajan RK, Lokknathan TR and Agrawal RC. 1998. Diversity in Indian sesame collection and stratification of germplasm accessions in different diversity groups. Genetic Resources and Crop Evolution 45: 325-335.
6.Brown AHD. 1989. Core collections: a practical approach to genetic resources management. Genome 31: 818-824.
7.Brown AHD. 1995. The core collection at the crossroads. In: Hod-kin T.; Brown, A.H.D.; Van Hintum, J.L. and Morales, E.A.V.(ed.) Core Collections for Today and Tomorrow, International Plant Genetic Resources Institute (IPGRI), Rome, Italy, Wiley-Sayce Publication, pp. 3-19.
8.Chandra S, Huaman Z, Hari Krishna S and Ortiz R. 2002. Optimal sampling strategy and core collection size of Andean tetraploid potato based on isozyme data-simulation study. The-oretical and Applied Genetics 104: 1325-1334.
9.Charmet G and Balfourier F. 1995. The use of geostatistics for sampling a core collection of perennial ryegrass populations. Genetic Resources and Crop Evolution 42: 303-309.
10.Cui YH, Qiu LJ, Chang RZ and Lv WH. 2003. Advances in the core collection of plant germplasm resources. Journal of Plant Genetic Resources. 4: 279-284 (in Chinese with an English abstract).
11.Diwan N, Mcintosh MS and Bauchan GR. 1995. Methods of developing a core collection of annual Medicago species. The-oretical and Applied Genetics 90: 755-761.
12.Erskine W and Muehlbauer FJ. 1991. Allozyme and morpho-logical variability, outcrossing rate and core collection forma-tion in lentil germplasm. Theoretical and Applied Genetics 83:119-125.
13.Franco J, Crossa J, Taba S and Shands H. 2005. A Sampling Strategy for Conserving Genetic Diversity when Forming Core Subsets. Crop Science 45: 1035-1044.
14.Frankel OH and Brown AHD. 1984. Current plant genetic resources – a critical appraisal. In: Genetics: new frontiers, Oxford and IBH Publishing, New Delhi, India, pp. 1-11.
15.Galwey NW. 1995. Verifying and validating the representative-ness of a core collection. Core collection of plant genetic resources, John Wiley, Sons, West Sussex, England.
16.Gangopadhyay KK, Mahajan RK, Kumar G, Yadav SK, Meena BL, Pandey C, Bisht IS, Mishra SK, Sivaraj N, Gambhir R, Sharma SK and Dhillon BS. 2010. Develop-ment of a core set in brinjal (Solanum melongena L.). Crop Sci-ence 50: 755-762.
17.Grenier C, Deu M, Kresovich S, Bramel PC and Hamon P. 2000. Assessment of genetic diversity in three subsets consti-tuted from the ICRISAT sorghum collection using random vs nonrandom sampling procedures. B. Using molecular markers. Theoretical and Applied Genetics 101: 197-202.
18.Hannon RM, Kaizer WJ and Muehlbauer FJ. 1994. Develop-ment and utilization of the USDA chickpea germplasm core collection. In: Agronomy Abstacts, ASA, Madison, WI, USA, p. 217.
19.Holbrook CC, Anderson WF and Pittman RN. 1993. Selection of core collection from the U.S. germplasm collection of pea-nut. Crop Science 33: 859-861.
20.Hu J, Zhu J and Xu HM. 2000. Methods of constructing core collections by stepwise clustering with three sampling strate-gies based on the genotypic values of crops. Theoretical and Applied Genetics 101: 264-268.
21.Huamán Z, Aguilar C and Ortiz R. 1999. Selecting a Peruvian sweetpotato core collection on the basis of morphological, eco-geographical, and disease and pest reaction data. Theoretical and Applied Genetics 98: 840-844.
22.Jackson MT, Pham JL, Newbury HJ, Ford-Lloyd BV and Virk PS. 1999. A core collection for rice-needs, opportunities, and constraints. In: Johnson, R.C.; Hodkin, T. (ed.) Core Col-lections for Today and Tomorrow, International Plant Genetic Resources Institute, Rome, Italy, pp. 18-27.
23.Jansen J and Van Hintum TH. 2007. Genetic distance sampling: a novel sampling method for obtaining core collections using genetic distances with an application to cultivated lettuce. The-oretical and Applied Genetics 114: 421-428.
24.Keuls M. 1952. The use of the'Studentized range'in connection with an analysis of variance. Euphytica 1: 112-22.
25.Knüffer H and Van Hintum TJL. 1995. The Barley Core Collec-tion - an international effort. In: Hodgkin T, Brown AHD, Van Hintum TJL and Morales EAV (ed.) Core Collections of Plant Genetic Resources. John Wiley and Sons, UK, pp. 171-178.
26.Levene H. 1960. Robust tests for equality of variances. In: Olkin I.(ed.) Contributions to Probability and Statistics: Essays in Honour of Harold Hotelling, Stanford University Press, Stan-ford, USA, pp. 278-92.
27.Lh LI, Qiu LJ, Chang RZ and He XL. 2005. Differentiation and Genetic Diversity of SSR Molecular Markers for Huanghuai and Southern Summer Sowing Soybean in China. (in Chinese with an English abstract). Acta Agronomica Sinica 31: 777-783.
28.Li Z and Nelson RL. 2002. RAPD marker diversity among soy-bean and wild soybean accessions from four Chinese prov-inces. Crop Science 42: 1737-1744.
29.Li ZC, Zhang HL, Zengyw YZY, Shen SQ, Sun CQ and Wang XK. 2000. Study on sampling schemes of core collection of local varieties of rice in Yunnan China. Scientia Agricultura Sinica, (in Chinese with an English abstract), 33: 1-7.
30.Marita J M, Julie M, Rodriguez and James N. 2000. Develop-ment of an algorithm identifying maximally diverse core col-lections. Genetic Resources and Crop Evolution 47: 515-526.
31.Martynov SP, Dobrotvorskaiatv, Dotlacil L, Stehno Z, Fab-erova I and Bares I. 2003. Genealogical approach to the for-mation of the winter wheat core collection. Russian Journal of Genetics 39: 917-923.
32.Mckhann H, Camilleri C, Berard A, Bataillon T, David JL, Reboud X, Corre VL, Caloustian C, Gut IG and Brunel D. 2004. Nested core collections maximizing genetic diversity in Arabidopsis thaliana. The Plant Journal 38: 193-202.
33.Moe KT, Gwag JG and Par, YJ. 2012. Efficiency of PowerCore in core set development using amplified fragment length poly-morphic markers in mungbean. Plant Breeding 131: 110-117.
34.Nei M. 1972. Genetic distance between populations. The Ameri-can Naturalist 106: 283-292.
35.Newman D. 1939. The distribution of range in samples from a normal population expressed in terms an independent estimate of standard deviation. Biometrika 31: 20-30.
36.Noirot M, Hamon S and Anthony F. 1996. The principal compo-nent scoring: a new method of constituting a core collection using quantitative data. Genetic Resources and Crop Evolution 43: 1-6.
37.Perry MC, Mcintosh MS and Stoner AK. 1991. Geographical patterns of variation in the USDA soybean germplasm collec-tion. II. allozyme frequencies. Crop Science 31: 1356-1360.
38.Rohlf FJ. 1992. NTSYS-pc numerical taxonomy and multivariate analysis system. New York, State University of New York.
39.Romesburg HC. 1984. Cluster analysis for researchers. Lifetime Learning Publications, Belmont, California.
40.Sas Institute. 1989. SAS/STAT User' Guide Version 6. 4th edn. SAS Institute, Inc., Cary, NC, USA.
41.Schoen DJ and Brown AHD. 1993. Conservation of allelic rich-ness in wild crop relatives is aided by assessment of genetic markers. Proceedings of the National Academy of Sciences, USA 90: 10623-10627.
42.Tai PYP and Miller JD. 2001. A core collection for Saccharum spontaneum L from the world collection of sugarcane. Crop Science 41: 879-885.
43.Tohme J, Jones P, Beebe S and Iwanga M. 1995. The combined use of agroecological and characterization data to establish the CIAT Phaseolus vulgaris core collection. In: Hodkin, T.; Brown, A.H.D.; Van Hintum, Th.J.L. and Morales, B.A.V.(ed.) Core Collection of Plant Genetic Resources, International Plant Genetic Resources Institute (IPGRI), John Willey & Sons, New York, USA, pp. 95-108.
44.Upadhyaya HD and Ortiz R. 2001. A mini core subset for cap-turing diversity and promoting utilization of chickpea genetic resources in crop improvement. Theoretical and Applied Genetics 102: 1292-1298.
45.Upadhyaya HD, Bramel PJ and Singh S. 2001. Development of a chickpea core subset using geographic distribution and quan-titative traits. Crop Science 41: 206-210.
46.Upadhyaya HD, Ortiz R, Bramel PJ and Singh S. 2003. Devel-opment of a groundnut core collection using taxonomical, geo-graphical, and morphological descriptors. Genetic Resources and Crop Evolution 50: 139-148.
47.Van Hintum TJL. 1994. Comparison of marker system and con-struction of a core collection in a pedigree of European spring barley. Theoretical and Applied Genetics 89: 991-997.
48.Van Hintum TJL. 1995. Hierarchical approaches to the analysis of genetic diversity in crop plants. In: Hodgkin, T.; Brown, A.H.D.; Hintum, T.H.J.L. and Morales, E.A.V. (ed.) Core col-lections of plant genetic resources. John Wiley and Sons, Chichester, UK, pp. 23-34.
49.Van Hintum TJL, Brown AHD, Spillane C and Hodgkin T. 2000. Core collections of plant genetic resources. IPGRI Tech-nical Bulletin, 3: 1-51.
50.Wang L, Guan Y, Guan R, Li Y, Ma Y, Dong Z, Liu X, Zhang H, Zhang Z, Liu R, Chang H, Xu L, Li F, Lin W, Luan Z, Yan XN, Zhu YL, Cui Y, Piao R, Liu Y, Chen P and Qiu L. 2006. Establishment of Chinese soybean (Glycine max) core collections with agronomic traits and SSR markers. Euphytica 151: 215-223.
51.Wojciech K, Bogdan W, Somsak A and Pawe K. 2000. An anal-ysis of isozymic loci polymorphism in the core collection of the Polish Pisum genebank. Genetic Resources and Crop Evo-lution 47: 583-590.
52.Yonezawa K, Nomura T and Morishima H. 1995. Sampling strategies for use in stratified germplasm collections. In Hodgkin, T.; Brown, A.H.D.; van Hintum, Th.J.L. and Morales, E.A.V. (ed.) Core Collections of Plant Genetic Resources. John Wiley and Sons, UK. pp. 35-54.
53.Zeuli PLS and Qualset CO. 1993. Evaluation of five strategies for obtaining a core subset from a large genetic resource collec-tion of durum wheat. Theoretical and Applied Genetics 87:295-304.
54.Zhang X, Zhao Y, Cheng Y, Feng X, Guo Q, Zhou M and Hodgkin T. 2000. Establishment of sesame germplasm core collection in China. Genetic Resources and Crop Evolution 47:273-279.