Assessment of genetic diversity and differentiation of two major camel ecotypes ( Camelus dromedarius ) in Sudan using microsatellite markers

Although Sudan has the second largest camel population in Africa, it has not yet been genetically differentiated. The present study was undertaken to evaluate, for the first time, the genetic diversity and relationship of two major camel ecotypes representing the eastern (Butana) and western (Darfur) regions of Sudan using 12 microsatellite markers. A total of 107 samples of study ecotypes were investigated displaying high mean values of genetic diversity (mean number of alleles: 11.5± 1.45; polymorphism information content: 0.67± 0.04; observed heterozygosity: 0.69± 0.05; expected heterozygosity: 0.72± 0.04). The global inbreeding coefficient (FIT = 0.041± 0.03, P > 0.05) was attributed to substantial and non-significant withinpopulation inbreeding (FIS = 0.034± 0.03) and scarce but highly significant differentiation between ecotypes (FST = 0.008± 0.00; P < 0.0001). Multivariate analysis indicated a historical intermixing between different genealogical lineages making up the current admixed gene pool of the geographically divergent ecotypes. Consistent with this, STRUCTURE cluster analysis showed these ecotypes to be one mosaic admixed population. The results showed abundant genetic diversity within Sudanese dromedaries. Our study indicates that the two Sudanese camel ecotypes (Butana and Darfur) appear as an admixture of two geographical branches and do not support the contemporary division of Sudanese dromedaries into their respective socio-ethno-geography.


Introduction
Sudan ranks first among Arabian countries and second in Africa with respect to camel population, having more than four million (FAOSTAT, 2009: http://faostat3.fao.org). Camels in Sudan are of single-humped type, or dromedary (Camelus dromedarius). They are mainly owned by the nomadic tribes and migratory pastoralists. Therefore, camel production in Sudan is classified principally into nomadic and sedentary systems (Eisa and Mustafa, 2011). Camel classification in Sudan is based on morphological characteristics, ethnic pastoral communities and geographical distribution. Eisa and Mustafa (2011) stated that the packing and riding of camels in Sudan is spread in a belt configuration. This belt is characterized by erratic rainfall and contains two main regions: the eastern regions, including the Butana plains and the Red Sea hills, and the western regions in Darfur and Kordofan. Accordingly, the Sudanese camel ecotypes are classified socio-ethno-geographically into Butana, Lahaween, Shukriah and Rashaid in the eastern regions of Sudan, and into Darfur and Kababish in the western regions.
Sudanese camels have not been subjected to selective pressures. However, they have not been adequately differentiated based on genetic analyses. So far, there have only been few studies describing the biochemical variation in indigenous Sudanese camel ecotypes using milk proteins (Shuiep et al., 2013), and there has only been one comparative genetic study using six microsatellite markers between South African and Sudanese camels (Nolte et al., 2005). Additionally, a study by Ishag et al. (2010) reported single nucleotide polymorphisms in growth hormone genes among six Sudanese camel breeds.
Microsatellite markers are numerous, polymorphic and randomly distributed in the genome, as well as codominantly inherited and selectively neutral (Cheng et al., 1995). These markers have been used to determine genetic variations within and between camel populations with only a regional scope. Nolte et al. (2005) found high genetic diversity within, and a very low differentiation among, 16 South African dromedary populations across 12 microsatellites. Mburu et al. (2003) identified two separate genetic entities present in Kenyan dromedaries using 14 microsatellites. Vijh et al. (2007) indicated, based on 23 microsatellite markers, that there were two distinct genetic clusters in the Indian dromedaries. Ould  investigated eight microsatellites, reporting high genetic variability within three Tunisian dromedaries. Karima et al. (2011), utilizing three microsatellites, detected low genetic differentiation between five Egyptian dromedaries. Xiaohong et al. (2012) analysed 18 microsatellites, concluding that a genetic structuring among the 10 Bactrian Chinese and Mongolian camel populations was in agreement with the geographic distribution and natural geographic barriers among populations. Mahmoud et al. (2013), assaying 15 microsatellite markers, deduced a low genetic variance among four Saudi Arabian dromedaries.
The main objective of this study was to characterize the genetic structure (diversity and relationships) of the two major camel ecotypes (Butana and Darfur) in Sudan using 12 microsatellite markers. A further objective was to support or reject the current classification of Sudanese dromedaries, which is based on socio-geographical and tribal considerations.

Camel ecotypes and sampling genomic materials
A total of 107 unrelated Sudanese dromedary animals (Butana, n = 49; Darfur, n = 58) were sampled. The EDTA blood samples were collected from three tribes in each ecoregion representing a corresponding ecotype with a range of n = 15-19 for each in Butana and n = 18-20 for each in Darfur. The sampled tribes were distant from each other in Butana by nearly 80-130 km and in Darfur by nearly 250-450 km (Fig. 1). According to our questionnaires conducted with camel owners, the mean tribe size was 62 for migratory and 118 for sedentary animals, with male and female percentages being 25 and 75 %, respectively.
Genomic DNA was extracted following the standard protocol employing proteinase K digestion and phenol chloroform extraction as described by Sambrook et al. (1989).

Microsatellite loci genotyping
A total of 14 microsatellite markers spread across the camelide genome were used for genotyping, of which 7 are recommended for dromedaries by the ISAG/FAO working group (Hoffmann et al., 2004) as illustrated in Table 1. DNA was amplified in four multiplex reactions using labelled forward primers with 700 and 800 infrared fluorescent dyes (IRDs) (Eurofins-Operon, MWG, Germany). Primers of CMS121 and CVRL01 were labelled with 700 IRDs in one multiplex, and those of CVRL04 and CMS58 with 800 IRDs. Another multiplex included CMS50 and CVRL07 with 700 IRDs, as well as CVRL05 with 800 IRDs. Another of them contained LCA63 with 700 IRDs and VOLP67 and LCA90 with 800 IRDs. Lastly, 700 IRD-labelled primers of YWLL09 and VOLP10 and 800 IRD-labelled primers of YWLL08 and LCA18 were a part of one of the four multiplexes.
Multiplex PCR was performed in a total reaction volume of 12 µL for all selected loci in 96 plate tubes containing 50-100 ng of genomic DNA. PCR reactions were run in a Bio-Rad C1000 thermal cycler (Bio-Rad Laboratories Inc., Hercules, CA, USA) as follows: initial denaturation at 94 • C for 4 min, 35 cycles of denaturation at 94 • C for 45 s, primer annealing at 52-56 • C for 45 s, and extension at 72 • C for 2 min and final extension for 10 min at 72 • C. The resulting DNA fragments were denatured on 6 % polyacrylamide gel and separated on an automatic LI-COR sequencer (LI-COR Inc., Lincoln, NE, USA). Fragment size was estimated by plotting produced bands versus the 50-350 bp standard (LI-COR) because of unavailability of reference samples.

Statistical analyses
The observed (N A ) and expected (effective: N e ) number of alleles per locus across ecotypes and across markers as well as the pattern of private alleles within each ecotype were determined using GENAlEX 6.5 in Excel (Microsoft, Redmond, WA, USA) (Peakall and Smouse, 2006). Observed (direct count) heterozygosity (H o ), unbiased expected heterozygosity (H e ) (Saitou and Nei, 1987), and polymorphic information content (PIC) across ecotypes and across markers were estimated using POPGENE software version 1.31 (http://www.ualberta.ca/~fyeh/). Analysis of Wright's fixation indices (F IS , F IT and F ST ) were measured according to Weir and Cockerham (1984).
Analysis of molecular variance (AMOVA) was conducted with GENAlEX 6.5. The study ecotypes were clustered using multivariate analysis carried out through principal component analysis (PCA) implemented in PCAGEN1.2.1 (http: //www.soft82.com/download/windows/pcagen/). This non- parametric method identifies the primary axes of variation in data to project the two studied ecotypes onto these axes in a graphically appealing and intuitive manner. Therefore, it can uncover their underlying genealogical histories and demographic processes (McVean, 2009). STRUCTURE software (Pritchard et al., 2000) was used to study the ecotype structure and stratifications using genotype data. An admixture model was applied with correlated allele frequencies with 2 ≤ K ≤ 3. There were 20 runs for each K value used. The number of iterations in each run was 10 000 in Burn-in, followed by 50 000 iterations of Markov chain Monte Carlo length (MCMC). Pair-wise comparisons of the 20 solutions of each K value were run along with 100 permutations using CLUMPP software (Jakobsson and Rosenberg, 2007). The best solution obtained the highest similarity index (H ). Finally, the clustering pattern was graphically displayed using DISTRUCT software (Rosenberg, 2004).

Genetic variation among marker loci and within ecotypes
LCA18 and VOLP67 marker loci failed to be amplified on the previously mentioned multiplex PCR conditions; therefore they were excluded in the present study. The other 12 microsatellites were amplified successfully and used in further analyses. The parameters and indices of genetic diversity among the 12 marker loci and within ecotypes are shown in Tables 2 and 3. A total of 120 alleles were identified in the two camel ecotypes, with the mean number of observed alleles (MNA) being 11.5 ± 1.  Total mean ± SE 120 11.5 ± 1.45 0.68 ± 0.11 0.69 ± 0.03 0.72 ± 0.02 0.034 ± 0.03 + 0.008 ± 0.00 * * * 0.041 ± 0.03 + * * * dHWE: number of populations deviating from Hardy-Weinberg equilibrium (HWE); SE: standard error; + not significant at P > 0.05; significant deviation of the marker from HWE at * P < 0.05, * * P < 0.01 and * * * P < 0.001.  To describe the level of heterogeneity within and between ecotypes, F -statistics values were determined and summarized in Tables 2 and 3. A non-significant global inbreeding coefficient (F IT = 0.041 ± 0.03, P > 0.05) was attributed to high and non-significant within-population inbreeding (F IS = 0.034 ± 0.03) and low but highly significant differentiation between ecotypes (F ST = 0.008 ± 0.00, P < 0.001).
All the analysed markers showed low F ST estimates varying from 0.000 (CVRL04) to 0.021 (CVRL07). CVRL07 displayed the highest heterozygote deficiency and significant deviation from Hardy-Weinberg equilibrium (HWE) across ecotypes (F IS : 0.272). The highest heterozygote excess and significant deviation from HWE across ecotypes was de-tected in LCA90 (F IS : −0.137). The genetic equilibrium and absence of inbreeding influence across ecotypes was noticed in YWLL08 (F IS : 0.000). Within ecotypes, Butana exhibited a high and significant inbreeding coefficient (F IS : 0.07, P < 0.05), while Darfur showed a non-significant and low inbreeding coefficient (F IS : 0.02, P > 0.05). Both ecotypes deviated significantly from genetic equilibrium (P < 0.001).

Genetic differentiation between ecotypes and cluster analysis
AMOVA revealed that 1 % of total genetic variance resulted from genetic differentiation between two geographically distant ecotypes. The other 99 % was due to the withinpopulation components of the genetic variance. The pairwise fixation index (F ST ) value provided by AMOVA between studied ecotypes through 99 permutations differed significantly from zero at P < 0.05 (Table 4). The cluster pattern produced by the multivariate analysis using PCA is illustrated in Fig. 2, in which Butana and Darfur were projected along a line onto the parallel axes of the same synthetic space referring to genetic admixture between two ecotypes. The STRUCTURE cluster pattern is depicted  in Fig. 3. The two geographically divergent Sudanese camel ecotypes were clustered as one mosaic highly admixed population with no clear separation from K = 2 to K = 3. The best solution was achieved at K = 2, yielding the highest H value of 98.1 %.

Discussion
The main objective of this study was to determine, using microsatellite markers, the genetic diversity and the relationship of two major camel ecotypes inhabiting the eastern (Butana) and western (Darfur) regions of Sudan. Butana and Darfur are considered the major entities of camel populations in Sudan (Eisa and Mustafa, 2011). All 12 analysed loci displayed a large range of polymorphisms, and so their use should reduce the danger of overestimating genetic variability (Wimmers et al., 2000). They produced low F ST values, which in turn are useful for cluster analysis and capable of indicating genetic variation in terms of observed alleles (Rosenberg et al., 2001).
The MNA for loci per ecotype was 8.58 ± 0.91 and the corresponding mean number of effective alleles (N e ) was 4.15 ± 0.16. These results indicate equal distributions of frequent alleles of highly informative loci within ecotypes as well as enough sampling process for current genetic diversity assessment. The present MNA is greater than that found across 15 microsatellites within four Saudi Arabian camel populations (Mahmoud et al., 2012(Mahmoud et al., , 2013. This could reflect the absence of selective pressures applied to Sudanese dromedary ecotypes. In addition, Tunisian (Ould  and Egyptian dromedary populations (Karima et al., 2011) had a lower MNA than the current one due to investigation of fewer loci in their studies. The Kenyan and South African dromedaries have not been subjected to intensive selection (Mburu et al., 2003;Nolte et al., 2005); therefore this may be a possible reason for having a comparable MNA to ours. Analyses of 23 loci within 4 Indian dromedaries (Vijh et al., 2007) and 18 loci within 10 Chinese and Mongolian Bactrians (Xiaohong et al., 2012) yielded MNA values that were reasonably higher than ours.
Polymorphic information content and H e estimates within current ecotypes were higher compared with Kenyan and United Arab Emirates (Mburu et al., 2003), Indian (Vijh et al., 2007), Tunisian (Ould Ahmed et al., 2010), Chinese, Mongolian (Xiaohong et al., 2012) and Saudi Arabian camels (Mahmoud et al., 2012(Mahmoud et al., , 2013. Nonetheless, study ecotypes showed lower H e values than Egyptian camel breeds (Karima et al., 2011). The observed high genetic variation within studied ecotypes inferred from the high values of MNA, PIC and H e may reflect their large population sizes and wide genetic base attributed to various ancestry lineages.
The observed non-significant inbreeding coefficient (F IS = 0.034 ± 0.03) across individuals may indicate large size of Sudanese camel populations. However, this low heterozygote deficiency may be due to the presence of null alleles (Nei, 1987). It also could be owing to association of some alleles to functional genetic segments due to unsystematic selection applied to each ecotype according to tribal traditions. Within ecotype, there were higher and more significant heterozygote deficits in Butana than in Darfur. This is suggested due to the lower effective population size of Butana compared to Darfur. This suggestion is inferred from nomadism being dominant in Butana, which is characterized by a lower male-to-female ratio than in Darfur (Eisa and Mustafa, 2011). Moreover, this reflects Butana being sub-  jected to substructure or bottleneck because of its different soil composition, temperature ranges, and amount and duration of rainfall from Darfur (Drosa et al., 2011). Both ecotypes displayed significant deviation from HWE (P < 0.001), indicating gene flow because the migratory system is the dominant production system in Sudan (Eisa and Mustafa, 2011).
A lack of genetic variation in studied ecotypes was detected by AMOVA (F ST = 0.01). This result could indicate uncontrolled gene flow, introgression and crossing experienced between ecotypes, and is supported by cluster analyses. Nevertheless, this paucity of genetic differentiation was highly significant (P < 0.001), reflecting unsystematic individual selection according to the purpose of rearing for each ecotype. Moreover, this highly significant ecotype differentiation was consistent with finding private alleles in each ecotype.
Concordantly, a close phylogenetic relationship was observed between Darfur (west) and Butana (east), as depicted by the multivariate analysis using PCA (Fig. 2), in which Butana and Darfur constituted two parallel axes of the same synthetic space. This depiction could indicate a historical intermixing to a great extent between different genealogical lineages making up the current admixed gene pool of the geographically divergent ecotypes. In agreement with this, McVean (2009) interpreted PCA between two populations to be admixed when the populations project along a line. The later speculation is supported by the highly significant partitioning detected between ecotypes, the presence of a high number of private alleles in each, and the abundance of within-ecotype genetic diversity.
In agreement with this, the STRUCTURE algorithm clustered geographically distant ecotypes as one mosaic admixed population, supporting the history of occurrence of substantial gene flows and hybridization in between. Consistent with this, Nolte et al. (2005) and Mahmoud et al. (2013) detected very low genetic distinction among South African and Saudi Arabian dromedaries, respectively. However, Xiaohong et al. (2012) found a plausible genetic substructure among Bactrian Chinese and Mongolian camel populations according to natural geographic barriers.
In conclusion, Sudanese Butana and Darfur camel ecotypes exhibited abundant genetic diversity and almost no genetic substructure. Genetic differentiation is not in agreement with socio-ethno-geographical classification of Sudanese dromedaries. Hence, Butana and Darfur camel ecotypes appear as an admixture of two geographical branches (east and west).