A novel 17 bp indel in the SMAD 3 gene alters transcription level , contributing to phenotypic traits in Chinese cattle

SMAD3, the messenger of the transforming growth factor beta (TGF-β) signaling pathway, plays essential roles in myogenesis and osteogenesis and may relate to the regulation of body weight. In this study, a 17 bp indel (NC_007308: g.101893_101909insGAGGATGAGTGCTCCAG) in intron3 of the SMAD3 gene was detected in four Chinese cattle breeds (Qinchuan, Jiaxian, Nanyang and Caoyuan) by using DNA pool sequencing, and its effects on gene expression and growth traits were analyzed in Qinchuan and Caoyuan cattle. The results showed that the indel locus was significantly associated with SMAD3 transcriptional levels where II genotypes had a higher value than DD genotypes in Qinchuan (QC) cattle muscle tissue (P<0.05). In addition, the locus was strongly associated with chest girth, chest width, rump length, hucklebone width and body weight in 2-year-old QC cattle (P<0.05) and body weight (12 months), body height (18 months) and chest girth (18 months) in Caoyuan cattle (P<0.5). To the best of our knowledge, this is the first evidence of the association between SMAD3 indel and cattle phenotype, and it may contribute to understanding the function of the indel, which could be a promising marker for beef cattle breeding.


Introduction
TGF-β (transforming growth factor beta) regulates bone formation as well as bone resorption depending on its multiple effects on osteoblast and osteoclast migration, proliferation, differentiation and viability (Janssens et al., 2005;Fox and Lovibond, 2005).In this way, the canonical intracellular effector SMAD3 is phosphorylated upon recruitment to the activated TβRI-TβRII complex (Shi and Massague, 2003;Rahimi and Leof, 2007) and transmits this signaling to the nucleus to control the expression of the target gene, including RUNX2 (Alliston et al., 2001).Yang et al. (2001) reported that the TGF-β-SMAD3 signal repressed chondrocyte hypertrophic differentiation and was required for maintaining articular cartilage.Additionally, SMAD3 promoted alkaline phosphatase activity and mineralization of mouse osteoblast, suggesting that SMAD3 was involved in the transcriptional mechanism causing bone formation (Sowa et al., 2002).Consistent with this, mice with targeted deletion of SMAD3 were osteopenic compared with wild-type littermates, which was attributed to a lower rate of bone formation (Borton et al., 2001).Recently, Liying et al. (2013) reported that an SMAD3 gene polymorphism was significantly related to osteoarthritis in the human population of northeast Chinese.
SMAD3 could also severely inhibit muscle development.It is known that SMAD3 inhibits myogenesis by repressing the activity of the MyoD family through interfering with the assembly of the bHLH transcription factor on E-box sequences (Liu et al., 2001).Also, TGF-β-activated SMAD3 represses MEF2-dependent transcription and inhibits terminal differentiation of myoblast (Liu et al., 2004).Moreover, myostatin induces skeletal muscle wasting through SMAD3mediated expression of functional genes (Lokireddy et al., 2011).Furthermore, SMAD3 defects result in impaired mus-cle regeneration, suggesting that SMAD3 plays an indispensable role in postnatal myogenesis (Ge et al., 2011).
Given the fact that SMAD3 plays an important role in the regulation of bone formation and myogenesis, we hypothesized that the variations in the SMAD3 gene may be relevant to cattle growth and influence phenotypic traits.The objective of this study was to explore genetic variations in the SMAD3 gene and evaluate their association with gene expression and growth traits in Chinese cattle.

Animals and phenotypic traits
All experiment procedures were approved by the Review Committee for the Use of Animal Subjects of Northwest A&F University, and animal experimentation, including fetal and adult sample collection, was in agreement with the ethical commission.A total of 844 blood samples were collected in four Chinese cattle breeds: Qinchuan (QC) cattle (n = 519) in Shaanxi Province, Jiaxian (JX) cattle (n = 120) in Henan Province, Nanyang (NY) cattle (n = 65) in Henan Province and Chinese Caoyuan (CY) cattle (n = 140) in Jilin Province.The animals of each breed were cows that were reared at the same farm and were in a good physical condition.Calves were weaned on average at 6 months of age and raised from weaning to slaughter on a diet of corn and corn silage.Moreover, the animals of each breed were unrelated for at least three generations, with the aim of having diverse lineages within each breed.Genomic DNA was extracted from blood samples of all 844 animals, using the phenol-chloroform method (Sambrook and Russell, 2001), diluted to a standard concentration (50 ng µL −1 ) and stored at −80 • C for polymorphism detection.The DNA quality and purity (A 260 / A 280 ratio) of each sample were assessed using the NanoDrop ™ 1000 Spectrometer (Thermo Scientific, Waltham, MA, USA).
In association analysis, growth traits of 2-year-old QC cattle were collected, including body height, height at hip cross, body length, chest girth, chest width, chest depth, rump length, hucklebone width, hip width, and body weight.The growth traits of CY cattle at three growth periods (6, 12 and 18 months) were also obtained, including body height, body length, body weight, chest girth and hucklebone width.All body measurements were collected following the procedures of Gilbert et al. (1993).

Primers design, amplifications and genotyping
A total of eight primer pairs (Table S1 in the Supplement) were designed for the coding region of the bovine SMAD3 gene based on the reference sequence (GenBank accession number NC_007308) and synthesized by Genscript Company (Nanjing, China) for variation detection.DNA pools were constructed with 50 individual genomic DNA sam-ples randomly chosen from each cattle breed and used as a polymerase chain reaction (PCR) template (Xue et al., 2013).PCR was performed in 25 µL of reaction volume, containing 50 ng µL −1 genomic DNA, 10 µM of each primer, one buffer (including 1.5 mM MgCl 2 ), 200 µM dNTPs and 0.6 U of Taq DNA polymerase (MBI, Vilnius, Lithuania).Direct sequencing of PCR products was executed by an ABI PRISM 3730XL DNA sequencer (Genscript, Nanjing, China).Indel variations were identified by sequence alignment of the sequenced reads against the reference sequence using the BioXM 2.6 software.After that, PCR-based amplification fragment length polymorphism (P-AFLP) was employed to detect indel polymorphisms, and a genotyping primer (forward: 5 TTAAACCTGGTCTTGAGGGA 3 ; reverse: 5 AGGAAGGTGGTTCATTACATG 3 ) was designed with Primer Premier 5 software.In total, 10 µL PCR reaction was performed on a thermal cycler system, consisting of 50 ng µL −1 genomic DNA, 5 µM of each primer, one buffer (including 0.75 mM MgCl2), 100 mM dNTPs (dATP, dTTP, dCTP and dGTP), and 0.75 units of Taq DNA polymerase (MBI, Vilnius, Lithuania).The thermal cycling program was 5 min at 95 • C for pre-denaturation, then 36 cycles at 94 • C for 30 s, then annealing for 30 s at 55.9 • C, 30 s at 72 • C, and a final extension at 72 • C for 10 min.Finally, 7 µL PCR products were directly assayed by electrophoresis on 3 % agarose gels stained with ethidium bromide.

Tissue collection, RNA isolation and cDNA synthesis
Skeletal muscles were dissected from 25 QC fetal individuals (fetus: 90 days) and 22 QC adult cattle (about 24 months old) at the Kingbull Company slaughterhouse for genomic DNA and RNA isolation.All animals were in a healthy condition.After collection, tissues samples were immediately frozen in liquid nitrogen and stored at −80 • C. Total RNA was isolated using the TRIzol reagent and treated with RNase-free DNase (TaKaRa), as per the manufacturer's protocol (TaKaRa, Otsu, Shiga, Japan).RNA quality and purity were estimated by 1 % agarose gel electrophoresis and spectrophotometry measurement, respectively.Reverse transcription was done by using a PrimeScript(®) RT Reagent Kit with gDNA Eraser (Perfect Real Time) (Clontech, TaKaRa) in 20 µL volume containing 2 µg RNA.The reaction was carried out in a PCR System Thermal Cycler Dice (TaKaRa, Dalian, China) at 42 • C for 2 min, followed by 37 • C for 15 min and 85 • C for 5 min, and cDNA samples were stored at −20 • C.

qPCR
SMAD3 gene expression levels in skeletal muscle tissue of QC fetal and adult cattle were evaluated by SYBR Green quantitative PCR using a standard curve method, and the data were normalized against the internal reference gene (the bovine glyceraldehyde-3-phosphate dehydro-   genase, GAPDH) as described by Shi et al. (2016).qPCR was performed on a Bio-Rad CFX 96 Real-Time PCR Detection System in 12 µL of reaction volume containing 6 µL SYBR Premix Ex Taq TM II (Takara, Dalian, China), and the qPCR's thermal profile was 95 • C for 30 s followed by 39 cycles of 95 • C for 5 s, 60 • C for 30 s, and 65 • C for 5 s.The primers used in qPCR are given in Table 1.

Prediction in silico of transcription factor binding sites in the indel locus
The prediction in silico of transcription factor binding sites (TFBSs) in the variation sequence was performed by the online Genomatix MatInspector software (http://www.genomatix.de/)(Cartharius et al., 2005).The basic mechanism of this program used a large database of weight matrices based on known in vivo binding sites to predict TFBS in DNA sequences.

Data statistics
Genotypic and allelic frequencies of the indel locus were calculated.The Hardy-Weinberg equilibrium (HWE) was used to determine a specific value for the number of observed and effective alleles and was evaluated through χ 2 test performed by POPGENE software, version 3.2 (Yeh et al., 1999).Heterozygosity (He), homozygosity (Ho), effective allele numbers (Ne) and polymorphism information content (PIC) were computed following Nei's methods (Nei, 1973) and performed on the POPGENE software.
The associations of SMAD3 17 bp indel with gene expression and growth traits were evaluated by using the general linear model (GLM) procedure of the SPSS software (Version 18.0).Genotype and birth season (four seasons in a year) were fixed effects.Age was fitted as a random effect.Considering the distant lineage within each breed, each animal and the sire of the cow were excluded from random effects.The adjusted linear model was Y ij k = µ + G i (II, ID and DD genotypes) +A j + e ij k , where Y ilk is the growth traits measured in ilkth animals, G i is the fixed effect of the ith genotype of the indel, A j is the effect due to the j th age and e j lk is the random residual error.The associated birth (spring versus fall) effect was not included in the linear model, as the preliminary statistical analysis indicated that the effect did not have a significant influence on the variability of the traits in the study subjects (Zhang et al., 2014).

Indel variation identifications
In this study, a 17 bp indel (NC_007308: g.101893_ 101909insGAGGATGAGTGCTCCAG) in intron 3 of the SMAD3 gene was identified by DNA pool sequencing and alignment with the GenBank reference sequence (NC_007308; transcription start site of SMAD3 13717417; transcription end 13839788; initiation of translation 13717719) (Fig. 1).P-AFLP was carried out to genotype the indel locus polymorphisms, where homozygote insertion  type (II) consisted of 266 bp, deletion type (DD) consisted of 249 bp and heterozygote type (ID) showed homoduplexes 266 bp and 249 bp and heteroduplex A fragment (Nagamine et al., 1989) (Fig. 2).

Analysis of genetic parameter of the indel locus in four cattle breeds
Genotypic and allelic frequencies are shown in Table 2.The results indicated that the DD genotype was high in QC and CY cattle but low in JX and NY cattle.Specifically, the II genotype had a frequency lower than 0.05 and was excluded from association analysis in CY cattle.Similarly, the D allele had a dominant hierarchy in QC and CY breeds but was low in JX and NY breeds.The χ 2 test indicated that none of the tested breeds were in accordance with the Hardy-Weinberg equilibrium (Table 2), suggesting that these cattle populations were not in dynamic equilibrium during migration, artificial selection, and genetic drift (Huang et al., 2013).The PIC value showed that CY cattle possessed low genetic diversity (0 < PIC < 0.25), whereas the other three cattle breeds had medium genetic diversity (0.25 < PIC < 0.5) in the indel locus.It may be hypothesized that these differences between genetic parameters among four cattle breeds may be caused by their diverse genetic backgrounds.In fact, as described by Lehnert et al. (2007), QC, NY and JX cattle were important representative native beef breeds and were reared in the central region of China (Shaanxi and Henan provinces), whereas CY cattle crossbred from British shorthorn (male) and Mongolia cattle (female) were not classified as a native breed of China and were distributed in the northeast of China (Jilin Province).

Gene expression
Sequence variations detected in a gene promoter or regulatory region could influence transcription efficiency and thus alter genes' specific biological functions as well as the phenotype (Greenwood and Kelsoe, 2003;Braglia et al., 2014).
In our study, the associations of different genotypes in the indel locus with the SMAD3 expression level were evaluated in QC skeletal muscle.The results showed that the 17 bp indel was significantly associated with SMAD3 gene expression in fetal skeletal muscle (P <0.05),where II individuals had significantly higher mRNA expression level compared with ID and DD genotypes (Fig. 3).No strong significance was observed in adult skeletal muscle, but it was noted that II showed a higher relative mRNA expression level than the DD genotype (data not shown).

Prediction of transcription factor binding on the indel sequence
Gene expression is mostly regulated by the binding of transcription factors (TBFs) to short DNA motifs within noncoding proximal regions (Arnone and Davidson, 1997).Particularly, sequence variations within promoters or introns could change the binding site of transcription factors and alter transcriptional efficiency.It is known that 3-G3072A transition in IGF2 intron blocks a binding site for a repressor (muscle growth regulator, MGR) and leads to the upregulation of IGF2 expression in skeletal muscle, resulting in higher muscle content and reduced fat in pigs (Van Laere et al., 2003;Butter et al., 2010).In this study, bioinformatics analysis showed that five potential transcriptional factors, i.e.,  MEL1 (MDS1/EVI1-like gene 1), the homeodomain factor Nkx-2.5/Csx, a glucocorticoid receptor (IR1), PAX6 and the zinc finger protein 263 (ZNF263), could bind on the 17 bp insertion sequence (Fig. 4).MEL1 was identified as interacting with SKI and contributed to inhibiting TGF-β signaling by enhancing the SMAD3-SKI complex interaction on the promoter of TGF-β target genes (Takahata et al., 2009).It was shown that Nkx2.5 and GATA factors could crossregulate each other's expression on cardiogenesis (Lien et al., 1999;Davis et al., 2000).The glucocorticoid receptor (GR1) belongs to the nuclear receptor (NR) superfamily.In a nucleus, GR1 can bind to glucocorticoid (GC) response elements (GREs) and regulate transcription of target genes (Surjit et al., 2011).PAX6, a member of the paired domain family of transcription factors, was originally identified in a homology screen for paired-box-containing genes (Walther and Gruss, 1991).ZNF263 binding sites were located within the transcribed region of target genes and had both positive and negative effects on its target genes (Frietze et al., 2010).However, there was no direct evidence of their effects on the SMAD3 transcription.Moreover, considering the mechanisms of these transcription factors regulating target genes in positive or negative ways, it was hypothesized that the differential expression of three genotypes may be caused by compensative regulation effects of these potential transcription factors.

Association analysis of the indel locus with growth traits in QC and CY cattle
Association analysis showed that the indel locus was significantly associated with growth traits including chest girth, chest width, rump length, hucklebone width and body weight of 2-year-old QC cattle (P <0.05) (Table 3).Individuals with DD had a better phenotype compared with ID and II genotypes.Similarly, DD genotypes in CY cattle had greater body weight at 12 months and 18 months and greater chest girth at 18 months (P <0.05) (Table 4).Interestingly, our results showed that the DD genotype was at a relatively low expression level but performed better with regard to body traits, suggesting that the low expression level of the SMAD3 gene could rescue its inhibition of myogenesis and contribute to

Conclusions
In summary, the 17 bp indel polymorphism identified in the present study could affect potential transcription factor binding and thus lead to an SMAD3 mRNA expression level change in skeletal muscle.Moreover, the indel locus was related to growth traits in Chinese cattle.These data provide the molecular basis for improving the economic traits of cattle using the 17 bp indel as an underlying molecular marker in beef cattle breeding.

Figure 1 .
Figure 1.Sequencing maps for the 17 bp indel in the SMAD3 gene.Panel (a): homozygotic insertion type (II); the sequence with the black border is 17 bp insertion.Panel (b): homozygotic deletion type (DD).Panel (c): heterozygote type (ID) Note: the location of the indel locus relates to GenBank No. NC_007308.5.

Figure 2 .
Figure 2. The agarose gel electrophoresis patterns of the 17 bp indel within the SMAD3 gene.PCR products showed three genotypes at this locus where the insertion type (II genotype) consisted of 266 bp, deletion types (DD) consisted of 249 bp, and the heterozygote type (ID) showed 266, 249 and 17 bp and heteroduplex A fragment, which was detected by 3 % agarose gel electrophoresis.

Figure 3 .
Figure 3.Comparison of SMAD3 gene expression levels among different genotypes in the 17 bp indel in QC fetal cattle.The expression level was calculated with a 2 (− Ct) value and normalized to the GAPDH gene.Error bars represented standard deviations of three different biological replicates; "a" and "b" denote values that differ significantly at P <0.05.

Table 2 .
Genetic diversity of the 17 bp indel locus within the SMAD3 gene in the four cattle breeds.Numbers in brackets indicate the number of individuals.
Notea and b denote values that differ significantly at P <0.05.

Table 4 .
Association of 17 bp indel within the SMAD3 gene and growth traits of CY cattle.± 0.67 158.95 ± 1.04 248.38 ± 3.72 A 13.64 ± 0.34 125.82 ± 0.79 a 159.05 ± 0.95 a ID 119.17 ± 2.32 158.33 ± 4.30 225.00 ± 20.12 B 14.00 ± 01.24 124.50 ± 3.84 b 154.00 ± 4.85 : BH-6 and CG-6 denote body height and chest girth of cattle aged 6 months.BW-12 and HW-12 denote body weight and hucklebone width of cattle aged 12 months.BH-18 and CG-18 denote body height and chest girth of cattle aged 18 months.a and b denote values that differ significantly at P <0.05.A and B denote values that differ significantly at P <0.01.superior phenotype.However, further research and validation are needed to reveal the molecular mechanisms of SMAD3 indel polymorphisms regulating gene expression and cattle phenotype observed in this study. Note