Using Bioinformatics and NCBI Tools to Sequence and Structure Analysis of Transcription Factor 7 Like 2 Gene (TCF7L2) in Iraqi Diabetic Mellitus Type II Patients

*Corresponding Author: Batool K. Queen batoolkareem97@gmail.com Abstract In this paper, the study aimed to analyze the sequence and structure of Transcription Factor 7 Like 2 gene (TCF7L2) in Iraqi Diabetic Mellitus Type II (T2DM) Patients and was compared with standard sequence form National Center of Bioinformatics Tools (NCBI) using BLAST. Ten blood samples of (T2DM) Iraqi patients, was collected from AlMustansiriya University National Diabetes Centre in Baghdad Province/Iraq (17-65 year). DNA was extracted from whole blood of patients’ samples using the Quick-DNATM Blood MiniPrep kit, then it was sent to Korea at Macrogen Corporartion Company, where they used automated DNA sequencing for sequence analysis. The results of the sequence analysis of (T2DM) patients’ samples, we found: six missense mutations, one deletion mutation, and three silent mutations detected using BLAST in (NCBI). All mutations appeared at the same sites of the gene which controls the rate of genetic information transcription that indicates to have a relationship with (T2DM). These mutations were recorded on the (NCBI). The physicochemical properties of (TCF7L2) determined in the present study included; alpha-helical structure and 3-Dimension structure appeared contrast when compared with the gene template. Briefly, mutations effected (TCF7L2) which influences the structure, physicochemical properties of the protein, and the secretion of insulin hormone which maintains glucose level in blood.


Introduction
The Transcription Factor 7 Like 2 gene (TCF7L2) polymorphisms and development of Diabetes Mellitus Type II (T2DM) remains a big area of study especially in Iraqi diabetic patients. Bioinformatics Tools will provide a standardized method to compare the analysis, structure, and sequence of (TCF7L2) in Iraqi patients with (TD2M), with a standard sequence from the National Centre of Biotechnology Information (NCBI). Bioinformatics is defined as understanding and organizing the information associated with these molecules on a large scale by conceptualizing biology in terms of molecules (in the sense of Physical chemistry) and applying "informatics techniques" (derived from disciplines such as applied maths, computer science, and statistics). In a nutshell, bioinformatics is a molecular biology information management system with several applications [1].
The amount of data collected in the form of databases and computer processing capability has been increasing at an exponential rate, which explains why bioinformatics has progressed quickly in the scientific area. The data of COVID19 which is new and dangers to collected, now are available in bioinformatics tools and can be collected for COVID19 monitoring through an intelligent network of medical biosensors, thereby improving healthcare quality [2]. Bioinformatics consists of statistics control, analysis, and modelling over all domain names [3]. Type II diabetes (T2DM) is a long-term condition in which the body's ability to produce or respond to the hormone insulin is impaired. Or when the body is unable to use the insulin it produces, resulting in high blood glucose levels (hyperglycemia) [4], Resistance of Insulin and relative (rather than absolute) insulin shortage characterizes diabetes II that affects (90 -95%) of diabetics' patients [5]. Several factors are believed to have a contact responsible for this kind of diabetes like lifestyle, stress, and environmental conditions, in addition, that in most cases of type II diabetes, several genes played a little role in the overall disease. TCF7L2 is one of the wellstudied genes for diabetes susceptibility in most populations, as of 2011, more than 36 genes have been discovered that contribute to the risk of type 2 diabetes [6,7]. TCF7L2 gene, which is found on the chromosome (10q25.2-q25.3), there are 19 exons in this gene, 5 exons are optional. TCF7L2 is considered a member of the high mobility group box (HMGB) family with 619 amino acids and have a molecular mass of 67919 Da. Have plays a vital function in the Wnt signalling pathway [8]. TCF7L2 works as a transcription factor that influences the transcription function of numerous genes, allowing it to perform a wide range of actions within the cell [9]. However, TCF7L2 does not directly regulate glucose metabolism in β-cells, but regulates glucose metabolism in pancreatic and liver tissues [10]. Also reported to activate Wnt target genes, which consequently increases the reproduction of β-cells in the pancreas, TCF7L2 variations have since become essential genetic materials in determining the dangers of type II of diabetic mellitus all around the world by increasing susceptibility to type II diabetes through decreasing the production of glucagon-like peptide-1 (GLP-1) [9]. The TCF7L2 allele is the riskiest of the frequent genetic variations, increasing the risk of diabetes by 1.5 times [11]. Our study aimed to analyze the sequence and structure of the TCF7L2 gene in Iraqi T2DM patients and how the mutations or variation effects the gene and increasing the potential risk of diabetes. However, it's important to mention that, this study is the first in Iraq.

Blood Sample Collection
Ten Blood samples were collected from Iraqi diabetic mellitus type II patients from Al-Mustansiriya University National Diabetes Centre in Baghdad province/Iraq (aged between 17 -65 years). For each patient drawn five millilitres of whole blood and this process was done under aseptic conditions. Whole blood was collected in a sterile EDTA tube for a genetic test [12]. The blood samples were subjected to freezing at -20C.

DNA Extraction from Blood Samples
The extraction of DNA has been used to extract the human DNA from the whole blood of patients by using a protocol in Quick-DNA™ Blood MiniPrep (Catalog Nos. D3024 & D3025) was optimized to extract the human DNA from fresh or frozen blood.

Estimation of DNA Concentration and Purity
The DNA extracted from whole blood of diabetic mellitus type II patient's concentration and purity was estimated by using The Quantus™ Fluorometer Promega/USA. The purity of DNA was determined at wavelength 260nm-280nm, which represents the ratio of the absorbance and the purity (1.6 -1.8) means DNA high purity [13].

Agarose Gel Electrophoresis
After DNA extraction, the presence and integrity of extracted DNA were confirmed by using agarose gel electrophoresis [14]. The samples were carefully loaded into the individual wells of the gel. Then electrical power was turned for 1-2 h of 7 v\c2. After that, the DNA migrates from the cathode pole (-) to the anode pole (+) (shown in Figure 1). The red safe nucleic acid stained bands in the gel were visualized using UV light with a wavelength of 336 nm.

Primer for TCF7L2 Gene
The full (TCF7L2) gene sequences were taken from the Genomic Database of the (NCBI), the reference sequence ID: NG_012631.1. Integrated DNA Technologies investigated dissolved primers (Table 1) in free DdH2O after they were lyophilized; to get it as a stock solution in an optimal concentration of 100 pmol/l with kept it at -20, to prepare a primer suspended of 10 pmol/l concentration as work. To reach a final volume of 100: l, 90: l of free DdH2O water with 10: l of the stock solution (IDT/ company, Canada).

Polymerase Chain Reaction
The reaction mix was made by mixing the primer (working solution) and template DNA with Taq PCR Master Mix, then vortexing to avoid salt concentration discrepancies. Then prepared a mixed reaction as in Table 2. In thermal cycler placed PCR tubes and run with 5 v/cm2.on (2%) agarose gel than started the programed system of cycles in Table 3.

DNA Sequence of TCF7L2 Gene Translation to Amino Acid Sequence
For translation of nucleotide sequence of (TCF7L2) gene to amino acid sequence, we are using (BLASTX) Basic Local Alignments Search Tool BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi?) [17].

Physicochemical Properties Prediction
The PortParam online server tool (https://web.expasy.org/protparam/) used to investigate the physicochemical properties of TCF7L2 protein that effect amino acids sequence and the way that interact across a sequence in which the feature of the protein like: theoretical PI, amino acid compositions, molecular weight, instability index, and the other parameters can by calculate using this tool [18].

Protein Structure Prediction.
Using the PSIpred online program that gave a result with the highly accurate secondary structure of the protein that includes alpha helix, beta-strands and coils will obtain from the sequences of amino acid in the home page of the web address (http://bioinf.cs.ucl.ac.uk/psipred/psiform.html/) [19]. PHYRE2 online tool for tertiary structure (https://string-db.org/cgi/) [20].

Results and Discussion
The TCF7L2 DNA was extracted from the whole blood samples following the standard procedure that was used in many genetic kinds of research and studies according to this cite [21]. The quantification of DNA measured by Quantus™ Fluorometer was (1-50) ng/µl and the purity ranged was (1. 6-1.8). The result demonstrated that the purity of extracted DNA in the total samples was sufficiently high for PCR analysis as shown in Figure (2) in light fluorescents color.

Figure 2:
Under UV light and after staining with Red Safe nucleic acid stain, human genomic DNA extraction from whole blood were shown the bands on 1.5% agarose gel at 30 min in gel electrophoresis.

3.1PCR
After the extraction of pure DNA from the all blood samples, PCR technique used to amplification the segments of extracted DNA obtained from diabetic patients. By using two primers that utilised to amplify the (TCF7L2) gene exon as specific primers, which were designed by (NCBI) software and gave identity I.d. (100%) with Homo sapiens (TCF7L2) gene by BLAST and obtained by IDT (Integrated DNA Technologies Company, Canada). As shows in Figure (3), the result of amplification segments of patient's DNA by primers that gave pure PCR bands with size (888 bp) after electrophoresed at 5 v/cm2.on (2%) agarose gel.

Sequence Analysis of Samples
Samples sequencing results that obtained from Korea/Macrogen Corporartion Company using an automated DNA sequencing after compared with the sequence of (TCF7L2) gene. Showed six missense mutations, one deletion mutation and three silent mutations in Table 4with details. These are available in the (NCBI) Reference Sequence by using (Bioeditpro.version:7.0.0). The primary structure of the nucleotide sequence of TCF7L2 gene for diabetic patients was translated to amino acid sequence by using the BLASTX tool from NCBI. The structure analysis provided the physiochemical properties of TCF7L2 for all samples by using the PortParam server as shown in Table 5. After using PortParam serves, the results show that the substitution of amino acid has many effects on the TCF7L2 protein and the function as a transcription factor for β-cell in the pancreas. It observed that the physicochemical properties of the protein in Table 5, which the stability of protein determined by Instability Index (II) and should be less than 40. So, the substitution in the gene effects on (II) which probably consider the protein with above of 40 (II) as unstable protein, than the primary structure-dependent statistical (II) that was created for predicting protein stability in vivo [22]. The protein's Isoelectric Point (PI) is the pH at which it carries no net charge, so it will precipitate in an acidic buffer if the PI is less than 7, and soluble in a basic buffer if the PI is more than 7. The (PI) of TCF7L2 in T2DM patients were (5.16) while the (PI) of TCF7L2 that was retrieved from (NCBI) was 5.39 so they are likely to precipitate in acidic buffers [23]. These results are approved with other studies that conformed to the mutation effects on primary structure and physicochemical characteristics especially protein stability [24].

Protein Structure Analysis
The TCF7L2 protein secondary structure was determined by using PSIpred online software program that gave alpha helix predicated and beta-turn and random coil for samples groups of T2DM patients and healthy control,   The results of TCF7L2 protein tertiary structure for patients with T2DM that compared with the 3D structure of the TCF7L2 protein from NCBI by using the Phyre2, those the tertiary structure of TCF7L2 gene in patients with diabetes appeared out of the templates the provided by Phyre2 tool (Figure 6). The highest % I.d (94%) was used to predict the 3D structure. When compared to TCF7L2 retrieved from NCBI, the mutations on the TCF7L2 gene for T2DM patients caused different effects on the structure of the protein, as shown in the structural analysis result. This result proved the agreement with the studies [25] that demonstrated the mutations cause large changes in the sequences that effected on the structure of the protein.

Conclusions
In the present study, the mutations in the TCF7L2 gene in T2DM Iraqi patients affect the physicochemical properties of TCF7L2 protein such as molecular weight, PI, and the stability of protein when compared with TCF7L2 retrieved from NCBI. Also, the mutations effected the structure of TCF7L2 by changing the number and position of alpha-helix, beta-turn, and coil, which lead to the loss of the function of TCF7L2 protein as a transcription factor.