Investigation of the variants at the binding site of inflammatory transcription factor NF-κB in patients with end-stage renal disease

Background A chronic inflammatory state is a prominent feature in patients with end-stage renal disease (ESRD). Nuclear factor-kappa B (NF-κB) is a transcription factor that regulates the expression of genes involved in inflammation. Some genetic studies have demonstrated that the NF-κB genetic mutation could cause kidney injury and kidney disease progression. However, the association of a gene polymorphism in the transcription factor binding site of NF-κB with kidney disease is not clear. Methods We used the Taiwan Biobank database, the University of California, Santa Cruz, reference genome, and a chromatin immunoprecipitation sequencing database to find single nucleotide polymorphisms (SNPs) at potential binding sites of NF-κB. In addition, we performed a case–control study and genotyped 847 patients with ESRD and 846 healthy controls at Tri-Service General Hospital from 2015 to 2016. Furthermore, we used the ChIP assay to identify the binding activity of different genotypes and used Luciferase reporter assay to examine the function of the rs9395890 polymorphism. Result The results of biometric screening in the databases revealed 15 SNPs with the potential binding site of NF-κB. Genotype distributions of rs9395890 were significantly different in ESRD cases and healthy controls (P = 0.049). The ChIP assay revealed an approximately 1.49-fold enrichment of NF-κB of the variant type TT when compared to that of the wild-type GG in rs9395890 (P = 0.027; TT = 3.20 ± 0.16, GT = 2.81 ± 0.20, GG = 1.71 ± 0.18). The luciferase reporter assay showed that the NF-κB binding site activity in T allele was slightly higher than that in G allele, though it is not significant. Conclusions Our findings indicate that rs9395890 is associated with susceptibility to ESRD in Taiwan population. Electronic supplementary material The online version of this article (10.1186/s12882-019-1471-2) contains supplementary material, which is available to authorized users.

in the pathogenesis of CKD. The heritability of ESRD is 31.1% in the Taiwanese population [9].
Nuclear factor-kappa B (NF-κB) is an important transcription factor in inflammation and promotes the expression of genes involved in inflammation, such as cytokines and adhesion molecules. NF-κB comprises a family of dimeric transcription factors that regulate the expression of numerous genes involved in inflammation and cell proliferation [10].
Several studies have shown that inhibitors of NF-κB activation can regulate the inflammatory response of glomerular mesangial cells [19]. The pathogenesis of glomerular mesangial cell inflammation in patients with kidney disease has been associated with NF-κB activation [20]. A recent study showed that when patients with kidney disease have proteinuria, the NF-κB inflammatory reaction and expression of proinflammatory genes are accelerated [21][22][23].
Some genetic studies have shown the association of NF-κB genetic mutations with kidney failure and kidney disease progression [10,24,25]. NF-κB is an important transcription factor in inflammation. The polymorphisms in NF-κB transcription binding sites have yet to be identified. Therefore, we used bioinformatics technology and the Taiwan Biobank and chromatin immunoprecipitation sequencing (ChIP-Seq) databases to find NF-κB transcription binding site polymorphisms in a Han population. We then performed a case-control study to investigate the association between the polymorphisms and ESRD.

Bioinformatics analysis in the screening of gene processes
We performed a three-step process for screening of genes ( Fig. 1).

Screening of genetic variation in a Taiwanese population through a quality control program
First, we used the single nucleotide polymorphism (SNP) database from the Taiwan Biobank. This database includes 58,917,994 SNPs and 997 next-generation sequencing (NGS) samples. Then, we used a human reference genome downloaded from the University of California, Santa Cruz (UCSC; GRCh37/hg19), and all SNPs within 500 kb upstream and downstream of each candidate SNP from the UCSC genome browser (https://genome.ucsc.edu/). We deleted structural variants (insertion/deletion, deletion) because there was no way to use the multifunctional mass spectrometer (mass array) for genotyping. We kept the remaining variants for study and deleted variants with a call rate of less than 90% at the position. Finally, the remaining SNPs were used for further alignment.
Sequence alignment techniques using bioinformatics analysis of genetic variations that may affect NF-κB binding Second, we analyzed genetic variants that may affect NF-κB binding by using bioinformatics sequence alignment techniques and identified the variants located in the transcription factor binding site (TFBS). Prior studies have confirmed that the structure of NF-κB is a dimer consisting of five different related structural proteins: p50, p52, p65 (RelA), RelB, and c-Rel. The combination of the p50 protein and the p65 protein is found [26][27][28] in almost all cells; as a result, this study explored only the heterodimer of p50/p65. In the past, the TFBS sequence of the identified transcription factor was 5′-GGGRNNYYCC-3′ (R = A or G; N = A, C, G, or T; Y = C or T). We aligned this motif in all 36,041,790 SNPs and in its nearby sequences within 500 kb that included this motif and found 40,137 SNPs that may affect the binding activity.
Confirmation by ChIP-Seq that these mutations bind to these positions Third, we further confirmed that these variations do combine with these locations through ChIP-Seq. After the above screening, we used the method of a previous study on the human genome on the NF-κB ChIP-Seq for analysis of the results of further screening [29]. The study was performed using B cells for ChIP-Seq analysis and analysis of the NF-κB five structural proteins of TFBS and was published in the online Gene Expression Omnibus (GEO) database (GSE55105). We extracted the results of the p50-p65 dimer follow-up screening and found a total of 5112 sequences with alignment to 15 SNPs.

Study subjects
Next, we conducted a case-control study to identify SNPs related to the NF-κB binding site associated with ESRD. In this study, we collected blood samples and social demographic data from patients admitted to the Tri-Service General Hospital in Taipei, Taiwan, between 2015 and 2016 and then performed real-time polymerase chain reaction (PCR) and genotyping.
We collected data from 847 hemodialysis (HD) patients (male, 50.8%; female, 49.2%; age, 71.84 ± 12.93) from the Tri-Service General Hospital in Taipei, Taiwan. CKD was defined according to the Kidney Disease Outcomes Quality Initiative definitions, and the estimated glomerular filtration rate (eGFR) was calculated using the Modification of Diet in Renal Disease study equation [30,31]. Study patients were defined as having an eGFR ≤15 mL/min/1.73 m 2 with clinical signs of uremic syndrome requiring HD. All patients were over 20 years old and had been on HD for more than 6 months. Patients were excluded if they had autoimmune disease, malignancy, or acute or chronic infection. Demographic data of the HD patients included age, sex, diabetes, HD duration, hypertension, education level, and blood biochemical values (white and red blood cells, hemoglobin, blood urea nitrogen [BUN], creatinine, albumin, proteinuria, blood sugar AC, triglycerides, cholesterol, sodium, potassium, calcium, phosphorus, and eGFR). The 846 healthy controls (male, 44.6%; female, 55.4%; age, 73.50 ± 7.21) had no history of renal disease, and their eGFR was ≥60 mL/min/ 1.73 m 2 . The control group was composed from those undergoing a physical examination at Tri-Service General Hospital. The healthy controls had no microalbuminuria, proteinuria, or hematuria and had normal abdominal/ renal ultrasonography findings.

Ethical statement
The study was reviewed and approved by the institutional ethics committee of the Tri-Service General Hospital (TSGH-1-104-05-006, TSGH-2-106-05-127). After full explanation of the study, written informed consent was obtained from all participants. All clinical and biological samples were collected, and DNA was genotyped following patient consent.

Genomic DNA extraction and genotyping
The blood samples were extracted from the laboratory by phenol chloroform and stored in a − 20°C refrigerator for subsequent genotyping and experimental use. Genomic DNA used standard procedures for proteinase K (Invitrogen, Carlsbad, CA, USA) digestion and phenol/chloroform [32] peripheral blood sample separation; then, the samples Fig. 1 The candidate SNPs screening process by biometrics for this study. First, we used a total of 58,917,994 SNPs (997 samples) in the Taiwan Biobank database to screen for a Taiwanese-specific genetic variation. Then, through genetic alignment of GRCh37/hg19 from the NCBI, we found that NF-κB (p50-p65) contained 271,063 potentials in the human genome based on the sequence of the above binding sites. Of the TFBSs, we compared the remaining 36,041,790 SNPs in the first step and found that there were 3,121,467 SNPs around the 271,063 potential TFBSs, of which 40,137 SNPs were even on the TFBS of NF-κB. Finally, we validated these with the results of the second stage through the ChIP-Seq database to further confirm that these mutations do have a combination of these positions. The 15 SNPs variation may affect NF-κB binding activity were genotyped by iPLEX Gold SNP [33]. We assessed the genotyping experiment quality by intrareplication validation. The concordance rate of interreplication validation of 78 samples (approximately 5%) was 100%. Secondary genotyping was performed on 10 random blood samples with PCR according to a previously described protocol for intrareplication validation [34]. After genotyping replication was conducted twice, the concordance rate was 100% between the two genotyping methods.

Chromatin immunoprecipitation assay and qPCR
We included 9 ESRD patients, GG (n = 3), GT (n = 3), and TT (n = 3), for the ChIP assay. ChIP assays were conducted by using the ChIP Kit ab500 (Abcam, USA) and NF-κB antibody (Proteintech) according to the manufacturer's instructions. The immunoprecipitate was eluted with 100 μl DNA purifying slurry, and 2 μl of DNA was used in qPCR. Input DNA and NF-κBenriched DNA fragments were amplified by using qPCR in a 7500 Fast Real-Time PCR System (Applied Biosystems) with primers 5′-ATTCTCACCATGGGAATGG-3′ and 5'GAGGACAGCAAGGTAATAG-3′. The results are shown as percentage input.

Transient transfection and luciferase assay
NF-κB binding site SNP rs9395890 reporter (from 53820675 to 53821295, 620 bp) was amplified by polymerase chain reaction from one home-made genomic DNA library with the primer pair: 5′: 5′-GGGGTACCGCATCTACGTTCTT AAATGGCC-3′ and 3′: 5′-GGAAGATCTCCTACAGAA CCATTACACTCTC-3′ and subcloned into a pGL3 basal reporter (Promega, USA) cut at KpnI and BglII sites. After the sequence verification, we further changed the current T allele into G allele using the QuickChange Lightening Sitedirected mutagenesis kit (Agilent Technology). HEK293 cells were grown in Dulbecco's modified Eagle's medium supplemented with 10% charcoal/dextran-treated fetal bovine serum. The cells in each well (24-well plate) were transfected with total 1 μg DNA and jetPEI (PolyPlus-transfection, Illkirch, France) according to the manufacturer's protocol. Luciferase activity was assessed after 24 h post transfection using the Promega Luciferase Assay kit and expressed as mean relative light units (RLU) of two transfected sets. Results shown are representative of at least three independent experiments.

mRNA expression
We assessed the correlation between genetic variants and mRNA expression of the corresponding genes. Expression quantitative trait loci (eQTL) analysis was also performed using data from the GTEx portal database (https://www.gtexportal.org/home/) and the HapMap Project by a general linear regression model in an additive genetic model [35].

Statistical analysis
Statistical analysis was performed with R software, version 3.3.1 (R Project for Statistical Computing, Vienna, Austria). Demographic and clinical data between the groups were compared with Student's t-test, and the results for continuous variables were given as the mean ± SD. The allele and genotype frequencies between the different groups were compared with the χ 2 test when appropriate. The results of ChIP assay qPCR cycles were compared with ANOVA. The genetic polymorphism of ESRD risk was calculated using dominant/recessive models. The odds ratios (ORs) and corresponding 95% confidence intervals (CIs) for assessing the effect of the genotype distribution, allele frequencies and binding site activity on ESRD were calculated by logistic regression analysis with adjustment for relevant significant variables. Statistical significance was defined at the 95% level (P < 0.05).

Screening of genes Next-generation sequencing
We screened for genetic variations in 997 samples from the NGS database in the Taiwan Biobank to determine the total number (58,917,994) of genetic variants in Taiwanese genomes: 11,423,191 were structural variants (insertion/deletion, deletion). There was no way to use the multifunctional mass spectrometer (mass array) for subsequent analysis, and thus we kept only the remaining variants for further study. Therefore, a total of 47,494,803 SNPs were analyzed in detail. Following a quality control program that involved deleting variants with a call rate of less than 90% at the position, 36,041, 790 SNPs remained; we then performed sequence alignment analysis.

National Center for biotechnology information
We downloaded the human reference gene sequence of GRCh37/hg19 from the National Center for Biotechnology Information (NCBI) in combination with the human biological database in Taiwan and found that NF-κB (p50-p65) contained 271,063 potential variants in the human genome based on the sequence of the above binding sites. Of the TFBSs, we compared the remaining 36,041,790 SNPs in the first step and found that 3,121, 467 SNPs were near the 271,063 potential TFBSs, of which 40,137 SNPs were even in the TFBS of NF-κB. Additionally, mutation of this site will likely result in NF-κB (p50-p65) being unable to bind. Finally, a total of 5766 SNPs with a minor allele frequency > 5% were screened for further follow-up by ChIP-Seq analysis due to the limited number of samples subject to subsequent analysis in this study [36].

Gene expression omnibus
In the GEO database, there were 5112 positions in the TFBS associated with the p50-p65 dimer. After validating these results with the results of the second stage, the remaining 15 SNPs are shown in Table 1. For SNPs near the DNA sequence, the SNP position as the center ±9 base pairs (the bold font indicates NF-κB) was the expected TFBS. The 15 SNP variations may affect NF-κB binding activity. Finally, we used 15 SNPs obtained from the bioinformatics technology results and the ChIP-Seq database to confirm the relationship with ESRD in this study.

Demographic characteristics
The characteristics of the 846 ESRD and 847 control group subjects are presented in Table 2. The causes of ESRD were diabetes mellitus (DM) in 215 patients (25%), hypertensive nephropathy in 164 (19%), systemic nephropathy in 252 (29%), and other and unknown causes in 136 (16%). There was no significant difference in body mass index. Significant differences in sex, age, DM, hypertension, BUN, serum creatinine, GFR, blood sugar AC, total cholesterol, and triglycerides were observed between patients with ESRD and controls (P < 0.001).

Association analyses of NF-κB binding site gene polymorphisms with susceptibility to ESRD
In the gene screening process, the call rate of all 15 SNPs was > 90%, and the genotypes of these SNPs were in Hardy-Weinberg equilibrium (P > 0.05). When we calculated our sample size, the power was > 50% and the OR was set at 1.5 to detect the real effects of expected NF-κB binding site SNPs. Two SNPs were nonfrequency SNPs under the allele model (rs2851583, rs76552560), and a suitable primer could not be found for three SNPs (rs11234413, rs3826454, rs67087171). Finally, genotyping results were obtained for 10 SNPs. Our results showed that SNP rs9395890 had a significant association with ESRD risk according to genotype (P = 0.041; Table 3).

Allele frequencies for the NF-κB binding site gene polymorphisms with susceptibility to ESRD
There was a significant association (P = 0.049;  Table 4). There were no significant differences in genotype or allele frequencies in the other nine SNPs between patients with ESRD and controls (Additional file 1).

Discussion
Our results suggested that there is a significant correlation between rs9395890 and ESRD risk. This genetic association study employed bioinformatics technology and epidemiological approaches that make it different from other studies. Previous reports included more genetic and molecular epidemiological studies of ESRD in genome-wide association studies (GWAS). GWAS can explore the etiological contribution of genetic variants throughout the whole genome without applying previously hypothesis. However there are very few detected causal variants [37]. Therefore, we provided an approach to use a hybrid method consisting of candidate gene and epidemiologic approaches. The research of inflammatory transcription factor (NF-κB) associated SNPs has been investigated in a few previous studies [38][39][40][41]. However, we addressed the importance of genetic polymorphisms in determining ESRD in this study. We were able to identify loci and information about which genes were associated with complex diseases [42,43]. In our study, we used methodological approach that combined the NGS, NCBI, and GEO online databases to find target SNPs and used epidemiological methods to confirm the findings in a case-control study. A previous study in 2014 also used publicly available genomic data and bioinformatics platforms to provide additional evidence for the TFBSs of SNPs of the ERαregulating sequence at 21q22.3, which are important in determining breast cancer progression [43]. Fig. 2 Confirmation of NF-κB binding ability in the rs9395890 by using ChIP assay. The representative data of enrichment of NF-κB of three genotypes, GG (n = 3), GT (n = 3), and TT (n = 3), in the rs9395890 by using chromatin immunoprecipitation assay (ChIP). Real-time qPCR was performed to measure the amount of with or without NF-κB enriched fragments. The ChIP-assay reveals that around 1.49 times enrichment of NF-κB of the variant type TT when compared to that of the wild type GG in the rs9395890. The SNP at NF-KB transcription binding site rs9395890 have high binding ability in TT type than GG type. The results are shown in % input (ChIP/input). The mean ± SEM is given for each construct from three experiments (P = 0.027; TT = 3.20 ± 0.16, GT = 2.81 ± 0.20, GG = 1.71 ± 0.18) ( Table 5) Immune and inflammatory factors have important roles in the pathogenesis of kidney diseases [44,45]. Based on previous studies, the transcription factor NF-κB regulates the expression of various genes that have an important role in the regulation of immunity and inflammation in disease [10]. NF-κB regulates T cells, particularly the T helper 17 cells, which mainly affect the pathogenesis of autoimmunity and inflammation [46]. Several studies have shown the cell-intrinsic role of NF-κB in T cell generation [47,48]. In the NF-κB pathway, when cells are unstimulated, NF-κB is bound to IkBa and IkBb in the cytoplasm, which prevents NF-κB from entering the nucleus [49]. When these cells are stimulated, specific kinases phosphorylate IkB, allowing degradation by proteasomes [50,51]. The NF-κB released from IkB results in the passage of NF-κB into the nucleus, and NF-κB binds to target sequences in the promoter regions of target genes, leading to the expression of many genes involved in immune and inflammatory responses [52].
The NF-κB signaling pathway regulated renal inflammation and the progression of ESRD. Histological evidence of NF-κB activation has been associated with human renal disease with diabetes, glomerular disease, and acute kidney injury [53]. The NF-κB transcription of multiple proinflammatory molecules, such as cytokines, chemokines, allograft antigens, adhesion molecules, and reactive oxygen, in response to renal injury [54]. The SNPs at NF-κB transcription binding site are functional polymorphisms that might regulatory polymorphisms situated in the noncoding regions of the genes which may affect gene product protein due to the transcriptional alterations [55].
In the past, we knew that the NF-κB transcription factor binding site was involved in the regulation of downstream inflammatory genes, which in turn affected the progression of disease and the deterioration of inflammation. However, the results of this study found an association between ESRD risk and the NF-κB binding site SNP rs9395890. Furthermore, we used a ChIP assay to identify NF-KB binding activity with different genotypes. We found that the NF-KB binding activity at SNP rs9395890 with the TT type was higher than that of the GG type. And we assessed the functionality of the NF-κB binding site rs9395890 T/G polymorphism for effects activity by luciferase reporter assay. Our experimental Fig. 3 showed that the transcriptional activity of the T allele was higher than G allele, but the relative light units (RLU) data was no significant difference between with T and G allele. So far there were no study about the rs9395890 and MLIP-IT1. However it might be the distance between rs9395890 and MLIP-IT1 is too far away.
Furthermore, the results from GTEx portal demonstrated that the T allele was significantly associated with increasing expression levels of rs9395890 in multiple tissues, suggesting that rs9395890 may modulate the risk of ESRD, possibly through a mechanism of modulating gene expression [35].
SNP rs9395890 is an intron variant located on chromosome 6: 53820994 in front of the MLIP-IT1 gene − 42694 bp. MLIP-IT1 is a noncoding RNA gene, and MLIP-IT1 is a responding gene of rs9395890. Noncoding RNA is not translated into protein but causes transcription factor binding protein and expression of downstream genetics. We suspect that a mutation in this site will affect the function of this gene in MLIP-IT1, which increases the risk of ESRD. To our knowledge, few studies have reported MLIP-IT1 and rs9395890 [56]. DNA is transcribed to Fig. 3 The effect of T/G allele within the NF-κB site on the MLIP-IT1 reporter activity. a HEK293 cells were transiently transfected indicated amount of pGL3.MLIP-IT1-LUC containing T or G allele and b the plot of T/G ratio. Dotted line represent the baseline of one fold. c the Luciferase activity upon overexpression of P65 or not in T and G allele, respectively. These data are the averages of three experiments (mean ± S.D.; n = 3) mRNA by transcription factors, which then initiate their function. Noncoding RNA occurs during DNA transcription to RNA, when a portion of RNA cannot become mRNA. Noncoding RNA regulates gene transcription function and protein transport. More studies have focused on noncoding RNA and its association with chromatin remodeling, gene transcription, protein transport, and trafficking. Noncoding RNA also has important roles in most human diseases, including coronary artery diseases, autoimmune diseases, neurological disorders, and various (b)(c) The gene expression box plot by QTL analysis using HapMap data for MLIP-IT1 rs9395890 with thyroid and skin tissue [35] cancers [37,43,57]. Specifically, we found that the rs9395890 T allele was associated with the risk of ESRD. The T allele mRNA expression levels were higher than those of the G allele in thyroid, skin and mucosa inflammation disease according to data from the GTEx portal. These results are consistent with our ChIP assay data (TT binding activity higher than GG; Fig. 2) [35].
However, we did not confirm that the NF-κB transcription binding site SNP rs9395890 and the responding gene MLIP-IT1 regulated the mechanism of ESRD risk. Therefore, an experiment to identify the association between rs9395890 MLIP-IT1 RNA expression and ESRD risk is necessary in the future.
Our study has some limitations. First, to our knowledge, no studies have related SNPs of NF-κB transcription binding sites to disease. Our study used bioinformatics technology, that is, the NGS, NCBI, and GEO online databases, to screen transcription binding site genetics. Furthermore, we used our case-control groups for genotyping to confirm that rs9395890 was associated with ESRD. We used GEO database-involved B cells for ChIP-Seq analysis, but it was difficult to obtain renal cells to repeat verification. Second, the odds ratio of rs9395890 was very low, but this is a limitation of an observational study. Third, our study sample size was not large enough. Bonferroni correction could not be performed. However, only one SNP was significantly correlated in our study, and the results of functional analysis were indeed related to ESRD risk. Our study included both a genetic association test and a functional analysis, and the results were consistent (p < 0.05 in both tests). Because of the double statistical test setting, we consider that the type 1 error rate in our setting is less than that in general genetic association studies using Bonferroni correction, and thus the evidence level provided by our study is sufficient even though we cannot conduct Bonferroni correction. In summary, we conclude that SNP rs9395890 plays a key role in the incidence of ESRD.

Conclusion
Our study demonstrated that SNP rs9395890 might contribute to NF-κB transcription binding site ability and might exert an effect on MLIP-IT1 activity. The function of MLIP-IT1 with regard to ESRD progression risk and survival should be explored further.