Abstract
Introduction
Thyroid hormones have systemic effects on the human body and play a key role in the development and function of virtually all tissues. They are regulated via the hypothalamic–pituitary–thyroid (HPT) axis and have a heritable component. Using genetic information, we applied tissue-specific transcriptome-wide association studies (TWAS) and plasma proteome-wide association studies (PWAS) to elucidate gene products related to thyrotropin (TSH) and free thyroxine (FT4) levels.
Results
TWAS identified 297 and 113 transcripts associated with TSH and FT4 levels, respectively (25 shared), including transcripts not identified by genome-wide association studies (GWAS) of these traits, demonstrating the increased power of this approach. Testing for genetic colocalization revealed a shared genetic basis of 158 transcripts with TSH and 45 transcripts with FT4, including independent, FT4-associated genetic signals within the CAPZB locus that were differentially associated with CAPZB expression in different tissues. PWAS identified 18 and ten proteins associated with TSH and FT4, respectively (HEXIM1 and QSOX2 with both). Among these, the cognate genes of five TSH- and 7 FT4-associated proteins mapped outside significant GWAS loci. Colocalization was observed for five plasma proteins each with TSH and FT4. There were ten TSH and one FT4-related gene(s) significant in both TWAS and PWAS. Of these, ANXA5 expression and plasma annexin A5 levels were inversely associated with TSH (PWAS: P = 1.18 × 10−13, TWAS: P = 7.61 × 10−12 (whole blood), P = 6.40 × 10−13 (hypothalamus), P = 1.57 × 10−15 (pituitary), P = 4.27 × 10−15 (thyroid)), supported by colocalizations.
Conclusion
Our analyses revealed new thyroid function-associated genes and prioritized candidates in known GWAS loci, contributing to a better understanding of transcriptional regulation and protein levels relevant to thyroid function.
Introduction
Thyroid hormones (TH) affect cellular metabolism and thereby have vital functions for growth and metabolic homeostasis (1). Thyroid function is governed by the hypothalamic–pituitary–thyroid (HPT) axis, where the thyrotropin-releasing hormone from the hypothalamus stimulates the release of thyrotropin (TSH) from the anterior pituitary gland, which in turn results in the production of the main hormones of the thyroid gland, thyroxine (T4) and triiodothyronine (T3). Feedback loops at different levels of the HPT axis maintain stable thyroid function. Many of the important genes involved in the HPT axis have been uncovered by the study of monogenic disorders of thyroid function (2, 3). However, important new knowledge remains to be uncovered, as evidenced by recent GWAS of TSH and free T4 (FT4) levels that identified both a new TH transport protein encoded by SLC17A4, as well as a previously unknown TH-metabolizing enzyme encoded by AADAT (4). Hypothesis-generating screens based on genetic information are therefore suitable to identify and prioritize previously unknown players in thyroid function. A dedicated analysis of all known TH-regulating genes (5) moreover revealed that variants in these genes detected by GWAS only accounted for a small percentage of variation in thyroid function measures. This suggests the presence of undiscovered TH-regulating genes within the HPT axis. A recent large-scale GWAS meta-analysis investigated the genetics of thyroid function parameters such as TSH, FT4, and T3-related traits (6). The current study builds on this GWAS and investigates the two most commonly used thyroid function markers, TSH and FT4, which were also the focus of previous GWAS on thyroid function (4).
As a genome-wide screen, GWAS needs to be stringently corrected for multiple tests, which can result in the missing of true association signals. Moreover, significant GWAS loci often contain several genes, complicating the prioritization of the causal gene. Lastly, GWAS do not provide information about tissue-specific effects, although most common genetic variants implicated by GWAS are intronic or intergenic and may differentially affect gene expression. Transcriptome-wide association studies (TWAS) can address these three challenges by studying the effect of genetically predicted, tissue-specific differential gene expression on a disease or trait of interest (7, 8). TWAS utilizes genomic information associated with a specific phenotype (GWAS) and weight matrices, which are computed using SNP-mRNA associations (expression quantitative trait loci, eQTL) data for the relevant tissues from GTEx v8 postmortem tissue from donors, acquired from rapid autopsy and organ procurement organizations (9, 10). These matrices quantify the relationship between genetic variants and gene expression levels. TWAS thereby only needs to correct for the number of evaluated transcripts across tissues and directly implicate the investigated gene product as a molecular trait that links genetic variation to the disease or trait of interest. Proteome-wide association studies (PWAS) represent an analogous approach by studying the effect of genetically predicted protein levels, currently focused on the plasma proteome (11). In addition to the advantages of TWAS, PWAS can identify trait correlations with plasma protein levels that are mediated through mechanisms other than their differential gene expression, such as thetrans effects of TSH and FT4 on the circulating proteome. This analysis can give us insights into the HPT axis, and to the best of our knowledge, such analyses have not been applied to investigate other endocrine axes.
The aims of this study were, therefore, to use TWAS and PWAS to uncover additional thyroid function-related genes previously unreported in GWAS, to prioritize potentially causal genes in TSH- and FT4-associated GWAS loci, to contrast different tissues of the HPT axis, and to characterize the identified associations through a series of downstream analyses. Here, we unravel new thyroid function-related genes, reflecting the increased power of TWAS and PWAS analyses, while also providing valuable insights into shared genetic signals underlying transcript and protein levels and thyroid function levels. The resulting prioritization of genes in GWAS loci contributes to a better understanding of the genetic architecture underlying thyroid hormone regulation and implicated numerous previously unreported relationships, such as those for ANXA5 transcript and protein levels.
Methods
Genome-wide association studies (GWAS)
GWAS of inverse normal transformed values of free T4 levels (n = 119,120 from 37 cohorts) revealed 85 independent significant SNPs (P-value < 5 ×10−8), and of inverse normal transformed TSH levels (n = 271,040 from 46 cohorts) resulted in 259 independent significant SNPs. All study participants were of European ancestry. Individual study files were filtered by MAF > 0.005 and by imputation quality > 0.4 at the marker level. For the meta-analysis, SNPs with minor allele frequency (MAF) < 0.01 or SNPs present in <75% of the sample size were excluded. GWAS were performed using genome assembly hg37. Variant identifiers (rsIDs) were annotated and genomic positions were converted to genome assembly GRCh38 (hg38) using the liftOver function from the rtracklayer R package (12).
Transcriptome-wide association studies (TWAS)
TWAS were performed following the FUSION workflow (13) based on weights from GTEx v8 tissues (European ancestry) considered relevant for thyroid function (brain hypothalamus, pituitary, thyroid, and whole blood). Gene expression weights and linkage disequilibrium (LD) reference data for persons of European (EUR) ancestry from 1000 Genomes Project (1000G) were downloaded from http://gusevlab.org/projects/fusion/. Prediction models were based on elastic net modeling (13). Multiple testing was accounted for by a Bonferroni adjustment for the number of unique transcripts modeled across traits and tissues (P-value < 1.74 × 10−6 = 0.05/28,768 transcripts). FT3 TWAS was performed for a selected candidate gene, ANXA5. Overlap with significant GWAS loci was determined using GWAS variants with P-value < 5 × 10−8 in a 250 kb window around the TWAS gene.
Proteome-wide association studies (PWAS)
PWAS were performed based on an adapted version of the FUSION workflow (13) using plasma protein weights from the European ancestry (EA) subpopulation of the Atherosclerosis Risk in Communities Study (n = 7213) (ARIC) (11), where plasma protein levels had been measured using the aptamer-based Somascan technology (V.4.1 platform). Protein weights and LD reference files were downloaded from http://nilanjanchatterjeelab.org/pwas/. Prediction models were based on elastic net modeling (13). Multiple testing was accounted for by a Bonferroni adjustment for the number of unique aptamers modeled across traits (P-value < 3.78 × 10−5 = 0.05 / 1322 aptamers). Overlap with significant GWAS loci was determined using GWAS variants with P-value <5 × 10−8 in a 250 kb window around the PWAS gene.
Conditionally independent colocalization analysis
A colocalization analysis of independent signals was performed for each significant association. Colocalization of TWAS findings was performed using publicly available cis-eQTL summary statistics from GTEx v8 (EUR). For PWAS, we used cis-protein quantitative trait loci (pQTL) summary statistics from EA participants of the ARIC study (11). As LD reference data, we used a subset of 15,000 genomically British participants from the UK Biobank (application number 20272); genomic positions were converted to genome assembly GRCh38 (hg38) using the liftOver function from the rtracklayer R package (12). Plasma proteins and gene expression summary statistics were extracted for every gene significant by TWAS or PWAS with a 250 kb flanking region. First, independent association signals within these regions were identified based on approximate conditional analyses via the GCTA COJO-Slct algorithm (14), with default parameters (P-value < 5 × 10−8 and a collinearity of 0.9). For each conditionally independent SNP, conditional summary statistics were computed by conditioning on all other independent SNPs in the gene region using the GCTA COJO-Cond algorithm with default parameters (collinearity of 0.9) (14). Subsequently, approximate Bayes factors-based colocalization analyses were conducted for all pairwise combinations of independent eQTL/pQTL associations against the independent thyroid function GWAS associations (15) using an adapted version of the Giambartolomei colocalization method as implemented in the ‘coloc.fast’ function (https://github.com/tobyjohnson/gtx), with default parameters and prior definitions (16). Positive colocalizations were reported if the posterior probability of a shared causal variant (PPH4/p12) was ≥0.8.
GO, KEGG, and tissue enrichment analyses
Enrichment testing of the significant genes identified by TWAS for TSH (297 genes) and for FT4 (113 genes) was performed in different resources using gene ontology (GO) terms (17), Kyoto encyclopedia of genes and genomes (KEGG) pathways (18), Human protein atlas (HPA) tissues (19), and GTEx v8 tissues (9), with highly expressed genes in each tissue defined as the top 10% most highly expressed genes. For more details on the processing of the tissue resources, see ref. 20. For all enrichment analyses, we used hypergeometric tests implemented in the R package clusterProfiler version 4.0.5 (21), where we selected the overlap of the 15,422 genes encoding for transcripts analyzed during TWAS with the genes available in the respective resource as background genes. When testing for enrichment in GO terms and KEGG pathways, after overlapping with the available genes in the respective resource, the gene set originating from TSH contained 186 and 95 genes, the gene set originating from FT4 contained 76 and 38 genes, and the number of background genes was reduced to 9629 and 4339, respectively. When performing tissue-specific enrichment analyses, the gene set assigned to a tissue was filtered for genes encoding transcripts analyzed during TWAS. All results were filtered for terms with at least two genes. P-values were corrected for multiple testing using the Benjamini–Hochberg procedure (22) separately in each of the different resources.
Investigation of significant genes for associations with human traits and diseases through PheWAS
We performed queries of phenome-wide association results using the AstraZeneca PheWAS Portal. These results are based on the UK Biobank whole-exome sequencing data (469,809 UK Biobank exomes) and contain associations for phenotypes derived from electronic health records, questionnaire data, and continuous traits released by UK Biobank (23). We limited our queries to gene-level associations to directly link them with genes detected in our TWAS (385 genes) and PWAS (28 genes). Gene-level analyses were based on collapsing analyses, which aggregated rare, putatively deleterious, qualifying variants in each gene with given criteria, including ten dominant and one recessive model, and testing them against a given phenotype. P-values for binary traits associations were determined by Fisher's exact two-sided test, and P-values for quantitative traits were determined by linear regression corrected for age, sex, and their interaction (age × sex). We considered all available binary endpoints (n = 10,088) and quantitative endpoints (n = 1927). The total number of unique phenotypes was n = 14,956. Therefore, we defined the following significance thresholds for our queries: P-value < 1.19 × 10−7 (0.05 / (28 genes × 14,956 phenotypes)) for genes implicated by PWAS, and P-value < 8.68 × 10−9 (0.05 / (385 genes × 14,956 phenotypes)) for those implicated by TWAS. The queries were further filtered to retain binary traits with more than 100 cases and controls and continuous traits with more than 30 observations. Additionally, the number of cases or controls with qualifying variants in binary phenotypes and the number of participants with qualifying variants in quantitative phenotypes had to exceed 3.
Results
TWAS reveal expressed genes related to thyroid hormone metabolism
We performed an HPT axis tissue-specific TWAS based on gene expression models derived from GTEx project v8 data (Supplementary Table 1, see section on supplementary materials given at the end of this article) (9, 13) and GWAS summary statistics of TSH and FT4 derived from up to 271,040 individuals of European ancestry across 46 predominantly population-based cohorts from the ThyroidOmics Consortium (Fig. 1). Each transcript was predicted using tissue-specific gene expression information, which allowed for the identification of genetic variants associated with gene expression levels in the relevant tissue (i.e. hypothalamus, pituitary, thyroid, and whole blood). Across tissues, TWAS yielded 297 and 113 transcripts significantly associated with TSH and FT4, respectively (Fig. 2). A total of 25 transcripts were significantly associated with both hormones. Of these, 20 transcripts exhibited inverse associations, meaning that their correlation with TSH was opposite to that with FT4 (Supplementary Fig. 1A and Supplementary Table 2). The thyroid gland yielded the highest number of significant findings across all studied tissues, accounting for 76.8% of significant findings from TWAS of TSH and 64.6% from TWAS of FT4 (Fig. 2, Supplementary Fig. 1B, C and Supplementary Table 2).
From the 297 unique transcripts associated with TSH in one or more tissues, 24 were not encoded by a gene mapping into a significant GWAS locus. Likewise, the genes encoding 14 out of the 113 significant FT4-associated transcripts did not map into a significant GWAS locus (Methods, Supplementary Table 2). Thus, TWAS enabled a gain in power to detect novel thyroid function-associated signals due to the reduced multiple testing burden of gene-based analysis as opposed to variant-based analysis.
To assess whether the implicated transcripts represented a molecular link between genetic variation and TSH or FT4 levels or independent associations in the same genetic region, we performed conditional genetic colocalization analyses. Strong evidence for colocalization, i.e. the posterior probability of a shared underlying causal variant (PPH4 > 0.8; Methods), was identified between thyroid function (GWAS) and gene expression for 158 (TSH) and 45 (FT4) transcripts in at least one of the studied tissues (Fig. 2, Supplementary Figs 2, 3, 4, 5, 6, 7, 8, 9 and Supplementary Table 3). These colocalization results are consistent with regions in which regulatory variants affect both gene expression and thyroid function, thereby enabling the prioritization of potentially causal genes in the GWAS loci.
Reassuringly, significant findings from TWAS of TSH and FT4 contained multiple positive controls, i.e., transcripts encoded by genes known to have important roles in thyroid hormone regulation. For example, TPO, encoding the TH-synthesizing enzyme thyroid peroxidase, and PDE8B, encoding phosphodiesterase 8B that catalyzes cAMP hydrolysis, a key second messenger in TSH signaling, were associated with both TSH and FT4. Testing for colocalization confirmed a shared genetic signal underlying both higher thyroid TPO and PDE8B expression and higher levels of TSH, as well as lower levels of FT4 (Supplementary Fig. 10). In line with known physiology, the colocalization of TSH and FT4 GWAS signals with gene expression was present in thyroid tissue, but not in the hypothalamus, pituitary, or whole blood. Along the same lines, the colocalization of transcript levels of the TSH receptor encoding for the gene TSHR with TSH was present solely in thyroid tissue. This finding is in line with the fact that the TSH receptor is predominantly expressed (9) and exerts its main function on the thyroid, as the TWAS of FT4 did not show a significant association (Supplementary Fig. 10 and Supplementary Table 3).
Tissue-specific differential CAPZB expression links independent genetic variants to altered TSH levels
We observed a significant association for CAPZB in both thyroid-specific (P-value = 2.02 × 10−215) as well as pituitary-specific (P-value = 1.57 × 10−15) TWAS of TSH (Fig. 3). CAPZB expression models were unavailable for hypothalamus- and whole blood-specific TWAS due to the lack of significant heritability of the eQTL in these datasets. Nevertheless, colocalization analyses between GWAS and gene expression could be conducted in all HPT tissues. These analyses revealed colocalization between genetically higher TSH levels and higher levels of CAPZB expression in both thyroid (PPH4 = 0.99) and hypothalamus (PPH4 = 0.89). A moderately strong colocalization signal (PPH4 = 0.75) between genetically higher TSH levels and higher levels of CAPZB expression in the pituitary was also identified. Interestingly, independent TSH-associated genetic variants seemed to differentially modulate CAPZB expression in a tissue-dependent manner, as shown in Fig. 3A: whereas a TSH-associated genetic signal upstream of CAPZB also underlies differential CAPZB expression in the thyroid, another independent genetic signal centered on the gene body underlies differential CAPZB expression in hypothalamus.
To further substantiate the regulatory potential of these variants, we analyzed functional genomics data from thyroid tissue generated by the ENCODE Project Consortium and found that rs10799824, the implicated variant upstream of CAPZB, overlapped with open chromatin (ATAC-seq). Notably, this position was not accessible in other tissues from the same tissue donor (ENCDO451RUA) (Fig. 3B), supporting tissue-specific regulatory function. These findings were confirmed in thyroid tissue of a different donor (ENCDO793LXB) (Methods, Supplementary Fig. 11A). We looked into PheWeb using 1400 EHR-derived broad phenome-wide association study (PheWAS) codes for 57 million TOPMed-imputed variants in 400,000 white British individuals in the UK Biobank (24) and we found that this variant is also associated with endocrine and metabolic traits such as goiter (P-value = 2 × 10−12) (Supplementary Fig. 11B).
Connecting TWAS discoveries to rare thyroid-related variants using PheWAS
Few TSH and FT4-associated GWAS loci contain genes known to be involved in monogenic thyroid diseases. To investigate potential relationships between the genes identified through our TWAS approach and rare damaging variants associated with thyroid disease, we investigated the individual and aggregated effects of rare, putatively damaging genetic variants across the exome using whole-exome sequencing and clinical outcome data from the UK Biobank (23), as implemented in the AZ Portal (Methods). Our analysis revealed significant associations for the TSH receptor gene with endocrine and metabolic diseases, such as hypothyroidism. We identified other associations described in Supplementary Fig. 12, Supplementary Tables 4 and 5.
Pathway enrichment for TWAS findings
We next performed overrepresentation analyses for genes encoding for significantly associated transcripts from TWAS using GO and KEGG databases to identify enriched pathways (Methods). For TSH, enriched biological processes (GO) and metabolic pathways (KEGG) were related to growth factors, cAMP-mediated signaling, and cAMP biosynthetic process. Additionally, genes encoding for associated transcripts were overrepresented in morphine addiction, relaxin signaling pathway, dopaminergic synapse, and chemical carcinogenesis, among others (Supplementary Fig. 13A and Supplementary Table 6). For FT4, enriched molecular functions and biological processes (GO), as well as metabolic pathways (KEGG), were related to metabolic processes, thyroid hormone generation, metabolism, and signaling, as well as chemical carcinogenesis (Supplementary Fig. 13B and Supplementary Table 6).
Overrepresentation analyses to identify tissues in which the target transcripts were highly expressed (Methods) revealed thyroid tissue for both TSH and FT4 and additionally minor salivary gland for TSH (Supplementary Table 7). These results provide evidence that transcripts implicated by TWAS of TSH and FT4 are indeed involved in molecular pathways that are relevant to thyroid function and molecular metabolism, including TSH and FT4 regulation.
PWAS reveal plasma proteins related to thyroid function
PWAS revealed significant associations between genetically predicted levels of 18 plasma proteins (Methods) with TSH and 10 with FT4. Two proteins, protein HEXIM1 and Sulfhydryl oxidase 2 (QSOX2), were associated with both hormones (Fig. 4, Supplementary Fig. 14A and Supplementary Table 8). Evidence for a shared causal variant from colocalization testing was observed for five TSH- and five FT4-related proteins (Supplementary Table 9). Although HEXIM1 and Sulfhydryl oxidase 2 were not among the proteins for which colocalization was observed, dedicated testing for colocalization with gene expression showed a shared genetic signal underlying FT4 levels and QSOX2 expression in both thyroid and pituitary (PPH4 = 0.969 and 0.968, respectively).
From all significant plasma proteins detected by PWAS of thyroid function, the genes encoding for 5 and 7 proteins did not map into a significant GWAS locus for TSH and FT4, respectively (see Methods, Supplementary Table 8). Details about the function of these novel TH-related candidates are summarized in Supplementary Table 10 (TSH) and Supplementary Table 11 (FT4).
There were 11 significant shared findings between PWAS and TWAS: 10 with TWAS of TSH, including HEXIM1, and one with TWAS of FT4 (Supplementary Fig. 14B and C), providing evidence for a molecular chain connecting genetic variation, gene expression, and protein levels with thyroid function. In particular, ANXA5 (annexin A5) showed colocalization between TSH-associated genetic variants with both their plasma protein and their gene expression levels in HPT axis tissues (Supplementary Table 12).
Annexin A5 is inversely associated with TSH levels
We observed a significant inverse association between plasma levels of annexin A5 and TSH (PWAS P-value = 1.18 × 10−13), and an inverse association between TSH and ANXA5 expression in the hypothalamus (TWAS P-value = 6.40 × 10−13), pituitary (TWAS P-value = 1.57 × 10−15), thyroid (TWAS P-value = 4.27 × 10−15), and whole blood (TWAS P-value = 7.61 × 10−12). These findings were supported by colocalization testing: we observed colocalization between TSH GWAS and both annexin A5 plasma protein levels and gene expression in all studied tissues, with lower TSH levels related to higher protein and expression levels. In line with these findings, colocalization between plasma protein and gene expression levels was also observed with a positive correlation (Fig. 5A).
Annexin A5 (Anxa A5, annexin V) is a member of the Ca2+-dependent phospholipid-binding protein family of annexins, functioning as a membrane stabilizer. Once the membrane is bound, the function of annexins can vary: some members of the annexin family act in vesicle trafficking or membrane organization (1, 25). Furthermore, annexins, including annexin A5, have been described to function as a Ca2+ channel under certain conditions (25). The latter finding is of interest because calcium is required in the iodination process and H2O2 production to form thyroid hormones (1). Additionally, a previous study showed that on continuous daily administration of calcium channel blockers, a gradual fall in levels of T3 and T4 with a rise in TSH levels was observed in rabbits (26). Our results showed a positive although non-significant association between FT4 and FT3 levels and ANXA5 expression in the thyroid (TWAS P-value = 1.27 × 10−02 and TWAS P-value = 1.02 × 10−01, respectively), and a significant inverse relation between thyroid ANXA5 expression levels and TSH, consistent with stimulation of TH production in presence of annexin A5 and displaying the negative feedback loop (Fig. 5b).
Connecting PWAS discoveries to rare thyroid-related variants using PheWAS
We investigated whether rare putatively damaging variants in the cognate genes of plasma proteins identified in PWAS showed significant associations across a broad range of clinical traits and diseases. We identified potential connections between rare variants in the genes encoding for plasma protein levels whose predicted levels were significantly associated with thyroid traits in PWAS (Supplementary Fig. 15, Supplementary Tables 13 and 14). Gene-phenotype associations were observed for PSCK1 (from FT4 analyses) with higher fat mass, weight, and phenotypes related to body mass index. Additionally, plasma levels of peptide YY (PYY), a gut hormone involved in appetite regulation and obesity, were also associated with PSCK1 (27). PSCK1 encodes the prohormone convertase 1/3 (PC1/3), a serine endoprotease responsible for processing precursor proteins as pro-neuropeptides and prohormones (28). Moreover, IHH identified by PWAS of FT4, was associated with respiratory function and height-related phenotypes. It encodes a signaling molecule associated with the regulation of skeletal growth and development, phenotypes influenced by thyroid hormones. Lastly, HGFAC, which encodes a protease that activates the hepatocyte growth factor hormone revealed in the FT4 PWAS, was associated with sex hormone-binding globulin (SHBG). Thyroid hormones (T3 and T4) were shown to act indirectly to increase SHBG production in the liver (29), and could thus be part of the pathway from HGFAC expression to SHBG levels.
Discussion
This systematic study of the relation of genetically predicted tissue-specific levels of gene expression and the plasma proteome with the most commonly quantified thyroid function-related hormones, TSH and FT4, has several principal findings: first, we identified thyroid function-associated transcripts and proteins that are missed by conventional GWAS even at sample sizes well over 250,000 individuals. Secondly, the analyses can help prioritize genes in GWAS loci, including previously identified loci. Thirdly, conditional colocalization analyses increased confidence in candidates that share a genetic basis with TH and provided information about tissue-specific and tissue-shared effects on gene expression. Functional annotation of candidate variants implicated a potentially causal, TSH-related regulatory variant driving CAPZB expression in the thyroid. Fourthly, shared genetic causes of transcript, protein, and TSH levels support the role of annexin A5 in TH regulation.
In comparison to a prior TWAS study focusing on TSH (30), our expanded study involved a much larger underlying GWAS and incorporated PWAS of the plasma proteome, revealing a comprehensive set of transcripts and proteins associated with both TSH and FT4 levels. Our results confirmed most of the previously reported associations but also revealed new ones, thus contributing to a broader understanding of thyroid function and expanding current knowledge.
We specifically selected those tissues in which genes that influence thyroid status and disease are predominantly expressed. For TWAS, we, therefore, focused on tissues within the HPT axis (2), namely hypothalamic, pituitary, and thyroid tissue, to examine thyroid hormone production, as well as whole blood, which gives a systemic view and can capture genetic influences on thyroid function that might be reflected in peripheral tissues. Most of the significant transcripts for both TSH and FT4 were found in the thyroid, which may partially be attributable to a higher availability of transcript models for thyroid TWAS due to a larger eQTL sample size for the thyroid compared to the hypothalamus and pituitary in the GTEx data. Additionally, it should be noted that the observed number of transcripts associated in the thyroid with TSH and FT4 cannot be directly compared to hypothalamus and pituitary tissue, as the latter two have a more complex cell type composition and play a role in regulating different hormones (31). Namely, the hypothalamus has a complex and heterogeneous population of neurons and glial cells that regulate also other endocrine axes, which can affect thyroid-related gene expression (32). Similarly, the pituitary gland is composed of different hormone-producing cell types, with TSH-secreting thyrotropes making up less than 10% of the cells in the gland (33).
Our results also underline the link between thyroid hormones and trace elements by identifying not only DIO1 and DIO2 but also the Se transport receptor LRP8 and the selenoprotein-specific translational regulator SECISBP2L. These findings are not only consistent with the well-studied role of iodine but also selenium in thyroid hormone synthesis and metabolism (34, 35). This points toward potential tissue-specific links, motivating further studies in experimental models.
CAPZB – different SNPs related to different tissues
Six independent variants in the CAPZB locus were identified in the TSH GWAS. Of these, the intronic SNPs rs12042004 and rs10799824 upstream of CAPZB colocalize with gene expression in the hypothalamus and thyroid, respectively. In agreement with a potential regulatory role in the thyroid, rs10799824 maps into an open chromatin peak in the thyroid but not in other studied tissues. CAPZB encodes the β-subunit of the capping protein involved in TSH-induced engulfment of the colloid by extension of microvilli and filopodia, representing a key step for thyroglobulin mobilization and subsequent proteolysis of thyroid hormones, including T3 and T4, in the thyroid gland (36). Given the location of rs10799824 in the thyroid-specific open chromatin peak, it could affect thyroid-specific gene expression of CAPZB, e.g. via differential transcription factor binding, leading to differences in the amount of produced thyroid hormones. In turn, these differences could cause changes in the observed variation of TSH via the negative feedback mechanism by T3 and T4.
Annexin A5 function in thyroid hormone production
We described a conceptual model regarding the function of annexin A5 in thyroid hormone production based on the inverse association of TSH levels and ANXA5 expression in thyroid and other HPT-axis tissues. A reverse association, namely an effect of TSH levels on annexin A5 levels, has also been described: previous literature in pigs and rats have shown that annexin A5 levels are dependent on the cAMP pathway in thyroid cells, and that concentrations and localization of annexin proteins are under TSH control via this similar pathway, resulting in higher expression of annexins mediated by TSH-induced cAMP (37, 38). Conversely, TSH-stimulated thyroid cells from rats were inversely associated with ANXA5 expression (39). Our study also showed novel findings regarding ANXA5 expression in the other HPT-axis tissues and whole blood, as well as a relation of annexin A5 plasma levels with lower TSH levels and higher FT4 and FT3 levels. Recent studies supporting these associations of ANXA5 expression and protein levels on TSH levels in thyroid-related tissues are sparse. However, given the tight relation between the different tissues in the HPT-axis, it can be expected that the associations with TSH (and FT4) are present also in hypothalamic and pituitary tissue, but that these are driven by the effect of ANXA5 in the thyroid. The PWAS findings are consistent with the high abundance of annexin A5 in plasma as a reflection of the general function of annexins.
PheWAS supports the relation of PCSK1 to FT4 levels
PheWAS of thyroid function-associated genes identified by PWAS revealed associations between rare genetic variants in PCSK1 and several anthropometric phenotypes, as well as association with PYY, a hormone related to appetite regulation. Rare mutations in PCSK1 are known to result in a loss of the encoded enzyme’s autocatalytic cleavage ability, leading to a variable pleiotropic syndrome that can include obesity (40), malabsorption diarrhea, hypogonadotropic hypogonadism, growth hormone deficiency, altered thyroid and adrenal function, or impaired glucose regulation (28, 41). With regard to altered thyroid function, patients with PCSK1 mutations demonstrate (mild) central hypothyroidism with low FT4 levels and normal to low TSH levels. This is in agreement with our PWAS findings that show a significant association between PCSK1-encoded plasma protein levels and FT4 levels, denoting that both rare and common variants in the gene relate to functions of the thyroid (Supplementary Tables 8 and 11).
Study strengths and limitations
TWAS and PWAS are important approaches for gene-based prioritization and for delivering potential mechanistic information by revealing a possible correlation between gene expression or protein levels and a trait. As exemplified in this study on TSH and FT4, which are key players in the HPT-axis, this approach could also be of interest to other axes in endocrinology. Similarly, combining TWAS and PWAS with available GWAS data could be used to identify the molecular underpinnings of endocrine circuits. For instance, cortisol, a key player in the hypothalamic–pituitary–adrenal axis (42), or GWAS data of luteinizing hormone and follicle-stimulating hormone as part of the hypothalamic–pituitary–gonadal axis (43), can provide valuable insights into the underlying genetic architecture. However, it is important to keep in mind that co-regulation of gene expression or protein levels, as well as predicted expression correlations and shared variants between expression weight models, can induce bias and therefore lead to false positive signals (7). To mitigate this potential bias, we included a conditional colocalization analysis step and confirmed that several candidate transcripts and proteins revealed by the TWAS and PWAS have a shared causal variant with TSH and FT4 levels. Additionally, the ability of TWAS and PWAS approaches to establish directionality in the observed associations is limited. While these approaches enable the identification of potential causal links between genetically regulated transcript and protein levels with TSH and FT4 levels, it does not ascertain which is the cause and which is the consequence. Experimental validations are needed to unravel the temporal sequence of these associations accurately.
Taking into account the importance of T3 in thyroid function, larger GWAS on this hormone are needed to also include T3 as an outcome in powerful PWAS and TWAS analyses. Lastly, the underlying GWAS solely incorporated information from individuals of European descent. We used matching transcript and protein level models derived from persons of European ancestry, but our findings may therefore not be generalizable to other ancestries.
Conclusion
In summary, we performed well-powered association studies of genetically inferred, tissue-specific transcripts as well as plasma protein levels with the thyroid function parameters TSH and FT4 levels. We identified novel thyroid function-associated genes, including those outside significant GWAS loci. Tissue-specific colocalization analysis revealed associations between a thyroid-specific regulatory variant, rs10799824, and CAPZB levels. Furthermore, we developed a conceptual framework consistent with the calcium channel activity of annexin A5 in thyroid cells, revealed by inverse, colocalization-supported associations between ANXA5 transcript and protein levels with TSH levels. Our findings contribute to a better understanding of transcriptional regulation and protein levels relevant to thyroid hormone regulation. Finally, our approach can be used as a conceptual blueprint for other endocrine axes.
Supplementary materials
This is linked to the online version of the paper at https://doi.org/10.1530/ETJ-24-0067.
Declaration of interest
MM is on the Editorial Board of European Thyroid Journal. MM was not involved in the review or editorial process for this paper, on which he is listed as an author. The authors declare that there is no conflict of interest that could be prejudicing the impartiality of the research reported here.
Funding
The work of SM-M was supported by the German Research Foundation (DFG) Project-ID 192904750 – CRC 992 Medical Epigenetics. The work of AK, NS, and OB was supported by the DFG Project-ID 431984000 – CRC 1453 NephGen. The work of YC was supported by the DFG project ID 441891347-S1 – SFB 1479. This research has been conducted using the UK Biobank Resource under Application Number 20272.
Availability of data and materials
The GWAS summary results included in this project are available as stated in the respective publications. RNA-seq and ATAC-seq datasets are available from the ENCODE project. RNA-seq (ENCSR023ZXN), ATAC-seq from thyroid (ENCSR474XFV), ATAC-seq from breast epithelium (ENCSR955JSO), ATAC-seq from transverse colon (ENCSR761TKU), and ATAC-seq from stomach (ENCSR851SBY) correspond to donor ENCDO451RUA (displayed in Fig. 3B). ATAC-seq from thyroid (ENCSR914DTI) displayed in Supplementary Fig. 11 corresponds to donor ENCDO793LXB.
Author contribution statement
Research idea and study design: AK, SM-M. Data analysis: NS, OB, SM-M, YC. Data interpretation: AK, AT, MM, RS, SM-M. Supervision or mentorship: AK, AT. Every author played a crucial role in shaping the manuscript through their significant intellectual contributions. Furthermore, each author accepts personal responsibility for their contributions and is committed to addressing any inquiries regarding the accuracy or integrity of any part of the research.
Acknowledgements
We are grateful to the ThyroidOmics Consortium for making the GWAS summary results available. We acknowledge the consortium’s efforts in generating and sharing this dataset, which significantly contributed to the findings presented in this article. The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. Summary-level plasma proteome data are available from the Atherosclerosis Risk in Communities (ARIC) study, funded by the National Heart, Lung, and Blood Institute (NHLBI) of the National Institutes of Health (NIH).
References
- 1↑
Maenhaut C, Christophe D, Vassart G, Dumont J, Roger PP, & Opitz R. Ontogeny, anatomy, metabolism and physiology of the thyroid . InEndotext. KR Feingold, B Anawalt, MR Blackman (editors) South Dartmouth, MA: MDText.com, Inc. 2015. Available at: https://www.ncbi.nlm.nih.gov/books/NBK285554/.
- 2↑
Medici M, Visser WE, Visser TJ, & Peeters RP. Genetic determination of the hypothalamic-pituitary-thyroid axis: where do we stand? Endocrine Reviews 2015 36 214–244. (https://doi.org/10.1210/er.2014-1081)
- 3↑
Hannoush ZC, & Weiss RE. Defects of thyroid hormone synthesis and action. Endocrinology and Metabolism Clinics of North America 2017 46 375–388. (https://doi.org/10.1016/j.ecl.2017.01.005)
- 4↑
Teumer A, Chaker L, Groeneweg S, Li Y, Di Munno C, Barbieri C, Schultheiss UT, Traglia M, Ahluwalia TS, Akiyama M, et al.Genome-wide analyses identify a role for SLC17A4 and AADAT in thyroid hormone regulation. Nature Communications 2018 9 4455. (https://doi.org/10.1038/s41467-018-06356-1)
- 5↑
Sterenborg RBTM, Galesloot TE, Teumer A, Netea-Maier RT, Speed D, Meima ME, Visser WE, Smit JWA, Peeters RP, & Medici M. The effects of common genetic variation in 96 genes involved in thyroid hormone regulation on TSH and FT4 concentrations. Journal of Clinical Endocrinology and Metabolism 2022 107 e2276–e2283. (https://doi.org/10.1210/clinem/dgac136)
- 6↑
Sterenborg RBTM, Steinbrenner I, Li Y, Bujnis MN, Naito T, Marouli E, Galesloot TE, Babajide O, Andreasen L, Astrup A, et al.Multi-trait analysis characterizes the genetics of thyroid function and identifies causal associations with clinical implications. Nature Communications 2024 15 888. (https://doi.org/10.1038/s41467-024-44701-9)
- 7↑
Wainberg M, Sinnott-Armstrong N, Mancuso N, Barbeira AN, Knowles DA, Golan D, Ermel R, Ruusalepp A, Quertermous T, Hao K, et al.Opportunities and challenges for transcriptome-wide association studies. Nature Genetics 2019 51 592–599. (https://doi.org/10.1038/s41588-019-0385-z)
- 8↑
Li B, & Ritchie MD. From GWAS to gene: transcriptome-wide association studies and other methods to functionally understand GWAS discoveries. Frontiers in Genetics 2021 12 713230. (https://doi.org/10.3389/fgene.2021.713230)
- 9↑
GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 2020 369 1318–1330. (https://doi.org/10.1126/science.aaz1776)
- 10↑
Carithers LJ, Ardlie K, Barcus M, Branton PA, Britton A, Buia SA, Compton CC, DeLuca DS, Peter-Demchok J, Gelfand ET, et al.A novel approach to high-quality postmortem tissue procurement: the GTEx project. Biopreservation and Biobanking 2015 13 311–319. (https://doi.org/10.1089/bio.2015.0032)
- 11↑
Zhang J, Dutta D, Köttgen A, Tin A, Schlosser P, Grams ME, Harvey B, Yu B, Boerwinkle E, Coresh J, et al.Plasma proteome analyses in individuals of European and African ancestry identify cis-pQTLs and models for proteome-wide association studies. Nature Genetics 2022 54 593–602. (https://doi.org/10.1038/s41588-022-01051-w)
- 12↑
Lawrence M, Gentleman R, & Carey V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 2009 25 1841–1842. (https://doi.org/10.1093/bioinformatics/btp328)
- 13↑
Gusev A, Ko A, Shi H, Bhatia G, Chung W, Penninx BWJH, Jansen R, de Geus EJC, Boomsma DI, Wright FA, et al.Integrative approaches for large-scale transcriptome-wide association studies. Nature Genetics 2016 48 245–252. (https://doi.org/10.1038/ng.3506)
- 14↑
Yang J, Lee SH, Goddard ME, & Visscher PM. GCTA: a tool for genome-wide complex trait analysis. American Journal of Human Genetics 2011 88 76–82. (https://doi.org/10.1016/j.ajhg.2010.11.011)
- 15↑
Wakefield J. Bayes factors for genome-wide association studies: comparison with P-values. Genetic Epidemiology 2009 33 79–86. (https://doi.org/10.1002/gepi.20359)
- 16↑
Giambartolomei C, Vukcevic D, Schadt EE, Franke L, Hingorani AD, Wallace C, & Plagnol V. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genetics 2014 10 e1004383. (https://doi.org/10.1371/journal.pgen.1004383)
- 17↑
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al.Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics 2000 25 25–29. (https://doi.org/10.1038/75556)
- 18↑
Kanehisa M, & Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Research 2000 28 27–30. (https://doi.org/10.1093/nar/28.1.27)
- 19↑
Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, Sivertsson Å, Kampf C, Sjöstedt E, Asplund A, et al.Proteomics. Tissue-based map of the human proteome. Science 2015 347 1260419. (https://doi.org/10.1126/science.1260419)
- 20↑
Schlosser P, Scherer N, Grundner-Culemann F, Monteiro-Martins S, Haug S, Steinbrenner I, Uluvar B, Wuttke M, Cheng Y, Ekici AB, et al.Genetic studies of paired metabolomes reveal enzymatic and transport processes at the interface of plasma and urine. Nature Genetics 2023 55 995–1008. (https://doi.org/10.1038/s41588-023-01409-8)
- 21↑
Wu T, Hu E, Xu S, Chen M, Guo P, Dai Z, Feng T, Zhou L, Tang W, Zhan L, et al.clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innovation 2021 2 100141. (https://doi.org/10.1016/j.xinn.2021.100141)
- 22↑
Benjamini Y, & Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 1995 57 289–300. (https://doi.org/10.1111/j.2517-6161.1995.tb02031.x)
- 23↑
Wang Q, Dhindsa RS, Carss K, Harper AR, Nag A, Tachmazidou I, Vitsios D, Deevi SVV, Mackay A, Muthas D, et al.Rare variant contribution to human disease in 281,104 UK Biobank exomes. Nature 2021 597 527–532. (https://doi.org/10.1038/s41586-021-03855-y)
- 24↑
Gagliano Taliun SA, VandeHaar P, Boughton AP, Welch RP, Taliun D, Schmidt EM, Zhou W, Nielsen JB, Willer CJ, Lee S, et al.Exploring and visualizing large-scale genetic associations using PheWeb. Nature Genetics 2020 52 550–552. (https://doi.org/10.1038/s41588-020-0622-5)
- 25↑
Gerke V, & Moss SE. Annexins: from structure to function. Physiological Reviews 2002 82 331–371. (https://doi.org/10.1152/physrev.00030.2001)
- 26↑
Mittal SR, Mathur AK, & Prasad N. Effect of calcium channel blockers on serum levels of thyroid hormones. International Journal of Cardiology 1993 38 131–132. (https://doi.org/10.1016/0167-5273(9390171-c)
- 27↑
Karra E, Chandarana K, & Batterham RL. The role of peptide YY in appetite regulation and obesity. Journal of Physiology 2009 587 19–25. (https://doi.org/10.1113/jphysiol.2008.164269)
- 28↑
Stijnen P, Ramos-Molina B, O’Rahilly S, & Creemers JWM. PCSK1 mutations and human endocrinopathies: from obesity to gastrointestinal disorders. Endocrine Reviews 2016 37 347–371. (https://doi.org/10.1210/er.2015-1117)
- 29↑
Selva DM, & Hammond GL. Thyroid hormones act indirectly to increase sex hormone-binding globulin production by liver via hepatocyte nuclear factor-4alpha. Journal of Molecular Endocrinology 2009 43 19–27. (https://doi.org/10.1677/JME-09-0025)
- 30↑
Ke X, Tian X, Yao S, Wu H, Duan YY, Wang NN, Shi W, Yang TL, Dong SS, Huang D, et al.Transcriptome-wide association study identifies multiple genes and pathways associated with thyroid function. Human Molecular Genetics 2022 31 1871–1883. (https://doi.org/10.1093/hmg/ddab371)
- 31↑
Feldt-Rasmussen U, Effraimidis G, & Klose M. The hypothalamus-pituitary-thyroid (HPT)-axis and its role in physiology and pathophysiology of other hypothalamus-pituitary functions. Molecular and Cellular Endocrinology 2021 525 111173. (https://doi.org/10.1016/j.mce.2021.111173)
- 32↑
Fekete C, & Lechan RM. Central regulation of hypothalamic-pituitary-thyroid axis under physiological and pathophysiological conditions. Endocrine Reviews 2014 35 159–194. (https://doi.org/10.1210/er.2013-1087)
- 33↑
Ooi GT, Tawadros N, & Escalona RM. Pituitary cell lines and their endocrine applications. Molecular and Cellular Endocrinology 2004 228 1–21. (https://doi.org/10.1016/j.mce.2004.07.018)
- 34↑
Köhrle J. Selenium, iodine and iron-essential trace elements for thyroid hormone synthesis and metabolism. International Journal of Molecular Sciences 2023 24 3393. (https://doi.org/10.3390/ijms24043393)
- 35↑
Chaker L, Razvi S, Bensenor IM, Azizi F, Pearce EN, & Peeters RP. Hypothyroidism. Nature Reviews. Disease Primers 2022 8 30. (https://doi.org/10.1038/s41572-022-00357-7)
- 36↑
Teumer A, Rawal R, Homuth G, Ernst F, Heier M, Evert M, Dombrowski F, Völker U, Nauck M, Radke D, et al.Genome-wide association study identifies four genetic loci associated with thyroid volume and goiter risk. American Journal of Human Genetics 2011 88 664–673. (https://doi.org/10.1016/j.ajhg.2011.04.015)
- 37↑
el Btaouri H, Claisse D, Bellon G, Antonicelli F, & Haye B. In vivo modulation of annexins I, II and V expression by thyroxine and methylthiouracil. European Journal of Biochemistry 1996 242 506–511. (https://doi.org/10.1111/j.1432-1033.1996.0506r.x)
- 38↑
Elbtaouri H, Antonicelli F, Claisse D, Delemer B, & Haye B. Cyclic AMP regulation of annexins I, II, V synthesis and localization in cultured porcine thyroid cells. Biochimie 1994 76 417–422. (https://doi.org/10.1016/0300-9084(9490118-x)
- 39↑
Lorenz S, Eszlinger M, Paschke R, Aust G, Weick M, Führer D, & Krohn K. Calcium signaling of thyrocytes is modulated by TSH through calcium binding protein expression. Biochimica et Biophysica Acta 2010 1803 352–360. (https://doi.org/10.1016/j.bbamcr.2010.01.007)
- 40↑
Ayers KL, Glicksberg BS, Garfield AS, Longerich S, White JA, Yang P, Du L, Chittenden TW, Gulcher JR, Roy S, et al.Melanocortin 4 receptor pathway dysfunction in obesity: patient stratification aimed at MC4R agonist treatment. Journal of Clinical Endocrinology and Metabolism 2018 103 2601–2612. (https://doi.org/10.1210/jc.2018-00258)
- 41↑
Wilschanski M, Abbasi M, Blanco E, Lindberg I, Yourshaw M, Zangen D, Berger I, Shteyer E, Pappo O, Bar-Oz B, et al.A novel familial mutation in the PCSK1 gene that alters the oxyanion hole residue of proprotein convertase 1/3 and impairs its enzymatic activity. PLoS One 2014 9 e108878. (https://doi.org/10.1371/journal.pone.0108878)
- 42↑
Crawford AA, Bankier S, Altmaier E, Barnes CLK, Clark DW, Ermel R, Friedrich N, van der Harst P, Joshi PK, Karhunen V, et al.Variation in the SERPINA6/SERPINA1 locus alters morning plasma cortisol, hepatic corticosteroid binding globulin expression, gene expression in peripheral tissues, and risk of cardiovascular disease. Journal of Human Genetics 2021 66 625–636. (https://doi.org/10.1038/s10038-020-00895-6)
- 43↑
Genome-wide association study with 1000 genomes imputation identifies signals for nine sex hormone-related phenotypes - PubMed [Internet]. [cited 2024 Jan 28]. Available at: https://pubmed.ncbi.nlm.nih.gov/26014426/.
- 44↑
Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, Boehnke M, Abecasis GR, & Willer CJ. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 2010 26 2336–2337. (https://doi.org/10.1093/bioinformatics/btq419)
- 45↑
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 2012 489 57–74. (https://doi.org/10.1038/nature11247)