Proteins serve as a link between genes and diseases
Scientists have now found that many diseases have a genetic origin. e.g. Thalassemia, Sickle cell anemia. The human genome is a complete set of instructions for generating a human being and scientists have proven that a nuclear human genome encodes around 20000 protein-coding genes. Most of the human DNA is present in the nucleus in the form of chromosomes and some are present in the mitochondria. We get these chromosomes from our parents. Any variations in our DNA and differences in the DNA functions together with other environmental factors further affect the disease process.
Recently, a group of scientists published a study that has tried to produce a map that connects the genetic data related to a particular disease with the circulating proteins. Let us discuss how the proteo-genomic link within and between different diseases helps in connecting the genes to diseases through proteins.
How do the genes connect to disease through proteins?
Genes contain information that is required to make proteins. Proteins, made up of amino acid sequences, are the vital functional units of our body. The pathway of gene to protein is called as gene expression. It is a complex process, but controlled by our body cells.
We are aware that any malfunction in the proteins result in several diseases in our body. Therefore, proteins are also the main drug targets while treating a particular disease.
In a recent study on human genome, the scientists came across a trend that showed that variations in the genome at certain specific sites are associated with the abundance or function of a particular set of proteins circulating the blood. In this study, the scientists performed a genome-proteome–wide association study that included 4775 protein targets measured in plasma from 10,708 European-descent individuals (mean age 48.6 years, 53.3% women). They identified 10,674 genetic variant–protein target associations that covered 3892 protein distinct protein targets, most of which are have cis-protein quantitative trait loci. These cis- qPTLs have a specific ability to prioritize genes that are responsible for a certain disease. These genetic variations thus result in a particular disease symptom through the protein affected due to the genetic variation. This proteo-genomic map of human health provides insights into the shared etiology across diseases and the identification of pathophysiological pathways through cross-domain integration.
Figure 1 depicts the summary of the proteogenomic map that connects genes to diseases through proteins.
Figure 1: Mapping the proteogenomic convergence in human diseases.
(Source: Pietzner M, Wheeler E, Carrasco-Zanini J, Cortes A, Koprulu M, Wörheide MA, Oerton E, Cook J, Stewart ID, Kerrison ND, Luan JA. Mapping the proteo-genomic convergence of human diseases. Science. 2021 Oct 14;374(6569):eabj1541.)
In another study, proteins in the serum of a group of Icelanders above the age of 65 years when measured, showed network modules of serum proteins associated with heart diseases and metabolic disorders, and overall survival. Studies indicated that these network modules of proteins were controlled by cis- and trans-acting genetic variants.
Some examples of diseases that show the connection of genes to diseases through proteins:
The current studies can link the causal genes for certain diseases linked with specific proteins. The following are a few findings:
- They assigned (RSPO1) a causal gene for endometrial cancer. The gene was seen to encode an R-Spondin-1 protein (a secretory activating protein) and a secondary cis-pQTL which was a lead signal for endometrial cancer. It also regulates the adult stem cell growth factor.
- They assigned a single protein to several soft tissue disorders showing the gene-protein convergence in almost 37 different diseases.
- Studies have identified a human genome known as KAT8 and assigned a protein to a particular focus of this genome in people with Alzheimer’s disease.
- Another example is NSF, encoding N-ethylmaleimide-sensitive factor (NSF). It is associated with the fusion of vesicles with membranes, which helps the release of neurotransmitters into the extracellular space a locus that was identified for Parkinson’s disease.
- The map has also highlighted ten diseases for which five or more colocalizing cis-pQTLs have been found. The diseases include coronary artery disease, hyperlipidemia, ulcerative colitis, Alzheimer’s disease, and type 2 diabetes.
The proteogenomic map has served to provide proteins that can be targeted for the treatment of specific genetic diseases. It helps to classify diseases based on causal genes so that they can provide treatment targets for these diseases. Diverse diseases may be converged based on genetic etiology through this map.
Pietzner M, Wheeler E, Carrasco-Zanini J, Cortes A, Koprulu M, Wörheide MA, Oerton E, Cook J, Stewart ID, Kerrison ND, Luan JA. Mapping the proteo-genomic convergence of human diseases. Science. 2021 Oct 14;374(6569):eabj1541.
Jackson M, Marks L, May GHW, Wilson JB. The genetic basis of disease. Essays Biochem. 2018 Dec 2;62(5):643-723. doi: 10.1042/EBC20170053. Erratum in: Essays Biochem. 2020 Oct 8;64(4):681. PMID: 30509934; PMCID: PMC6279436.
Emilsson V, Ilkov M, Lamb JR, Finkel N, Gudmundsson EF, Pitts R, Hoover H, Gudmundsdottir V, Horman SR, Aspelund T, Shu L. Co-regulatory networks of human serum proteins link genetics to disease. Science. 2018 Aug 24;361(6404):769-73.