Transcription factors

are proteins that regulates the the transcription of genes. In a sense, they are the switches that turn a gene on or off.

Regulatory regions

  • Promoters: for any gene to be expressed, the RNA polymerase needs to bind to a DNA region (upstream of the gene) called promoters to be able to start the transcription. In eukaryotes, the RNA polymerase can only bind to promoters with the help of transcription factors.
  • Activators & enhancers: sometimes when transcription factors binds to DNA, it increase the likelihood of the gene being expressed (turn a gene on, or upregulate a gene). These transcription factors are called activators, and the DNA region they bind to are called enhancers.
  • Repressors and silencers: conversely, when a transcription factor decreases the likelihood of a gene’s expression (turn it off, downregulate a gene), the transcription factor is called a repressor, and its binding DNA binding site is called a silencer.

Histones

  • Histones are the protein “beads” that DNA molecules wrap around in the “beads on string” structure in DNA 3D Structures.
  • When the histone beads pack tightly in chromatin, the transcription factors can’t access the promoter, and genes aren’t activated.
  • The packaging has too loosen, so the promoter regions are exposed for gene expressions to occur.

CpG islands

  • CpG islands are regions of DNA in the genome that are rich in the dinucleotide CG (cytosine-phosphate-guanine, p represents the phosphate bond, means the direction is 5’ → 3’). Note that the CG pairing rule means that for each CpG dinucleotide, there is a corresponding CpG dinucleotide in the reverse compliment string.
  • CpG islands are usually located in or near the promoter regions of genes. CpG islands are found upstream of ~60% of protein-coding genes. In the early days of annotating the human genome, CpG islands were used as signposts of potential genes downstream.
  • GC content (percentage of G or C in the string) is a predictor of gene content, and varies a lot in the genome. Fun fact, chromosome 19 has the highest GpC island density (~40/Mb), and has the highest gene density (~20/Mb) in the human genome.
  • DNA methylation silencing genes
    • For the CpG dinucleotides in CpG islands, the C (cytosine) in CG can be methylated.
    • When many CpG were methylated in a CpG island, it’ll block the transcription factors’ binding, and inhibit or silence the expression of downstream genes.
    • DNA methylation can be pretty stable. It can be passed on to daughter cells, but not inherited when a new baby is born.
    • DNA methylation has been used as a biomarker for aging, and was associated with diseases such as cancer and developmental disorders.