Understanding Gene Hunting and Disease Genes in Human Genetics
Gene hunting involves finding genes responsible for diseases by statistically linking them with markers on chromosomes. This process relies on the logical structure of chromosomes, genotypes versus phenotypes, recombination phenomena, and specific loci like the ABO locus on Chromosome 9. By analyzing alleles and haplotypes, researchers can infer the genetic basis of diseases and understand inheritance patterns in human genetics.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Gene Hunting: find genes responsible for a given disease Main idea: If a disease is statistically linked with a marker on a chromosome, then tentatively infer that a gene causing the disease is located near that marker. Some slides were prepared by Ma ayan Fishelson, some by Nir, and most are mine. I have slightly edited all slides. .
Human Genome Most human cells contain 46 chromosomes: 2 sex chromosomes (X,Y): XY in males. XX in females. 22 pairs of chromosomes named autosomes. 2
Sexual Reproduction egg sperm zygote gametes 3
Chromosome Logical Structure Locus the location of genes or other markers on the chromosome. Allele one variant form (or state) of a gene/marker at a particular locus. Locus1 Possible Alleles: A1,A2 Locus2 Possible Alleles: B1,B2,B3 4
Genotypes versus Phenotypes At each locus (except for sex chromosomes) there are 2 genes. These constitute the individual s genotype at the locus. The expression of a genotype is termed a phenotype. For example, hair color, weight, or the presence or absence of a disease. 5
Recombination Phenomenon A recombination between 2 genes occurred if the haplotype of the individual contains 2 alleles that resided in different haplotypes in the individual's parent. (Haplotype the alleles at different loci that are received by an individual from one parent). 6
An example - the ABO locus. The ABO locus determines detectable antigens on the surface of red blood cells. The 3 major alleles (A,B,O) interact to determine the various ABO blood types. O is recessive to A and B. Alleles A and B are codominant. Phenotype Genotype A A/A, A/O B B/B, B/O AB A/B O O/O Note that the listed genotypes are unordered (we don t know which allele is from the father and which one is from the mother). 7
Example: ABO near AK1 on Chromosome 9 O A O O A2 A2 2 1 A2/A2 A1/A1 A A A O A1 A2 A O A2 | A2 4 3 A2/A2 A1/A2 O O O A1 A2 5 Recombinant A1/A2 8
Example for Finding Disease Genes A H A A A2 A2 2 1 A2/A2 A1/A1 H H H A A1 A2 H | A A2 | A2 4 3 A2/A2 A1/A2 A A A A1 A2 5 Recombinant A1/A2 We use a marker with codominant alleles A1/A2. We speculate a locus with alleles H (Healthy) / A (affected) If the expected number of recombinats is low (close to zero), then the speculated locus and the marker are tentatively physically closed. 9
The method just described is called genetic linkage analysis. It uses the phenomena of recombination in families of affected individuals to locate the vicinity of a disease gene. 10
Comments about the example Often: Pedigrees are larger and more complex. Not every individual is typed. There are more markers and they have more than two alleles. Recombinants cannot always be determined. 11
Usually recombination can not be simply counted A A A O A2 A2 2 1 A2/A2 A1/A1 A A A O A1 A2 A O A2 | A2 4 3 A2/A2 A1/A2 A ? ? A1 A2 Recombinant ? Sometimes ! 5 A1/A2 One can compute the likelihood of data given every location and choose the most likely location. 12
A Bayesian Network Model L11f L11m Selector of maternal allele at locus 1 of person 3 X11 S13m P(s13m) = L13m Maternal allele at locus 1 of person 3 (offspring) Selector variables Sijm are 0 or 1 depending on whose allele is transmitted to offspring i at maternal locus j. P(l13m | l11m, l11f,,S13m=0) = 1 if l13m = l11m P(l13m | l11m, l11f,,S13m=1) = 1 if l13m = l11f P(l13m | l11m, l11f,,s13m) = 0 otherwise 13
L11m L12m L11f L12f Probabilistic model for two loci X11 S13m X12 S13f L13f L13m X13 Model for locus 1 L21m L22m L21f L22f X21 S23m X22 S23f L23f L23m X23 Model for locus 2 14
Probabilistic model for Recombination L11m L12m L11f L12f X11 S13m X12 S13f L13f L13m X13 L21m L22m L21f L22f X21 S23m X22 S23f L23f L23m 1 X23 2 2 = ( | , ) where P s s t {m,f} 23 13 2 t t 1 2 2 2 is called the recombination fraction between loci 2 & 1. 15
Modeling Phenotypes I L11f L11m X11 S13m Y11 L13m Phenotype variables Yij are 0 or 1 depending on whether a phenotypic trait associated with locus i of person j is observed. E.g., sick versus healthy. For example model of perfect recessive disease yields the penetrance probabilities: P(y11 = sick | X11= (a,a)) = 1 P(y11 = sick | X11= (A,a)) = 0 P(y11 = sick | X11= (A,A)) = 0 16
Introducing a tentative disease Locus L11m L12m L11f L12f Marker locus X11 S13m X12 S13f L13f L13m X13 Disease locus: assume sick means xij=(a,a) L21m L22m L21f L22f X21 S23m X22 S23f 1 2 2 = ( | , ) P s ts L23f L23m 23 13 ' 2 t 1 2 2 X23 The recombination fraction 2 is unknown. Finding it can help determine whether a gene causing the disease lies in the vicinity of the marker locus. 17
SUPERLINK Stage 1: each pedigree is translated into a Bayesian network. Stage 2: value elimination is performed on each pedigree (i.e., some of the impossible values of the variables of the network are eliminated). Stage 3: an elimination order for the variables is determined, according to some heuristic. Stage 4: the likelihood of the pedigrees given the values is calculated. This is done by by performing variable elimination according to the elimination order determinedin stage 3. 18