Gene Coding Potential Analysis for Multiple Genes

Slide Note
Embed
Share

Analysis of coding potential for six genes revealed details such as start positions, gene lengths, coding potential scores, and functional predictions. Genes exhibit various characteristics like RBS values, Blast results, and Family of unknown function classifications based on multiple analyses like Glimmer, GeneMark, Blast, and HHPred. Each gene's start position and agreement among analytical tools provide insights into their functionality and prevalence across genomes.


Uploaded on Oct 04, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Gene 1 SSC: 41 - 379 CP: Includes all coding potential RBS: Raw -3.990, Z Value 2.053, Final -5.036 SCS: Glimmer and Gene mark agree start at 41 Gap: NA (first gene) Blast Start: Bri160 agrees with 41 start 1,1 LO: Not the longest ORF FBlast: Family of unknown function gp1 EValue 0.0e0 95.54% identity HHPred: top hit Family of unknown function 41.47% probability Evalue 280 32% percent alignment FS: HHpred, Blast, Phamerator agree on family of unknown function Syn: NA ST: Start 3 Confirmed at 41 as most frequent start 65/65 genomes

  2. Gene 2 SSC: 376-1071 CP: Includes all coding potential RBS: RAW SD: -3.452, Z-value: 2.321, Final score: -4.498 SCS: Gimmer and Genemark agree on a start at 376 Gap: -3 bp Blast Start: Paopu agrees with 376 start, 1,1 alignment LO: Not the longest ORF FBlast: PaoPu, hypothetical protein, E-value: 0.0E0, 99.57% identity HHPred: No significant match FS: HHpred, Blast, Phamerator agree on NKF Syn: N/A ST: start 4 at 376 confirmed as most frequent with 84/84 genomes

  3. Gene 3 SSC: 1074-2534 CP: Includes all coding potential RBS: Raw SD: -4.385, Z-value: 1.856, Final Score: -5.987 SCS: Glimmer and Genemark agree on start at 1074 Gap: 3 bp gap Blast Start: PaoPu agrees with 1074 start, 1,1 alignment LO: Longest ORF FBlast: PaoPu, terminase, E-value: 0.0E0, 100% identity HHPred: Top hit: terminase large subunit, probability 100%, E-value: 1.8e-41, 84% alignment FS: Phamerator, Blast, and HHPred all agree with terminase function Syn: Agrees with synteny ST: Start 1 at 1074 confirmed in 86/86 genomes

  4. Gene 4 SSC: 2660, 3787 CP: Includes all coding potential RBS: Raw -4.154, Zvalue 1.971, Final -5.756 SCS: Glimmer calls start 2660, GeneMark 2729 Gap: 126bp gap Blast Start: Paopu agrees with 2660 start alignment 1,1 LO: Yes FBlast: Paopu, Portal Protein, Evalue e0.0 Identity 100% HHPred: Phage Portal Protein 100% probability to phage HK97 82% identity Evalue 2.1e-36 FS: Phamerator, Blast, HHpred all agree with Portal Protein as Function, Paopu GP4 Syn: Agrees with synteny ST: Paopu (start 20) agrees with Glimmer call at 2660 as most common start

  5. Gene 5 SSC: 3784 - 5370 CP: Includes all coding potential RBS: Raw -1.907 Z Value 3.091 Final -3.509 SCS: Glimmer and Gene mark agree start at 3784 Gap: -3bp Blast Start: Paopu agrees with 3784 start 1,1 LO: Not longest ORF FBlast: Paopu EValue 0.0e0 100% identity Major Capsid and protease fusion protein HHPred:top hit Major Capsid protein in phage Jellyroll and Spike 99.9% probability Evalue 2.3e-24 70% percent alignment FS: HHpred, Blast, Phamerator agree Major Capsid and Protease fusion protein Syn: agrees with synteny ST: Start 2 Confirmed as 3784 as most frequent start 63/65 genomes

  6. Gene 6 SSC: 5374 - 5727 CP: Includes all coding potential RBS: Raw -2.757, Z Value 2.667, Final -4.535 SCS: Glimmer and Gene mark agree start at 5374 Gap: 3bp Blast Start: Paopu agrees with 5374 start 1,1 LO: Longest ORF FBlast: Paopu head-to-tail adapter gp 6 EValue 0.0e0 100% identity HHPred: top hit head-to-tail adapter protein in phage HK97 99.52% probability Evalue 2.5e-13 87% percent alignment, 5A21_D second hit probability 99.31 e value 4.8e-11 FS: HHpred, Blast, Phamerator agree head-to-tail adapter protein Syn: agrees with synteny ST: Start 2 Confirmed as 5374 as most frequent start 65/65 genomes

  7. Gene 7 SCC: 5724-6104 CP: Includes all coding potential SCS: Glimmer and GeneMark agree on Start 5724 ST: Start 2 confirmed as 5724 in 65/65 of non-draft gene annotations. Blast-Start: PaoPu agrees with start 5724, 1,1 Gap: 3 bp overlap (-3bp) LO: Yes, Longest ORF RBS: Raw= -5.068; Z Value= 1.505; Final= -6.591 FBlast: E-Value= 0.0e0; 100% Identity; Tail terminator with PaoPu HHPred: Top hit for Tail terminator gene in Virus 6TE9; 98% probability; E-value 0.0012; 87% alignment, also hit 3FZ2_D 98.52 Probabilty, 0.000022 e Value Synt: Agrees with synteny FS: HHPred, Phamerator and Blast agree on Tail terminator protein

  8. Gene 8 SSC: 6143-6577 CP: Includes all coding potential RBS: Raw -5.040 Z Value 1.529 Final -5.797 SCS: Glimmer and Gene mark agree start at 6143 Gap: 39bp Blast Start: Paopu, Scamander agrees with 6143 1,1 LO: Longest ORF FBlast: PaoPu and Scamander EValue 0.0e0 100% identity Major tail protein, HHPred:top hit Phage Major tail protein 6TE9_G 99.3% probability Evalue 5.8e-11 89.1% percent alignment FS: HHpred, Blast, Phamerator agree Major tail protein Syn: agrees with synteny ST: Start 2 Confirmed as 6143 as most frequent start 86/93 genomes (92.5%)

  9. Gene 9 SSC: 6590, 6973 CP: Includes all coding potential RBS: Raw -3.599 Z value 2.247 Final -4.294 Not best score SCS: Glimmer and genemark agree start at 6590 Gap: 12bp Blast Start: Paopu, alignment 1:1 LO: Longest ORF FBlast: NKF Paopu eValue 0.0, 100% identity, GP9 HHPred: NKF 99.4% HK97 GP10 6.6e-10, 85% alignment, FS: HHPred, phamerator, and blast agree Syn: N/A ST: Start 3 confirmed as most frequent at 6590, 86 out of 132 genes called here (65.2%)

  10. Gene 10 SSC: 6987 - 7307 CP: Contains All Coding Potential RBS: Raw: -3.131 Z: 2.481 Final: -3.826 SCS: Glimmer and GeneMark Agree at 6987 Gap: 14bp Blast Start: Lola20 1:1 alignment, 98.11 similarity LO: Longest ORF FBlast: Tail Assembly Chaperone - Lola20, evalue 0.0E0 97.17% identity, HHPred: Chromosomal Replication Initiator protein, 77,9 Probability, 14 eValue FS: Blast, HHPred, Phamerator agree Tail Assembly Chaperone Syn: Agrees with synteny ST: Start 6 - Most Called Start - 130/130

  11. Gene 11 - 1 Frame shift SSC: 6987 - 7423 CP: Contains All Coding Potential RBS: Raw: -3.131 Z: 2.481 Final: -3.826 (Frame shift) SCS: -1 Frameshift. Shift occurs GGGAAA Glycine/Lysine 7271bp Gap: N/A Blast Start: Bri160, alignment 1:117 LO: Longest ORF FBlast: Tail Assembly Chaperone - Bri160 e-Value 2.0@-10, 100% identity. HHPred: Chromosomal Replication Initiator protein, 77,9 Probability, 14 eValue FS: Blast, and phamerator agree Tail Assembly Chaperone Syn: Agrees with synteny ST: Start 6 - Most Called Start - 130/130

  12. Gene 12 SSC: 7541-9646 CP: Includes all coding potential RBS: Raw -2.590 Z Value 2.751 Final -4.386 SCS: Glimmer and Gene mark agree start at 7541 Gap: 118bp Blast Start: Paopu, Dongwon agrees with 7541 1,1 LO: Longest ORF FBlast: PaoPu and Minima EValue 0.0e0 100% identity Tape measure protein HHPred:top hit Tape measure protein 6V8I 99.82% probability Evalue 2e-12 50% percent alignment FS: HHpred, Blast, Phamerator agree Tape measure protein Syn: agrees with synteny ST: Start 2 confirmed as most frequent, found in 86 out of 87 genes in pham 98.9%

  13. Gene 13 SSC: 9643-10602 CP: Contains all coding potential RBS: Raw SD -4.088 Z-value 2.003 Final -4.924 SCS: Glimmer and genemark agree 9643 Gap: 3 overlap Blast Start: Paopu and Minima agree 1,1 LO: Longest ORF FBlast: Paopu and Minima E-value 0.0e0 100% identity Minor tail protein HHPred: Glycoside hydrolase family 9; cellulase, CbhA, Clostridium thermocellum, CBM4, Ig-like, cellulosome, CBM, SUGAR BINDING 98.9% probability E-value 4.2e-7 76.4% alignment FS: NCBI and phamerator agree minor tail protein Syn: agrees with synteny ST: Start 1 confirmed as most frequent, found in 73 of 73 genes in pham 100%

  14. Gene 14 SSC: 10602-12671 CP: contains all coding potential RBS: Raw SD -3.496 Z-value 2.299 Final -5.098 SCS: Glimmer and genemark agree 10602 Gap: 0 Blast Start: PaoPu and Dongwon 1,1 LO: Longest ORF FBlast: PaoPu and Dongwon E-value 0.0e0 100% identity minor tail protein HHPred: Fibronectin (Homo sapiens) 1.754A 99.6% probability E-value 7e-14 78.3% alignment FS: NCBI and phamerator agree minor tail protein Syn: agrees with synteny ST: Start 1 confirmed as most frequent, found in 86 of 86 genes in pham 100%

  15. Gene 15 SSC: 12673-13221 CP: Includes all coding potential RBS: Raw score: -1.559, Z-value: 3.265, final score: -2.394 SCS: Glimmer and Genemark agree with start at 12673 Gap: 2 bp gap Blast Start: Paopu agrees with start at 12673, 1,1 alignment LO: Longest ORF FBlast: PaoPu, minor tail protein, E-value: 0.0E0, 99.45% identity HHPred: Fibronectin, 99.7% probability, E-value: 7.7e-14, 77.8% alignment FS: Blast and Phamerator agree on minor tail protein, HHPred calls the protein Fibronectin Syn: Agrees with synteny ST: Agrees with most annotated start 7, 67 of 90 genes in the pham

  16. Gene 16 SSC: 13258-13545 CP: Includes all coding potential RBS: Raw -2.814 Z-value 2.639 Final -3.589 SCS: Glimmer calls start at 13258, GM at 13234. RBS and Starterator confirm start 13258 best start Gap: 37 Blast Start: Minima and Dongwon agree 13258 1,1 LO: Not the longest ORF FBlast: Minima and Dongwon E-value 2.4e-27 Hypothetical protein HHPred: NKF FS: Phamerator and Blast agree Syn: N/A ST: Start 5 confirmed at 13258 found in 82 of 83 ( 98.8% ) of genes in pham

  17. Gene 17 SSC: 13564-14250 CP: Contains all coding potential RBS: Raw SD -3.142 Z-Value 2.475 Final -3.978 SCS: Glimmer and genemark agree 13564 Gap: 19 Blast Start: Paopu and Minima agree 1,1 LO: Longest ORF FBlast: Paopu and Minima 100% identity 0.0E0 Endolysin HHPred: Peptidoglycan hydrolase b.84.3.2 probability 98.05 E-value 0.0024 36.2% alignment FS: NBCI and phamerator agree endolysin Syn: N/A ST: Agrees with most called start 6, 67 or 68 genes in the pham

  18. Gene 18 Start: 14247-14480 CP: Does not include all coding potential SCS: Glimmer and Genemark do not call this start RBS:Raw Score: -3.800, Z-Value: 2.147, Final Score: -5.596 Best Start Gap: 3 bp overlap Blast Start: JoBros 1:1 LO: not longest ORF FBlast: Hypothetical Protein, Membrane Protein Microbacterium phage JoBros, E value 0.0E0, 100% identity HHPred: NKF Probability: 95.53% E-Value: 0.41Percent Alignment: 57/95 = 0.6, 60% FS: Blast, Phamerator, HHPred agree NKF Syn: N/A ST: Most called Start

  19. Gene 19: SSC: 14477 - 14701 CP: Includes all coding potential SCS: Glimmer and Gene Mark agree RBS: Raw Score: -2.366 Z-value: 2.893 Final Score: -3.366 Gap: 3bp Blast Start: Paopu 1:1 LO: Longest ORF FBlast: Paopu Hypothetical Protein: E-value: 2.2E-29 Identity: 74 Ratio:1:1 HHPred:Murin Hydrolase Activator, Probability: 98.11%, E-Value: 0.00056, % Alignment: 25.12% FS: Blast, Phamerator,agree NKF Syn: N/A St: Most called start (5), 68 of 70 genes in the pham

  20. Gene 20 SSC: 14770-14982 (reverse) CP: contains all coding potential RBS: Raw SD -2.791 Z-value 2.650 Final -3.627 SCS: Glimmer & genemark agree 14982 Gap: 69 Blast Start: Paopu and Azizam agree 1,1 LO: Longest ORF FBlast: Paopu and Azizam 100% identity 1.4e-20 LSR2-like DNA bridging protein HHPred: Protein LSR2 anti-parallel beta sheet, dimer DNA binding protein 1.728A 99.88% probability E-value 1e-21 96.7% alignment FS: Phamerator, hhpred and NBCI agree LSR2-like bridging protein Syn: N/A ST: Start 50 not the most annotated start at 29.3% of the time called 80.5% when present, but common start among EE phage

  21. Gene 21 SSC: 14985-15422(reverse) CP: Contains all coding potential RBS: Raw: -2.095 Z: 3.03 Final: -3.166 best score SCS: Glimmer and GeneMark disagree. GeneMark: 15485 Gap: 2 LO: No Blast Start: Minima 1:26 FBlast: helix-turn-helix DNA Binding domain protein, MerR-like, Minima E-value 0.0E0, 100% identity HHPred: DNA Binding Protein Probability: 99.88% e-value: 1e-21 %aligned: 96.72% FS: Blast, HHPred, phamerator Starterator: start 12, not most annotated start

  22. Gene 22 SSC:15565-15795 (Reverse) CP: Contains all coding potential RBS: Raw Score: -6.371 Z-value:.875 Final Score: -7.593 not best score SCS: Glimmer and GeneMark agree Gap: 143 Blast Start: Gardevoir Helix-turn-Helix binding protein 1:1 LO: yes FBlast: Helix-turn-helix DNA Binding Protein Gardevoir - Score 378 E-Value: 1.4E-45 Length: 76 HHPred:Helix-turn-helix Endothelial differentiation-related factor, E-value: 7.1e-9 76.92% alignment FS: Helix-Turn-Helix DNA Domain Binding Domain, blast, hhpred, phamerator Syn: N/A ST: Most Called Start

  23. Gene 23 SSC: 16305 - 16691 CP: Contains all Coding Potential RBS: Raw -4.59 Z score: 1.772 Final: -5.716 SCS: Glimmer and GeneMark disagree (16509) Gap: 739 Blast Start: Dongwon - Aztec 1:1 LO: Longest ORF FBlast: Theoretical Protein E-Value: 0 100%Alignment 128 Identity HHPred: DNA Repair Protein 62.5% probability e-value:10 5.9% alignment FS: No Known Function, blast, hhpred, phamerator agree Syn: N/A ST: Start 3 Most called start

  24. Gene 24 SSC: 16775 - 16993 CP: Includes all coding potential RBS: Raw SD: -5.529, Z-Value: 1.285, Final Score: -6.575 not best score SCS: Glimmer calls start at 16775, GeneMark calls start at 16784 Gap: 84 bp Blast Start: PaoPu, Scamander agrees with start at 16775 1,1 LO: Longest ORF FBlast: PaoPu E-Value: 7.4E-44, 100% identity, helix-turn-helix DNA binding domain MerR-like HHPred: Q-box helicase domain of DEAD-like helicase RecG family proteins 66.41%, E-Value: 22, 14.6% FS: HHPred: Q-box helicase domain of DEAD-like helicase RecG family proteins, Blast: MerR-like helix-turn-helix DNA binding domain protein, Phamerator: Unknown Syn: NA ST: Start 3 confirmed as most frequent, found in 59 out of 62 genes in pham (95.2%)

  25. Gene 25 SSC: 16990-17289 CP: contains all coding potential RBS: Raw SD -5.276 Z-value 1.411 Final -6.051 SCS: Glimmer & genemark agree 16990 Gap: overlap 3 Blast Start: PaoPu and Scamander agree 1,1 LO: longest ORF FBlast: PaoPu and Scamander 100% identity E-value 0.0e0 HNH endonuclease HHPred: CRISPR-associated endonuclease 2.606A 97.78% probability E-value .000035 41% alignment FS: Phamerator, hhpred NCBI agree HNH endonuclease Syn: N/A ST: Start 145 not the most annotated start at 20.7% of the time called 97.7% when present

Related


More Related Content