Genome Assembly

Genome Assembly
Sakshi Chavan
Contents
C
Introduction
Types
Differences
Factors affecting genome
assembly results
Application
References
Genome Assembly
2
Introduction
In bioinformatics, genome assembly represents the process of putting a large number
of short DNA sequences back together to recreate the original chromosomes from
which the DNA originated.
Sequence assembly is one of the basic steps after performing 
next generation
sequencing
PacBio SMRT sequencing
, or 
Nanopore sequencing
.
The established genome assembly can be submitted to databases such as European
Nucleotide Archive, NCBI Assembly, and Ensembl Genomes. You can also browse
these databases for genomic sequences done by other researchers.
Presentation title
3
Types of Genome Assembly
Types of Genome Assembly
De novo
 genome assembly 
is a strategy for genome assembly, representing the
genome assembly of a novel genome from scratch without the aid of reference
genomic data. 
De novo
 genome assemblies assume no prior knowledge of the source
DNA sequence length, layout or composition.
Reference-based genome 
assembly maps reads to a reference genome by identifying
reads with similar nucleotides to the reference. It is a 
digital nucleic acid sequence
database, assembled by scientists as a representative example of the set of genes in
one idealized individual organism of a species.
Genome assembly
5
Difference Between Types
Differences
 
Genome assembly
7
Factors Affecting Genome Assembly Results
Factors Affecting 
G
enome 
A
ssembly 
R
esults
Properties of the genome
Genome size. The bigger the genome is, the more data is needed. Therefore,
before ordering sequence data, you need to estimate the genome size, which may
be inferred by investigating the genome size of closely related species.
Repeats. Amount and distribution of repeated sequences in a genome largely
influence the genome assembly results. This can lead to misassemblies and an
incorrect estimate of the size of the repeats.
Ploidy level. If possible, it is better to sequence haploid tissue, avoiding problems
caused by heterozygosity.
Genome Assembly
9
Factors Affecting 
G
enome 
A
ssembly 
R
esults
Nucleic Acid Extraction
For the 
DNA isolation
 or 
RNA isolation
, here are a couple of things need to be
aware of: DNA/RNA integrity, DNA/RNA purification, sufficient DNA/RNA
amount, 
etc
. Compared with resequencing, 
de novo
 sequencing requires superior
nucleic acid.
The most important nucleic acid quality parameters for NGS are chemical purity and
structural integrity.
Genome Assembly
10
Factors Affecting 
G
enome 
A
ssembly 
R
esults
 
Raw data processing
Although there are assembly tools that prefer dealing with the raw data, including
potential adapter sequences, we highly recommend that researchers study the
manual to determine whether the program requires quality-trimmed data or not.
If data trimming is required, it would be necessary to omit poor quality data by
trimming low quality read ends and filtering of low quality reads. Multiple tools are
available for this purpose, such as PRINSEQ32 and Trimmomatic33.
Genome assembly
11
Application
Presentation title
12
References
References
1.
Wajid B, Serpedin E. Do it yourself guide to genome assembly. 
Briefings in functional
genomics
, 2014, 15(1): 1-9.
2.
Victoria D D A, Erik H, Lieven S, et al. Ten steps to get started in Genome Assembly
and Annotation. F1000Research, 2018, 7.
Genome assembly
14
Thank you
Slide Note
Embed
Share

Genome assembly is a technique to assemble short DNA fragments into a organized format.

  • Bioinformatics
  • genomics
  • genomedataanalysis

Uploaded on Jul 25, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Sakshi Chavan Genome Assembly

  2. Contents Introduction Types Differences Factors affecting genome assembly results Application References Genome Assembly 2

  3. Introduction In bioinformatics, genome assembly represents the process of putting a large number of short DNA sequences back together to recreate the original chromosomes from which the DNA originated. Sequence assembly is one of the basic steps after performing next generation sequencing, PacBio SMRT sequencing, or Nanopore sequencing. The established genome assembly can be submitted to databases such as European Nucleotide Archive, NCBI Assembly, and Ensembl Genomes. You can also browse these databases for genomic sequences done by other researchers. Presentation title 3

  4. Types of Genome Assembly

  5. Types of Genome Assembly De De novo novo genome genome assembly of a novel genome from scratch without the aid of reference genomic data. De novo genome assemblies assume no prior knowledge of the source DNA sequence length, layout or composition. Reference Reference- -based based genome genome assembly maps reads to a reference genome by identifying reads with similar nucleotides to the reference. It is a digital nucleic acid sequence database, assembled by scientists as a representative example of the set of genes in one idealized individual organism of a species. genome assembly assembly is a strategy for genome assembly, representing the Genome assembly 5

  6. Difference Between Types

  7. Differences De novo De novo assembly assembly Reference based alignment Reference based alignment Good for SNV and small indels Works for deletions and duplications by using coverage information A quick method to assembly the genome Hiding raw data limitations More tools to work with the results Easier annotation and comparison Requires a reference genome Limited by read length for feature detection Dose not rely on a reference genome Used to search unknown genes/transcripts Good for structural variations Advantages Advantages Requires very high-quality raw data A slow method and requires high infrastructure disadvantag disadvantag es es Genome assembly 7

  8. Factors Affecting Genome Assembly Results

  9. Factors Affecting Factors Affecting G Genome enome A Assembly ssembly R Results esults Properties of the genome Properties of the genome Genome size. The bigger the genome is, the more data is needed. Therefore, before ordering sequence data, you need to estimate the genome size, which may be inferred by investigating the genome size of closely related species. Repeats. Amount and distribution of repeated sequences in a genome largely influence the genome assembly results. This can lead to misassemblies and an incorrect estimate of the size of the repeats. Ploidy level. If possible, it is better to sequence haploid tissue, avoiding problems caused by heterozygosity. Genome Assembly 9

  10. Factors Affecting Factors Affecting G Genome enome A Assembly ssembly R Results esults Nucleic Acid Extraction Nucleic Acid Extraction For the DNA isolation or RNA isolation, here are a couple of things need to be aware of: DNA/RNA integrity, DNA/RNA purification, sufficient DNA/RNA amount, etc. Compared with resequencing, de novo sequencing requires superior nucleic acid. The most important nucleic acid quality parameters for NGS are chemical purity and structural integrity. Genome Assembly 10

  11. Factors Affecting Factors Affecting G Genome enome A Assembly ssembly R Results esults Raw data processing Raw data processing Although there are assembly tools that prefer dealing with the raw data, including potential adapter sequences, we highly recommend that researchers study the manual to determine whether the program requires quality-trimmed data or not. If data trimming is required, it would be necessary to omit poor quality data by trimming low quality read ends and filtering of low quality reads. Multiple tools are available for this purpose, such as PRINSEQ32 and Trimmomatic33. Genome assembly 11

  12. Application Whole genome sequencing SNP Detection Transcriptome assembly application understanding of disease development at the molecular level. Transcriptome reads can be applied to discriminate the DNA expression level SNP is a single nucleotide mutation that differs between members of a species. Presentation title 12

  13. References

  14. References 1. Wajid B, Serpedin E. Do it yourself guide to genome assembly. Briefings in functional genomics, 2014, 15(1): 1-9. 2. Victoria D D A, Erik H, Lieven S, et al. Ten steps to get started in Genome Assembly and Annotation. F1000Research, 2018, 7. Genome assembly 14

  15. Thank you

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#