Basic BioPython Training for Bioinformatics

Biopython

Introduction

Biopython is a Python Package freely available for computational molecular biology. Biopython can parse Blast results (standalone and web); run biology related programs (blastall, clustalw, EMBOSS); deal with FASTA formatted files; parse GenBank files; parse PubMed, Medline and work with on-line resource; parse Expasy, SCOP, Rebase, UniGene, SwissProt; deal with Sequences; data classification (k Nearest Neighbors, Bayes, SVMs); Aligning sequences; CORBA interaction with Bioperl and BioJava; SQL database storage through BioSQL; Neural Networks; Genetic Algorithms; Hidden Markov Models; creating pretty PDF files for posters; format flat files with random access to entries; structural biology PDB, FSSP.

Course Objectives:

  • Sequence manipulation using Biopython
  • Annotating sequences
  • Sequence alignments
  • BLAST
  • Accessing NCBI databases

Duration

7 hours, 1 Day Course

Mode of Delivery

Classroom-based, Instructor-led Training

Course Outline

  1. Introduction to Biopython
    1. What is Biopython?
    2. Biopython packages
    3. Installing Biopython
    4. Biopython website & resources
  2. Working with Sequences
    1. Parsing
    2. Slicing
    3. Adding
    4. Concatenating
    5. Reverse complementing
  3. Annotating Sequences
    1. FASTA record
    2. GenBank record
    3. Chromosomal Location
    4. Sequence type
  4. Working with Sequence files
    1. Parsing a file
    2. Reading from a file
    3. Writing to file
    4. Converting file formats
  5. Sequence Alignment
    1. Parsing an alignment file
    2. Reading an alignment file
    3. Writing alignments to file
    4. Converting file formats
    5. Manipulating alignments
  6. BLAST
    1. What is BLAST?
    2. Running BLAST
    3. Parsing BLAST output
    4. Searching within BLAST output
  7. Accessing Entrez Databases at NCBI
    1. Connect to Entrez
    2. List accessible Entrez Databases
    3. Search Entrez Databases
    4. Upload identifiers for searching
    5. Return search results
    6. Parsing results
  8. Simple Plotting
    1. Plot %GC
    2. Plot sequence similarity (nucleotide dot plot)
    3. Plot quality scores of sequencing reads