![]() |
![]() |
Biotechnology/Bioinformatics Discovery!
Introduction to Bioinformatics
Biomedical Version
Credits: Geospiza, Austin Community College, ASM Instructional Library, S.
Wefer in American Biology Teacher 65:610, 2003.
The following exercise is best accomplished by starting at our program web site www.occc.edu/BBDiscovery and clicking on the Modules link and then clicking on the Introduction to Bioinformatics link to access this exercise online.
Go to this site: http://wiki.bioinformatics.org/Bioinformatics_FAQ
I.
What is
bioinformatics?
II. More on Bioinformatics
To know more about what all those notations mean that are given on the results page of a BLAST search, there is an excellent tutorial to take you through the steps and terminology of a BLAST search – go to the page titled “Blast for Beginners” located at http://www.geospiza.com/outreach/BLAST/index.html. Follow the green arrows to complete the 12 slide tutorial.
As you work through this tutorial, you should be able to answer the following questions. 1. What does BLAST stand for? 2. The BLAST site is maintained by what agency?
3. How long is the first sequence that the tutorial pasted in the BLAST database box?
4. How many sequences are in the database for comparison?
5. What organism is the source of the sequence?
6. Note the blue letters (hyperlinks) that are
given to the left of the sequence description.
In general, what are they used for?
7.
What is the definition of the E value?
Is a higher or lower E value better?
_________ Why?
8. Even though the tutorial searched 4183 bits,
how many bits from the query matched
the sequence stored in the database?
__________
9. In this same tutorial example, what % of the query matched exactly with the database?
10. Using the tutorial page that lists the
accession number at the top left of the screen
find out:
(a)
the taxonomy of this organism (just list the first 3)
(b)
name the journal this sequence was first published in and the year
(c) the authors of the journal are?
III.
Try your hand at bioinformatics.
Here is a simple exercise in using a national database to identify a DNA sequence. It’s as easy as cutting and pasting! You will be able to identify the human disease from these short DNA sequences.
**
HINT: it works easier if you have 2 browser windows open at the same time **.
1. Open this page http://www.ncbi.nlm.nih.gov. Next select “BLAST” from the top navigation bar. Next under the heading nucleotide select the link “Nucleotide BLAST .
2. You are now ready to simply “copy” each of the 8 sequences listed below and “paste” them in the BLAST search box to find the database information available. In the database box of this window, leave on human genomic and highly similar sequences.
3. As you find each of the 8 sequences, fill out the attached worksheet. Scroll down the results page to get to best match, and then click on L box (locuslink) to get chromosome location of this sequence, and also O box to get to OMIM for more information about the disease.
|
1 ATGGCGACCCTGGAAAAGCTGATGAAGGCCTTCGAGTCCCTCAAGT CCTTCCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAACAGCCGCC |
|
2 ATGGCGGGTCTGACGGCGGCGGCCCCGCGGCCCGGAGTCCTCCTG CTCCTGCTGTCCATCCTCCACCCCTCTCGGCCTGGAGGGGTCCCTG GGGCCATTCCTGGTGGAGTTCCTGGAGGAGTCT |
|
3 ATGCTCACATTCATGGCCTCTGACAGCGAGGAAGAAGTGTGTGATG AGCGGACGTCCCTAATGTCGGCCGAGAGCCCCACGCCGCGCTCCTG CCAGGAGGGCAGGCAGGGCCCAGAGGATGGAG |
|
4 ATGTTTTATACAGGTGTAGCCTGTAAGAGATGAAGCCTGGTATTTA TAGAAATTGACTTATTTTATTCTCATATTTACATGTGCATAATTTTCC ATATGCCAGAAAAGTTGAATAGTATCAGATTCCAAATCT |
|
5 ATGCGTCGAGGGCGTCTGCTGGAGATCGCCCTGGGATTTACCGTGCT TTTAGCGTCCTACACGAGCCATGGGGCGGACGCCAATTTGGAGGC TGGGAACGTGAAGGAAACCAGAGCCAGTCGGGCC |
|
6 ATGCCGCCCAAAACCCCCCGAAAAACGGCCGCCACCGCCGCCGCTGC CGCCGCGGAACCCCCGGCACCGCCGCCGCCGCCCCCTCCTGAGGAG GACCCAGAGCAAGGACAGCGGCCCGGAGGAC |
|
7 ATGTTGTGCAATATCCATCTACTGTAGTTAAGATATTCAGTAG TTTGTTTTTCATAAGCATGTAATTGATCATATTTCTGCCAAGGATGT GCCTTCAACTTTATAATTATAGTGTTGTAAAATATTTTTGTCTG |
|
8 ATGCCATCTTCCTTGATGTTGGAGGTACCTGCTCTGGCAGATTTCA ACCGGGCTTGGACAGAACTTACCGACTGGCTTTCTCTGCTTGATC AAGTTATAAAATCACAGAGGGTGATGGTGGGTGACCTT |
IV. To Summarize –
1. Now that you have used a computer and the Internet to obtain identities of DNA sequences with relative ease, imagine doing this task – taking a DNA sequence and searching a database of sequences – without a computer. In your own words, describe why bioinformatics is a part of today’s biology.
2. For a description of the role of bioinformatics in the Human Genome Project, go to the DNA Learning Centers DNA Interactive site: www.dnai.org and click on “Genome” and then click on “The Project” and at menu at top of that page, click on “Pieces of puzzle” and click on two puzzle pieces entitled “dealing with the data” and “finding genes”.
To learn more about the Human Genome Project, use our BBDiscovery module located at www.occc.edu/bbdiscovery/documents/Modules/Human%20Genome%20Project%20Module.htm
**hint: once you get your search results, find and use the Blue “G” box (Gene) (located to the right of the E-value under “sequences producing significant alignments” to lead you to Entrez Gene page, which gives you the name(s) of that gene, the chromosome number and location on the chromosome and the MIM number (the number for that gene in the OMIM database). Another approach is to go back to the NCBI Home Page, click on “Genes and Disease” on menu on right side; this gives you an overview of the gene and disease with links to more information.
Sequence # bases Human Disease Name of Gene Other
|
#1 |
|
|
MIM# |
What chromosome? (blue G box) –get MIM #. Go to OMIM and use MIM # to search. What is a triplet repeat? How many in this disease? |
|
2 |
|
|
|
What chromosome? |
|
3 |
|
|
|
What chromosome? |
|
4 |
|
|
|
What chromosome? |
|
5 |
|
|
|
What chromosome? |
|
6 |
|
|
|
What chromosome? |
|
7 |
|
|
|
What chromosome? |
|
8 |
|
|
|
What chromosome? |
Follow up questions: