31 августа начинается курс "Поиск скрытых сообщений в ДНК (Биоинформатика I)" на Coursera (Калифорнийский университет, Сан-Диего).
Поиск скрытых сообщений в ДНК (Биоинформатика I)
This course begins a series of classes illustrating the power of computing in modern biology. Please join us on the frontier of bioinformatics to look for hidden messages in DNA without ever needing to put on a lab coat. After warming up our algorithmic muscles, we will learn how to apply popular bioinformatics software tools to real experimental datasets.
About the Course
A genome may look like an incomprehensible string of the letters A, C, G, and T. Yet hidden in the three billion nucleotides of your genome is a secret language. This course offers an introduction to how we can start to understand this language by using algorithms to find hidden messages in DNA.
What do these hidden messages say? In the first chapter of the course, hidden DNA messages indicate where a bacterium starts replicating its genome, a problem with applications in genetic engineering and beyond. In the second chapter, hidden DNA messages tell us how organisms know whether it is day or night as well as how the bacterium causing tuberculosis is able to hide from antibiotics. We will see how randomized algorithms, which toss coins and roll dice, can be used to find these messages.
Each of the two topics in the course builds the algorithmic knowledge required to address this challenge. In the end of the course, you will be able to apply popular bioinformatics software tools for finding regulatory motifs in experimental data sets and solve our Bioinformatics Application Challenge.
The series to which this class belongs covers exactly the same material as the core bioinformatics class in the "Bioinformatics and Systems Biology" program at the University of California at San Diego, one of the top bioinformatics programs in the world. In other words, you will have exactly the same lectures and homework assignments as students at UCSD. Moreover, many leading universities have adopted this material in their offline classes. Our goal is to provide you with the same high-quality materials that these students study in their offline classes.
Course Syllabus
Where in the Genome Does Replication Begin? (Algorithmic Warmup):
- Introduction to DNA replication
- Hidden messages in the replication origin
- Some hidden messages are more surprising than others
- An explosion of hidden messages
- The simplest way to replicate DNA
- Asymmetry of replication
- Peculiar statistics of the forward and reverse half-strands
- Some hidden messages are more elusive than others
- A final attempt at finding DnaA boxes in E. coli
- Epilogue: Complications in oriC predictions
Which DNA Patterns Play the Role of Molecular Clocks? (Randomized Algorithms)
- Do we have a "clock" gene?
- Motif finding is more difficult than you think
- Scoring motifs
- From motif finding to finding a median string
- Greedy motif search
- Motif finding meets Oliver Cromwell
- Randomized motif search
- How can a randomized algorithm perform so well?
- Gibbs sampling
- Gibbs sampling in action
- Complications in motif finding
- Epilogue: How does Tuberculosis hibernate to hide from antibiotics?
Bioinformatics Application Challenge: Searching for regulatory motifs in Mycobacterium tuberculosis
Recommended Background
If you are aiming at earning a standard certificate in this class, then you do not need to have any experience in biology or programming. The only prerequisite is the enthusiasm to learn about how computational approaches are used in modern biology :)
If you are aiming at earning a certificate with distinction, then you should either know the basics of programming in the language of your choice (there is no required language for this course) or be willing to learn about programming before the course begins. In this case, we have the following suggestions for resources that will help you learn programming.
- The language tracks on Codecademy, particularly the Python track.
- An Introduction to Interactive Programming with Python, the acclaimed Coursera course.
- Introductory problems on Rosalind, a resource for learning bioinformatics created by the course instructors.
Suggested Readings
The printed course companion is Bioinformatics Algorithms: An Active-Learning Approach, by Compeau & Pevzner.
Course Format
This course covers two chapters from Bioinformatics Algorithms: An Active Learning Approach, by Compeau & Pevzner. A PDF e-book covering these two chapters can be downloaded from Leanpub. The course also contains summary quizzes and lecture videos.
To earn a standard certificate in the class, you must complete weekly quizzes in addition to a Bioinformatics Application Challenge in which you apply popular bioinformatics software tools to a real experimental dataset. If you are eager to learn about bioinformatics, you should be able to complete the Application Challenge and earn the course certificate even if you do not know how to program.
To earn a certificate with distinction, rather than complete the Application Challenge, you must complete some programming assignments found in the course's interactive text. The distinction is a "hacker track" that is aimed at learners who know how to program and would like to explore the nuts and bolts of bioinformatics algorithms.