You are required to work on this lab in teams of two. You must complete this lab with your teammate only, although you may discuss lab concepts with other students. Please keep the Academic Integrity Policy in mind---do not show your code to anyone outside of the professors and tutors, and do not look at anyone else's code for this lab. If you need help, please post a question on Piazza, or contact the professor.
In this project, you will practice computing with strings, if statements and loops. You will also use comments to explain the structure of your program. Save your code in a file whose name consists of the first letter of your first name followed by your last name and Prg4. For example, if I submitted an assignment, I would name the file mgagnePrg4.py.
The video I created for this assignment can be seen here, and you can see the notes I wrote during that video here.
DNA is composed of 4 nucleotides: A, C, T and G, and the A always binds with the T, the C always binds with the G. A DNA sequence forms a palindromic sequence (they are real, I swear, they are thought to play a role in gene activation) if its first half can bind with its second half. For example, "ACCTAGGT" is a palindromic sequence.
Geno-Thingy is a biotech company that analyses DNA. They would like a program that will tell them if a sequence of DNA is a palindromic sequence.
The program should ask the user for a string containing a DNA sequence.
Unfortunately, Geno-Thingy's equipment is a little faulty so there could be non-DNA characters in the input.
For example, a possible input could be "AGb TTA-GC*C ACAT". So one of your first tasks will be to 'clean up' this input by removing all the characters that are not 'A', 'C', 'T' or 'G' (lower case variants of these lettesr should also be removed).
The program should print one of the following. Either:
The sequence <cleaned-up sequence> is a palindromic sequence.
, ORThe sequence <cleaned-up sequence> is not a palindromic sequence.
For example:
The sequence AGTTAGCCACAT is not a palindromic sequence.
The sequence ACTTGAAGT is a palindromic sequence.
The sequence is a palindromic sequence.
The sequence A is a palindromic sequence.
The sequence ATAT is a palindromic sequence.
This whole task may seem quite difficult, so I will guide you by giving you a number of functions you should implement. Once you have them, putting things together to solve the problem should not be too hard. Your program should contain the following functions:
cleanSequence
should take one input parameter, a string representing a noisy DNA sequence, and should return a string containing a cleaned-up DNA sequence, one that contains only the upper-case characters 'A', 'C', 'T' or 'G' (lower-case version of these characters should also be removed). You should build the new sequence using loop with an accumulator.removeMiddle
should take one input parameter, a string representing a DNA sequence, and return a string containing an even number of characters, having removed the middle character if the sequence in the input parameter contained an odd number of characters.reverseSequence
should take one input parameter, a string representing a DNA sequence, and return a string that is the reverse of the string received as input. To do this, you are required to use a loop with an accumulator, you are not allowed to use any other methods.complementBases
should take one input parameter, a string representing a DNA sequence, and return a string shuch that each character is the complement of the original (that is, the nucleotide it could bind to). This will also be done with a loop and an accumulator.isPalindromicSequence
should take a one input parameter, a string representing a clean DNA sequence, and return True
if the sequence is a palindromic sequence, and False
otherwise. Use the strategy outlined in the video to do this.main
should take no input parameter, ask the user to enter a (possibly noisy) DNA sequence, clean it and test if it a palindromic sequence. The function should print one of the two sentences described in the Output section depending on the result.Like the previous assignment, your submitted assignment should start with a header containing the name of the author(s), filename, description of the program, input and output.
I recommend that you sketch a plan of the tests you intend to use before you start writing code, otherwise this could get messy. Five minutes of planning could save you an hour of debugging! You can also test your program by printing the result at the end of each step to see if each step is doing what you were expecting them to do (just remember to remove those prints before submitting the program).
Test your program carefully. You should try to have a test case that explores every possible "path" in your code, and think of special inputs that might cause problems.
The program is getting a little large now, so picking meaningful names for your variables will make the program that much easier to read and make all the information easier to access.
You can write more comments in your code to explain what is going on if you feel it is appropriate. It's never harmful to explain to the next person reading your code (or grading your code)