Problem Set 8: Overlap Graphs
Assigned: Monday April 18, 2016
Due: Monday April 25, 2016
Points: 9 Points up to 12 Points
This is an individual problem set, you can consult with friends
but you should all author your code independently. The problem
set has a required part and an optional extra credit part.
Part 1: (Required, 9 Points): Overlap Graphs
Rosalind is a
terrific website with problems in bioinformatics. The website is
named after
Dr. Rosalind Franklin, the discoverer of the helical structure of
DNA.
Solve the Overlap Graphs
problem. Your solution should be in the form of a self-contained OCaml
program contained in one file named overlap.ml. When compiled from
the Unix shell with:
> ocamlc -o overlap overlap.ml
and then run from the shell as in:
> ./overlap a.fas 3
where a.fas is the name of a
FASTA
file of the form shown on the
Overlap Graph page,
the program should create a new file a.graph containing
a representation of the overlap graph as specified on the Rosalind
page.
Notes
- There is no harness code for this problem set, use
SublimeText to create both sample FASTA files as well as the
source file overlap.ml. But feel free to use a
variation of the following snippet of code for reading the
input text from the FASTA file.
(* readLines : string -> string list
*
* The call (readLines filename) returns a list of strings with
* one list-entry for each line in filename.
*)
let readLines filename =
let inch = open_in filename in
let rec repeat lines =
try
repeat ((input_line inch)::lines)
with
End_of_file -> close_in inch;
lines
in
repeat []
- Remember that the unix command line inputs to your program can be
found in the string array Sys.argv.
- You'll want to make use of several of the functions in OCaml's
String module.
Part 2: (Optional, 3 Points): Locating Restriction Sites
Do the
Locating Restriction Sites
problem on Rosalind.
|