"Contigs should not know where they are." That's a phrase uttered by James Bonfield when presenting his work on gap5, the successor to gap4, a much-used assembly software suite. So you think: "Wait a second: you're talking about assembly, and the contigs should not store their position?"

This statement addresses a problem that we encounter often when working with genomic data: how to handle features. The approach often used is to give the feature a 'chromosome', 'start position' and 'stop position'. Seems reasonable, right? So if you want all features on chromosome 1 between positions 6,124,627 and 6,827,197 you just loop over all features and check if their range overlaps with this query range. Indeed: seems reasonable. Unless your collection of features goes into the millions.
2
Welcome
Welcome
Hi there, and welcome to SaaienTist, a blog by me, for me and you. It started out long ago as a personal notebook to help me remind how to do things, but evolved to cover more opinionated posts as well. After a hiatus of 3 to 4 years (basically since I started my current position in Belgium), I resurrect it to help me organize my thoughts. It might or might not be useful to you.

Why "Saaien tist"? Because it's pronounced as 'scientist', and means 'boring bloke' in Flemish.
About Me
About Me
Tags
Blog Archive
Links
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.