asivache
b4ef16ced2
extractIndels() now should deal correctly with soft- and hard-clipped bases
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@936 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-08 16:04:49 +00:00
asivache
0bb4565798
added AlignmentUtils.getNumAlignmentBlocks(read) - a faster alternative to read.getAlignmentBlocks().size(); IntervalCleaner updated accordingly.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@923 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-06 19:35:21 +00:00
asivache
92b054b71b
moved another variant of numMismatches to AlignmentUtils
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@922 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-06 18:07:48 +00:00
ebanks
092a754071
Make sure indel position from SW alignment is leftmost possible
...
(and improve printouts)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@912 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 15:36:10 +00:00
hanna
5e8c08ee63
Update to latest version of picard. Change imports in all classes dependent on picard public from import edu.mit.broad.picard... to import net.sf.picard...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@849 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 20:13:01 +00:00
asivache
4b718688d5
no changes, really, just synchronizing (instead of reversing) to increase the amount of entropy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@801 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 17:27:28 +00:00
hanna
01a3cb27c7
@Required / @Allows flags for main arguments.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@751 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-19 23:26:17 +00:00
asivache
7b59f63f12
and don't forget to close sam writer after we are done...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@692 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-13 20:46:36 +00:00
asivache
de0cce87ea
new optional arg added that allows to specify a separate bam file to send all piles that fail to realign to; plus minor fixes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@691 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-13 20:24:23 +00:00
hanna
23e9e29964
Changed reads traversals from providing a LocusContext from which the reference sequence
...
could be extracted to a char[] containing the reference bases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@657 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 22:45:11 +00:00
asivache
072808858e
added COUNT_CUTOFF arg: it is nor possible to tell the code to try to realign all read piles over trains of nearby indels with at least one indel observed in COUNT_CUTOFF or more different alignments (set the arg to 1 to realign around all indels); also, some diagnostic printouts added to the output (time spent on loading the reference, time spent on scrolling through the input bam file, counts of discarded reads)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@611 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 21:59:33 +00:00
ebanks
7de5da7065
Start getting the cleaner working in Walker
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@561 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-29 14:59:53 +00:00
ebanks
758db73b98
Fixed SLOWNESS issue.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@469 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 20:10:34 +00:00
asivache
55537c0d1e
chnage class name, now it compiles...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@451 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 16:51:00 +00:00
asivache
e8a6cdb386
renamed standalone main
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@449 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 15:56:46 +00:00
asivache
832afd3d60
renamed standalone main
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@448 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 15:56:27 +00:00
asivache
85308f4ddc
resurrected indel tool's standalone main
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@447 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 15:55:52 +00:00
asivache
240eb18564
fix a few related issues when not all the reads were written into the output files. now cleaned output still contains all reads either with modified alignments or untouched
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@444 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 03:56:47 +00:00
asivache
baae98c6d5
and don't allocate new 200M string every time please, just pass byte array!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@417 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 21:55:33 +00:00
asivache
9d56355abe
bug fixed when reference name was passed as a string instead of actual reference bases
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@416 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 21:46:27 +00:00
ebanks
647827b18c
Transitioned indel code to use GATK and Walkers
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@410 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 19:14:15 +00:00
asivache
bc43c0eefc
there are really cases when we can not merge until we get just two pilesant now we do not crash in those cases but print a warning and just show the resulting n piles even when n>2
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@390 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:45:47 +00:00
asivache
d44c30154a
added MAX_READ_LENGTH - now we can ignore long reads (454?); a bad idea in general, but the performance hit is to hard to take, at least for preliminary testing runs...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@384 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 16:53:12 +00:00
asivache
b4136b6d6e
a few tweaks to make it more robust: ignore reads with cigars containing anything but I,D,M; don't set up contig ordering manually, rely upon reference sequence and its dictionary; don't die if a record does not have NM tag, but faal back to direct counting instead; now requires reference as a cmdline arg
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@378 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 04:49:19 +00:00
depristo
d7c0bcc223
Reorganized GenomeLoc code to more clearly and better use the picard SequenceDictionary information.
...
All GenomeLoc[] are not ArrayList<GenomeLoc> for clarity and consistency
Parsing now recursively merges contiguous elements chr1:1-10;chr1:11-20 => chr1:1-20
Added support for TraversingByLoci over all reference positions specified by the provided location array. System dynamically determines which traversal system to use.
Pileup now marks, very clearly, reference positions without covered reads.
Made changes around the codebase to deal with new GenomeLoc structure.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@218 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-28 20:37:27 +00:00
asivache
c6d9848d08
synchronizing latest changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@212 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 14:15:44 +00:00
asivache
f47a214f96
massive changes everywhere; lots of bugs fixed; methods moved around; computation and printout of overall stats added; now decides whether to accept or reject 'improvement'; writes alignments into two output sam files (unmodified reads/failed piles into one, realigned piles into the other); special treat for paranoids: writes third sam file with all the analyzed reads, unmodified
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@197 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 02:26:17 +00:00
asivache
4c29dca70d
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@186 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 09:23:42 +00:00
asivache
71d3e8e99b
fixed another bug in gapped alignment computation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@185 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 08:33:57 +00:00
asivache
40f45c2333
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@184 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 05:48:10 +00:00
asivache
4222016bf5
stop printing sw matrix and other debug infoant
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@171 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 18:15:52 +00:00
asivache
8ea8a74fbf
fixed bug in calculation of alignment start offset for negative offsets; toString() added
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@170 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 18:05:28 +00:00
asivache
9aa1ccd9b7
fixed some bugs in calling the optimal path; parameters adjusted (?)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@169 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 17:27:51 +00:00
asivache
786a7845dd
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@167 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 14:06:44 +00:00
asivache
3d1e0bf079
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@166 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 14:06:24 +00:00
asivache
908065125f
computes Smith-Waterman pairwise alignment
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@164 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 05:36:37 +00:00
hanna
63cd1fe201
Push core / playground lower into the tree.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@160 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 23:19:54 +00:00