ebanks
e6200fe5b5
don't ignore reads when maxReadLength isn't set
...
also, print out LOD score for cleaning
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@771 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 19:24:10 +00:00
andrewk
0219d33e10
QualityUtils: added reverse function to reverse an array of bytes (and not complement it), BaseUtils: split qualToProb into itself and qualToErrProb, CovariateCounterWalker and LogisticRecalibrationWalker: several changes including a properly acocunting (only partly complete) for reversing AND complementing bases that are negative strand, PrintReadsWalker: created option to output reads to a BAM file rather than just to the sceern (useful for creating a downsampled BAM file)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@770 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 18:30:45 +00:00
asivache
7e77c62b49
auxiliary class, a simple struct to keep together info like numbers of covered, assessed, ref/variant bases across the sample
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@769 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 16:30:16 +00:00
ebanks
34f9820299
update mapping quality score and edit distance attribute for reads when they are cleaned
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@763 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-20 17:51:31 +00:00
hanna
01a3cb27c7
@Required / @Allows flags for main arguments.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@751 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-19 23:26:17 +00:00
jmaguire
3441795d9c
better handling of edge cases (zero coverage, reference mistakes, etc.)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@747 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-18 18:04:37 +00:00
asivache
a39c8839c8
print percentage sign!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@745 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-18 14:38:20 +00:00
jmaguire
94e324b844
Write N for the alt allele when we're hom-ref.
...
Stop EM loop when we've converged (likelihood[t-1] == likelihood[t]).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@737 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-17 13:58:11 +00:00
kcibul
bd53bc18f9
added new required annotations
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@736 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-17 12:24:06 +00:00
ebanks
81fac73c01
LOD checks for normal and brute force versions
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@732 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-17 02:56:03 +00:00
jmaguire
527df6e57b
Massive speed-up, clean-up and tabular output.
...
This program is going to rule.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@731 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-16 16:52:40 +00:00
jmaguire
3b57a35009
don't be tricked by multiple read groups with the same sample id!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@730 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-16 15:28:55 +00:00
jmaguire
947bac5cdc
vast speedup
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@729 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-16 15:27:58 +00:00
ebanks
f33f3c0434
added LOD threshold for determining when to clean
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@725 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-15 20:23:59 +00:00
kcibul
d1f3000afa
bed-style output for IGV
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@721 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-15 17:58:44 +00:00
jmaguire
641afc4e76
fix a crash in the event that the input file has no read groups!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@714 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 19:27:41 +00:00
ebanks
7a1f85ff86
option to print out the indels found by the cleaner to a file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@709 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 17:50:08 +00:00
ebanks
5dda448ae0
1. Add printouts for the cleaner
...
2. First pass at the entropy interval walker (still needs work)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@696 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 13:59:48 +00:00
asivache
7b59f63f12
and don't forget to close sam writer after we are done...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@692 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-13 20:46:36 +00:00
asivache
de0cce87ea
new optional arg added that allows to specify a separate bam file to send all piles that fail to realign to; plus minor fixes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@691 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-13 20:24:23 +00:00
jmaguire
7084ecdeb6
a few changes; checked in to allow debugging.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@688 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-13 15:50:48 +00:00
kiran
4e4767e5de
Moved to org.broadinstitute.sting.secondarybase
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@682 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 20:26:43 +00:00
kiran
219eb60716
Added newly-required documentation to arguments so that build can complete successfully.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@681 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 20:26:10 +00:00
kiran
688358190c
Moved secondary base stuff out of playground for the purpose of making it a core utility. Modified package names and imports such that things would build properly.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@680 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 20:24:18 +00:00
kcibul
8079acb1d3
basic step0 implementation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@679 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 19:49:39 +00:00
kiran
57ecb7fbf1
Nicer reporting functions.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@678 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 19:48:30 +00:00
hanna
ee99320c83
Removed at Mark's request.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@677 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 19:48:21 +00:00
kiran
f1de3d6366
Minor tweaks to how probs are supplied.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@676 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 19:47:41 +00:00
kiran
095dacd154
Experimental refactoring.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@675 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 19:46:50 +00:00
kiran
758f8aa89b
Experimental refactoring.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@674 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 19:46:34 +00:00
andrewk
1518f8f9bf
Update training data creation in CovariateCounterWalker to output much smaller files by counting the number of occurences of each data point combination rather than outputting a line for each data point (i.e. each base). Also fixed bug in LogisticRecalibrationWalker where a null SAMHeader was being pulled from a function that is now marked deprecated.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@673 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 19:23:14 +00:00
ebanks
4c12df372c
Dumb, dumb bug.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@672 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 19:21:33 +00:00
ebanks
630066cc0a
1. Merge LocusWindows whose reads overlap.
...
2. Fix bug (we weren't clearing the "to emit" list)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@670 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 17:33:23 +00:00
jmaguire
c4d89997ca
put in a dummy sample_name so it'll compile
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@668 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 15:12:42 +00:00
jmaguire
c8d7223789
do pooled calling properly for 1kg
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@667 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 15:12:13 +00:00
jmaguire
313a6d0fb5
lots of changes to facilitate calling indels and 1kG
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@666 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 15:11:42 +00:00
jmaguire
add7b6cf65
add sample_name to constructor, misc bug fixes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@665 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 15:10:17 +00:00
jmaguire
0267ccae7f
add code for computing indel genotype likelihoods
...
make reference lods negative
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@664 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 15:09:29 +00:00
hanna
ee9077fc69
LocusIterator iterated through LocusContexts, which was fine until now when we need something
...
that iterates through loci (GenomeLocs). Rename LocusIterator to LocusContextIterator.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@662 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 13:54:57 +00:00
hanna
0bca588629
Botched some boolean logic.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@658 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 22:53:52 +00:00
hanna
23e9e29964
Changed reads traversals from providing a LocusContext from which the reference sequence
...
could be extracted to a char[] containing the reference bases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@657 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 22:45:11 +00:00
hanna
052819bed5
Switched dependencies of GenomeAnalysisTK to depend on GenomeAnalysisEngine.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@656 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 22:33:00 +00:00
ebanks
009e71fcd9
We need to sort cleaned reads ourselves (instead of letting SAMFileWriter
...
do it) because the SAM headers are often screwed up and claim to be
"unsorted". While here, I broke off the module from the SortSamIterator
in case someone else wants to use it.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@654 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 15:43:42 +00:00
ebanks
3aabc144c6
Added functionality to allow for a contract between LocusWindowTraversalEngine and LocusWindowWalker which allows the Walker to act upon reads outside of the provided intervals.
...
(Really, all we want to do is spit out all reads, but this allows the Walker to do other things with the reads if it wants)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@641 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 17:28:16 +00:00
hanna
226edbdef6
Hypen-style xml output. Much sexier.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@635 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 01:04:40 +00:00
aaron
21536df308
Change the sample XML marshalling code over to simple XML, and take out the castor lines in the ivy.xml
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@633 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 00:08:25 +00:00
depristo
5a6892900e
fixing oddities in duplicates
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@628 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:55:45 +00:00
depristo
4a26f35caa
new default syntax
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@627 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:16:53 +00:00
ebanks
283a4d1b54
Fix some special-case cleaner issues.
...
We now do the same as brute force in all examples to date.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@626 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:16:35 +00:00
depristo
2204be43eb
System for traversing duplicate reads, along with a walker to compute quality scores among duplicates and a smarter method to combine quality scores across duplicates -- v1
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@624 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:06:02 +00:00