depristo
076d21d394
Minor bug workaround in GenotypeConcordance module (see todo). General platform read filter. You can say -rl Platform illumina to remove all SLX reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3054 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 02:47:09 +00:00
hanna
6cd97b78ab
An additional safety check to ensure that we only walk over coordinate-sorted
...
data when doing locus traversals.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3053 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-21 23:31:45 +00:00
hanna
b4b4e8d672
For Sarah Calvo: initial implementation of read pair traversal, for BAM files
...
sorted by read name.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3052 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-21 23:22:25 +00:00
hanna
c0eb5c27ea
Lower memory support for merged sharding. Merged sharding is still not available.
...
WARNING: If you update frequently, you might have to rm -rf ~/.ant/cache -- this is an unfortunate side effect of the way we
distribute picard-private.jar.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3050 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 22:03:47 +00:00
ebanks
4d4db7fe63
Renaming for consistency
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3049 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 18:45:01 +00:00
ebanks
4c4d048f14
Moving VariantFiltration over to use VariantContext.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3048 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 18:35:23 +00:00
ebanks
c88a2a3027
Fixing/cleaning up the vcf merge util
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3047 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 15:13:32 +00:00
rpoplin
cdec84aa8f
Bug fix for variant optimizer. Remember to close the PrintStreams it uses to output the cluster files.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3046 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 15:07:32 +00:00
depristo
d8ff552311
Support for EXPERIMENT sampling-based genotype likelihoods
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3044 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 13:19:40 +00:00
depristo
7b17bcd0af
Refactoring a few useful routines for detecting mendelian violations
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3043 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 13:19:01 +00:00
depristo
56092a0fc2
Slight cleanup for mathutils
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3042 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 13:18:08 +00:00
depristo
b221ce94ce
Still being tested trio-aware genotyper that calculates P(de novo)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3041 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 13:11:39 +00:00
ebanks
03480c955c
And now the UnifiedGenotyper can officially annotate genotype (FORMAT) fields too.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3039 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 04:58:37 +00:00
ebanks
e757f6f078
Missing value for arbitrary format entries is empty string (need to revisit at some point, but it will require updating the VCF spec).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3038 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 03:56:27 +00:00
ebanks
0311980668
The VariantAnnotator can now officially annotate genotype (FORMAT) fields.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3037 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 03:30:14 +00:00
hanna
9b61d95d9c
Khalid found an out-of-memory condition with the new sharding system when
...
merging lots of BAMs, and the fix is taking longer than I thought. Disable
experimental sharding when merging until the fix is ready.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3036 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 02:43:46 +00:00
ebanks
b8e8852b4f
Better interface for the Annotator in how it interacts with VariantContext.
...
Also, added a proof of concept genotype-level annotation (not working yet, almost there).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3035 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-18 20:41:57 +00:00
hanna
96662d8d1b
Moving from GATK dependencies on isolated classes checked into the GATK
...
codebase to a dependency on a jar file compiled from my private picard branch.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3034 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-18 17:43:42 +00:00
aaron
8a5f0b746e
some cleanup for the output system.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3032 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-18 12:54:39 +00:00
rpoplin
c78fc23ec5
Minor updates to output of variant optimizer.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3031 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-18 12:46:47 +00:00
ebanks
0247548400
Fixed one test and (temporarily) punted on another
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3030 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-18 06:22:48 +00:00
ebanks
ee0e833616
Some significant changes to the annotator:
...
1. Annotations can now be "decorated" with any arbitrary interface description - not just standard or experimental.
2. Users can now not only specify specific annotations to use, but also the interface names from #1 . Any number of them can be specified, e.g. -G Standard -G Experimental -A RankSumTest.
3. These same arguments can be used with the Unified Genotyper for when it calls into the Annotator.
4. There are now two types of annotations: those that are applied to the INFO field and those that are applied to specific genotypes (the FORMAT field) in the VCF (however, I haven't implemented any of these latter annotations just yet; coming soon).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3029 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-18 05:38:32 +00:00
rpoplin
58a31bab6a
Variant optimizer now outputs VCF files via ApplyVariantClustersWalker. Documentation to be added to the wiki. It is ready to be used by other people but only with great caution.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3028 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 20:41:42 +00:00
hanna
d9398dc347
Remove some of the restrictions on getStart() and getStop(); getStart() and getStop()
...
now do the minimum validation rather than the more rigorous only-within-the-contig-bounds
header validation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3027 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 19:39:30 +00:00
aaron
182f1061ff
Bamboo isn't picking up commits for some reason; updating a copyright to see if it'll get this commit.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3025 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 17:56:48 +00:00
ebanks
5e29d0c219
Be smarter about dealing with infinite quals for ref calls
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3024 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 17:35:23 +00:00
rpoplin
1bb4394aa9
Adding a skeleton for the second step of the variant optimization process.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3023 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 17:03:40 +00:00
ebanks
ded4ba8966
Let's make artificial reads that actually adhere to the specs...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3022 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 16:51:42 +00:00
bthomas
5b34bb9ab0
Adding three minor new features:
...
+ -L all now walks over all intervals
+ if a -L argument is passed with a .list extension, and file does not exist, returns a \
File Not Found error instead of "bad interval" error. We plan to soon revisit interval \
lists and generate a concrete list of filenames, so this is likely temporary.
+ Error is thrown if the start position on an interval is higher number than the end position.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3021 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 16:24:10 +00:00
ebanks
4340601c26
-Pushed base quals back down into SAMRecord; if -OQ is used, the SAMRecord quals get updated automatically
...
-Better integration test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3020 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 16:00:10 +00:00
ebanks
76d14d17dc
oops, need to update class names too
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3019 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 14:01:31 +00:00
ebanks
85a030069d
renaming for consistency
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3018 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 14:00:28 +00:00
ebanks
af5fd99444
Added filter for bad cigars (based on consecutive indels) - and cleaned up bad mates filter.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3017 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 13:53:42 +00:00
hanna
2cc040aa1c
New sharding system is live. Disable with -ds.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3016 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 03:32:45 +00:00
ebanks
1fd909cdaf
Fix for Kiran: -1 is a valid value for genotype qualities in VCF, so VariantContext shouldn't die. Cleaned up the relevant VCF code while I was in there.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3015 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 00:20:15 +00:00
hanna
849bd1f451
Set the eagerDecode flag in such a way that the binary data block in the BAM will always be considered dirty.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3014 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 22:01:23 +00:00
rpoplin
933823c8bc
Removed the StingException when mkdir fails for Sendu in AnalyzeCovariates. Incremental updates to VariantOptimizer.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3013 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 19:45:02 +00:00
hanna
2525ecaa43
Oops. Commented out some tests to improve performance and then checked in the commented out tests. Reverted.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3012 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 16:34:50 +00:00
hanna
59045ccb28
Filter,merge performs much better than merge,filter. Many thanks to Eric for checking in an integration test that so compellingly demonstrates this.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3011 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 16:23:37 +00:00
hanna
6dd5f192e7
Performance improvements for RODs in conjunction with new sharding system.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3010 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 14:54:12 +00:00
kiran
f20f78d77f
Don't crash if the tracker is null. Reset the alternate alleles based on the alts present in the subset of samples.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3009 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 04:00:04 +00:00
aaron
10e76abbbc
adding some VE2 report infrastructure; work-in-progress.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3008 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 03:57:42 +00:00
ebanks
586f87fa35
Quick fix
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3007 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 02:59:26 +00:00
ebanks
202231141c
-Push the --use_original_qualities argument into the engine.
...
-Check that base and qual strings are the same lengths
-Fix one more bug in the clipper.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3006 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 02:06:11 +00:00
ebanks
035d4170aa
fix bug in read clipper: output bam can be null, so check for it.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3005 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 18:49:26 +00:00
ebanks
411d25c8d1
-Integration tests for walkers that use original quals.
...
-framework for pushing -OQ into GATK (not done)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3004 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 18:46:31 +00:00
aaron
e365d308d4
add a new JEXLContext that lazy-evaluates JEXL expressions given the VariantContext.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3003 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 16:00:55 +00:00
kcibul
9f519af06d
new method to filter out overlapping PE reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3002 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 15:40:09 +00:00
hanna
45f70de6df
Fixed bug that failed to reset an accumulator when crossing contig boundaries,
...
meaning that in special cases of shallow coverage, an interval might get dropped.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2999 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 04:45:55 +00:00
ebanks
73d6167bd6
Fixing broken integration tests
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2998 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-14 23:18:49 +00:00