Mark DePristo
44348a3761
Merge pull request #363 from broadinstitute/md_het_docs
...
Better docs on the meaning of heterozygosity
2013-08-07 04:28:16 -07:00
Mark DePristo
318f7e74e4
Better docs on the meaning of heterozygosity
...
-- [delivers #53522209 ]
2013-08-07 07:27:45 -04:00
Eric Banks
dd0e6409c6
Merge pull request #367 from broadinstitute/md_hc_ref_fix
...
Bugfix for ReferenceConfidenceModel
2013-08-06 20:37:08 -07:00
Mark DePristo
40bc7d6a9c
Bugfix for ReferenceConfidenceModel
...
-- In the case where there's some variation to assembly and evaluate but the resulting haplotypes don't result in any called variants, the reference model would exception out with "java.lang.IllegalArgumentException: calledHaplotypes must contain the refHaplotype". Now we detect this case and emit the standard no variation output.
-- [delivers #54625060 ]
2013-08-06 16:00:32 -04:00
Ryan Poplin
6dfd17122f
Merge pull request #362 from broadinstitute/rp_single_sample_hc_pipeline
...
Adding single sample HC qscript for Mauricio.
2013-08-06 12:11:50 -07:00
Ryan Poplin
ee0aba224c
Adding single sample HC qscript for Mauricio.
2013-08-06 15:10:15 -04:00
Mark DePristo
81a74351fd
Merge pull request #360 from broadinstitute/rp_vqsr_ordering_of_annotations_bug
...
Fix for the VQSR visualization script with the new ordering of annotatio...
2013-08-03 09:49:42 -07:00
Ryan Poplin
a46f633bd6
Fix for the VQSR visualization script with the new ordering of annotations.
2013-08-02 19:10:45 -04:00
lbergelson
af36c7ce9a
Update QScript.scala
...
Relaxing addAll parameter type from Seq to Traversable to make it slightly more flexible.
2013-08-02 14:09:26 -04:00
Eric Banks
08a7ef6620
Merge pull request #358 from broadinstitute/md_tribble_reuse_query_stream
...
Rev picard to get optimized tribble feature reads
2013-08-02 10:29:39 -07:00
Mark DePristo
d5dd3b23db
Rev picard to get optimized tribble feature reads
...
-- The previous version of TribbleIndexedFeatureReader.query() would open a RandomAccessFile each time the GATK crossed a shard boundary. When running with -L wex.intervals (or any time there were lots of intervals) we'd be opening and closing enormous numbers of files, radically slowing down the GATK. With these patched versions of Tribble we see something like the following performance improvements:
SelectVariants with -L wex.intervals on my local machine against non-local file
pre-patch => 3 hours
post-patch => 30 seconds
2013-08-02 10:31:36 -04:00
jmthibault79
9316a70d1e
Merge pull request #355 from broadinstitute/eb_add_error_handling_to_kb
...
Added error handling to the newly added sites iterator so that it doesn't NPE when it encounters a bad record
2013-08-02 06:40:41 -07:00
Eric Banks
ae5fc4c726
Merge pull request #356 from broadinstitute/mc_refbias
...
Reference bias walker
2013-08-02 06:30:56 -07:00
Eric Banks
8a1e2d58ef
Merge pull request #357 from broadinstitute/mc_resort_haplotypes
...
Better caching for the HaplotypeCaller (significant speed up on NA12878 chr20!!)
2013-08-02 06:29:41 -07:00
Mauricio Carneiro
3e75262a3e
Reference bias walker
...
Calculates reference bias based on the AD genotype field instead of AB. This is slightly more meaningful for indels and still a good estimator for snps.
2013-08-02 01:44:57 -04:00
Mauricio Carneiro
285ab2ac62
Better caching for the HaplotypeCaller
...
Problem
-------
Caching strategy is incompatible with the current sorting of the haplotypes, and is rendering the cache nearly useless.
Before the PairHMM updates, we realized that a lexicographically sorted list of haplotypes would optimize the use of the cache. This was only true until we've added the initial condition to the first row of the deletion matrix, which depends on the length of the haplotype. Because of that, every time the haplotypes differ in length, the cache has to be wiped. A lexicographic sorting of the haplotypes will put different lengths haplotypes clustered together therefore wasting *tons* of re-compute.
Solution
-------
Very simple. Sort the haplotypes by LENGTH and then in lexicographic order.
2013-08-02 01:27:29 -04:00
Eric Banks
0b062e7f22
Merge pull request #354 from broadinstitute/eb_fix_rr_count_encoding
...
Two reduce reads updates/fixes
2013-08-01 12:34:19 -07:00
Eric Banks
e5be038f1a
Added error handling to the newly added sites iterator so that it doesn't NPE when it encounters a bad record.
...
Added a unit test that exactly replicates the behavior.
2013-08-01 15:25:20 -04:00
Eric Banks
1e396af4d0
Two reduce reads updates/fixes:
...
1. Removing old legacy code that was capping the positional depth for reduced reads to 127.
Unfortunately this cap affectively performs biased down-sampling and throws off e.g. FS numbers.
Added end to end unit test that depth counts in RR can be higher than max byte.
Some md5s change in the RR tests because depths are now (correctly) no longer capped at 127.
2. Down-sampling in ReduceReads was not safe as it could remove het compressed consensus reads.
Refactored it so that it can only remove non-consensus reads.
2013-08-01 14:34:59 -04:00
Eric Banks
ec3c885a25
Merge pull request #353 from broadinstitute/rp_HC_updates_for_1000G_and_WGS_calling
...
Max number of haplotypes to evaluate no longer grows unbounded with the ...
2013-07-31 08:29:06 -07:00
Ryan Poplin
4f3411f3d4
Max number of haplotypes to evaluate no longer grows unbounded with the number of samples. This is necessary for multi-sample calling projects with over 100 samples.
2013-07-31 10:48:55 -04:00
Yossi Farjoun
00cedd0bd3
Merge pull request #352 from broadinstitute/yf_SNPEFF_Stratifier
...
moved SnpEffUtilUnitTest to public tree
2013-07-30 14:52:33 -07:00
Yossi Farjoun
284176cd7b
moved SnpEffUtilUnitTest to public tree
2013-07-30 17:51:40 -04:00
droazen
b8709b1942
Merge pull request #332 from broadinstitute/st_fpga_hmm
...
FPGA support for PairHMM
2013-07-30 14:21:21 -07:00
Eric Banks
ac06829194
Merge pull request #349 from broadinstitute/yf_SNPEFF_Stratifier
...
Adding a representation of the hierarchy of flags output by snpEff (Yoss...
2013-07-30 12:42:25 -07:00
Joseph Rose
d2860a5486
Adding a representation of the hierarchy of flags output by snpEff (Yossi) and a stratifier whose output states are coding regions, genes, stop_gain, stop_lost and splice sites, all determined by the snpEff hierarchy (J. Rose)
2013-07-30 15:38:32 -04:00
Mauricio Carneiro
7b731dd596
Removed native method call
...
and fixed indentation.
2013-07-30 13:59:58 -04:00
chartl
cf46256356
Merge pull request #350 from broadinstitute/chartl_genotypeconcordance_doc_cleanup
...
Add <pre> tags to the Genotype Concordance docs. Tables were not being d...
2013-07-29 16:17:26 -07:00
Chris Hartl
464a5b229d
Add <pre> tags to the Genotype Concordance docs. Tables were not being displayed properly.
2013-07-29 15:48:17 -07:00
Eric Banks
678d038c76
Merge pull request #348 from broadinstitute/gg_gatkdoc_fixes
...
Gg gatkdoc fixes
2013-07-26 13:17:51 -07:00
Geraldine Van der Auwera
3063d82797
Fixed example in CallableLoci gatkdoc
2013-07-26 15:51:31 -04:00
Geraldine Van der Auwera
fc4a8b1dd0
Fixed example in DoC gatkdoc
2013-07-26 15:51:30 -04:00
Geraldine Van der Auwera
660b075900
Added deprecation notice for SomaticIndelDetector
2013-07-26 15:51:30 -04:00
Geraldine Van der Auwera
5ad99c362d
Added caveat to gatkdocs for MAPQ read transformers & cleaned up AB annotation gatkdocs
2013-07-26 15:51:30 -04:00
Geraldine Van der Auwera
0ea3f8ca58
Added function to gatkdocs to specify what VCF field an annotation goes in (INFO or FORMAT)
2013-07-26 15:51:30 -04:00
Geraldine Van der Auwera
edbd17b8e0
Added note of caution to VQSR gatkdocs for option BOTH of recalibration mode
2013-07-26 15:51:29 -04:00
Ryan Poplin
f52196496d
Merge pull request #347 from broadinstitute/eb_more_dnagling_tail_improvements
...
More specific fix for the dangling tail edge case with a single leading deletion.
2013-07-26 07:25:47 -07:00
Ryan Poplin
66db412ad0
Merge pull request #345 from broadinstitute/rp_vqsr_sort_annotations
...
Automatically order the annotation dimensions in the VQSR by their stand...
2013-07-26 07:23:42 -07:00
Ryan Poplin
8c205dda1b
Automatically order the annotation dimensions in the VQSR by their standard deviation instead of the order they were specified on the command line.
2013-07-26 10:22:43 -04:00
Eric Banks
924d9b7ef4
Merge pull request #344 from lbergelson/lb_library_read_filter
...
Adding LibraryReadFilter.
2013-07-26 06:44:53 -07:00
Louis Bergelson
7c43b5f26a
Adding LibraryReadFilter.
...
--Moving LibraryReadFilter which has been part of Mutect into gatk public.
--Added an additional check for null values.
2013-07-26 09:32:14 -04:00
Eric Banks
9372c5ef41
Merge pull request #334 from broadinstitute/mc_generic_input_for_qualify_missing_intervals
...
QualifyMissingIntervals: support different formats
2013-07-25 12:39:26 -07:00
sathibault
71eb944e62
Adding CnyPairHMMUnitTest
2013-07-25 14:19:50 -05:00
Eric Banks
1b25cf471c
Merge pull request #341 from broadinstitute/eb_make_all_rr_stranded
...
Eb make all rr stranded
2013-07-25 11:50:43 -07:00
Eric Banks
5dfa863caa
Fully stranded implementation of RR (plus bug fix for insertions and het compression).
...
Now only filtered reads are unstranded. All consensus reads have strand, so that we
emit 2 consensus reads in general now: one for each strand.
This involved some refactoring of the sliding window which cleaned it up a lot.
Also included is a bug fix:
insertions downstream of a variant region weren't triggering a stop to the compression.
2013-07-25 14:48:53 -04:00
Eric Banks
0a2b5ddadf
More specific fix for the dangling tail edge case with a single leading deletion.
...
The previous fix was too general (and therefore incorrect) and caused the HC to exception out.
Added "unit" test for this exact case.
2013-07-25 12:24:46 -04:00
Mauricio Carneiro
31ab0824b1
quick indentation fixes to FPGA code
2013-07-24 14:09:49 -04:00
Ryan Poplin
e5aab22680
Merge pull request #342 from broadinstitute/eb_fix_mq_in_rbp
...
Fixing ReadBackedPileup to represent mapping qualities as ints, not (signed) bytes.
2013-07-24 09:42:13 -07:00
Eric Banks
6df43f730a
Fixing ReadBackedPileup to represent mapping qualities as ints, not (signed) bytes.
...
Having them as bytes caused problems for downstream programmers who had data with high MQs.
2013-07-23 23:47:15 -04:00
Eric Banks
71222bff45
Merge pull request #340 from broadinstitute/eb_fix_okaytomiss_arg
...
Various updates for KB, mostly so that reviews through IGV work properly...
2013-07-21 19:01:01 -07:00