Eric Banks
0187f04a90
Proper fix for a previous RR bug fix: only remove reads from the header if they were actually used in the creation of the polyploid consensus.
2012-09-23 00:39:19 -04:00
Eric Banks
344083051b
Reverting the fix to the generalized ploidy exact model since it cannot handle it computationally. Will file this in the JIRA.
2012-09-22 23:07:28 -04:00
Eric Banks
ced652b3dd
RR bug: we need to call removeFromHeader() for reads that were used in creating a polyploid consensus or else they are reused later in creating synthetic reads. In the worst case, this bug caused the tool to create 2 copies of the reduced read.
2012-09-22 21:50:10 -04:00
Eric Banks
60b93acf7d
RR bug: we need to test that the mapping and base quals are >= the MIN values and not just >. This was causing us to drop Q20 bases.
2012-09-22 21:32:29 -04:00
Eric Banks
21251c29c2
Off-by-one error in sliding window manifests itself at end of a coverage region dropping the last covered base.
2012-09-21 17:22:30 -04:00
Mauricio Carneiro
2c3dc291c0
Added positive/negative strand to the synthetic reads
2012-09-21 10:00:48 -04:00
Mauricio Carneiro
51cb5098e4
Fixed the alignment issues with reads that started with empty consensus headers
2012-09-21 10:00:47 -04:00
Mauricio Carneiro
aa1d2f3a5b
Not every consensus is well aligned. Need to check more, but starting position has been fixed.
2012-09-21 10:00:45 -04:00
Mauricio Carneiro
97874b92d1
Program runs, but the consensus reads are all out of place and need more tags
2012-09-21 10:00:44 -04:00
Mauricio Carneiro
3494a52ddc
another intermediate commit to update changes from stable
2012-09-21 10:00:43 -04:00
Mauricio Carneiro
a89ff7b5dd
Intermediate commit to resolve conflicts coming from stable
2012-09-21 10:00:41 -04:00
Eric Banks
1316b579f0
Bad news folks: BQSR scatter-gather was totally busted; you absolutely cannot trust any BQSR table that was a product of SG (for any version of BQSR). I fixed BQSR-gathering, rewrote (and enabled) the unit test, and confirmed that outputs are now identical whether or not SG is used to create the table.
2012-09-20 14:14:34 -04:00
Eric Banks
4b7edc72d1
Fixing edge case bug in the Exact model (both standard and generalized) where we could abort prematurely in the special case of multiple polymorphic alleles and samples with widely different depths of coverage (e.g. exome and low-pass). In these cases it was possible to call the site bi-allelic when in fact it was multi-allelic (but it wouldn't cause it to create a monomorphic call).
2012-09-20 10:59:42 -04:00
Mauricio Carneiro
ee31a54a03
Merged bug fix from Stable into Unstable
2012-09-19 16:09:45 -04:00
Mauricio Carneiro
7cf9911924
Fixed ReduceReads bug where variant regions were missing.
...
This affected variant regions with more than 100 reads and less than 250 reads. Only bams reduced with GATK v2 and 2.1 were affected.
2012-09-19 16:09:08 -04:00
Ryan Poplin
7a7103a757
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-09-19 10:39:18 -04:00
Guillermo del Angel
bebd5c14b8
Update general ploidy md5's due to bad merge of md5's in previous commit, and new shortened interval definition for EMIT_ALL_CONFIDENT_SITES was buggy
2012-09-18 20:12:15 -04:00
Guillermo del Angel
6b37350bc0
Two hairy bugs in pool caller: a) Site error model wasn't counting errors in insertions correctly - Alleles passed in had padded ref byte, but event base in PileupElement doesn't have it. As a result, mismatch rate was grossly overestimated with insertions and we missed several calls we should have made. Integration test reflects changes. b) Adding a ref GL to the exact model is correct mathematically but AFResult wasn't filled properly. As a result, QUAL was junk in pure ref sites, and in all other sites the last ref GL introduced wasn't properly updating Pr(AF>0). c) Added integration test that covers -out_mode EMIT_ALL_CONFIDENT_SITES. Not fully sure if the math is 100% correct (for both diploid and generalized case) but at least now diploid and non-diploid cases behave similarly. md5 of this new test will fail since it's taking me a long time to run so I'll update from Bamboo output shortly
2012-09-14 13:13:22 -04:00
Ryan Poplin
35d15278af
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-09-11 14:34:17 -04:00
Guillermo del Angel
13831106d5
Fix GSA-535: storing likelihoods in allele map was busted when running HaplotypeCaller, only the last likelihood of a haplotype was being stored, as opposed to the max likelihood of all haplotypes mapping to an allele
2012-09-11 11:01:26 -04:00
Ryan Poplin
aa9829b55c
fixing typo
2012-09-10 13:36:37 -04:00
Guillermo del Angel
10c720cbba
Merge branch 'master' of ssh://gsa4/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-09-10 09:56:47 -04:00
Guillermo del Angel
2d4b00833b
Bug fix for logging likelihoods in new read allele map: reads which were filtered out were being excluded from map, but they should be included in annotations
2012-09-09 20:35:45 -04:00
Ryan Poplin
36913706c0
Bug fix in HC GenotypingEngine to ensure that all the merged complex events get properly added to the priority list used by VariantContextUtils when combining multiallelic events.
2012-09-09 13:47:54 -04:00
Ryan Poplin
688fc9fb56
Bug fix in HC GenotypingEngine to ensure that all the merged complex events get properly added to the priority list used by VariantContextUtils when combining multiallelic events.
2012-09-09 10:36:09 -04:00
David Roazen
cb84a6473f
Downsampling: experimental engine integration
...
-Off by default; engine fork isolates new code paths from old code paths,
so no integration tests change yet
-Experimental implementation is currently BROKEN due to a serious issue
involving file spans. No one can/should use the experimental features
until I've patched this issue.
-There are temporarily two independent versions of LocusIteratorByState.
Anyone changing one version should port the change to the other (if possible),
and anyone adding unit tests for one version should add the same unit tests
for the other (again, if possible). This situation will hopefully be extremely
temporary, and last only until the experimental implementation is proven.
2012-09-06 15:03:27 -04:00
Yossi Farjoun
d6884e705a
Revert "fixed a typo in StringText.properties"
...
This reverts commit b74c1c17e748f75e59d23545084b983e2a8d2fa6.
2012-09-05 15:21:00 -04:00
Yossi Farjoun
f4b39a7545
Merge branch 'master' of ssh://gsa4/humgen/gsa-scr1/gsa-engineering/git/unstable
...
merging trivially after a commit
2012-09-05 14:33:39 -04:00
Yossi Farjoun
6e517df5d9
fixed a typo in StringText.properties
2012-09-05 14:33:08 -04:00
Ryan Poplin
9cc1a9931b
Resolving merge conflicts.
2012-09-04 10:47:38 -04:00
Ryan Poplin
c9944d81ef
Skip array needs to also be used in the updateDataForRead function of the delocalized BQSR.
2012-09-04 10:33:37 -04:00
Mark DePristo
817ece37a2
General infrastructure for ReadTransformers
...
-- These are like read filters but can be applied either on input, on output, of handled by the walker
-- Previous example of BAQ now uses the general framework
-- Resulted in massive conceptual cleanup of SAMDataSource and ReadProperties! Yeah!
-- BQSR now uses this framework. We can now do BQSR on input, on output, or within a walker
-- PrintReads now handles all read transformers in the walker in map, enabling us to parallelize PrintReads with BAQ and BQSR
-- Currently BQSR is excepting in parallel, which subsequent commit with fix
-- Removed global variable setting in GenomeAnalysisEngine for BAQ, as command line parameters are cleanly handled by ReadTransformer infrastructure
-- In principle ReadFilters are just a special kind of ReadTransformer, but this refactoring is larger than I can do. It's a JIRA entry
-- Many files touched simply due to the refactoring and renaming of classes
2012-08-31 13:42:41 -04:00
Mark DePristo
1200848bbf
Part II of GSA-462: Consistent RODBinding access across Ref and Read trackers
...
-- Deleted ReadMetaDataTracker
-- Added function to ReadShard to give us the span from the left most position of the reads in the shard to the right most, which is needed for the new view
2012-08-30 10:15:10 -04:00
Ryan Poplin
e12ae65d33
Changing the commenting style in the BQSR
2012-08-29 11:27:45 -04:00
Ryan Poplin
18eca3544e
Initial commit of the delocalized BQSR written as a read walker.
2012-08-28 15:24:20 -04:00
Mark DePristo
b3fd74f0c4
HaplotypeCaller forbids BAQ
2012-08-24 13:25:05 -04:00
Ryan Poplin
fe3069b278
Merged bug fix from Stable into Unstable
2012-08-22 14:40:34 -04:00
Ryan Poplin
e5cfdb4811
Bug fix for popular _Duplicate allele added to VariantContext_ error reported on the forum. It seems to be due to lower case bases in the reference being treated as reference mismatches. We would try to turn these mismatches into SNP events, for example c/C. We now uppercase the result from IndexedFastaSequenceFile.getSubsequenceAt()
2012-08-22 14:39:35 -04:00
Guillermo del Angel
6a8cf1c84a
Enable and adapt HaplotypeScore and MappingQualityZero as active region annotations now that we have per-read likelihoods passed in to annotations
2012-08-21 14:35:40 -04:00
Guillermo del Angel
d0644b3565
Merge branch 'master' of ssh://gsa4.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-08-21 10:35:23 -04:00
Ryan Poplin
10961db3ce
Another round of FindBugs fixes. Object returns its internal reference to an externally mutable array. Very dangerous.
2012-08-21 09:35:55 -04:00
Ryan Poplin
605acaae9c
Another round of FindBugs fixes. Object internally stores a reference to an externally mutable array. Very dangerous.
2012-08-21 09:33:58 -04:00
Ryan Poplin
55b7949d68
Another round of FindBugs fixes. Comparator doesn't implement Serializable.
2012-08-21 09:20:55 -04:00
Guillermo del Angel
7bbd2a7a20
Fixing merge conflicts
2012-08-20 20:38:25 -04:00
Ryan Poplin
77fbaec044
Another round of FindBugs fixes. Class implements its own compareTo() but uses base Object.equals() which can lead to unpredictable behavior.
2012-08-20 16:55:00 -04:00
Ryan Poplin
a9472c1980
Another round of FindBugs fixes. Inefficient use of keySet iterator instead of entrySet iterator.
2012-08-20 16:11:45 -04:00
Ryan Poplin
464d49509a
Pulling out common caller arguments into its own StandardCallerArgumentCollection base class so that every caller isn't exposed to the unused arguments from every other caller.
2012-08-20 15:28:39 -04:00
Ryan Poplin
c67d708c51
Bug fix in HaplotypeCaller for non-regular bases in the reference or reads. Those events don't get created any more. Bug fix for advanced GenotypeFullActiveRegion mode: custom variant annotations created by the HC don't make sense when in this mode so don't try to calculate them.
2012-08-20 13:41:08 -04:00
Guillermo del Angel
963ad03f8b
Second step of interface cleanup for variant annotator: several bug fixes, don't hash pileup elements to Maps because the hashCode() for a pileup element is not implemented and strange things can happen. Still several things to do, not done yet
2012-08-19 21:18:18 -04:00
Guillermo del Angel
b61ecc7c19
Fix merge conflicts
2012-08-16 20:45:52 -04:00