Mark DePristo
24356f11b7
Merged bug fix from Stable into Unstable
...
-- Resolved conflict
Conflicts:
public/java/src/org/broadinstitute/sting/gatk/datasources/reads/SAMDataSource.java
2012-02-27 17:13:17 -05:00
Mark DePristo
0b29d54937
Changed most BAMSchedule ReviewedStingExceptions to UserExceptions
...
-- As these represent the bulk of the StingExceptions coming from BAMSchedule and are caused by simple problems like the user providing bad input tmp directories, etc.
2012-02-27 17:08:41 -05:00
Mark DePristo
f9e8e82e33
Removed unused class variable from VCFHeaderLineTranslator
2012-02-27 17:07:19 -05:00
Mark DePristo
100ddef930
Fix typo in VariantContextBuilder
2012-02-27 17:06:45 -05:00
Mark DePristo
ca0931c01f
Adding test for reading samtools VCF file
2012-02-27 17:05:50 -05:00
Menachem Fromer
33cf1368ba
Added options to add XHMM command-line parameters for discovery and genotyping
2012-02-27 16:03:02 -05:00
Eric Banks
bd944ab04f
Another test where we no longer print out 'NaN' for the AF.
2012-02-27 15:19:08 -05:00
Mark DePristo
5f7ccdcc01
Avoid calling getBasePileup when there's no pileup in NBaseCount annotation
2012-02-27 15:12:25 -05:00
Eric Banks
52871187d7
Adding integration test for file with no GTs. Also updated md5 for one other test (since we no longer print out 'NaN' for the AF).
2012-02-27 15:09:56 -05:00
Mark DePristo
729bb954e2
Throws ReviewedStingException for a bug when parent VariantContext argument is null
2012-02-27 15:09:00 -05:00
Eric Banks
998ed8fff3
Bug fix to deal with VCF records that don't have GTs. While in there, optimized a bunch of related functions (including removing a copy of the method calculateChromosomeCounts(); why did we have 2 copies? very dangerous).
2012-02-27 14:56:10 -05:00
Mark DePristo
4d9582de77
More general catching of Exceptions in interval reading to throw MalformedFile exception in all cases
...
-- Now throws UserException no matter what happens during the reading of the intervals file.
2012-02-27 14:02:26 -05:00
Mark DePristo
9712fed7a5
Trap SAMFormatException and rethrow as MalformatedBAM exception
...
-- Trap errors in header and rethrow
-- Wrap underlying iterator in MalformatedBAMErrorReformattingIterator
2012-02-27 13:52:50 -05:00
Eric Banks
1ea34058c2
Updating integration tests now that standard annotations support multiple alleles
2012-02-27 11:32:26 -05:00
Eric Banks
64754e7870
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-02-27 11:31:41 -05:00
Eric Banks
850c5d0db2
Enabling Rank Sum Tests for multi-allelics: use ref vs any alt allele.
2012-02-27 09:59:36 -05:00
Eric Banks
dfdf4f989b
Enabling Fisher Strand for multi-allelics: use the alt allele with max AC. Added minor optimization to the method in the VC.
2012-02-27 09:50:09 -05:00
Guillermo del Angel
16122bea8d
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-02-25 13:57:54 -05:00
Guillermo del Angel
dea35943d1
a) Bug fix in calling new functions that give indel bases and length from regular pileup in LocusIteratorByState, b) Added unit test to cover these.
2012-02-25 13:57:28 -05:00
Mark DePristo
c8a06e53c1
DoC now properly handles reference N bases + misc. additional cleanups
...
-- DoC now by default ignores bases with reference Ns, so these are not included in the coverage calculations at any stage.
-- Added option --includeRefNSites that will include them in the calculation
-- Added integration tests that ensures the per base tables (and so all subsequent calculations) work with and without reference N bases included
-- Reorganized command line options, tagging advanced options with @Advanced
2012-02-25 11:32:50 -05:00
Mark DePristo
50de1a3eab
Fixing bad VCFIntegration tests
...
-- Left disabled a test that should have been enabled
-- Didn't add the md5 to the test I actually added
-- Now VCFIntegrationTests should be working!
2012-02-25 11:26:36 -05:00
Mark DePristo
9bad51877e
Generalized gsafolkLSFLogs.py to gsafolkLogsForTableau.py
...
-- Now updates both LSF logs and filesystem sizes
-- New Tableau emails will include both LSF and FS info!
2012-02-24 15:58:24 -05:00
Mark DePristo
80b5c7ad21
Fix gitVersionNumbers script to not print git status messages to our file
2012-02-24 15:58:22 -05:00
Mark DePristo
747e1a728f
Script to recreate entire GATKLog db from scratch
...
Useful primarily as a reference. Sometimes necessary when low-level changes are made to the scripts, requiring all of the data to be reprocessed
2012-02-24 15:58:21 -05:00
Mark DePristo
e94a534076
Added dry run and verbose options to gsafolkLSFLogs
2012-02-24 15:58:20 -05:00
Mark DePristo
253bb46bcd
Add support to analyzeRunReports to tag xml logs with git version numbers
2012-02-24 15:58:19 -05:00
Guillermo del Angel
c9a4c74f7a
a) Bug fixes for last commit related to PileupElements (unit tests are forthcoming). b) Changes needed to make pool caller work in GENOTYPE_GIVEN_ALLELES mode c) Bug fix (yet again) for UG when GENOTYPE_GIVEN_ALLELES and EMIT_ALL_SITES are on, when there's no coverage at site and when input vcf has genotypes: output vcf would still inherit genotypes from input vcf. Now, we just build vc from scratch instead of initializing from input vc. We just take location and alleles from vc
2012-02-24 10:27:59 -05:00
Mauricio Carneiro
470375db58
added integration test for the ReduceReadsStash bug reported by Adam
2012-02-23 18:59:27 -05:00
Mauricio Carneiro
ee9a56ad27
Fix subtle bug in the ReduceReads stash reported by Adam
...
* The tailSet generated every time we flush the reads stash is still being affected by subsequent clears because it is just a pointer to the parent element in the original TreeSet. This is dangerous, and there is a weird condition where the clear will affects it.
* Fix by creating a new set, given the tailSet instead of trying to do magic with just the pointer.
2012-02-23 18:35:25 -05:00
Mark DePristo
e0c189909f
Added support for breakpoint alleles
...
-- See https://getsatisfaction.com/gsa/topics/support_vcf_4_1_structural_variation_breakend_alleles?utm_content=topic_link&utm_medium=email&utm_source=new_topic
-- Added integrationtest to ensure that we can parse and write out breakpoint example
2012-02-23 12:14:48 -05:00
Menachem Fromer
522ace6d57
CNV discovery is also a long-running job (depending on the number of samples)
2012-02-23 11:28:22 -05:00
Guillermo del Angel
6866a41914
Added functionality in pileups to not only determine whether there's an insertion or deletion following the current position, but to also get the indel length and involved bases - definitely needed for extended event removal, and needed for pool caller indel functionality.
2012-02-23 09:45:47 -05:00
Eric Banks
d34f07dba0
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-02-22 20:41:03 -05:00
Ryan Poplin
2b6c0939ab
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-02-22 19:00:38 -05:00
Ryan Poplin
8695738400
Bug fix in HaplotypeCaller's GENOTYPE_GIVEN_ALLELES mode for insertions greater than length 1. The allele being genotyped was off by one base pair.
2012-02-22 19:00:04 -05:00
Christopher Hartl
2c1b14d35e
Mostly small changes to my own scala scripts: .vcf.gz compatibility for output files, smarter beagle generation, simple script to scatter-gather combine variants. Whole genome indel calling now uses the gold standard indel set.
2012-02-22 17:20:04 -05:00
Christopher Hartl
9b61a398b3
Merge branch 'master' of ssh://ni.broadinstitute.org/humgen/gsa-scr1/chartl/dev/unstable
2012-02-22 17:18:10 -05:00
Ryan Poplin
ca7b5e068f
updating HaplotypeCaller integration tests after change to separate insertion and deletion GOP.
2012-02-22 15:23:24 -05:00
Ryan Poplin
e39638323b
Misc cleanup in HaplotypeCaller's HMM code now that we have separate GOP for insertions and deletions
2012-02-22 12:24:43 -05:00
Ryan Poplin
a611f86558
CalibrateGenotypeLikelihoods now accepts any number of external likelihood VCFs. We decided in the dev group to have the assigned name be a combination of the sample name provided in the VCF and the name provided to the rod binding.
2012-02-22 12:23:45 -05:00
Mauricio Carneiro
75783af6fc
int <-> BitSet conversion utils for MathUtils
...
* added unit tests.
2012-02-21 14:10:36 -05:00
Christopher Hartl
685bcaced2
Merge branch 'master' of ssh://ni.broadinstitute.org/humgen/gsa-scr1/chartl/dev/unstable
2012-02-21 13:53:37 -05:00
Guillermo del Angel
0f5674b95e
Redid fix for corner case when forming consensus with reads that start/end with insertions and that don't agree with each other in inserted bases: since I can't iterate over the elements of a HashMap because keys might change during iteration, and since I can't use ConcurrentHashMaps, the code now copies structure of (bases, number of times seen) into ArrayList, which can be addressed by element index in order to iterate on it.
2012-02-20 09:12:51 -05:00
Ryan Poplin
fe102a5d47
Fix for my renaming of the BQSR walker
2012-02-18 11:13:20 -05:00
Ryan Poplin
3d9eee4942
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-02-18 10:55:29 -05:00
Ryan Poplin
a8be96f63d
This caching in the BQSR seems to be too slow now that there are so many keys
2012-02-18 10:54:39 -05:00
Ryan Poplin
78718b8d6a
Adding Genotype Given Alleles mode to the HaplotypeCaller. It constructs the possible haplotypes via assembly and then injects the desired allele to be genotyped.
2012-02-18 10:31:26 -05:00
Guillermo del Angel
e724c63f2b
Reverting last commit until I learn how to effectively replicate and debug pipeline test failures, and until I also learn how to effectively remove a kep from a HashMap that's being iterated on
2012-02-17 17:18:43 -05:00
Guillermo del Angel
f2ef8d1d23
Reverting last commit until I learn how to effectively replicate and debug pipeline test failures, and until I also learn how to effectively remove a kep from a HashMap that's being iterated on
2012-02-17 17:15:53 -05:00
Guillermo del Angel
3e031a540f
Solve merge conflict
2012-02-17 10:56:03 -05:00