Commit Graph

7986 Commits (d3a533b82e29d39d32af2c3a6992ef0a466ff09a)

Author SHA1 Message Date
Christopher Hartl d3a533b82e Revert "a"
This reverts commit 1175f50ddbf389f5da74d27dc725596582ae15af.
2011-11-09 11:22:26 -05:00
Christopher Hartl 5eaf800281 a 2011-11-09 11:22:20 -05:00
Christopher Hartl 091229e4db MVLikelihoodRatio now checks if the family string is provided before attempting to instantiate. Also check that variant contexts have both genotypes and genotype likelihoods.
Table codec now yells at users for not providing a HEADER with the table - parsing tables without a header line was causing the first line of the file to be eaten.
Table feature now has a toString method.

These are minor bug fixes.
2011-11-09 11:03:29 -05:00
Ryan Poplin b0e6afec48 Bug fix for HMM optimization. Need to also check the gap continuation penalty array for the index with the first discrepancy. 2011-11-08 14:51:25 -05:00
Ryan Poplin 0b181be61f Bug fix in SelectVariants when using a discordance track but no sample specifications. Added integration test to test this. 2011-11-07 15:25:16 -05:00
Eric Banks 6297561326 Adding the new jar 2011-11-07 15:08:19 -05:00
Eric Banks 86312a956a Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-11-07 14:59:49 -05:00
Ryan Poplin 2d1e385ca4 Adding note to VQSR docs about Rscript being needed in the environment PATH. 2011-11-07 14:04:13 -05:00
Eric Banks aa0c8c3600 Revving Tribble jar to v40. Our last jar was busted. 2011-11-07 11:30:08 -05:00
Eric Banks cdd40d1222 Removing contracts for the SimpleTimer 2011-11-06 22:22:49 -05:00
Eric Banks 90a053ea93 Don't change the mapping quality of MQ=255 reads in IR 2011-11-05 22:40:45 -04:00
Eric Banks de07e06cbc Merge remote-tracking branch 'unstable/master' 2011-11-04 15:57:13 -04:00
Mark DePristo e99871f587 Bug fix for decode loc
-- decodeLoc() wasn't skipping input header lines, so the system blew up when there was an = line being split.
2011-11-04 13:20:54 -04:00
Mark DePristo a340a1aeac Bug fix. decodeLoc() should update lineNo so you get meaningful line no when indexing
due to malformed VCF files.
2011-11-04 11:44:24 -04:00
Mark DePristo 849c0757f2 Bug fix for LocusScatterFunction when no intervals are provided
-- Now correctly grabs reference contigs and cuts them all up, rather than NPE as intervalString == null.
2011-11-04 10:55:09 -04:00
Mark DePristo 9f260c0dc1 Zero byte index bug fix for RandomlySplitVariants + cleanup
-- vcfWriter2 was never being closed in onTraversalDone(), so the on the fly index file was being created but never actually properly written to the file.

-- This bug is ultimately due to the inability of the GATK to allow multiple VCF output writers as @Output arguments, though

-- Removed the unnecessary local variable iFraction, = 1000 * the input fraction argument.  Now the system just uses a double random number and compares to the input fraction at all.  Is there some subtle reason I don't appreciate for this programming construct?
2011-11-04 09:45:20 -04:00
Mark DePristo 5a47c3c8a0 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-04 09:36:42 -04:00
Mauricio Carneiro 020b8b88ef GATKSAMRecord refactor in the tools
No tools should create SAMRecords internally. This commit should move all internals of the current tools to GATKSAMRecord.
2011-11-03 17:33:42 -04:00
Mauricio Carneiro e89ff063fc GATKSAMRecord refactor
The GATK engine will now provide a GATKSAMRecord to all tools which incorporates the functionality used by the GATK to the bam file (ReadGroups, Reduced Reads, ...).

* No tools should create SAMRecord anymore, use GATKSAMRecord instead *
2011-11-03 15:43:26 -04:00
Ryan Poplin f1df6c0c81 Misc cleanup in haplotype caller after incorporating Mark's FragmentCollection to merge overlapping read pairs. 2011-11-03 13:51:19 -04:00
Mark DePristo 748f8f1edc Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-03 10:06:48 -04:00
Mark DePristo c7f51e92a0 Mostly working version of multi-sample analysis qscript 2011-11-02 23:00:17 -04:00
Eric Banks e8bceb1eaa Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-02 21:13:54 -04:00
Eric Banks 78a00d2ddc Updating UG integration tests (needed updating only because the -mbq default is different from the old -mmq one). 2011-11-02 21:13:44 -04:00
Eric Banks 52b16bf739 Must check whether there's a normal vs. extended pileup before asking for it. 2011-11-02 20:45:24 -04:00
Eric Banks e1edd6bd12 Removing the min mapping quality argument since it wasn't being used in the normal processing of the pileups in UG - only for indel pileups. Instead, we apply the min base quality to the reads in the pileup for indels and define it to be the min 'confidence' of the base. Docs are updated but I didn't rename the argument as I don't want people to complain. 2011-11-02 20:32:58 -04:00
Mauricio Carneiro c22a14ee3b Merged bug fix from Stable into Unstable 2011-11-02 17:53:56 -04:00
Mauricio Carneiro e4a583a53f Fixing docs: No -I in this walker 2011-11-02 17:53:32 -04:00
Ryan Poplin e94fcf537b Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-02 16:29:19 -04:00
Ryan Poplin 4d35272916 Bug fixes with Mauricio to functions in ReadUtils used by reduced reads and the haplotype caller. 2011-11-02 16:29:10 -04:00
Mark DePristo 8a2929c1dd Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-02 16:21:00 -04:00
Mark DePristo e2f40da27f Scala script to run multi-sample analysis 2011-11-02 16:20:57 -04:00
Mark DePristo bd977c2d92 Bug fix to avoid infinite loop in GATKScatterFunction 2011-11-02 16:20:42 -04:00
Eric Banks 967ff647b8 Reduced reads shouldn't contribute to Fisher Strand calculations 2011-11-02 13:07:20 -04:00
Eric Banks cf0e699226 QualByDepth was inefficiently iterating over the pileup 2 times for some reason. Removed non-useful annotation classes. 2011-11-02 12:58:38 -04:00
Eric Banks 4501dce58d Fixing merge conflict 2011-11-02 12:50:32 -04:00
Eric Banks 54331b44e9 New way of looking at the size of a pileup: there's a physical number of elements in the data structure and there's a representative depth of coverage (since a reduced read represents depth >= 1). The size() method has been removed because its meaning is ambiguous. Updated several annotations and the UG engine to make use of the representative depths. 2011-11-02 12:47:30 -04:00
Mark DePristo c1da8cd5e7 Final version of bp-resolved locus scatter/gather
-- Minor refactoring to allow LocusScatterFunction to have maxIntervals be the original scatter count, rather than capping this by the interval count as Contig and Interval do
2011-11-02 11:26:34 -04:00
Mark DePristo 392e0aeace Moved unit tests into master IntervalUtilsUnitTest 2011-11-02 10:52:00 -04:00
Mark DePristo c2b97030a4 IntervalUtils for completely balanced locus-based scatter/gather
-- scatterLocusIntervals master utility
-- Moved around some general functionality from GenomeLocSortedSet to GenomeLoc
-- Util function for reversing a list (List<T> -> List<T>, unlike Collections version)
-- DoC is PartitionType.INTERVAL
-- Significant unit tests on new functionality (all passing)
-- Ready for real-world testing, as soon as I can get LocusScatterFunction.scala to actually work
2011-11-02 10:49:40 -04:00
Mark DePristo 5fc613f972 Better default partition types for walkers
-- Added PartitionType.READ, and associated ReadScatterFunction.  ReadScatterFunction is literally just ContigScatterFunction until someone wants to implement something better
-- LocusWalkers (and subclasses RodWalkers and RefWalkers) are by default PartitionType.LOCUS.
2011-11-01 19:47:10 -04:00
Mauricio Carneiro 53c9f49050 Fixing contracts!
forgot to revert the contract changes. This will fix bamboo.
2011-11-01 18:09:29 -04:00
Mauricio Carneiro 7d194afda8 Revert "Using isReduceRead from GATKSAMRecord"
Apparently the casting SAMRecord to GATKSAMRecord is not allowed.
2011-11-01 17:54:30 -04:00
Mauricio Carneiro 36600fd8e9 added MQ of low MQ/BQ to consensus RMS
Bases that were excluded for MQ and BQ filters are now contributing to the MQ RMS (but not to consensus base counts and variant/not variant region triggers).
2011-11-01 17:46:12 -04:00
Mauricio Carneiro 18f4c63d44 Using isReduceRead from GATKSAMRecord
centralizing functionality of the reduced reads.
2011-11-01 17:15:58 -04:00
Mauricio Carneiro b004489c6d Moving ReduceRead TAG to GATKSAMRecord
ReduceReads are now a feature of a GATKSAMRecord, so the tag and the special methods needed to use it will now be housed by the GATKSAMRecord.
2011-11-01 17:12:09 -04:00
Mauricio Carneiro 2b200c34a6 Removing testEqualBases
No need for the test if ReduceReads is not producing '=' bases anymore.
2011-11-01 17:05:27 -04:00
Mauricio Carneiro 17cc484dbd Revert "ReduceReads ref bases are now output as '='
Reducing the reference bases to '=' results in an extra compression of 13% on average. The GATK is not ready to handle files with '=' bases, and the decision was to implement this a an engine support, not a part of ReduceReads.
2011-11-01 16:35:07 -04:00
Mauricio Carneiro 76c32f5409 Revert "Compressed read group information"
We decided not to compress read group information because read groups should be universally unique. The gain of 3% compression was not worth it.

This reverts commit 79f1c3b70de240d8060ecb9a86d2f1d4ff2a8efb.
2011-11-01 16:33:21 -04:00
Eric Banks 0839c75c8d More minor fixes to docs 2011-10-31 21:49:27 -04:00