Ryan Poplin
4a1e8ecbb7
Updating HaplotypeCaller qscript to work with most recent Queue
2011-11-06 12:53:54 -05:00
Ryan Poplin
5c565d28b9
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-06 10:26:19 -05:00
Ryan Poplin
ebdced412c
misc cleanup
2011-11-06 10:26:07 -05:00
Eric Banks
3517489a22
Better --sample selection integration test for VE. The previous one would return true even if --sample was not working at all.
2011-11-06 01:07:49 -04:00
Eric Banks
1c4e429a1c
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-06 00:05:56 -04:00
Eric Banks
a12bc63e5c
Get rid of support for bams without sample information in the read groups. This hidden option wasn't being used anyways because it wasn't hooked up properly in the AlignmentContext.
2011-11-05 23:54:28 -04:00
Eric Banks
ad57bcd693
Adding integration test to cover using expressions with IDs (-E foo.ID)
2011-11-05 23:53:15 -04:00
Eric Banks
e89fe7770e
Merged bug fix from Stable into Unstable
2011-11-05 22:41:21 -04:00
Eric Banks
90a053ea93
Don't change the mapping quality of MQ=255 reads in IR
2011-11-05 22:40:45 -04:00
Ryan Poplin
611a395783
Now properly extending candidate haplotypes with bases from the reference context instead of filling with padding bases. Functionality in the private Haplotype class is no longer necessary so removing it. No need to have four different Haplotype classes in the GATK.
2011-11-05 12:18:56 -04:00
Mauricio Carneiro
fd09d92801
Added contracts to SyntheticRead class
2011-11-04 16:27:11 -04:00
Mauricio Carneiro
839cab7427
Generalizing RunningConsensus to SyntheticRead
...
the class structure of the RunningConsensus is now going to represent both consensus and missing data objects.
2011-11-04 16:27:11 -04:00
Mauricio Carneiro
56674e3518
Removing SlidingReads
...
Since we're not doing '=' conversion inside the Reduce Reads walker, there is no real need for this class.
2011-11-04 16:27:11 -04:00
Mauricio Carneiro
18837e3e32
Moving SlidingRead functionality into SlidingWindow
2011-11-04 16:27:11 -04:00
Mauricio Carneiro
bf822172c7
Renaming the compressor classes
...
Naming scheme was confusing for the consensus classes. Now that we'll have multiple types of running consensus, I felt like a new naming scheme was necessary.
2011-11-04 16:27:11 -04:00
Mauricio Carneiro
56a6bb6e98
Fixing BAMRecord illegal access
...
Can't use getReadName on a read that has been completely hardclipped. The BAMRecord doesn't like it.
2011-11-04 16:27:11 -04:00
Eric Banks
de07e06cbc
Merge remote-tracking branch 'unstable/master'
2011-11-04 15:57:13 -04:00
Ryan Poplin
888d3b4fdc
Initial graph pruning algorithm for the assembler.
2011-11-04 14:02:50 -04:00
Mark DePristo
e99871f587
Bug fix for decode loc
...
-- decodeLoc() wasn't skipping input header lines, so the system blew up when there was an = line being split.
2011-11-04 13:20:54 -04:00
Mark DePristo
a340a1aeac
Bug fix. decodeLoc() should update lineNo so you get meaningful line no when indexing
...
due to malformed VCF files.
2011-11-04 11:44:24 -04:00
Mark DePristo
849c0757f2
Bug fix for LocusScatterFunction when no intervals are provided
...
-- Now correctly grabs reference contigs and cuts them all up, rather than NPE as intervalString == null.
2011-11-04 10:55:09 -04:00
Mark DePristo
9f260c0dc1
Zero byte index bug fix for RandomlySplitVariants + cleanup
...
-- vcfWriter2 was never being closed in onTraversalDone(), so the on the fly index file was being created but never actually properly written to the file.
-- This bug is ultimately due to the inability of the GATK to allow multiple VCF output writers as @Output arguments, though
-- Removed the unnecessary local variable iFraction, = 1000 * the input fraction argument. Now the system just uses a double random number and compares to the input fraction at all. Is there some subtle reason I don't appreciate for this programming construct?
2011-11-04 09:45:20 -04:00
Mark DePristo
5a47c3c8a0
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-04 09:36:42 -04:00
Mauricio Carneiro
020b8b88ef
GATKSAMRecord refactor in the tools
...
No tools should create SAMRecords internally. This commit should move all internals of the current tools to GATKSAMRecord.
2011-11-03 17:33:42 -04:00
Mauricio Carneiro
e89ff063fc
GATKSAMRecord refactor
...
The GATK engine will now provide a GATKSAMRecord to all tools which incorporates the functionality used by the GATK to the bam file (ReadGroups, Reduced Reads, ...).
* No tools should create SAMRecord anymore, use GATKSAMRecord instead *
2011-11-03 15:43:26 -04:00
Ryan Poplin
f1df6c0c81
Misc cleanup in haplotype caller after incorporating Mark's FragmentCollection to merge overlapping read pairs.
2011-11-03 13:51:19 -04:00
Mark DePristo
748f8f1edc
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-03 10:06:48 -04:00
Mark DePristo
c7f51e92a0
Mostly working version of multi-sample analysis qscript
2011-11-02 23:00:17 -04:00
Eric Banks
e8bceb1eaa
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-02 21:13:54 -04:00
Eric Banks
78a00d2ddc
Updating UG integration tests (needed updating only because the -mbq default is different from the old -mmq one).
2011-11-02 21:13:44 -04:00
Eric Banks
52b16bf739
Must check whether there's a normal vs. extended pileup before asking for it.
2011-11-02 20:45:24 -04:00
Eric Banks
e1edd6bd12
Removing the min mapping quality argument since it wasn't being used in the normal processing of the pileups in UG - only for indel pileups. Instead, we apply the min base quality to the reads in the pileup for indels and define it to be the min 'confidence' of the base. Docs are updated but I didn't rename the argument as I don't want people to complain.
2011-11-02 20:32:58 -04:00
Mauricio Carneiro
c22a14ee3b
Merged bug fix from Stable into Unstable
2011-11-02 17:53:56 -04:00
Mauricio Carneiro
e4a583a53f
Fixing docs: No -I in this walker
2011-11-02 17:53:32 -04:00
Ryan Poplin
e94fcf537b
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-02 16:29:19 -04:00
Ryan Poplin
4d35272916
Bug fixes with Mauricio to functions in ReadUtils used by reduced reads and the haplotype caller.
2011-11-02 16:29:10 -04:00
Mark DePristo
8a2929c1dd
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-02 16:21:00 -04:00
Mark DePristo
e2f40da27f
Scala script to run multi-sample analysis
2011-11-02 16:20:57 -04:00
Mark DePristo
bd977c2d92
Bug fix to avoid infinite loop in GATKScatterFunction
2011-11-02 16:20:42 -04:00
Eric Banks
967ff647b8
Reduced reads shouldn't contribute to Fisher Strand calculations
2011-11-02 13:07:20 -04:00
Eric Banks
cf0e699226
QualByDepth was inefficiently iterating over the pileup 2 times for some reason. Removed non-useful annotation classes.
2011-11-02 12:58:38 -04:00
Eric Banks
4501dce58d
Fixing merge conflict
2011-11-02 12:50:32 -04:00
Eric Banks
54331b44e9
New way of looking at the size of a pileup: there's a physical number of elements in the data structure and there's a representative depth of coverage (since a reduced read represents depth >= 1). The size() method has been removed because its meaning is ambiguous. Updated several annotations and the UG engine to make use of the representative depths.
2011-11-02 12:47:30 -04:00
Mark DePristo
c1da8cd5e7
Final version of bp-resolved locus scatter/gather
...
-- Minor refactoring to allow LocusScatterFunction to have maxIntervals be the original scatter count, rather than capping this by the interval count as Contig and Interval do
2011-11-02 11:26:34 -04:00
Mark DePristo
392e0aeace
Moved unit tests into master IntervalUtilsUnitTest
2011-11-02 10:52:00 -04:00
Mark DePristo
c2b97030a4
IntervalUtils for completely balanced locus-based scatter/gather
...
-- scatterLocusIntervals master utility
-- Moved around some general functionality from GenomeLocSortedSet to GenomeLoc
-- Util function for reversing a list (List<T> -> List<T>, unlike Collections version)
-- DoC is PartitionType.INTERVAL
-- Significant unit tests on new functionality (all passing)
-- Ready for real-world testing, as soon as I can get LocusScatterFunction.scala to actually work
2011-11-02 10:49:40 -04:00
Mark DePristo
5fc613f972
Better default partition types for walkers
...
-- Added PartitionType.READ, and associated ReadScatterFunction. ReadScatterFunction is literally just ContigScatterFunction until someone wants to implement something better
-- LocusWalkers (and subclasses RodWalkers and RefWalkers) are by default PartitionType.LOCUS.
2011-11-01 19:47:10 -04:00
Mauricio Carneiro
53c9f49050
Fixing contracts!
...
forgot to revert the contract changes. This will fix bamboo.
2011-11-01 18:09:29 -04:00
Mauricio Carneiro
7d194afda8
Revert "Using isReduceRead from GATKSAMRecord"
...
Apparently the casting SAMRecord to GATKSAMRecord is not allowed.
2011-11-01 17:54:30 -04:00
Mauricio Carneiro
36600fd8e9
added MQ of low MQ/BQ to consensus RMS
...
Bases that were excluded for MQ and BQ filters are now contributing to the MQ RMS (but not to consensus base counts and variant/not variant region triggers).
2011-11-01 17:46:12 -04:00