Mauricio Carneiro
020b8b88ef
GATKSAMRecord refactor in the tools
...
No tools should create SAMRecords internally. This commit should move all internals of the current tools to GATKSAMRecord.
2011-11-03 17:33:42 -04:00
Mauricio Carneiro
e89ff063fc
GATKSAMRecord refactor
...
The GATK engine will now provide a GATKSAMRecord to all tools which incorporates the functionality used by the GATK to the bam file (ReadGroups, Reduced Reads, ...).
* No tools should create SAMRecord anymore, use GATKSAMRecord instead *
2011-11-03 15:43:26 -04:00
Ryan Poplin
f1df6c0c81
Misc cleanup in haplotype caller after incorporating Mark's FragmentCollection to merge overlapping read pairs.
2011-11-03 13:51:19 -04:00
Eric Banks
e8bceb1eaa
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-02 21:13:54 -04:00
Eric Banks
78a00d2ddc
Updating UG integration tests (needed updating only because the -mbq default is different from the old -mmq one).
2011-11-02 21:13:44 -04:00
Eric Banks
52b16bf739
Must check whether there's a normal vs. extended pileup before asking for it.
2011-11-02 20:45:24 -04:00
Eric Banks
e1edd6bd12
Removing the min mapping quality argument since it wasn't being used in the normal processing of the pileups in UG - only for indel pileups. Instead, we apply the min base quality to the reads in the pileup for indels and define it to be the min 'confidence' of the base. Docs are updated but I didn't rename the argument as I don't want people to complain.
2011-11-02 20:32:58 -04:00
Mauricio Carneiro
c22a14ee3b
Merged bug fix from Stable into Unstable
2011-11-02 17:53:56 -04:00
Mauricio Carneiro
e4a583a53f
Fixing docs: No -I in this walker
2011-11-02 17:53:32 -04:00
Ryan Poplin
e94fcf537b
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-02 16:29:19 -04:00
Ryan Poplin
4d35272916
Bug fixes with Mauricio to functions in ReadUtils used by reduced reads and the haplotype caller.
2011-11-02 16:29:10 -04:00
Mark DePristo
8a2929c1dd
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-02 16:21:00 -04:00
Mark DePristo
e2f40da27f
Scala script to run multi-sample analysis
2011-11-02 16:20:57 -04:00
Mark DePristo
bd977c2d92
Bug fix to avoid infinite loop in GATKScatterFunction
2011-11-02 16:20:42 -04:00
Eric Banks
967ff647b8
Reduced reads shouldn't contribute to Fisher Strand calculations
2011-11-02 13:07:20 -04:00
Eric Banks
cf0e699226
QualByDepth was inefficiently iterating over the pileup 2 times for some reason. Removed non-useful annotation classes.
2011-11-02 12:58:38 -04:00
Eric Banks
4501dce58d
Fixing merge conflict
2011-11-02 12:50:32 -04:00
Eric Banks
54331b44e9
New way of looking at the size of a pileup: there's a physical number of elements in the data structure and there's a representative depth of coverage (since a reduced read represents depth >= 1). The size() method has been removed because its meaning is ambiguous. Updated several annotations and the UG engine to make use of the representative depths.
2011-11-02 12:47:30 -04:00
Mark DePristo
c1da8cd5e7
Final version of bp-resolved locus scatter/gather
...
-- Minor refactoring to allow LocusScatterFunction to have maxIntervals be the original scatter count, rather than capping this by the interval count as Contig and Interval do
2011-11-02 11:26:34 -04:00
Mark DePristo
392e0aeace
Moved unit tests into master IntervalUtilsUnitTest
2011-11-02 10:52:00 -04:00
Mark DePristo
c2b97030a4
IntervalUtils for completely balanced locus-based scatter/gather
...
-- scatterLocusIntervals master utility
-- Moved around some general functionality from GenomeLocSortedSet to GenomeLoc
-- Util function for reversing a list (List<T> -> List<T>, unlike Collections version)
-- DoC is PartitionType.INTERVAL
-- Significant unit tests on new functionality (all passing)
-- Ready for real-world testing, as soon as I can get LocusScatterFunction.scala to actually work
2011-11-02 10:49:40 -04:00
Mark DePristo
5fc613f972
Better default partition types for walkers
...
-- Added PartitionType.READ, and associated ReadScatterFunction. ReadScatterFunction is literally just ContigScatterFunction until someone wants to implement something better
-- LocusWalkers (and subclasses RodWalkers and RefWalkers) are by default PartitionType.LOCUS.
2011-11-01 19:47:10 -04:00
Mauricio Carneiro
53c9f49050
Fixing contracts!
...
forgot to revert the contract changes. This will fix bamboo.
2011-11-01 18:09:29 -04:00
Mauricio Carneiro
7d194afda8
Revert "Using isReduceRead from GATKSAMRecord"
...
Apparently the casting SAMRecord to GATKSAMRecord is not allowed.
2011-11-01 17:54:30 -04:00
Mauricio Carneiro
36600fd8e9
added MQ of low MQ/BQ to consensus RMS
...
Bases that were excluded for MQ and BQ filters are now contributing to the MQ RMS (but not to consensus base counts and variant/not variant region triggers).
2011-11-01 17:46:12 -04:00
Mauricio Carneiro
18f4c63d44
Using isReduceRead from GATKSAMRecord
...
centralizing functionality of the reduced reads.
2011-11-01 17:15:58 -04:00
Mauricio Carneiro
b004489c6d
Moving ReduceRead TAG to GATKSAMRecord
...
ReduceReads are now a feature of a GATKSAMRecord, so the tag and the special methods needed to use it will now be housed by the GATKSAMRecord.
2011-11-01 17:12:09 -04:00
Mauricio Carneiro
2b200c34a6
Removing testEqualBases
...
No need for the test if ReduceReads is not producing '=' bases anymore.
2011-11-01 17:05:27 -04:00
Mauricio Carneiro
17cc484dbd
Revert "ReduceReads ref bases are now output as '='
...
Reducing the reference bases to '=' results in an extra compression of 13% on average. The GATK is not ready to handle files with '=' bases, and the decision was to implement this a an engine support, not a part of ReduceReads.
2011-11-01 16:35:07 -04:00
Mauricio Carneiro
76c32f5409
Revert "Compressed read group information"
...
We decided not to compress read group information because read groups should be universally unique. The gain of 3% compression was not worth it.
This reverts commit 79f1c3b70de240d8060ecb9a86d2f1d4ff2a8efb.
2011-11-01 16:33:21 -04:00
Eric Banks
0839c75c8d
More minor fixes to docs
2011-10-31 21:49:27 -04:00
Eric Banks
74b018a1f3
Minor fixes to docs
2011-10-31 21:41:43 -04:00
Mauricio Carneiro
b51c36abb3
TEST: equal '=' base test implementation
2011-10-31 18:01:22 -04:00
Eric Banks
31ee5432c5
Merged bug fix from Stable into Unstable
2011-10-31 14:56:59 -04:00
David Roazen
cdde32acbd
Merged bug fix from Stable into Unstable
2011-10-31 14:21:15 -04:00
Eric Banks
f62af0291b
Check for invalid VCF records (not enough tokens) instead of assuming they are there.
2011-10-31 14:09:51 -04:00
Andrey Sivachenko
bed0acaed4
nWayOut now adds PG tag to the header as it should. Also, additional hidden option added: keepPGTags. If invoked, IndelRealigner PG tags from previous runs (if any) are kept in the header and the new PG tag is simply added, instead of overriding them
2011-10-31 12:28:28 -04:00
Mauricio Carneiro
7a8e49154d
Fixing contracts
...
forgot to rename the variables in the contracts. This should resolve Bamboo failures.
2011-10-31 09:21:19 -04:00
Mauricio Carneiro
ab4f58a5e9
A base for a test walker for reduce reads
...
this walker allows general testing on reduce reads given two bam files to make sure the reduce reads are not screwing up the format. This is just the base of the walker, the tests need to be implemented independently inside the framework.
2011-10-30 22:26:16 -04:00
Mauricio Carneiro
e220507ff9
Compressed read group information
...
Adding another 3% reduction in file size by compressing the read group ID.
2011-10-30 21:09:06 -04:00
Mauricio Carneiro
389380a590
ReduceReads ref bases are now output as '=' to save space
...
Restructured the sliding window framework to manipulate a wrapped version of the SAMRecord that contains information about the reference.
2011-10-30 12:04:39 -04:00
Mauricio Carneiro
dbd8c25787
No more R resources in the DPP
...
updating the DPP to conform with Analyze Covariates changes.
2011-10-28 16:57:01 -04:00
Khalid Shakir
e25d40882a
Swapping Thread.sleep(0) with Object.wait(0) caused Queue to lock up. Thanks to rpoplin for pointing it out.
2011-10-28 15:51:03 -04:00
Eric Banks
0ca7428e76
Allow processing of empty intervals, but warn user when this case is encountered.
2011-10-28 12:12:14 -04:00
Eric Banks
649dfe98f0
Add VCF header for any expressions that are requested
2011-10-28 10:22:19 -04:00
Eric Banks
8b1a62da27
Adding unit test to cover overlapping intervals from the same source with the intersection rule.
2011-10-28 09:59:43 -04:00
Eric Banks
057a79f598
This argument should be annotated as @Input
2011-10-28 09:44:49 -04:00
Eric Banks
4ba7c0cecd
Moving to private
2011-10-28 09:29:28 -04:00
Eric Banks
1bdd76c2f2
These tools now use the IntervalBinding system to handle intervals instead of doing it all manually
2011-10-28 09:28:12 -04:00
Eric Banks
6ba08a103d
Empty ROD files should generate an exception when used for creating intervals. Moved some now obsolete files to the archive as the realigner will now read all target intervals into memory.
2011-10-28 09:23:25 -04:00