Matt Hanna
3ba918aff1
Error message cleanup in BAM indexing code.
2012-01-17 11:05:42 -05:00
Matt Hanna
cd43f016ce
Fixed NPE in getNextOverlappingBAMScheduleEntry() when mixed mapped/unmapped interval lists are used. Added integrationtest to verify behavior.
2012-01-12 13:29:11 -05:00
Mark DePristo
2e47336a81
Only print out error report for most recent release in runGATKReport.py
2012-01-11 08:54:46 -05:00
Khalid Shakir
ef50e77ee2
When running Queue jobs locally, merge the stderr to the stdout log if the error file is NOT specified.
...
Updated VE strats in the HSP for plotting Ka/Ks by AC.
2012-01-10 16:10:25 -05:00
Matt Hanna
dc60757b68
Eliminate unnecessary strong references (and therefore memory held) by tree reduce entries that have already been processed.
...
Thanks to Tim Fennell for the bug report.
2012-01-09 23:04:53 -05:00
Mark DePristo
845c0b1c66
Merge branch 'master' of ssh://depristo@gsa1/humgen/gsa-scr1/gsa-engineering/git/stable
2012-01-09 08:40:59 -05:00
Mark DePristo
f5add25c72
Improved formatting of queueStatus
2012-01-09 08:40:53 -05:00
Matt Hanna
1f1233b669
Fix for a rare but insidious bug in position tracking during async BAM file reading.
...
Thanks to Khalid for spotting and reporting the issue.
2012-01-08 22:03:35 -05:00
Mark DePristo
63b7a70c44
Removing very costly analyses of all GATK versions. Will be replaced by Tableau website
2012-01-06 18:13:19 -05:00
Mark DePristo
c96fee477c
Bug fix for VariantSummary
...
-- Call sets with indels > 50 bp in length are tagged as CNVs in the tag (following the 1000 Genomes convention) and were unconditionally checking whether the CNV is already known, by looking at the known cnvs file, which is optional. Fixed. Has the annoying side effect that indels > 50bp in size are not counted as indels, and so are substrated from both the novel and known counts for indels. C'est la vie
-- Added integration test to check for this case, using Mauricio's most recent VCF file for NA12878 which has many large indels. Using this more recent and representative file probably a good idea for more future tests in VE and other tools. File is NA12878.HiSeq.WGS.b37_decoy.indel.recalibrated.vcf in Validation_Data
2012-01-05 21:51:06 -05:00
Eric Banks
18ed954741
Compute Ti/Tv only if bi-allelic
2012-01-05 15:33:26 -05:00
Khalid Shakir
253a07fdb1
Implicits conversion issue/bug: QScript String<==>File shortcuts at compile time do not make String.equals(File) at runtime.
2012-01-03 18:43:45 -05:00
Mauricio Carneiro
9b55505c03
Fixing PairHMMIndelErrorModel array out of bounds
...
This error was due to the ReadClipper change of contract. Before the read utils would return null if a read was entirely clipped, now it returns an empty (safe) GATKSAMRecord.
2012-01-03 18:08:46 -05:00
David Roazen
ea6e718cb8
SnpEff 2.0.5 support. Re-enabled SnpEff in the HybridSelectionPipeline.
...
For now, we recommend only running with the GRCh37.64 database.
2012-01-03 15:18:36 -05:00
David Roazen
f3f01da1af
Enforce serial dependencies in RecalibrationWalkersIntegrationTest
...
Some tests in this class were intermittently not being executed due
to being randomly scheduled before tests whose results they depend on.
Now the serial dependencies are enforced to avoid problematic orderings.
2012-01-03 10:42:41 -05:00
Mauricio Carneiro
1b6d52817e
fixing adaptor clipping effect on recalibration integration test
2012-01-01 22:20:06 -05:00
Eric Banks
b0d68eb0e3
Merge remote-tracking branch 'unstable/master'
2011-12-31 20:26:44 -05:00
Mauricio Carneiro
55cfa76cf3
Updated integration tests for the new adaptor clipping fix.
2011-12-30 18:47:14 -05:00
Mauricio Carneiro
c7d0a9ebee
Forgot to test for inter-chromosomal mates in the adaptor clipping
...
* Fixing bug caught by Eric (and Kristian)
2011-12-30 00:19:53 -05:00
Matt Hanna
a259bfefd4
First commit addressing problems running RTC in parallel.
...
Turns out that because the RTC is the first walker to 'correctly' tree reduce according to functional programming
standards, the RTC has revealed a few problems with the tree reducer holding on to too much data. This is the first
and smaller of two commits to reduce memory consumption. The second commit will likely be pushed after GATK1.4 is
released.
2011-12-29 16:22:14 -05:00
Matt Hanna
e6e80e8d3f
Update Picard to fix a bug Mauricio found in Picard where Picard unnecessarily depends on Snappy during some usages of SortingCollection.
2011-12-29 14:35:02 -05:00
Roger Zurawicki
efe33a0a1b
BUG FIX: Output is correct
...
The output would put zero coverage because the pileup filtered using the wrong method
Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>
2011-12-28 23:05:43 -05:00
Roger Zurawicki
5672688a73
Optimized CoverageByRG and Added GCContent
...
- CoverageByRG now uses a hashmap for its value instead of a list. It runs about 4 times faster.
- Cleaned up some of the code
- CoverageByRG now calculates GCContent
Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>
2011-12-28 15:25:07 -05:00
Roger Zurawicki
0c05998c4c
Added CoverageByRG LocusWalker
...
WIll take any number of input bams and intervals
Returns a ReportTable with Average Coverage of each Read Group per Interval
Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>
2011-12-28 15:25:07 -05:00
Mauricio Carneiro
f692911903
GATKSAMRecord emptyRead static constructor
...
* Creates an empty GATKSAMRecord with empty (not null) Cigar, bases and quals. Allows empty reads to be probed without breaking.
* All ReadClipper utilities now emit empty reads for fully clipped reads
2011-12-27 17:01:17 -05:00
Mauricio Carneiro
8259c748f2
No more Filtered Reads tag.
...
All synthetic reads are marked with the reduced read tag.
2011-12-27 17:01:17 -05:00
Ryan Poplin
ef31b2f0a7
fixing merge conflicts.
2011-12-27 14:26:36 -05:00
Ryan Poplin
4f09a95221
Updating HaplotypeCaller for the new contracts in the adapter clipping.
2011-12-27 14:25:03 -05:00
Mauricio Carneiro
17bfe48d5e
Made all class methods private in the ReadClipper
...
* ReadClipperUnitTest now uses static methods
* Haplotype caller now uses static methods
* Exon Junction Genotyper now uses static methods
2011-12-27 02:11:32 -05:00
Mauricio Carneiro
ce493bf257
Added adaptor clipping to ReduceReads
...
* made all clipping steps optional with arguments.
2011-12-27 01:19:06 -05:00
Mauricio Carneiro
f7a5752025
Let this one slip through my commits.
2011-12-26 21:55:02 -05:00
Mauricio Carneiro
c1eaf7cf81
ReduceReads will allows different context sizes for different events
...
* Rename contextSize to contextSizeMismatches
* Indel context size is now different from mismatches context size
2011-12-26 21:17:29 -05:00
Mauricio Carneiro
4633637af6
Moved ReduceReads to static ReadClipper
...
* all clipping done in ReduceReads is done using the static methods of the ReadClipper now.
2011-12-26 21:14:40 -05:00
Mauricio Carneiro
9aa1c0c6e5
Better documentation and contracts for ReduceReads
...
* added javadoc to all methods
* added GATKDocs style documentation to the ReduceReadsWalker
* revised contracts and made explicit in the documentation
2011-12-26 21:12:23 -05:00
Mauricio Carneiro
3051cdf9c5
fixed reduced reads integration tests
2011-12-26 21:12:22 -05:00
Mauricio Carneiro
256a7d8bd2
fixing the arguments for RRead script
2011-12-26 21:12:22 -05:00
Eric Banks
dd990061f6
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-26 14:45:35 -05:00
Eric Banks
2130b39f33
Found the bug in the engine: RodLocusView was using the wrong seek method so that it would only move to the first locus of a shard (and with multi-locus shards, this meant that we never processed RODs from the other positions). In fact, because the seek(Shard) method is extremely misleading and now no longer used, I think it's safer to delete it and make everyone use the much more transparent seek(GenomeLoc). Note that I have not re-enabled my improvements to the intervals accumulation of ReferenceDataSource because that inefficiency is still present downstream in RodLocusView; need to discuss those changes with Matt.
2011-12-26 14:45:19 -05:00
Mauricio Carneiro
02495a5fd5
renaming script, once more
2011-12-23 20:01:25 -05:00
Mauricio Carneiro
afc58b81b2
changing permissions on the scala script
2011-12-23 19:47:48 -05:00
Mauricio Carneiro
5198f3a287
Making -e optional and renaming script
...
* Expanding intervals should be optional, not mandatory
2011-12-23 19:36:57 -05:00
Mauricio Carneiro
35c41409a1
Better contracts and docs for the ReadClipper
...
* Described the ReadClipper contract in the top of the class
* Added contracts where applicable
* Added descriptive information to all tools in the read clipper
* Organized public members and static methods together with the same javadoc
2011-12-23 19:36:57 -05:00
David Roazen
506c0e9c97
Disabling SnpEff support in the GATK and SnpEff annotation in the HybridSelectionPipeline
...
SnpEff support will remain disabled until SnpEff 2.0.4 has been officially released
and we've verified the quality of its annotations.
2011-12-23 19:12:57 -05:00
Eric Banks
24c84da60d
'Fixing' the changes in ReferenceDataSource so that a shard properly contains a list of GenomeLocs instead of a single merged one. However, that uncovered a probable bug in the engine, so instead of letting this code fester unfixed in the build (affecting everyone in the group) I've decided to revert the previous (slow, but working) version and fix the engine in my own branch.
2011-12-23 15:39:12 -05:00
Eric Banks
8762313a0d
Better TODO message
2011-12-22 20:54:35 -05:00
Eric Banks
a815e875a8
Removing debugging output
2011-12-22 15:49:11 -05:00
Eric Banks
deef542a38
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-22 15:44:58 -05:00
Eric Banks
6d260ec6ae
Start printing traversal stats after 30 seconds. I can't stand waiting 2 minutes.
2011-12-22 15:40:59 -05:00
David Roazen
510c71158c
Merged bug fix from Stable into Unstable
2011-12-22 10:49:52 -05:00
David Roazen
32cdef9682
Rename *PerformanceTest test classes to *LargeScaleTest
...
This is in preparation for the installation of the new performance test suite in Bamboo.
Note that "ant performancetest" is now "ant largescaletest"
2011-12-22 10:38:49 -05:00