Mark DePristo
e4d49357ce
Further cleanup of R
2012-03-22 21:24:37 -04:00
Mark DePristo
503e2ea29e
Cleanup R directory
2012-03-22 21:24:37 -04:00
Mark DePristo
5725f72904
Cleanup unused python programs
...
-- If you happen to use one of these files you can always revert it.
2012-03-22 21:24:36 -04:00
Mark DePristo
9ddd5aec93
More eval modules being removed from VariantEval
...
-- IndelStatistics is superceded by IndelStatistics
2012-03-22 21:24:36 -04:00
Mark DePristo
bd5b6d1aba
Remove no longer in use Eval modules from VariantEval
...
-- No more IndelLengthHistogram (superceded by IndelSummary in subsequent commit)
-- No more SamplePreviousGenotypes or PhaseStats
-- No more MultiallelicAFs
2012-03-22 21:24:36 -04:00
Mark DePristo
6c2290fb6e
Performance optimization for gsa.read.gatkreport.R
...
-- instead of using y = rbind(x, y), which is O(n^2) in a loop when processing lines into a data structure in R, preallocate a matrix and explicitly assign each row to x. This results in a radical performance improvement when reading large tables into R. It's possible with this optimization to read in a 70MB table for variantQCReport.R with 200K lines for 800 samples.
2012-03-22 21:24:36 -04:00
Menachem Fromer
7faa9938b1
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-03-22 17:43:44 -04:00
Menachem Fromer
b9b9219ac7
Added respectPhaseInInput flag to RBP and integration tests
2012-03-22 17:40:21 -04:00
Ryan Poplin
39c8edf85b
Improvements to the node merging / pruning algorithm in the assembler
2012-03-22 17:16:45 -04:00
Guillermo del Angel
0a56a14d09
Build fixes to merge pool calculation models with latest interface changes. Reverted build.xml's private debug changes
2012-03-22 16:07:07 -04:00
Guillermo del Angel
b92fee711b
Added missing new files from previous commit
2012-03-22 15:47:23 -04:00
Guillermo del Angel
f198cec5e2
Temp commit: new structure for pool caller, now all work is in the same framework as in UG. There's a new genotype calculation model, PoolGenotypeCalculationModel, that does all the work and plugs into UnifiedGenotyperEngine. A new AF module for pools is upcoming. Old pool caller will be removed once all work is migrated
2012-03-22 15:46:39 -04:00
Menachem Fromer
1dfaacfeb5
Check for consistency of the BAM and VCF sample names, with a command line disable to throw if you know what you are doing
2012-03-22 12:40:15 -04:00
Guillermo del Angel
b02ef95bcf
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-03-22 12:14:12 -04:00
Guillermo del Angel
92676c63ca
Make constructor of IndelGenotypeLikelihoodsCalculationModel public so it can be used in unit tests
2012-03-22 12:13:59 -04:00
Guillermo del Angel
58965d6a6e
Merged bug fix from Stable into Unstable
2012-03-22 11:04:11 -04:00
Mark DePristo
256e9f001e
analyzeRunReports now includes full stack traces with causes
2012-03-22 10:15:44 -04:00
Guillermo del Angel
b8cd959461
Potential corner condition bug fix: protect against null pointer exceptions when computing consensus indel bases when UG is discovering alt alleles. If an alt allele has non-standard bases, skip allele gracefully instead of adding null object into list
2012-03-22 10:06:22 -04:00
Ryan Poplin
5c98424783
Add the reference haplotype to the assembly graph. The philosophy of the assembly is now to traverse paths alongside the reference instead of finding all possible paths.
2012-03-21 17:42:54 -04:00
Ryan Poplin
a29fc6311a
New debug option to output the assembly graph in dot format. Merge nodes in assembly graph when possible.
2012-03-21 15:48:55 -04:00
Eric Banks
8c09ff9459
Merged bug fix from Stable into Unstable
2012-03-21 12:44:43 -04:00
Eric Banks
58245bfa2f
Bug fix: check to see whether there's a BasePileup before asking for one.
2012-03-21 12:44:09 -04:00
Eric Banks
07c3bd32b3
Bug fix: merge NO_VARIATION records with those of another type. The sad part is that this WAS covered by integration tests but someone updated the MD5s without actually paying attention...
2012-03-21 12:42:13 -04:00
Eric Banks
dcf2fa361d
Minor cleanup
2012-03-21 12:14:31 -04:00
Eric Banks
ab1c48745b
Need to catch RuntimeExceptions coming out of Picard too so that they show up as UserErrors (some BAM errors are thrown as REs).
2012-03-21 12:13:52 -04:00
Ryan Poplin
9e10779fa7
Caching log calculations cut the non-Map runtime of HaplotypeCaller in half. Moved the qual log cache used in HC and PairHMM into a common place and added unit tests.
2012-03-21 08:45:42 -04:00
Mauricio Carneiro
0e93cf5297
Taking care of bad cigars in the GATK
...
* fixed BadCigarFilter to filter out reads starting/ending in deletion and that have adjacent I/D events.
* added Unit tests for BadCigarFilter
* updated all exceptions in LocusIteratorByState to tell the user that he can instead run with -rf BadCigar
* added the BadCigar filter to ReduceReads and RealignTargetCreator (if your walker blows up with these malformed reads, you may want to add it too)
2012-03-20 14:32:57 -04:00
Eric Banks
b290152542
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-03-20 08:56:04 -04:00
Eric Banks
5e79046c98
Minor change but I realized from Mark's commit that the code I stole it from was flawed
2012-03-20 08:55:56 -04:00
Mark DePristo
5ecfc49f74
Minor cleanup of MergeIntervalLists (example, please look)
...
-- Note that isDone() is override to return true. This causes the GATK to cleanly stop processing early.
2012-03-20 07:49:27 -04:00
Mark DePristo
36636eb323
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-03-20 07:47:24 -04:00
Eric Banks
ade1971581
Since we allow any generic header types, there's no longer any reason to check for supported types
2012-03-20 00:12:17 -04:00
Eric Banks
4910ef86d9
Added a to-do for Khalid
2012-03-19 23:12:58 -04:00
Eric Banks
5a3afd768d
Walker to merge multiple bed/interval files into a single consensus. 'Walker' is used loosely here; there must be a better way to do this, but I don't know how within the GATK framework.
2012-03-19 22:42:48 -04:00
Eric Banks
2324c5a74f
Simplified the interface for simple VCF header lines by making the VCFSimpleHeaderLine not abstract anymore - now any arbitrary header line with an ID (e.g. the contig and ALT lines) can be part of this class without having to define new classes. Also, renamed the 'named' header line to 'id' since that's more accurate.
2012-03-19 21:29:24 -04:00
Ryan Poplin
069ccdfdd4
Fixing broken HC integration tests while changes to exact model are being formulated.
2012-03-19 16:56:51 -04:00
Mauricio Carneiro
633b5c687d
Fixing MD5's (new GATKReport header was missing from old md5's)
2012-03-19 15:28:45 -04:00
Mauricio Carneiro
9cf4df15e5
BQSR recal script (just so we can scatter-gather)
2012-03-19 15:28:45 -04:00
Khalid Shakir
875dc5ef95
Re-added non-verbose MultiallelicSummary to HSP eval.
2012-03-19 14:40:31 -04:00
Khalid Shakir
e8b083ac20
Merged bug fix from Stable into Unstable
2012-03-19 14:37:36 -04:00
Khalid Shakir
d0056d6c71
Updated HSP dbsnp from 132 to 135 along with other minor patches.
2012-03-19 14:36:38 -04:00
Roger Zurawicki
7afb333811
GATK Report code cleanup
...
- Updated the documentation on the code
- Made the table.write() method private and updated necessary files.
- Added a constructor to GATKReport that takes GATKReportTables
- Optimized my code
Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>
2012-03-19 11:53:57 -04:00
Mauricio Carneiro
0d4ea30d6d
Updating the BQSR Gatherer to the new file format
...
This is important for quick turnaround in the analysis cycle of the new covariates. Also added a dummy unit test that doesn't really test anything (disabled), but helps in debugging.
2012-03-19 09:02:27 -04:00
Mark DePristo
37d979d98d
GATK performance over time includes GATK 1.5
2012-03-18 19:49:26 -04:00
Ryan Poplin
1c67a62fc0
Updating LikelihoodCalculationEngineUnitTest
2012-03-18 16:39:58 -04:00
Ryan Poplin
943b1d34f8
intermediate commit to aid in debugging HC / exact model changes. HC integration tests will still fail
2012-03-18 15:50:27 -04:00
Ryan Poplin
c4f4d16490
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-03-18 14:27:42 -04:00
Eric Banks
9223e451a3
Merged bug fix from Stable into Unstable
2012-03-18 00:54:19 -04:00
Eric Banks
5c5d8e7cd3
Minor: cleaner way of turning off index-on-the-fly checking in case we want to turn it back on.
2012-03-18 00:53:29 -04:00
Eric Banks
344a938a70
When checking to make sure that we have cached enough data in the PL array, use the converted index value since that's what will be used as an index into the array.
2012-03-18 00:36:30 -04:00