Commit Graph

6749 Commits (1a0e5ab4ba6563b9a7e13ea4495d59e2cd0e9d8f)

Author SHA1 Message Date
Eric Banks 1a0e5ab4ba Merge branch 'master' of ssh://gsa1.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-08 13:08:25 -04:00
Eric Banks a06f341685 qscript to assess BQSR known sets 2011-08-08 13:08:17 -04:00
Ryan Poplin 99e3a72343 Merged bug fix from Stable into Unstable 2011-08-08 12:36:17 -04:00
Ryan Poplin 8072bd9831 Updating resource bundle generation qscript for changeover to git 2011-08-08 12:35:39 -04:00
Mark DePristo 0db79207e8 Refactored dependancy from CommandLineGATK from javadocs
This allows us to run the GATK again in environments without Javadoc loading by default in the classpath
2011-08-08 12:27:13 -04:00
Mauricio Carneiro 0db46d0648 Merged bug fix from Stable into Unstable 2011-08-08 10:50:09 -04:00
Mauricio Carneiro 2fd101135c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-08-08 10:49:43 -04:00
Mauricio Carneiro 4d6cb33612 removing temporary bam index
The clean bai file was left behind after the data processing pipeline was done
2011-08-08 10:49:28 -04:00
Mark DePristo 88061ed5fa rmdir the empty tmp dir if possible 2011-08-08 09:18:20 -04:00
Ryan Poplin 6693407bd8 Merged bug fix from Stable into Unstable 2011-08-07 17:39:03 -04:00
Ryan Poplin 738e94efcb Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-08-07 17:36:45 -04:00
Mark DePristo b3e57c329a Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-07 15:18:15 -04:00
Mark DePristo 5f8bc3aa8a Documenting classes, and name cleanup 2011-08-07 15:17:50 -04:00
Mark DePristo 1c63d43176 Help now points to GATKDocs instead of spitting out full, garbled description 2011-08-07 15:02:46 -04:00
Mark DePristo 3c7a96cf2d Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-07 12:12:47 -04:00
Mark DePristo be0fc8a1e0 Now creates .gz archive of s3 downloads 2011-08-07 12:11:50 -04:00
Khalid Shakir f534c2e7bb Merged bug fix from Stable into Unstable 2011-08-06 10:43:52 -04:00
Khalid Shakir eaa2f16d83 When a job finishes successfully in the ShellJobRunner, mark it as DONE instead of FAILED. 2011-08-06 10:42:04 -04:00
Guillermo del Angel a8eb8c27f0 a) Minor changes to indel consensus scripts to better reflect good default values, b) Fixed up Mills/Devine codec so it always produces correct ref padded bases, and added option to VariantsToVCF to fix reference base 2011-08-04 15:34:49 -04:00
Ryan Poplin 98a96f07c1 Updated standard deviation parameter in VQSR to our current recommended value 2011-08-04 14:06:26 -04:00
Eric Banks 406982284c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-04 12:49:02 -04:00
Eric Banks e48492f3c3 Validate that the reference padding base for indels is correct. 2011-08-04 12:48:56 -04:00
Eric Banks f10588420c Fixing path to dbSNP file as the other one was replaced 2011-08-04 12:36:24 -04:00
Ryan Poplin 21dc9a5543 Adding mills/devine indel dataset to the resource bundle 2011-08-04 12:31:28 -04:00
Mauricio Carneiro 0739b7f75b Merged bug fix from Stable into Unstable 2011-08-04 11:07:25 -04:00
Mauricio Carneiro aff681e407 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-08-04 11:05:25 -04:00
Mauricio Carneiro fa97bd8ac1 Merged bug fix from Stable into Unstable 2011-08-04 09:52:10 -04:00
Mauricio Carneiro 23ec5b94cf fixed a missing check for null
There was a missed check for the case when you don't provide an indels vcf for the cleaner.
2011-08-04 09:50:02 -04:00
Eric Banks a831af1166 Another misprint when removing the references to -D 2011-08-03 21:29:21 -04:00
Mauricio Carneiro 8981367307 Updating memory usage for picard programs 2011-08-03 15:48:28 -04:00
Eric Banks f62f47d476 Not sure why this didn't fail before, but bringing VE up to date with previous changes 2011-08-03 14:27:07 -04:00
Eric Banks 3de10b1ef8 Fixing misprint from Ryan's commit 2011-08-03 12:37:50 -04:00
Eric Banks db2e0aaa1a Darn, forgot to update unit tests. 2011-08-03 12:31:08 -04:00
Eric Banks 020b2408a8 Adding integration test for left alignment of indels 2011-08-03 12:19:44 -04:00
Eric Banks f6648e0144 Don't left-align complex indels because it's too complicated. 2011-08-03 12:03:50 -04:00
Eric Banks 5dc324ff35 Dealing with merge confict 2011-08-03 11:03:47 -04:00
Eric Banks 7c89fe01b3 Instead of having the padded reference base be some hackish attribute it is now an actual variable in the Variant Context class. More importantly, we now always require that it be present when padding is necessary - and validate as such upon construction of the VC. This cleans up the interface significantly because we no longer require that a reference base be passed in when writing a VC/VCF record. 2011-08-03 11:00:36 -04:00
Khalid Shakir 3e043a633c Merged bug fix from Stable into Unstable 2011-08-03 02:23:16 -04:00
Khalid Shakir a587f38808 Fixed example unified genotyper pipeline to wrap filter expressions with quotes and use rod binding name "variant" instead of "vcf". 2011-08-03 02:21:01 -04:00
Khalid Shakir 5dcac7b064 GATKReport v0.2:
- Floating point column widths are measured correctly
- Using fixed width columns instead of white space separated which allows spaces embedded in cell values
- Legacy support for parsing white space separated v0.1 tables where the columns may not be fixed width
- Enforcing that table descriptions do not contain newlines so that tables can be parsed correctly
Replaced GATKReportTableParser with existing functionality in GATKReport
2011-08-03 00:24:47 -04:00
David Roazen d3437e62da Added a simple utility method Utils.optimumHashSize() to calculate the optimum
initial size for a Java hash table (HashMap, HashSet, etc.) given an expected
maximum number of elements. The optimum size is the smallest size that's
guaranteed not to result in any rehash / table-resize operations.

Example Usage:
Map<String, Object> hash = new HashMap<String, Object>(Utils.optimumHashSize(expectedMaxElements));

I think we're paying way too heavy a price in unnecessary rehash operations across
the GATK. If you don't specify an initial size, you get a table of size 16 that gets
completely rehashed and doubles in size every time it becomes 75% full. This means you
do at least twice as much work as you need to in order to populate your table:

(n + n/2 + n/4 + ... 16 ~= (1 + 1/2 + 1/4...) * n ~= 2 * n
2011-08-02 21:59:06 -04:00
Guillermo del Angel df37716857 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-02 18:27:13 -04:00
Khalid Shakir daeabe6a2f Updated HybridSelectionPipelineTest's variant count and titv to match expectations based on ebanks removing strand bias.
TODO: update filtering/recalibrations in HSP.
2011-08-02 17:09:16 -04:00
Ryan Poplin 20afe2a437 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-02 15:34:52 -04:00
Ryan Poplin b2cde87378 Removing --DBSNP syntax from BQSR integration tests 2011-08-02 15:34:38 -04:00
Khalid Shakir c2dc7c9c99 Updated flanks eval the same way the targets eval was updated by ebanks. 2011-08-02 15:30:04 -04:00
Ryan Poplin c0653514b3 minor update to comment in UG 2011-08-02 13:34:48 -04:00
Ryan Poplin 2ba57bb502 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-02 13:30:46 -04:00
Ryan Poplin 38e4ae4176 minor update to comment in UG 2011-08-02 13:30:38 -04:00
Guillermo del Angel 821bbfa9e0 Bug fixes and enhancements to run whole-genome indel VQSR, removed old chr20-only code and cleanup 2011-08-02 13:17:20 -04:00