Khalid Shakir
5dcac7b064
GATKReport v0.2:
...
- Floating point column widths are measured correctly
- Using fixed width columns instead of white space separated which allows spaces embedded in cell values
- Legacy support for parsing white space separated v0.1 tables where the columns may not be fixed width
- Enforcing that table descriptions do not contain newlines so that tables can be parsed correctly
Replaced GATKReportTableParser with existing functionality in GATKReport
2011-08-03 00:24:47 -04:00
David Roazen
d3437e62da
Added a simple utility method Utils.optimumHashSize() to calculate the optimum
...
initial size for a Java hash table (HashMap, HashSet, etc.) given an expected
maximum number of elements. The optimum size is the smallest size that's
guaranteed not to result in any rehash / table-resize operations.
Example Usage:
Map<String, Object> hash = new HashMap<String, Object>(Utils.optimumHashSize(expectedMaxElements));
I think we're paying way too heavy a price in unnecessary rehash operations across
the GATK. If you don't specify an initial size, you get a table of size 16 that gets
completely rehashed and doubles in size every time it becomes 75% full. This means you
do at least twice as much work as you need to in order to populate your table:
(n + n/2 + n/4 + ... 16 ~= (1 + 1/2 + 1/4...) * n ~= 2 * n
2011-08-02 21:59:06 -04:00
Guillermo del Angel
df37716857
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-02 18:27:13 -04:00
Khalid Shakir
daeabe6a2f
Updated HybridSelectionPipelineTest's variant count and titv to match expectations based on ebanks removing strand bias.
...
TODO: update filtering/recalibrations in HSP.
2011-08-02 17:09:16 -04:00
Ryan Poplin
20afe2a437
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-02 15:34:52 -04:00
Ryan Poplin
b2cde87378
Removing --DBSNP syntax from BQSR integration tests
2011-08-02 15:34:38 -04:00
Khalid Shakir
c2dc7c9c99
Updated flanks eval the same way the targets eval was updated by ebanks.
2011-08-02 15:30:04 -04:00
Ryan Poplin
c0653514b3
minor update to comment in UG
2011-08-02 13:34:48 -04:00
Ryan Poplin
2ba57bb502
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-02 13:30:46 -04:00
Ryan Poplin
38e4ae4176
minor update to comment in UG
2011-08-02 13:30:38 -04:00
Guillermo del Angel
821bbfa9e0
Bug fixes and enhancements to run whole-genome indel VQSR, removed old chr20-only code and cleanup
2011-08-02 13:17:20 -04:00
Eric Banks
65c5d55b72
Not sure how I missed these. These lines are now superfluous.
2011-08-02 12:48:36 -04:00
Eric Banks
b9d0d2af22
Adding back temporarily removed integration test now that the file permissions have been fixed.
2011-08-02 12:39:11 -04:00
Eric Banks
9497c303e6
Temporarily comment out annotation from the pipeline test; David will re-enable.
2011-08-02 12:38:47 -04:00
Eric Banks
1c387848de
No more use of -D in the integration tests but instead stick with VCFs only. Since all of these tests were duplicated (one each for dbSNP format and for VCF), we don't actually lose coverage in the integration tests.
2011-08-02 10:39:50 -04:00
Eric Banks
a2ca994e9a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-02 10:34:53 -04:00
Eric Banks
2c5e526eb7
Don't use the mismatch fraction by default in the RealignerTargetCreator (since it's only useful when using SW in the indel realigner). Also, no more use of -D but instead move over to using VCFs. One integration test is temporarily commented out while I wait for a VCF file to get fixed.
2011-08-02 10:34:46 -04:00
Eric Banks
5626199bb6
The Unified Genotyper now does NOT emit SLOD/SB by default; to compute SB use --computeSLOD
2011-08-02 10:14:21 -04:00
Guillermo del Angel
a6a87006a5
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-02 08:26:05 -04:00
Guillermo del Angel
3b2d1dee66
Final changes to indel project consensus script
2011-08-02 08:25:50 -04:00
Eric Banks
3a9b6eacdf
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-01 11:23:18 -04:00
Mauricio Carneiro
e7b4959ebe
Merged bug fix from Stable into Unstable
2011-07-30 02:06:45 -04:00
Mauricio Carneiro
2d94037ad0
Remove temporary index files (*.bai)
...
some temporary index files were not being removed.
2011-07-30 02:05:22 -04:00
Ryan Poplin
b06deac9ea
Merged bug fix from Stable into Unstable
2011-07-29 10:02:36 -04:00
Ryan Poplin
c0d4110ffd
Correcting redundant warning text.
2011-07-29 10:01:11 -04:00
Eric Banks
33b32c4211
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-28 13:57:22 -04:00
Eric Banks
7a2a65155f
Merged bug fix from Stable into Unstable
2011-07-28 13:56:43 -04:00
Eric Banks
1afc49a297
There are some really 'interesting' (but apparently valid) records in the Mus musculus dbSNP file. Generalized the handling of complex cases in the dbSNP adaptor to handle it all. I just grabbed the actual Mus musculus dbSNP file as a test, ran it whole genome, and confirmed that we finally produce a valid VCF on it. Should be the last commit needed on this adaptor.
2011-07-28 13:55:58 -04:00
Eric Banks
1865211b6d
Merged bug fix from Stable into Unstable
2011-07-27 22:52:06 -04:00
Eric Banks
6230315ff2
Along with my half-written commit message from earlier, I also forgot to commit the integration test updates. This is what happens when you try to do things 30 seconds before you leave for the day. To finish up from before: complex events weren't being padded with the reference base as per the VCF spec. They are now.
2011-07-27 22:51:21 -04:00
Eric Banks
ff31fa7990
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-27 16:15:23 -04:00
Eric Banks
5809a61b20
Merged bug fix from Stable into Unstable
2011-07-27 16:14:59 -04:00
Eric Banks
64aad67b5f
Fixing dbSNP adaptor for complex indels (wasn)
2011-07-27 16:13:45 -04:00
Kiran V Garimella
ab69b8e4ee
Merged bug fix from Stable into Unstable
2011-07-27 12:37:34 -04:00
Kiran V Garimella
6ebd83478b
Fixed build.xml to reflect path changes for gsalib
2011-07-27 12:37:00 -04:00
Kiran V Garimella
fe52f2dd8c
Merged bug fix from Stable into Unstable
2011-07-27 12:30:15 -04:00
Kiran V Garimella
ca35defdcd
Moved gsalib sources from private/ to public/
2011-07-27 12:29:43 -04:00
Kiran V Garimella
ada2f21976
Revert "Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable"
...
This reverts commit 9c81ef835a3ac581d4eb9cf1243e30df20a46795, reversing
changes made to f23d3ad5aec1c70cc1ecc48b295258aa70d30c7d.
2011-07-27 12:27:17 -04:00
Kiran V Garimella
86d38b7f0b
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-27 10:35:19 -04:00
Kiran V Garimella
dc8061e7a6
Moved gsalib from private/ to public/
2011-07-27 10:34:56 -04:00
Mauricio Carneiro
e607461db1
leftover </ol> removed...
2011-07-26 19:31:33 -04:00
Mauricio Carneiro
20a3b31b61
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-26 19:29:45 -04:00
Mauricio Carneiro
321afac4e8
Updates to the help layout.
...
*New style.css, new template for the walker auto-generated html. Short description is no longer repeated in the long description of the walker.
*Updated DiffObjectsWalker and ContigStatsWalker as "reference" documented walkers.
2011-07-26 19:29:25 -04:00
Kiran V Garimella
405e521d44
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-26 17:56:48 -04:00
Kiran V Garimella
92a11ed8dc
Updated MD5 for PhaseByTransmissionIntegrationTest
2011-07-26 17:52:25 -04:00
Kiran V Garimella
412c466de6
Bug fix, wherein triple-hets after genotype refinement need to be left unphased, not just prior to refinement
2011-07-26 17:43:43 -04:00
Mark DePristo
81f8e05bfa
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-26 17:35:46 -04:00
Mark DePristo
f6a5e0e36a
Go for global integrationtest path first, if possible.
2011-07-26 17:35:30 -04:00
Kiran V Garimella
36daaa7bda
Extract GA, AR2, and DR2 from the BEAGLE output
2011-07-26 17:29:23 -04:00
Matt Hanna
fec495e292
Fix a nasty little bug in the sharding system: if the last shard in contig n
...
overlaps exactly on disk with the first shard in contig n+1, the shards
would be merged together to avoid duplicate extraction. Unfortunately,
the interval overlap filter couldn't handle shards spanning contigs, and
was choosing to filter out reads from contig n+1 which should have been
included.
I'm not completely sure why the BAM indexing code would ever specify that the
end of one chromosome had the same on-disk location as the start of the next
one. I suspect that this is a indexer performance bug.
2011-07-26 15:43:20 -04:00