Commit Graph

166 Commits (65a9159ac67c988d0600a0ea21d4c2b4b4daf3b6)

Author SHA1 Message Date
Matt Hanna 2b2a4e0795 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/stable 2011-08-17 16:26:45 -04:00
Matt Hanna d170187896 Disable optimization that increases marginal speed of the GATK slightly but
can produce data loss in a narrow corner case where the BGZF block(s) locations
and offsets in the last index bucket of contig n overlap exactly with the BGZF
block locations and offset in the last index bucket of contig n+1.

A proper fix that keeps the optimization has already been introduced into
unstable, but disabling the optimization is a low risk way to make sure that
users of stable experience no data loss.
2011-08-17 16:16:05 -04:00
Ryan Poplin 9d4add3268 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-08-16 14:18:03 -04:00
Ryan Poplin 170d1ff7b6 Fix in UG for trying to call indels at IUPAC code bases when in EMIT_ALL_SITES mode 2011-08-16 14:17:46 -04:00
Andrey Sivachenko 9f3328db53 fixing read group name collision: before writing the read into respective stream in nway-out mode we now retrieve the original rg, not the merged/modified one 2011-08-16 13:45:40 -04:00
Eric Banks 9ddbfdcb9f Check filtered status before applying to alt reference 2011-08-15 12:25:23 -04:00
Guillermo del Angel 8325cb8c26 Fixing up apparent source control/merge snafu: fix to correctly output PL ordering in multi-allelic sites by UG was only half-committed and hence not working. This completes fix 2011-08-10 15:31:49 -04:00
Ryan Poplin 98a96f07c1 Updated standard deviation parameter in VQSR to our current recommended value 2011-08-04 14:06:26 -04:00
Ryan Poplin c0d4110ffd Correcting redundant warning text. 2011-07-29 10:01:11 -04:00
Eric Banks 1afc49a297 There are some really 'interesting' (but apparently valid) records in the Mus musculus dbSNP file. Generalized the handling of complex cases in the dbSNP adaptor to handle it all. I just grabbed the actual Mus musculus dbSNP file as a test, ran it whole genome, and confirmed that we finally produce a valid VCF on it. Should be the last commit needed on this adaptor. 2011-07-28 13:55:58 -04:00
Eric Banks 6230315ff2 Along with my half-written commit message from earlier, I also forgot to commit the integration test updates. This is what happens when you try to do things 30 seconds before you leave for the day. To finish up from before: complex events weren't being padded with the reference base as per the VCF spec. They are now. 2011-07-27 22:51:21 -04:00
Eric Banks 64aad67b5f Fixing dbSNP adaptor for complex indels (wasn) 2011-07-27 16:13:45 -04:00
Matt Hanna f50145b872 Reinitialize random seed in the bwa bindings from the fixed seed stored in the
BWA support files every time the support files are loaded.
2011-07-22 13:41:53 -04:00
Khalid Shakir 8b8f121cfb Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-21 23:01:11 -04:00
Khalid Shakir 59eb1f4663 Memory limits changed from Int to Double.
Updated LSF calls to read memory units from config along with tweaks to select hosts.
Moved some common code from GridEngine and LSF to super classes.
2011-07-21 22:57:18 -04:00
Christopher Hartl 2f5d10d16b Fix bug wherein aligner could be closed prior to its being used to lowercase sequences. 2011-07-21 13:21:48 -04:00
Matt Hanna 7054c5342f When using the BWA bindings, you have to explicitly call close() to get the
bindings to release memory.
It may or may not be possible to implicitly close triggered by the GC; I'll add a JIRA.
2011-07-21 12:13:29 -04:00
Christopher Hartl 15610ce0c3 Per Matt's request, disabling BWA-based integration tests so he can assess bamboo memory usage. 2011-07-21 11:04:22 -04:00
Guillermo del Angel 0a1d2df8cb Merged bug fix from Stable into Unstable 2011-07-20 13:19:35 -04:00
Guillermo del Angel f15023b7d2 Bad bug fix: output GLs in multiallelic records were in incorred order (misread spec) 2011-07-20 12:10:48 -04:00
Guillermo del Angel b9c9e0e952 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-20 10:45:16 -04:00
Guillermo del Angel 7140280bf6 Further bug fixes/cleanups for PrintReadsWalker 2011-07-20 10:44:37 -04:00
Guillermo del Angel a2d90a3590 Bug fix: reverted logic so that default behavior skips over sample lookup 2011-07-20 10:23:10 -04:00
Guillermo del Angel e8409c80fa Further protection vs null pointers in PrintReadsWalker 2011-07-19 21:59:24 -04:00
Christopher Hartl 5d706c9e92 Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
Removing PSP and CSM

Conflicts:

	public/java/src/org/broadinstitute/sting/gatk/walkers/sequenom/CreateSequenomMask.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/sequenom/PickSequenomProbes.java
2011-07-19 20:25:33 -04:00
Guillermo del Angel fb2d475c22 Bug fix to prevent null pointer 2011-07-19 20:13:56 -04:00
Christopher Hartl 92c7cfa1c8 BWA bindings and tests moved to public (was required for ValidationAmplicons)
Integration tests for ValidationAmplicons. New argument to disable BWA, lowercase letters only for repetitiveness instead.
2011-07-19 20:11:31 -04:00
David Roazen baae381acb Revert "Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable"
This reverts commit 039a6bb01f345322ce2be50ae3634308bb24e77e, reversing
changes made to b9c9973d1c638dfc9f8c19b5eb845e99844f9d29.
2011-07-19 18:38:53 -04:00
Christopher Hartl 07e716d23a PickSequenomProbes2 expanded functionality: lowercasing based on sequence uniqueness, preserving reference base prior to indel (not a part of the VC as I thought it was), masking deletion bases with 'N's, flanking insertion with 'N's, output is a fasta formatted file. Renamed to ValidationAmplicons since this is really not for picking sequenom probes, but for generating amplicon sequence from which other applications (like sequenom) can choose PCR primers. Moved from private to public. 2011-07-19 15:21:47 -04:00
Guillermo del Angel 6181d1e4cb Fixed integration test for VariantsToTable: now the * in REF column is not output 2011-07-19 14:42:11 -04:00
Guillermo del Angel e6d306458c Merge bug fixes 2011-07-19 14:36:20 -04:00
Guillermo del Angel 989dd17f95 a) Add ability in PrintReads to specify a sample file to easily subset samples, useful for IGV visualization, b) VariantsToTable is more R-friendly with Indels when printing ref/alt columns, c) Changes to SelectVariants ability to speficy a mask to randomly sample from a given AF distribution 2011-07-19 14:29:07 -04:00
Mark DePristo 8f0badc52b Updating md5s, as the diffobjects walker now emits the summary in reverse order. 2011-07-18 15:44:21 -04:00
Mark DePristo c05451047c Support for multiple records at the same site. The first record gets chr:start, and subsequent records get chr:start_2, chr:start_3, etc. 2011-07-18 15:43:52 -04:00
Mark DePristo 782a05e9b5 Support for sorting the diff output in reverse order. 2011-07-18 15:43:01 -04:00
Mark DePristo 45702d3084 Now supports a mode where the primary key isn't sorted. In this case the records are displayed in the order in which they are added to to the table. 2011-07-18 15:40:15 -04:00
Eric Banks 83ba2c066a Making it deterministic 2011-07-18 13:59:02 -04:00
Eric Banks 92fa410450 Check that it's a valid bam file before parsing or bad things can happen 2011-07-18 13:43:34 -04:00
Eric Banks 80b5c5261a CombineVariants no longer combines records of different types. So now when combining SNP and indel callsets, overlapping calls get their own records. Useful for Khalid in the pipeline. For those interested, it turns out the previous behavior was doing the wrong thing occasionally (and this was even captured in the integration tests). 2011-07-18 13:42:45 -04:00
Eric Banks bc8b5da698 Added docs while I was reading through the code to understand it 2011-07-18 12:25:54 -04:00
Mark DePristo 51b0dd01c3 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-18 10:47:29 -04:00
Mark DePristo d6e2e89f99 Walker test system refactoring. All MD5DB related functions are now in MD5DB.java.
System has the concept of a local and a global MD5 db.  The local one is like it operated previously.  The global one lives in /humgen/gsa-hpprojects/GATK/data/integrationtests.  If the system can find this directory then MD5s will also be read / written to this location.  This means that gsabamboo will print differences as appropriate.  And all users will in effect have access to a complete history of MD5 file results.
A few minor code reshuffles changed VariantRecalibration and VCFHeader test files.
2011-07-18 10:46:01 -04:00
Mark DePristo 6f26c07b85 Removed the SpecificDifference class. Now Difference classes always have the option to remember specific master and test values. This means that all summarized differences carry with them specific examples of their differences. Consequently, now even summarized differences give at least one example of the specific difference, even when the count of the difference is > 1. Unit tests updated. Added DiffObjects integrationtest. VCFDiffableReader now specifically reads the first line of the VCF file to capture the version number. 2011-07-18 10:42:35 -04:00
Kiran V Garimella b2b7d27fed Merge branch 'laptop' 2011-07-18 00:25:46 -04:00
Kiran V Garimella 497721a799 Added class documentation string. 2011-07-18 00:25:21 -04:00
Kiran V Garimella ac9c66138d Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-18 00:20:33 -04:00
Kiran V Garimella 824100e57f Corrected typo in MergeAndMatchHaplotypes integration test 2011-07-17 22:50:54 -04:00
Kiran V Garimella 8167aba601 Moved (poorly named) MergeAndMatchHaplotypes to public. Added integration test 2011-07-17 22:47:32 -04:00
Kiran V Garimella afb506e128 Added MD5s for PhaseByTransmission integration tests 2011-07-17 21:55:33 -04:00
Kiran V Garimella 558e197989 Integration test for PhaseByTransmission 2011-07-17 21:25:08 -04:00