Commit Graph

8720 Commits (3164c8dee57cb84a3c60c38f67e196a7fc25038e)

Author SHA1 Message Date
Mark DePristo 3164c8dee5 S3 upload now directly creates the XML report in memory and puts that in S3
-- This is a partial fix for the problem with uploading S3 logs reported by Mauricio.  There the problem is that the java.io.tmpdir is not accessible (network just hangs).  Because of that the s3 upload fails because the underlying system uses tmpdir for caching, etc.  As far as I can tell there's no way around this bug -- you cannot overload the java.io.tmpdir programmatically and even if I could what value would we use?  The only solution seems to me is to detect that tmpdir is hanging (how?!) and fail with a meaningful error.
2012-01-29 15:14:58 -05:00
Menachem Fromer 0e17cbbce9 Merged bug fix from Stable into Unstable 2012-01-27 16:03:16 -05:00
Menachem Fromer a9671b73ca Fix to permit proper handling of mapping qualities between 128 to 255 (which get converted to byte values of -128 to -1) 2012-01-27 16:01:30 -05:00
Ryan Poplin f7ac1f4a69 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-27 15:12:55 -05:00
Ryan Poplin fc08235ff3 Bug fix in active region traversal, locusView.getNext() skips over pileups with zero coverage but still need to count them in the active probability integrator 2012-01-27 15:12:37 -05:00
Mauricio Carneiro 052a4bdb9c Turning off PHONE HOME option in the MDCP
* MDCP is for internal use and there is no need to report to the Amazon cloud.
   * Reporting to ASW_S3 is not allowing jobs to finish, this is probably a bug.
2012-01-27 11:13:30 -05:00
Mauricio Carneiro f8f2152f9c fixing ReduceReads MD5s
now that we're using OP instead of OS.
2012-01-27 10:53:24 -05:00
Mark DePristo 0f2e8400b5 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-27 10:12:50 -05:00
Mauricio Carneiro ec9920b04f Updating the SAM TAG for Original Alignment Start to "OP"
per Mark's recommendation to reuse the Indel Realigner tag that made it to the SAM spec. The Alignment end tag is still "OE" as there is no official tag to reuse.
2012-01-27 08:51:39 -05:00
Mark DePristo 13d1626f51 Minor improvements in ref QC walker. Unfortunately this doesn't actually catch Chris's error 2012-01-27 08:24:22 -05:00
Mark DePristo cb04c0bf11 Removing javaassist 3.7, lucene library dependancies 2012-01-27 08:24:22 -05:00
Mauricio Carneiro 2a565ebf90 embarrassing fix-up, thanks Khalid. 2012-01-26 19:58:42 -05:00
Menachem Fromer 0de6aca29b Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 18:26:41 -05:00
Menachem Fromer d54e237671 Take advantage of Eric's fix for multiAllelic AC calculation, and also add fix to have the original allele's INFO field be passed through for batch merging 2012-01-26 18:25:30 -05:00
Mauricio Carneiro 246e085ec9 Unit tests for GATKSAMRecord class
* new unit tests for the alignment shift properties of reduce reads
   * moved unit tests from ReadUtils that were actually testing GATKSAMRecord, not any of the ReadUtils to it.
   * cleaned up ReadUtilsUnitTest
2012-01-26 17:06:36 -05:00
Mauricio Carneiro 0d4027104f Reduced reads are now aware of their original alignments
* Added annotations for reads that had been soft clipped prior to being reduced so that we can later recuperate their original alignments (start and end).
   * Tags keep the alignment shifts, not real alignment, for better compression
   * Tags are defined in the GATKSAMRecord
   * GATKSAMRecord has new functionality to retrieve original alignment start of all reads (trimmed or not) -- getOriginalAlignmentStart() and getOriginalAligmentEnd()
   * Updated ReduceReads MD5s accordingly
2012-01-26 17:06:36 -05:00
Eric Banks 07f72516ae Unsupported platform should be a user error 2012-01-26 16:14:25 -05:00
Ryan Poplin cdff23269d HaplotypeCaller now uses insertions and softclipped bases as possible triggers. LocusIteratorByState tags pileup elements with the required info to make this calculation efficient. The days of the extended event pileup are coming to a close. 2012-01-26 15:56:33 -05:00
Ryan Poplin 44c3a41f67 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 13:25:33 -05:00
Ryan Poplin dbe9eb70fe Updating HC integration tests after merge 2012-01-26 13:25:22 -05:00
Guillermo del Angel bd27f3bc9b Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 12:52:16 -05:00
Guillermo del Angel 67c89cadad Fixes for pool caller to match UG outputs at certain sites: implement min base qual/min mapping qual read filter so those reads are filtered from pileups, and implemented filter for sites that have a too large a fraction of deletions 2012-01-26 12:52:00 -05:00
Ryan Poplin 25532bdc37 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 11:43:32 -05:00
Ryan Poplin 390d493049 Updating ActiveRegionWalker interface to output a probability of active status instead of a boolean. Integrator runs a band-pass filter over this probability to produce actual active regions. First version of HaplotypeCaller which decides for itself where to trigger and assembles those regions. 2012-01-26 11:37:08 -05:00
Eric Banks 9a63a9ae3c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 00:38:25 -05:00
Eric Banks 859dd882c9 Don't make it standard for now 2012-01-26 00:38:16 -05:00
Eric Banks c5e81be978 Adding pairwise AF table. Not polished at all, but usable none-the-less. 2012-01-26 00:37:06 -05:00
Eric Banks 774e540042 Fixing broken test 2012-01-26 00:31:41 -05:00
Eric Banks 5b9c8ab01b Another quick update missed in the merge 2012-01-25 21:53:20 -05:00
Eric Banks 702a2d768f Initial version of multi-allelic summary module in VariantEval 2012-01-25 19:42:55 -05:00
Eric Banks 9a60887567 Lost an import in the merge 2012-01-25 19:41:41 -05:00
Eric Banks cba5f1a8b1 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-25 19:19:03 -05:00
Eric Banks ddaf51a50f Updated one integration test for indels 2012-01-25 19:18:51 -05:00
Eric Banks add6918f32 Cleaner, more efficient way of determining the last dependent set in the queue. 2012-01-25 16:21:10 -05:00
Menachem Fromer db645a94ca Added options to make the batch-merger more all-inclusive: keep all indels, SNPs (even filtered ones) but maintain their annotations. Also, VariantContextUtils.simpleMerge can now merge variants of all types using the Hidden non-default enum MultipleAllelesMergeType=MIX_TYPES 2012-01-25 16:10:59 -05:00
Guillermo del Angel d405ec2a0d Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-25 15:53:32 -05:00
Guillermo del Angel 4337dcd7e4 More pool caller bug fixes: the QUAL field was actually multiplied by 10 (accounting for a lot of singletons that shouldn't have been there), and correct AD output 2012-01-25 15:53:03 -05:00
Guillermo del Angel 66772d0ebf Next iteration in the pool caller: more bug fixes, start of big refactoring to clean up interfaces, moved up a lot of attributes that really belonged to a site up from the Pool class, added by default option to filter out a call if there's no reference depth (instead of just skipping the call which makes it hard to figure out what happened afterwards). 2012-01-25 15:41:08 -05:00
Eric Banks ef335a5812 Better implementation of the fix; PL index is now traversed in order. 2012-01-25 15:15:42 -05:00
Eric Banks 8e2d372ab0 Use remove instead of setting the value to null 2012-01-25 14:41:34 -05:00
Eric Banks 05816955aa It was possible that we'd clean up a matrix column too early when a dependent column aborted early (with not enough probability mass) because we weren't being smart about the order in which we created dependencies. Fixed. 2012-01-25 14:28:21 -05:00
Eric Banks 2799a1b686 Catch exception for bad type and throw as a TribbleException 2012-01-25 12:15:51 -05:00
Eric Banks 96b62daff3 Minor tweak to the warning message. 2012-01-25 11:55:33 -05:00
Eric Banks fb863dc6a7 Warn user when trying to run with EMIT_ALL_SITES with indels; better docs for that option. 2012-01-25 11:50:12 -05:00
Eric Banks e349b4b14b Allow appending with the dbSNP ID even if a (different) ID is already present for the variant rod. 2012-01-25 11:35:54 -05:00
Eric Banks ea3d4d60f2 This annotation requires rods and should be annotated as such 2012-01-25 11:35:13 -05:00
Ryan Poplin 7a26fcb86f Setting the max alternate alleles for the exact model in the HaplotypeCaller's copy of the UG engine. 2012-01-25 09:51:13 -05:00
Ryan Poplin bbefe4a272 Added option to be able to write out the active regions to an interval list file 2012-01-25 09:47:06 -05:00
Ryan Poplin 9818c69df6 Can now specify active regions to process at the command line, mainly for debugging purposes 2012-01-25 09:32:52 -05:00
Guillermo del Angel 22f0caccac Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-24 16:22:39 -05:00