Commit Graph

8996 Commits (ff26f2bf688048bbd6e2b9ffcf31cedce4fa99dd)

Author SHA1 Message Date
Christopher Hartl 25d943f706 Merge branch 'master' of ssh://chartl@ni.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-02-01 10:32:11 -05:00
Ryan Poplin e8528bc526 updating HaplotypeCaller integration tests 2012-02-01 09:43:19 -05:00
Ryan Poplin dc23265640 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-02-01 09:22:43 -05:00
Menachem Fromer 579627568e Limit to 3 ALT alleles 2012-01-31 23:39:39 -05:00
Mauricio Carneiro 08c7c07f25 Added the option of not compressing read names to ReduceReads
* When scatter/gathering, name compression cannot guarantee uniqueness. If uniqueness is important, it is recommended to turn compression off for scatter/gathering ReduceReads.
2012-01-31 17:14:57 -05:00
Ryan Poplin 056b24ccd6 Resolving merge conflicts with LocusIteratorByState 2012-01-31 16:13:32 -05:00
Ryan Poplin febc634557 Changing PileupElement's isSoftClipped to isNextToSoftClip since soft clipped bases aren't actually added to pileups, oops. Removing the intrinsic clustered variants filter from the HaplotypeCaller 2012-01-31 16:06:14 -05:00
Matt Hanna 7f70612beb Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-31 11:59:25 -05:00
Matt Hanna a630db1703 Oops...HierarchicalMicroScheduler was transforming any exception from the walker level into a ReviewedStingException.
Thanks to Ryan for pointing this out.
2012-01-31 11:58:21 -05:00
Mauricio Carneiro a7f5d26326 No more synthetic reads starting/ending with deletions
bug reported by Kristian Cibulskis that we were generating filtered data synthetic reads with leading deletions. Added integration test.
2012-01-31 11:41:36 -05:00
Mark DePristo 2f2f039c37 Better flow for byNegTrainingFraction 2012-01-31 10:49:46 -05:00
Mark DePristo d8a4d78854 Bugfix for exceptions with unknown source whose error was not being shown in tableau 2012-01-31 10:49:06 -05:00
Christopher Hartl faba3dd530 Merge branch 'master' of ssh://chartl@ni.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-31 10:25:29 -05:00
Mauricio Carneiro 17dbe9a95d A few cleanups in the LocusIteratorByState
* No more N's in the extended event pileups
   * Only add to the pileup MQ0 counter if the read actually goes into the pileup
2012-01-31 09:40:51 -05:00
Menachem Fromer e7ace8efc4 Fix NullPointerException caused in cases with too many ALT alleles 2012-01-30 21:00:16 -05:00
Ryan Poplin f9162ea705 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-30 19:45:19 -05:00
Ryan Poplin abb91cf26b Increasing the size of the active regions that are produced by the active probability integrator, more context is needed to call more complex events 2012-01-30 15:36:12 -05:00
Mauricio Carneiro d5d4fa8a88 Fixed discordance bug reported by Brad Chapman
discordance now reports discordance between genotypes as well (just like concordance)
2012-01-30 09:50:45 -05:00
Menachem Fromer f1e07f169e Only apply filters if there are filters to apply 2012-01-30 02:22:13 -05:00
Menachem Fromer d1aa5204d7 Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-29 23:39:34 -05:00
Menachem Fromer 3186f0f1b0 Try more memory and fewer ALT alleles so that we don't run out of memory 2012-01-29 23:38:32 -05:00
Mark DePristo 3164c8dee5 S3 upload now directly creates the XML report in memory and puts that in S3
-- This is a partial fix for the problem with uploading S3 logs reported by Mauricio.  There the problem is that the java.io.tmpdir is not accessible (network just hangs).  Because of that the s3 upload fails because the underlying system uses tmpdir for caching, etc.  As far as I can tell there's no way around this bug -- you cannot overload the java.io.tmpdir programmatically and even if I could what value would we use?  The only solution seems to me is to detect that tmpdir is hanging (how?!) and fail with a meaningful error.
2012-01-29 15:14:58 -05:00
Menachem Fromer 0e17cbbce9 Merged bug fix from Stable into Unstable 2012-01-27 16:03:16 -05:00
Menachem Fromer a9671b73ca Fix to permit proper handling of mapping qualities between 128 to 255 (which get converted to byte values of -128 to -1) 2012-01-27 16:01:30 -05:00
Ryan Poplin f7ac1f4a69 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-27 15:12:55 -05:00
Ryan Poplin fc08235ff3 Bug fix in active region traversal, locusView.getNext() skips over pileups with zero coverage but still need to count them in the active probability integrator 2012-01-27 15:12:37 -05:00
Mauricio Carneiro 052a4bdb9c Turning off PHONE HOME option in the MDCP
* MDCP is for internal use and there is no need to report to the Amazon cloud.
   * Reporting to ASW_S3 is not allowing jobs to finish, this is probably a bug.
2012-01-27 11:13:30 -05:00
Mauricio Carneiro f8f2152f9c fixing ReduceReads MD5s
now that we're using OP instead of OS.
2012-01-27 10:53:24 -05:00
Mark DePristo 0f2e8400b5 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-27 10:12:50 -05:00
Mauricio Carneiro ec9920b04f Updating the SAM TAG for Original Alignment Start to "OP"
per Mark's recommendation to reuse the Indel Realigner tag that made it to the SAM spec. The Alignment end tag is still "OE" as there is no official tag to reuse.
2012-01-27 08:51:39 -05:00
Mark DePristo 13d1626f51 Minor improvements in ref QC walker. Unfortunately this doesn't actually catch Chris's error 2012-01-27 08:24:22 -05:00
Mark DePristo cb04c0bf11 Removing javaassist 3.7, lucene library dependancies 2012-01-27 08:24:22 -05:00
Mauricio Carneiro 2a565ebf90 embarrassing fix-up, thanks Khalid. 2012-01-26 19:58:42 -05:00
Menachem Fromer 0de6aca29b Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 18:26:41 -05:00
Menachem Fromer d54e237671 Take advantage of Eric's fix for multiAllelic AC calculation, and also add fix to have the original allele's INFO field be passed through for batch merging 2012-01-26 18:25:30 -05:00
Mauricio Carneiro 246e085ec9 Unit tests for GATKSAMRecord class
* new unit tests for the alignment shift properties of reduce reads
   * moved unit tests from ReadUtils that were actually testing GATKSAMRecord, not any of the ReadUtils to it.
   * cleaned up ReadUtilsUnitTest
2012-01-26 17:06:36 -05:00
Mauricio Carneiro 0d4027104f Reduced reads are now aware of their original alignments
* Added annotations for reads that had been soft clipped prior to being reduced so that we can later recuperate their original alignments (start and end).
   * Tags keep the alignment shifts, not real alignment, for better compression
   * Tags are defined in the GATKSAMRecord
   * GATKSAMRecord has new functionality to retrieve original alignment start of all reads (trimmed or not) -- getOriginalAlignmentStart() and getOriginalAligmentEnd()
   * Updated ReduceReads MD5s accordingly
2012-01-26 17:06:36 -05:00
Eric Banks 07f72516ae Unsupported platform should be a user error 2012-01-26 16:14:25 -05:00
Ryan Poplin cdff23269d HaplotypeCaller now uses insertions and softclipped bases as possible triggers. LocusIteratorByState tags pileup elements with the required info to make this calculation efficient. The days of the extended event pileup are coming to a close. 2012-01-26 15:56:33 -05:00
Ryan Poplin 44c3a41f67 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 13:25:33 -05:00
Ryan Poplin dbe9eb70fe Updating HC integration tests after merge 2012-01-26 13:25:22 -05:00
Christopher Hartl 673ceadd11 While this fix worked for the evaluator module, it could potentially have bad effects in the phasing walkers. Special-case nocalls in the PhasingEvaluator and return AllelePair to previous state. 2012-01-26 13:06:36 -05:00
Christopher Hartl 9c6fda7e15 Yup. I was right. 2012-01-26 12:54:11 -05:00
Guillermo del Angel bd27f3bc9b Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 12:52:16 -05:00
Guillermo del Angel 67c89cadad Fixes for pool caller to match UG outputs at certain sites: implement min base qual/min mapping qual read filter so those reads are filtered from pileups, and implemented filter for sites that have a too large a fraction of deletions 2012-01-26 12:52:00 -05:00
Christopher Hartl 7d059540a4 Allow segments of genome to be excluded in generating a reference panel. Occasionally targets would contain no variation (typically, in the middle of the centromere), which beagle doesn't particularly like, and errors out rather than producing empty output files. The best way to deal with these is to just exclude the regions on a second-pass, and the remaining bits will be gathered with no additional work.
AllelePair is being mean and not telling me what genotype it sees when it finds a non-diploid genotype, but i suspect it's a no-call (".") rather than a no call ("./.").
2012-01-26 12:43:52 -05:00
Christopher Hartl 9d4b84f6bd Merge branch 'master' of ssh://chartl@ni.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 12:38:24 -05:00
Ryan Poplin 25532bdc37 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 11:43:32 -05:00
Ryan Poplin 390d493049 Updating ActiveRegionWalker interface to output a probability of active status instead of a boolean. Integrator runs a band-pass filter over this probability to produce actual active regions. First version of HaplotypeCaller which decides for itself where to trigger and assembles those regions. 2012-01-26 11:37:08 -05:00
Eric Banks 9a63a9ae3c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 00:38:25 -05:00