Commit Graph

8681 Commits (8e2d372ab0649a003745ffbd319e64e6f5df2e25)

Author SHA1 Message Date
Eric Banks 8e2d372ab0 Use remove instead of setting the value to null 2012-01-25 14:41:34 -05:00
Eric Banks 05816955aa It was possible that we'd clean up a matrix column too early when a dependent column aborted early (with not enough probability mass) because we weren't being smart about the order in which we created dependencies. Fixed. 2012-01-25 14:28:21 -05:00
Eric Banks 2799a1b686 Catch exception for bad type and throw as a TribbleException 2012-01-25 12:15:51 -05:00
Eric Banks 96b62daff3 Minor tweak to the warning message. 2012-01-25 11:55:33 -05:00
Eric Banks fb863dc6a7 Warn user when trying to run with EMIT_ALL_SITES with indels; better docs for that option. 2012-01-25 11:50:12 -05:00
Eric Banks e349b4b14b Allow appending with the dbSNP ID even if a (different) ID is already present for the variant rod. 2012-01-25 11:35:54 -05:00
Eric Banks ea3d4d60f2 This annotation requires rods and should be annotated as such 2012-01-25 11:35:13 -05:00
Ryan Poplin 7a26fcb86f Setting the max alternate alleles for the exact model in the HaplotypeCaller's copy of the UG engine. 2012-01-25 09:51:13 -05:00
Ryan Poplin bbefe4a272 Added option to be able to write out the active regions to an interval list file 2012-01-25 09:47:06 -05:00
Ryan Poplin 9818c69df6 Can now specify active regions to process at the command line, mainly for debugging purposes 2012-01-25 09:32:52 -05:00
Guillermo del Angel 22f0caccac Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-24 16:22:39 -05:00
Mauricio Carneiro 97499529c7 another small bug with the file extension. 2012-01-24 16:14:35 -05:00
Mauricio Carneiro ffd61f4c1c Refactor the Pileup Element with regards to indels
Eric reported this bug due to the reduced reads failing with an index out of bounds on what we thought was a deletion, but turned out to be a read starting with insertion.

   * Refactored PileupElement to distinguish clearly between deletions and read starting with insertion
   * Modified ExtendedEventPileup to correctly distinguish elements with deletion when creating new pileups
   * Refactored most of the lazyLoadNextAlignment() function of the LocusIteratorByState for clarity and to create clear separation between what is a pileup with a deletion and what's not one. Got rid of many useless if statements.
   * Changed the way LocusIteratorByState creates extended event pileups to differentiate between insertions in the beginning of the read and deletions.
   * Every deletion now has an offset (start of the event)
   * Fixed bug when LocusITeratorByState found a read starting with insertion that happened to be a reduced read.
   * Separated the definitions of deletion/insertion (in the beginning of the read) in all UG annotations (and the annotator engine).
   * Pileup depth of coverage for a deleted base will now return the average coverage around the deletion.
   * Indel ReadPositionRankSum test now uses the deletion true offset from the read, changed all appropriate md5's
   * The extra pileup elements now properly read by the Indel mode of the UG made any subsequent call have a different random number and therefore all RankSum tests have slightly different values (in the 10^-3 range). Updated all appropriate md5s after extremely careful inspection -- Thanks Ryan!

 phew!
2012-01-24 16:07:21 -05:00
Matt Hanna 4aacaf8916 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-24 15:32:26 -05:00
Matt Hanna c312bd5960 Weirdly, PicardException inherits from SAMException, which means that our specialty code for
reporting malformed BAMs was actually misreporting any error that happened in the Picard layer
as a BAM ERROR.

Specifically changing PicardException to report as a ReviewedStingException; we might want to
change it in the future.  I'll followup with the Picard team to make sure they really, really
want PicardException to inherit from SAMException.
2012-01-24 15:30:04 -05:00
David Roazen 47f4440aea Merged bug fix from Stable into Unstable 2012-01-24 15:19:29 -05:00
David Roazen b07fdb1089 Rename alltests* targets in build.xml
"ant alltests" is now "ant committests"
"ant alltests.public" is now "ant committests.public"
"ant alltests.gatk.packagejar" is now "ant releasetests.gatk.packagejar"
"ant alltests.queue.packagejar" is now "ant releasetests.queue.packagejar"

This is going into both Stable + Unstable so that all Bamboo
plans can be properly updated at the same time.
2012-01-24 14:58:30 -05:00
Guillermo del Angel 8aaca83e91 Next intermediate iteration on pool caller: start using UnifiedArgumentCollection in order to have similar syntax to UG, several numerical bug fixes, add more logging to vcf (not done yet) 2012-01-24 11:45:19 -05:00
Mauricio Carneiro 7c7ca0d799 fixing bug with fastq extension
* PPP only recognized .fasta and .fq, failing when the user provided a .fastq file. Fixed.
2012-01-24 11:02:15 -05:00
Mark DePristo 0a3172a9f1 Fix for ref 0 bases for Chris
-- Disturbingly, fixing this bug doesn't actually cause an test failures.
-- Wrote a new QCRefWalker to actually check in detail that the reference bases coming into the RefWalker are all correct when comparing against a clean uncached load of the contig bases directly.
-- However, I cannot run this tool due to some kind of weird BAM error -- sending this on to Matt
2012-01-24 10:55:09 -05:00
Mauricio Carneiro 945cf03889 IntelliJ ate my import! 2012-01-23 21:46:45 -05:00
Mauricio Carneiro 2bb9525e7f Don't set base qualities if fastQ is provided
* Pacbio Processing pipeline now works with the new fastQ files outputted by the Pacbio instrument
2012-01-23 17:57:29 -05:00
Mark DePristo b6c816fe12 Turn off unnecessary printing in analyzeRunReports 2012-01-23 17:49:37 -05:00
Mark DePristo 1172517abb Bugfix for version parsing.
-- Now maps anything that doesn't exactly fit our git / svn schemes to unknown
-- Added max records and specific id options
2012-01-23 17:49:35 -05:00
Mark DePristo ceca7e0b37 Bugfix to now separate completed, sting and user exceptions. Added dry run mode 2012-01-23 17:49:34 -05:00
Mark DePristo 1f620c79e6 Add busers and bugroup information to queueStatus 2012-01-23 17:49:32 -05:00
Mark DePristo 10bc26079d bugfix to actually run correct python script 2012-01-23 17:49:31 -05:00
Mark DePristo 4b17fc3cc1 Parallel implementation of random forest training. Very cool (and easy) example of parallel processing in R 2012-01-23 17:49:29 -05:00
Mark DePristo bb203ccf0a combined analyses of snps and indels. 2012-01-23 17:49:28 -05:00
Mark DePristo 0ec6f86c21 Tests for event length, combined snps and indels. Partial infrastructure to train and eval trees. 2012-01-23 17:49:26 -05:00
Khalid Shakir c18beadbdb Device files like /dev/null are now tracked as special by Queue and are not used to generate .out file paths, scattered into a temporary directory, gathered, deleted, etc.
Attempted workaround for xdr_resourceInfoReq unsatisfied link during loading of libbat.so.
2012-01-23 16:17:04 -05:00
Christopher Hartl cc4ba7372f Why is reference_bases even an option anymore? 2012-01-23 15:18:59 -05:00
Christopher Hartl 3392d67c1a Maybe a switch to reference bases will fix this 2012-01-23 15:10:03 -05:00
Christopher Hartl 15c0c294c1 Adding in this walker to try to debug the 0-byte ref bases 2012-01-23 14:51:24 -05:00
Mark DePristo 02450e4b12 Merged bug fix from Stable into Unstable 2012-01-23 12:08:39 -05:00
Christopher Hartl 798596257b Enable the Genotype Phasing Evaluator. Because it didn't have the same argument structure as the base class, update2 of VariantEvaluator was being called, rather than update2 of the actual module. 2012-01-23 10:50:16 -05:00
Mark DePristo 80a4ce0edf Bugfix for incorrect error messages for missing BAMs and VCFs
-- Missing BAMs were appearing as StingExceptions
-- Missing VCFs were showing up as CommandLineErrors, but it's clearer for them to be CouldNotReadInputFile exceptions
-- Added integration tests to ensure missing BAMs, VCFs, and -L files are properly thrown as CouldNotReadInputFile exceptions
-- Added path to standard b37 BAM to BaseTest
-- Cleaned up code in SAMDataSource, removing my parallel loading code as this just didn't prove to be useful.
2012-01-23 09:52:07 -05:00
Guillermo del Angel 31d2f04368 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-23 09:23:03 -05:00
Guillermo del Angel 966387ca0b Next intermediate commit in the pool caller. Lots of bug fixes and now we can emit true vcf's with calls in discovery mode (still of unknown quality) - old validation mode is temporarily broken,will be fixed in next refactoring. 2012-01-23 09:22:31 -05:00
Christopher Hartl 4a08e8ca6e Minor tweaks to T2D-related qscripts. Replacing old md5s from the BeagleIntegrationTest. All differences boiled down either to the accounting of genotypes changed (./. --> 0/0 is no longer a "changed" genotype, and original genotypes that were ./. are represented as OG=. rather than OG=./. .)
This is somewhat of an arbitrary decision, and is negotiable. I could see treating

GT:PL   ./.:.

differently from

GT:PL   .:0,3,6

but am not sure the worth of doing so.
2012-01-23 08:25:34 -05:00
Ryan Poplin 4d6312d4ea HaplotypeCaller is now an ActiveRegionWalker. 2012-01-22 14:31:01 -05:00
Christopher Hartl 3b1aad4f17 After a minor and abject freakout, alter the T2D script to seek out truth sensitivities between 80 and 100, rather than between 0.8 and 1. Also, don't consider a genotype "changed by beagle" if the initial genotype is a no-call. 2012-01-20 23:43:51 -05:00
Christopher Hartl 9b4f6afa21 Alterations to scripts for better performance. Grid search now expands the sens/spec tradeoff (90 was far too aggressive against hapmap chr20), and 20 max gaussians was too many, and caused errors. For consensus genotypes: remember to gunzip the beagle outputs before converting to VCF. Also, beagle can in fact create 'null' alleles in certain circumstances. I'm not sure what exactly those circumstances are, but those sites should be ignored. When it does, all alleles apear to be set to null, so this should not affect the actual phasing in the output VCF. 2012-01-20 23:07:59 -05:00
Christopher Hartl f3564bbf43 Ugh. Darn intelliJ not telling me I was missing an import statement. 2012-01-20 13:25:11 -05:00
Christopher Hartl b902d778ca . 2012-01-20 13:22:46 -05:00
Christopher Hartl 7c6a9471e8 After ensuring MultiplyLikelihoods does what I want it to do, add a quick and simple integration test to ensure I don't break it. 2012-01-20 13:20:13 -05:00
Christopher Hartl e245cde47f A new beagle script for generating a reference panel from lowpass, exome, and chip data. This is for T2D, but potentially useful. 2012-01-20 12:48:32 -05:00
Christopher Hartl a91dd5d137 Merge branch 'master' of ssh://tin.broadinstitute.org/humgen/gsa-scr1/chartl/dev/unstable 2012-01-20 12:45:16 -05:00
Christopher Hartl 3fe73f155c Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-20 12:44:22 -05:00
Ryan Poplin 4b18786b5d Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-19 22:05:20 -05:00