Commit Graph

8709 Commits (2a565ebf90ea53e0a252fe097c778365aff3714a)

Author SHA1 Message Date
Mauricio Carneiro 2a565ebf90 embarrassing fix-up, thanks Khalid. 2012-01-26 19:58:42 -05:00
Menachem Fromer 0de6aca29b Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 18:26:41 -05:00
Menachem Fromer d54e237671 Take advantage of Eric's fix for multiAllelic AC calculation, and also add fix to have the original allele's INFO field be passed through for batch merging 2012-01-26 18:25:30 -05:00
Mauricio Carneiro 246e085ec9 Unit tests for GATKSAMRecord class
* new unit tests for the alignment shift properties of reduce reads
   * moved unit tests from ReadUtils that were actually testing GATKSAMRecord, not any of the ReadUtils to it.
   * cleaned up ReadUtilsUnitTest
2012-01-26 17:06:36 -05:00
Mauricio Carneiro 0d4027104f Reduced reads are now aware of their original alignments
* Added annotations for reads that had been soft clipped prior to being reduced so that we can later recuperate their original alignments (start and end).
   * Tags keep the alignment shifts, not real alignment, for better compression
   * Tags are defined in the GATKSAMRecord
   * GATKSAMRecord has new functionality to retrieve original alignment start of all reads (trimmed or not) -- getOriginalAlignmentStart() and getOriginalAligmentEnd()
   * Updated ReduceReads MD5s accordingly
2012-01-26 17:06:36 -05:00
Eric Banks 07f72516ae Unsupported platform should be a user error 2012-01-26 16:14:25 -05:00
Ryan Poplin cdff23269d HaplotypeCaller now uses insertions and softclipped bases as possible triggers. LocusIteratorByState tags pileup elements with the required info to make this calculation efficient. The days of the extended event pileup are coming to a close. 2012-01-26 15:56:33 -05:00
Ryan Poplin 44c3a41f67 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 13:25:33 -05:00
Ryan Poplin dbe9eb70fe Updating HC integration tests after merge 2012-01-26 13:25:22 -05:00
Guillermo del Angel bd27f3bc9b Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 12:52:16 -05:00
Guillermo del Angel 67c89cadad Fixes for pool caller to match UG outputs at certain sites: implement min base qual/min mapping qual read filter so those reads are filtered from pileups, and implemented filter for sites that have a too large a fraction of deletions 2012-01-26 12:52:00 -05:00
Ryan Poplin 25532bdc37 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 11:43:32 -05:00
Ryan Poplin 390d493049 Updating ActiveRegionWalker interface to output a probability of active status instead of a boolean. Integrator runs a band-pass filter over this probability to produce actual active regions. First version of HaplotypeCaller which decides for itself where to trigger and assembles those regions. 2012-01-26 11:37:08 -05:00
Eric Banks 9a63a9ae3c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-26 00:38:25 -05:00
Eric Banks 859dd882c9 Don't make it standard for now 2012-01-26 00:38:16 -05:00
Eric Banks c5e81be978 Adding pairwise AF table. Not polished at all, but usable none-the-less. 2012-01-26 00:37:06 -05:00
Eric Banks 774e540042 Fixing broken test 2012-01-26 00:31:41 -05:00
Eric Banks 5b9c8ab01b Another quick update missed in the merge 2012-01-25 21:53:20 -05:00
Eric Banks 702a2d768f Initial version of multi-allelic summary module in VariantEval 2012-01-25 19:42:55 -05:00
Eric Banks 9a60887567 Lost an import in the merge 2012-01-25 19:41:41 -05:00
Eric Banks cba5f1a8b1 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-25 19:19:03 -05:00
Eric Banks ddaf51a50f Updated one integration test for indels 2012-01-25 19:18:51 -05:00
Eric Banks add6918f32 Cleaner, more efficient way of determining the last dependent set in the queue. 2012-01-25 16:21:10 -05:00
Menachem Fromer db645a94ca Added options to make the batch-merger more all-inclusive: keep all indels, SNPs (even filtered ones) but maintain their annotations. Also, VariantContextUtils.simpleMerge can now merge variants of all types using the Hidden non-default enum MultipleAllelesMergeType=MIX_TYPES 2012-01-25 16:10:59 -05:00
Guillermo del Angel d405ec2a0d Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-25 15:53:32 -05:00
Guillermo del Angel 4337dcd7e4 More pool caller bug fixes: the QUAL field was actually multiplied by 10 (accounting for a lot of singletons that shouldn't have been there), and correct AD output 2012-01-25 15:53:03 -05:00
Guillermo del Angel 66772d0ebf Next iteration in the pool caller: more bug fixes, start of big refactoring to clean up interfaces, moved up a lot of attributes that really belonged to a site up from the Pool class, added by default option to filter out a call if there's no reference depth (instead of just skipping the call which makes it hard to figure out what happened afterwards). 2012-01-25 15:41:08 -05:00
Eric Banks ef335a5812 Better implementation of the fix; PL index is now traversed in order. 2012-01-25 15:15:42 -05:00
Eric Banks 8e2d372ab0 Use remove instead of setting the value to null 2012-01-25 14:41:34 -05:00
Eric Banks 05816955aa It was possible that we'd clean up a matrix column too early when a dependent column aborted early (with not enough probability mass) because we weren't being smart about the order in which we created dependencies. Fixed. 2012-01-25 14:28:21 -05:00
Eric Banks 2799a1b686 Catch exception for bad type and throw as a TribbleException 2012-01-25 12:15:51 -05:00
Eric Banks 96b62daff3 Minor tweak to the warning message. 2012-01-25 11:55:33 -05:00
Eric Banks fb863dc6a7 Warn user when trying to run with EMIT_ALL_SITES with indels; better docs for that option. 2012-01-25 11:50:12 -05:00
Eric Banks e349b4b14b Allow appending with the dbSNP ID even if a (different) ID is already present for the variant rod. 2012-01-25 11:35:54 -05:00
Eric Banks ea3d4d60f2 This annotation requires rods and should be annotated as such 2012-01-25 11:35:13 -05:00
Ryan Poplin 7a26fcb86f Setting the max alternate alleles for the exact model in the HaplotypeCaller's copy of the UG engine. 2012-01-25 09:51:13 -05:00
Ryan Poplin bbefe4a272 Added option to be able to write out the active regions to an interval list file 2012-01-25 09:47:06 -05:00
Ryan Poplin 9818c69df6 Can now specify active regions to process at the command line, mainly for debugging purposes 2012-01-25 09:32:52 -05:00
Guillermo del Angel 22f0caccac Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-24 16:22:39 -05:00
Mauricio Carneiro 97499529c7 another small bug with the file extension. 2012-01-24 16:14:35 -05:00
Mauricio Carneiro ffd61f4c1c Refactor the Pileup Element with regards to indels
Eric reported this bug due to the reduced reads failing with an index out of bounds on what we thought was a deletion, but turned out to be a read starting with insertion.

   * Refactored PileupElement to distinguish clearly between deletions and read starting with insertion
   * Modified ExtendedEventPileup to correctly distinguish elements with deletion when creating new pileups
   * Refactored most of the lazyLoadNextAlignment() function of the LocusIteratorByState for clarity and to create clear separation between what is a pileup with a deletion and what's not one. Got rid of many useless if statements.
   * Changed the way LocusIteratorByState creates extended event pileups to differentiate between insertions in the beginning of the read and deletions.
   * Every deletion now has an offset (start of the event)
   * Fixed bug when LocusITeratorByState found a read starting with insertion that happened to be a reduced read.
   * Separated the definitions of deletion/insertion (in the beginning of the read) in all UG annotations (and the annotator engine).
   * Pileup depth of coverage for a deleted base will now return the average coverage around the deletion.
   * Indel ReadPositionRankSum test now uses the deletion true offset from the read, changed all appropriate md5's
   * The extra pileup elements now properly read by the Indel mode of the UG made any subsequent call have a different random number and therefore all RankSum tests have slightly different values (in the 10^-3 range). Updated all appropriate md5s after extremely careful inspection -- Thanks Ryan!

 phew!
2012-01-24 16:07:21 -05:00
Matt Hanna 4aacaf8916 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-01-24 15:32:26 -05:00
Matt Hanna c312bd5960 Weirdly, PicardException inherits from SAMException, which means that our specialty code for
reporting malformed BAMs was actually misreporting any error that happened in the Picard layer
as a BAM ERROR.

Specifically changing PicardException to report as a ReviewedStingException; we might want to
change it in the future.  I'll followup with the Picard team to make sure they really, really
want PicardException to inherit from SAMException.
2012-01-24 15:30:04 -05:00
David Roazen 47f4440aea Merged bug fix from Stable into Unstable 2012-01-24 15:19:29 -05:00
David Roazen b07fdb1089 Rename alltests* targets in build.xml
"ant alltests" is now "ant committests"
"ant alltests.public" is now "ant committests.public"
"ant alltests.gatk.packagejar" is now "ant releasetests.gatk.packagejar"
"ant alltests.queue.packagejar" is now "ant releasetests.queue.packagejar"

This is going into both Stable + Unstable so that all Bamboo
plans can be properly updated at the same time.
2012-01-24 14:58:30 -05:00
Guillermo del Angel 8aaca83e91 Next intermediate iteration on pool caller: start using UnifiedArgumentCollection in order to have similar syntax to UG, several numerical bug fixes, add more logging to vcf (not done yet) 2012-01-24 11:45:19 -05:00
Mauricio Carneiro 7c7ca0d799 fixing bug with fastq extension
* PPP only recognized .fasta and .fq, failing when the user provided a .fastq file. Fixed.
2012-01-24 11:02:15 -05:00
Mark DePristo 0a3172a9f1 Fix for ref 0 bases for Chris
-- Disturbingly, fixing this bug doesn't actually cause an test failures.
-- Wrote a new QCRefWalker to actually check in detail that the reference bases coming into the RefWalker are all correct when comparing against a clean uncached load of the contig bases directly.
-- However, I cannot run this tool due to some kind of weird BAM error -- sending this on to Matt
2012-01-24 10:55:09 -05:00
Mauricio Carneiro 945cf03889 IntelliJ ate my import! 2012-01-23 21:46:45 -05:00
Mauricio Carneiro 2bb9525e7f Don't set base qualities if fastQ is provided
* Pacbio Processing pipeline now works with the new fastQ files outputted by the Pacbio instrument
2012-01-23 17:57:29 -05:00