Commit Graph

1093 Commits (d00b2c6599b2d20b4b926a0aa6c5649e3fbae20a)

Author SHA1 Message Date
Mauricio Carneiro d00b2c6599 Adding a synthetic read for filtered data
* Generalized the concept of a synthetic read to cread both running consensus and a synthetic reads of filtered data.
* Synthetic reads can now have deletions (but not insertions)
* New reduced read tag for filtered data synthetic reads *(RF)*
* Sliding window header now keeps information of consensus and filtered data
* Synthetic reads are created simultaneously, new functionality is controlled internally by addToSyntheticReads
2011-11-09 20:16:22 -05:00
Eric Banks 21bf43f3bb Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-09 15:34:40 -05:00
Eric Banks 02d5e3025e Added integration test for intervals from bed file 2011-11-09 15:34:19 -05:00
Christopher Hartl 85bffe1dca Merged bug fix from Stable into Unstable 2011-11-09 15:29:14 -05:00
Christopher Hartl d828eba7f4 Allow comments in a table-formatted file to precede the header line. 2011-11-09 15:27:38 -05:00
Eric Banks 8205efbb29 Merge branch 'master' into intervals 2011-11-09 15:27:15 -05:00
Eric Banks d64f8a89a9 Instead of the SelfScopingFeatureCodec interface, pushed this functionality into Tribble itself. Now we can e.g. determine that a file can be parsed by the BedCodec on the fly. 2011-11-09 15:24:29 -05:00
Mauricio Carneiro f080f64f99 Preserve RG information on new GATKSAMRecord from SAMRecord 2011-11-09 14:39:20 -05:00
Mauricio Carneiro f9530e0768 Clean unnecessary attributes from the read
this gives on average 40% file size reduction.
2011-11-09 14:39:20 -05:00
Mauricio Carneiro 9427ada498 Fixing no cigar bug
empty GATKSAMRecords will have a null cigar. Treat them accordingly.
2011-11-09 14:39:20 -05:00
Christopher Hartl 149b79eaad Merged bug fix from Stable into Unstable 2011-11-09 11:26:30 -05:00
Christopher Hartl 11abb4f9d1 Better error message. 2011-11-09 11:25:28 -05:00
Christopher Hartl d3a533b82e Revert "a"
This reverts commit 1175f50ddbf389f5da74d27dc725596582ae15af.
2011-11-09 11:22:26 -05:00
Christopher Hartl 5eaf800281 a 2011-11-09 11:22:20 -05:00
Christopher Hartl 5451fbc2b2 Merged bug fix from Stable into Unstable 2011-11-09 11:06:15 -05:00
Christopher Hartl 091229e4db MVLikelihoodRatio now checks if the family string is provided before attempting to instantiate. Also check that variant contexts have both genotypes and genotype likelihoods.
Table codec now yells at users for not providing a HEADER with the table - parsing tables without a header line was causing the first line of the file to be eaten.
Table feature now has a toString method.

These are minor bug fixes.
2011-11-09 11:03:29 -05:00
Mauricio Carneiro e1b4c3968f Fixing GATKSAMRecord bug
when constructing a GATKSAMRecord from scratch, we should set "mRestOfBinaryData" to null so the BAMRecord doesn't try to retrieve missing information from the non-existent bam file.
2011-11-08 16:50:36 -05:00
Ryan Poplin e973ca2010 fixing merge conflict. 2011-11-08 14:55:05 -05:00
Ryan Poplin b0e6afec48 Bug fix for HMM optimization. Need to also check the gap continuation penalty array for the index with the first discrepancy. 2011-11-08 14:51:25 -05:00
Ryan Poplin 94dc447a70 Merged bug fix from Stable into Unstable 2011-11-07 15:26:35 -05:00
Ryan Poplin 0b181be61f Bug fix in SelectVariants when using a discordance track but no sample specifications. Added integration test to test this. 2011-11-07 15:25:16 -05:00
Ryan Poplin 0534149708 Merged bug fix from Stable into Unstable 2011-11-07 14:07:08 -05:00
Ryan Poplin 2d1e385ca4 Adding note to VQSR docs about Rscript being needed in the environment PATH. 2011-11-07 14:04:13 -05:00
Eric Banks 759f4fe6b8 Moving unclaimed walker with bad integration test to archive 2011-11-07 13:16:38 -05:00
Eric Banks c1986b6335 Add notes to the GATKdocs as to when a particular annotation can/cannot be calculated. 2011-11-07 11:06:19 -05:00
Eric Banks 724e3f3b0d Merged bug fix from Stable into Unstable 2011-11-06 22:23:22 -05:00
Eric Banks cdd40d1222 Removing contracts for the SimpleTimer 2011-11-06 22:22:49 -05:00
Ryan Poplin 5c565d28b9 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-06 10:26:19 -05:00
Eric Banks 3517489a22 Better --sample selection integration test for VE. The previous one would return true even if --sample was not working at all. 2011-11-06 01:07:49 -04:00
Eric Banks 1c4e429a1c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-06 00:05:56 -04:00
Eric Banks a12bc63e5c Get rid of support for bams without sample information in the read groups. This hidden option wasn't being used anyways because it wasn't hooked up properly in the AlignmentContext. 2011-11-05 23:54:28 -04:00
Eric Banks ad57bcd693 Adding integration test to cover using expressions with IDs (-E foo.ID) 2011-11-05 23:53:15 -04:00
Eric Banks 90a053ea93 Don't change the mapping quality of MQ=255 reads in IR 2011-11-05 22:40:45 -04:00
Ryan Poplin 611a395783 Now properly extending candidate haplotypes with bases from the reference context instead of filling with padding bases. Functionality in the private Haplotype class is no longer necessary so removing it. No need to have four different Haplotype classes in the GATK. 2011-11-05 12:18:56 -04:00
Mark DePristo e99871f587 Bug fix for decode loc
-- decodeLoc() wasn't skipping input header lines, so the system blew up when there was an = line being split.
2011-11-04 13:20:54 -04:00
Mark DePristo a340a1aeac Bug fix. decodeLoc() should update lineNo so you get meaningful line no when indexing
due to malformed VCF files.
2011-11-04 11:44:24 -04:00
Mark DePristo 9f260c0dc1 Zero byte index bug fix for RandomlySplitVariants + cleanup
-- vcfWriter2 was never being closed in onTraversalDone(), so the on the fly index file was being created but never actually properly written to the file.

-- This bug is ultimately due to the inability of the GATK to allow multiple VCF output writers as @Output arguments, though

-- Removed the unnecessary local variable iFraction, = 1000 * the input fraction argument.  Now the system just uses a double random number and compares to the input fraction at all.  Is there some subtle reason I don't appreciate for this programming construct?
2011-11-04 09:45:20 -04:00
Mauricio Carneiro e89ff063fc GATKSAMRecord refactor
The GATK engine will now provide a GATKSAMRecord to all tools which incorporates the functionality used by the GATK to the bam file (ReadGroups, Reduced Reads, ...).

* No tools should create SAMRecord anymore, use GATKSAMRecord instead *
2011-11-03 15:43:26 -04:00
Eric Banks e8bceb1eaa Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-02 21:13:54 -04:00
Eric Banks 78a00d2ddc Updating UG integration tests (needed updating only because the -mbq default is different from the old -mmq one). 2011-11-02 21:13:44 -04:00
Eric Banks 52b16bf739 Must check whether there's a normal vs. extended pileup before asking for it. 2011-11-02 20:45:24 -04:00
Eric Banks e1edd6bd12 Removing the min mapping quality argument since it wasn't being used in the normal processing of the pileups in UG - only for indel pileups. Instead, we apply the min base quality to the reads in the pileup for indels and define it to be the min 'confidence' of the base. Docs are updated but I didn't rename the argument as I don't want people to complain. 2011-11-02 20:32:58 -04:00
Ryan Poplin e94fcf537b Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-02 16:29:19 -04:00
Ryan Poplin 4d35272916 Bug fixes with Mauricio to functions in ReadUtils used by reduced reads and the haplotype caller. 2011-11-02 16:29:10 -04:00
Mark DePristo 8a2929c1dd Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-02 16:21:00 -04:00
Eric Banks 967ff647b8 Reduced reads shouldn't contribute to Fisher Strand calculations 2011-11-02 13:07:20 -04:00
Eric Banks cf0e699226 QualByDepth was inefficiently iterating over the pileup 2 times for some reason. Removed non-useful annotation classes. 2011-11-02 12:58:38 -04:00
Eric Banks 4501dce58d Fixing merge conflict 2011-11-02 12:50:32 -04:00
Eric Banks 54331b44e9 New way of looking at the size of a pileup: there's a physical number of elements in the data structure and there's a representative depth of coverage (since a reduced read represents depth >= 1). The size() method has been removed because its meaning is ambiguous. Updated several annotations and the UG engine to make use of the representative depths. 2011-11-02 12:47:30 -04:00
Mark DePristo 392e0aeace Moved unit tests into master IntervalUtilsUnitTest 2011-11-02 10:52:00 -04:00