Commit Graph

702 Commits (e76f3816289954bb2d5d2b31bb4ddb9dbce1ead5)

Author SHA1 Message Date
Mark DePristo e76f381628 Moved sample package from DataSources to gatk, and renamed it samples
-- All associated changes to the codebase are just header updates
2011-09-29 09:57:15 -04:00
Mark DePristo e197dcd1f3 Pre-cleanup commit of Sample and SampleDataSource
-- SampleDataSource has all reader functionality disabled
2011-09-29 09:44:18 -04:00
Mark DePristo 4d31673cc5 No longer supporting YAML file allows us to delete 75% of the sample's codebase 2011-09-29 09:43:31 -04:00
Eric Banks 1b45f21774 Removing this command-line tool. Purposely not doing this in stable so that users who may still use it have time to find other options. But the docs are no longer on the wiki. 2011-09-28 13:18:32 -04:00
Eric Banks 1f0e354fae Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-28 13:13:21 -04:00
Eric Banks bb619a9a3c Fixing docs 2011-09-28 13:13:03 -04:00
Mark DePristo 5812004e06 Merge branch 'stable' 2011-09-28 11:36:40 -04:00
Mark DePristo a5006831d7 Shows "" not empty space when default string value is "" 2011-09-28 11:35:52 -04:00
Mark DePristo 1e32281a15 Fix to not show -null when missing short name argument 2011-09-28 11:31:20 -04:00
Mauricio Carneiro 89544c209c Fixing contracts
changed return type to Pair, changing contracts accordingly.
2011-09-28 11:19:17 -04:00
Eric Banks eacbee3fe5 Merged bug fix from Stable into Unstable 2011-09-27 20:35:18 -04:00
Eric Banks 43b0c98298 Fix docs 2011-09-27 20:34:46 -04:00
Eric Banks 232a6df11c Add longhand form to the error message. 2011-09-27 20:29:31 -04:00
Eric Banks 1d6fcb6eb1 Revert "Add longhand form to the error message to prevent users from posting borderline dumb posts to GS."
This reverts commit 75b2600527cfce05ae683cb394290ff2a80e8552.
2011-09-27 20:27:00 -04:00
Eric Banks 269b9826b6 Add longhand form to the error message to prevent users from posting borderline dumb posts to GS. 2011-09-27 20:26:36 -04:00
Mauricio Carneiro 3b6e43b7c4 Use reads that span multiple intervals
* RR will now compress reads that span across multiple intervals correctly and output them in the correct order.
* Fixed bug in getReadCoordinateForReferenceCoordinate where if the requested reference coordinate fell inside a deletion in the read the read would be clipped up to one element past the deletion.
2011-09-27 18:39:06 -04:00
Khalid Shakir 84bd355690 Merged bug fix from Stable into Unstable 2011-09-27 14:34:39 -04:00
Khalid Shakir b090751f62 Fixed Ant / PluginManager issue where reflections was picking up all class files under current working directory due to "." in jar manifest classpaths.
Updates to HybridSelectionPipeline:
- Added annotations back via snpEff
- Minor updates to VQSR paths and lowered memory
2011-09-27 14:33:57 -04:00
Eric Banks 26e71f6688 The Omni files have multiple records (with the same ALT) at a particular location, with one PASSing and the other(s) filtered. Chris, this is why using this file as both eval and comp leads to ref/no-call cells in the GenotypeConcordance table. However, this led to non-determinism in VE because the VCs were placed in a HashSet; we use a LinkedHashMap instead to bring back determinism. 2011-09-27 11:03:17 -04:00
Mark DePristo e99ff3caae Removed lots of old, and not to be used, HMM options
-- resulted in massive code cleanup
-- GdA will integrate his new banded algorithm here
-- Removed: DO_CONTEXT_DEPENDENT_PENALTIES, GET_GAP_PENALTIES_FROM_DATA, INDEL_RECAL_FILE, dovit, GSA_PRODUCTION_ONLY
2011-09-27 10:08:40 -04:00
Mark DePristo fa0efbc4ca Refactoring of PairHMM to support reduced reads 2011-09-26 13:28:56 -04:00
Mark DePristo a6b65d6347 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-26 13:26:21 -04:00
Mark DePristo 4f09453470 Refactored reduced read utilities
-- UnitTests for key functions on reduced reads
-- PileupElement calls static functions in ReadUtils
-- Simple routine that takes a reduced read and fills in its quals with its reduced qual
2011-09-26 12:58:31 -04:00
Eric Banks 234b74dd05 Merged bug fix from Stable into Unstable 2011-09-26 11:47:23 -05:00
Eric Banks 317b95fa57 Fixing some annotator docs 2011-09-26 11:46:45 -05:00
Mauricio Carneiro b76dbc72f0 Fixed interval navigation bug.
If a read was hard clipped away from the current interval, all subsequent reads within that interval (not hardclipped) would be filtered out. Fixed.
2011-09-26 08:13:44 -04:00
Guillermo del Angel 9afccd11b1 Minor refactoring: add ability to MathUtils.normalizeFromLog10 to not go to linear domain but just substract max value from log values and return. Use this function in snp and indel GL computation. 2011-09-25 21:18:56 -04:00
Guillermo del Angel 3eef800889 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-24 21:20:11 -04:00
Guillermo del Angel 203517fbb7 a) Cleanups/bug fixes to previous commit to CombineVariants.
b) Change md5 to reflect records that are now merged correctly.
c) Change unit merge alleles test to reflect the fact that a null non-variant vc object is not valid and not supported because there's no way to codify such object in a vcf. The code correctly converts this to a non-variant single-base event with whatever the reference is at that location.
2011-09-24 19:08:00 -04:00
Mauricio Carneiro c31f4cb2f6 Cleaning leading insertions
With the current implementation, a read cannot start with a deletion or an insertion. Maybe this will change in the future, but for now, chop the leading insertion off.
2011-09-24 14:33:32 -04:00
Guillermo del Angel cd058dd10f a) Fixed md5 for legit change in UG output that now also no-calls genotypes w/0,0,0 in PL's in SNP case.
b) First reimplementation of new vc merger of different types. Previous version did it in two steps, first merging all vc's per type and then trying to see if resulting vc's would be merged if alleles of one type were a subset of another, but this won't work when uniquifying genotypes since sample names would be messed up and GT sample names wouldn't match VC sample names. Now, it's actually simpler: when splitting vc's by type before merging, we check for alleles of one vc being a subset of alleles of vc of another type and if so we put them together in same list.
2011-09-24 13:40:11 -04:00
Mark DePristo bb11951255 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-24 09:26:45 -04:00
Mark DePristo 8d9e136bba Merge branch 'stable' 2011-09-24 09:26:28 -04:00
Mark DePristo 6804ab6d2f Bug fix for NPE in very short GATK runs
-- Was already in unstable, but not stable...
2011-09-24 09:25:29 -04:00
Mark DePristo 92acff46e5 Moved Haplotype into Utils root 2011-09-24 09:14:05 -04:00
Mark DePristo f792353dcd Framework for genotype unit test 2011-09-24 08:56:45 -04:00
Mark DePristo c0bb0cb465 Make DiploidGenotype enum private to walkers.genotyper 2011-09-24 08:48:33 -04:00
Guillermo del Angel 3a4469a236 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-23 21:58:34 -04:00
Guillermo del Angel 0e74cc3c74 a) Treat SNP genotype likelihoods just as indels, in the sense that they're always normalized as PL's so one of them will always be zero. This creates minor numerical differences in Qual and annotations due to numerical approximations in AF computation.
b) Intermediate CombineVariants fixes, not ready yet
2011-09-23 21:58:20 -04:00
Mauricio Carneiro 7cac75ae1d Merged bug fix from Stable into Unstable 2011-09-23 19:00:43 -04:00
Mauricio Carneiro fbe3c1e0b3 Adding warning on HardClipping
Hard Clipping is still under heavy development and should not be used by anyone less prepared than MacGyver.
2011-09-23 19:00:19 -04:00
Mark DePristo b66841f179 Static cache for binomial probability
-- Very low level performance optimization
2011-09-23 17:29:34 -04:00
Mauricio Carneiro 1a45c331b2 bringing the latest bug fixes to Reduce Reads 2011-09-23 16:40:06 -04:00
Mauricio Carneiro 9ea40f2e41 Deletions/Insertions in hard clip and bug fixes
* Deletions now count as hard clipped bases in order to recover the original alignment start of a clipped read.
* Insertions do not  count as hard clipped bases for the same reason.
* This created a bug in the previous cigar cleaning function. Fixed.
2011-09-23 16:37:08 -04:00
David Roazen 40202c85e0 Merged bug fix from Stable into Unstable 2011-09-23 16:35:55 -04:00
David Roazen e1cb5f6459 SnpEff annotator now assigns a functional class to each effect and distinguishes between actual effects and mere modifiers.
-We now assign a functional class (nonsense, missense, silent, or none) to each SnpEff effect, and add a
 SNPEFF_FUNCTIONAL_CLASS annotation to the INFO field of the output VCF.
-Effects are now prioritized according to both biological impact and functional class, instead of impact only.
-Many of SnpEff's "low-impact" effects are now classified as "modifiers" with lower priority than every
 other effect. This includes such "effects" as DOWNSTREAM, UPSTREAM, INTRON, GENE, EXON, and others that
 really describe the location of the variant rather than its biological effect.

This code will be short-lived (likely 1.2-only), as the next version of SnpEff will include most of these
features directly.

Checking this change into Stable+Unstable instead of Unstable because the current functional class stratification
in VariantEval is basically broken and urgently needs to be fixed for production purposes.
2011-09-23 16:06:52 -04:00
Matt Hanna e388c357ca Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-23 14:53:28 -04:00
Matt Hanna cc23b0b8a9 Fix for recent change modelling unmapped shards: don't invoke optimization to combine mapped and unmapped shards. 2011-09-23 14:52:31 -04:00
Mark DePristo e3d4efb283 Remove N2 EXACT model code, which should never be used 2011-09-23 11:55:21 -04:00
Mark DePristo 27ce3c822e Merge branch 'stable' 2011-09-23 09:04:52 -04:00