Menachem Fromer
74aa49e423
Merged bug fix from Stable into Unstable
2011-07-13 12:12:42 -04:00
Menachem Fromer
fa3ff53508
Filters should only be applied to the new VC if the old VC had filters applied
2011-07-13 11:58:16 -04:00
Eric Banks
969227c657
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-13 10:01:28 -04:00
Eric Banks
6007eea3ff
Allowing VCF records without GTs in vf4.1
2011-07-13 09:56:08 -04:00
Guillermo del Angel
1e81d521c0
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 20:12:29 -04:00
Ryan Poplin
837fb8f689
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 15:39:26 -04:00
Ryan Poplin
5077c94d85
Adding MappingQualityUnavailableReadFilter to the SNP and indel CountCovariates
2011-07-12 15:39:07 -04:00
Mark DePristo
01fd6a6949
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 15:20:44 -04:00
Mark DePristo
ccedd6ff4c
Difference is now the general form -- used to be SummarizedDifference. The old Difference class is now a subclass of Difference that includes pointers to specific the master and test DiffElements.
...
Added a size() function that calculates the number of elements tree from a DiffElement.
2011-07-12 15:20:28 -04:00
Eric Banks
a2597e7f00
This commit incorporates several different changes that each pretty much break all the VCF-based integration tests, so I bunched them all together. We now officially emit VCF4.1 files (woo hoo), which means that the VCF headers are now all different (header version is 4.1 plus counts for some of the annotations are 'A' or 'G'). Also, I've added a Read Filter for reads with MQ=255 ('unavailable' in the SAM spec) and have applied this to the UG and the RMS MQ annotation.
2011-07-12 14:11:53 -04:00
Ryan Poplin
329c3d8050
Merged bug fix from Stable into Unstable
2011-07-12 13:55:51 -04:00
Ryan Poplin
73735863b0
Fix for the case of requesting genotype for a sample that doesn't exist in a VariantContext
2011-07-12 13:55:21 -04:00
Guillermo del Angel
c4c145afb9
Merged bug fix from Stable into Unstable
2011-07-12 13:44:48 -04:00
Guillermo del Angel
cfe43e3971
Bug fix for Genotype given alleles: if we are in INDEL mode ignore SNPs and MNPs instead of emitting an empty site with alleles but no annotations
2011-07-12 13:43:46 -04:00
Guillermo del Angel
bfbca8b194
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 12:11:58 -04:00
Mark DePristo
05212aea62
reader now takes an argument for the maximum number of elements to read from the file.
2011-07-12 08:53:19 -04:00
Mark DePristo
8056a3fe89
getElement() now uses O(1) get from hash instead of linear O(n) search. Enables us to read large files easily.
2011-07-12 08:52:31 -04:00
Eric Banks
d7d15019dd
Adding support for other simple header line types (e.g. ALT) and cleaning up the interface a bit.
2011-07-12 01:16:21 -04:00
Eric Banks
400b0d4422
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-11 23:38:57 -04:00
Mark DePristo
d5056ad899
Merge branch 'master' into diffit
2011-07-11 23:16:15 -04:00
Mark DePristo
893cc2e103
Making the package public, so there's no dependances from public -> private
2011-07-11 23:15:08 -04:00
Eric Banks
e3748675db
Support for VCF 4.1 header counts
2011-07-11 17:40:45 -04:00
Guillermo del Angel
f54c2ae3b4
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-11 16:26:27 -04:00
Christopher Hartl
d6517adb42
Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-11 16:16:37 -04:00
Christopher Hartl
86890c6357
N and K (in binomial probability) got switched in RFA Walker with the last commit. No longer will NaNs be produced.
...
Added: TableToVCF. Kind of a longer-term project, but there are lots of variant calls available in a weird tabular format. I used this to convert Ju Et Al small indels to VCF. I'll check against the 1000G ASN superpopulation calls to see if we see a good amount of recapitulation, and if so, i'll put them in unvalidated comparisons. Minor chances to the TableCodec and TableFeatures to allow for this (the codec can sometimes drop a column, and the feature now allows you to grab on to its header).
2011-07-11 16:16:15 -04:00
Guillermo del Angel
d587856f2d
Private feature to input a list of family descriptions from a file and to look for MV's on all of these. Feature can also output a detailed description of the violation into a separate file
2011-07-11 14:17:59 -04:00
Guillermo del Angel
6e7b5e1e7a
Merged bug fix from Stable into Unstable
...
Merge branch 'master' into unstable
2011-07-08 21:19:45 -04:00
Guillermo del Angel
7fbc5987d0
Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable
2011-07-08 21:17:32 -04:00
Mark DePristo
bd29236684
Merge branch 'master' into diffengine
2011-07-08 14:08:17 -04:00
Guillermo del Angel
224574424e
Bug fix: if we're genotyping a very long indel (>100 bp) fail gracefully instead of with an array out of bounds exception
2011-07-08 12:48:49 -04:00
Ryan Poplin
2a4b3ae4a2
Cleaning up / removing most of the monkeying around with annotation values that happens in VariantDataManager
2011-07-08 12:48:33 -04:00
Mark DePristo
8add2a3866
Merge branch 'master' into diffengine
2011-07-08 09:15:54 -04:00
Eric Banks
cc143493e3
Merged bug fix from Stable into Unstable
2011-07-07 23:01:24 -04:00
Eric Banks
4cfe0dd857
Test for bad alleles so that we don't generate IndexOutOfBoundsExceptions
2011-07-07 23:01:03 -04:00
Mark DePristo
3d4f0e9dd7
Now supports the case where you have multiple AC values in the info field.
2011-07-07 17:21:15 -04:00
Ryan Poplin
212e9a1a0c
Fixing unstable build after stable commit
2011-07-07 15:18:57 -04:00
Ryan Poplin
11d9a0473a
Merged bug fix from Stable into Unstable
2011-07-07 15:03:58 -04:00
Ryan Poplin
50111db2b7
Fixing non-determinism in single-threaded VQSR by moving references to cern.Normal over to the static random generator available in GenomeAnalysisEngine
2011-07-07 15:02:48 -04:00
Guillermo del Angel
4d565b0811
Merge branch 'incoming'
2011-07-07 06:21:05 -04:00
Guillermo del Angel
55c8c05060
Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-07 06:18:29 -04:00
Guillermo del Angel
5ab2e83904
a) Cosmetic modifications to IndelType annotation. b) Add ability to select samples from a file in PrintReads, c) fixes to shaped AF random selection in SelectVariants
2011-07-07 06:15:10 -04:00
Eric Banks
52f6f9fdcc
Merged bug fix from Stable into Unstable
2011-07-06 16:05:48 -04:00
Eric Banks
54121eb082
Catch malformed bams that cause the writer to run in infinite loops
2011-07-06 16:05:08 -04:00
Eric Banks
76a01a7453
Merged bug fix from Stable into Unstable
2011-07-06 12:53:09 -04:00
Eric Banks
14fee4ccbd
Patch from Bob to deal with symbolic alleles: these weren't getting padded but they should be.
2011-07-06 12:51:44 -04:00
Ryan Poplin
bdef233d4d
Merged bug fix from Stable into Unstable
2011-07-06 10:05:02 -04:00
Ryan Poplin
e8ed6b7f0f
Adding more comments to main VQSR walker. Fixing copyright lines. Bug fix for default paths to now point to public/R/ instead of R/ Bug fix in VQSR for the path to the R scripts not ending in a slash.
2011-07-06 10:01:14 -04:00
Guillermo del Angel
8e8b901d12
Merged bug fix from Stable into Unstable
...
Merge branch 'master' into unstable
2011-07-06 09:57:55 -04:00
Guillermo del Angel
81a4d18468
Mark several indel-related arguments as @Hidden
2011-07-06 09:56:38 -04:00
Guillermo del Angel
9124c84a7c
bug fixes
2011-07-04 21:10:44 -04:00
Guillermo del Angel
bb85f232b9
bug fixes
2011-07-04 21:04:49 -04:00
Guillermo del Angel
f26ffeaea0
bug fixes
2011-07-04 20:48:45 -04:00
Guillermo del Angel
04df153f47
bug fixes
2011-07-04 20:45:10 -04:00
Guillermo del Angel
7a04872a3f
bug fixes
2011-07-04 20:33:59 -04:00
Guillermo del Angel
08bc843d4c
SelectVariants can get a table to boost AF when choosing randomly
2011-07-04 20:23:22 -04:00
Guillermo del Angel
fac082de64
Report only highest AF and AC in multiallelic records in VariantsToTable or else R can't parse table
2011-07-03 14:32:12 -04:00
Guillermo del Angel
abe9480c6d
Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-02 21:19:15 -04:00
Ryan Poplin
fb315b5f8c
Merge branch 'incoming'
2011-07-02 18:10:48 -04:00
Ryan Poplin
41d46059e7
fixing bad format statement
2011-07-02 18:09:17 -04:00
Ryan Poplin
3804afeb8a
Merge branch 'incoming'
2011-07-02 17:55:39 -04:00
Ryan Poplin
781c0c33a4
Use the worst X% of calls in addition to the bad training sites list. Don't include the already added calls in the calculation of X%
2011-07-02 17:55:10 -04:00
Ryan Poplin
6b8af6afd8
Merge branch 'master' of ssh://gsa1.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-02 17:15:56 -04:00
Ryan Poplin
fdc2ebb321
Adding ability to specify in VQSR a list of bad sites to use when training the negative model. Just add bad=true to the list of rod tags for your bad sites track.
2011-07-02 17:15:13 -04:00
Guillermo del Angel
09af6bbc6c
Ugh - backed out experimental code not for public consumption unintendedly committed
2011-07-02 16:58:57 -04:00
Guillermo del Angel
c6c0dba040
Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-02 16:45:34 -04:00
Ryan Poplin
4532a84314
Merged bug fix from Stable into Unstable
2011-07-02 10:48:55 -04:00
Ryan Poplin
5faf40b79d
Moving AnalyzeAnnotations into the archive because it has outlived its usefulness.
2011-07-02 10:39:53 -04:00
Ryan Poplin
17ff5bb094
Variant records coming out of the VQSR are now annotated with which input annotation was most divergent from the Gaussian mixture model. This gives a general sense for why each variant was removed from the callset.
2011-07-02 09:55:35 -04:00
Khalid Shakir
c65e52f88a
Merged bug fix from Stable into Unstable
2011-07-01 20:50:56 -04:00
Khalid Shakir
b6bc64a0c8
Cleanup of the utils.broad package.
...
Using Picard IoUtils on sample names.
2011-07-01 20:47:03 -04:00
Eric Banks
0c9105ca22
Minor fix of description
2011-07-01 18:07:35 -04:00
David Roazen
d647ea4fdc
Long-delayed change to CachingIndexedFastaSequenceFile. Made the cache
...
non-static to avoid problems when multiple references are used within the same
thread (eg., during integration tests). This should kill the intermittent
IndelRealignerIntegrationTest failures.
2011-07-01 16:04:30 -04:00
Eric Banks
761347b8d5
The VariantContext utility method used by SelectVariants wasn't checking the filter status (unfiltered vs. passing filters) and always returned a VC that was passing filters. This is fixed and the md5 from the VCF Streaming test has been re-updated.
2011-06-30 15:26:09 -04:00
Mark A. DePristo
defa3cfe85
Moved around private walkers into appropriate directories in private gatk.walkers. Moved a few public walkers into private qc package, and some private qc walkers into the public directory. Removed several obviously broken and/or unused walkers.
2011-06-30 14:59:58 -04:00
Eric Banks
804d5f22d5
Reverting previous change, as promised.
2011-06-30 13:18:30 -04:00
Eric Banks
9e234cf5d6
This is a temporary commit for Picard. It will absolutely break integration tests, but I'm going to revert it in 1 minute. Because we don't want them in unstable, I need to push this into stable.
2011-06-30 13:17:14 -04:00
Guillermo del Angel
331b47afbd
Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-06-30 08:29:11 -04:00
Guillermo del Angel
50c32ce52e
VariantsToTableFix
2011-06-29 21:39:53 -04:00
Guillermo del Angel
9b134f3b96
VariantsToTableFix
2011-06-29 21:33:41 -04:00
Guillermo del Angel
2b88033ef4
Enable considering 454 reads, just lower GOP by 15
2011-06-29 16:12:55 -04:00
Guillermo del Angel
dc4f63a1a8
a) consensus goes to week queue
...
b) New experimental TechnologyComposition annotation
c) SelectVariants fixes
2011-06-29 16:00:23 -04:00
Eric Banks
70ba851478
Might as well check for the illegal state and throw an exception
2011-06-29 15:59:10 -04:00
Eric Banks
1f19afe1d9
Fixed bug in the IndelRealigner: now that variants are correctly typed in VariantContext, it is possible that a variant can be an indel but neither an insertion or a deletion; added a isComplexIndel() method and now we check for such an event in the realigner (we don't use them to generate alternate consenses). Also, added a isMNP() method while I was there so that it would be consistent with other variant types.
2011-06-29 15:54:09 -04:00
Guillermo del Angel
e91ae6b265
AF matching when selecting random variants
2011-06-29 15:00:26 -04:00
Guillermo del Angel
dee10140dd
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable
2011-06-29 13:58:04 -04:00
Eric Banks
8586c86bc4
My commit from last week to fix the old dbsnp rod conversion only worked for locus traversals. Updated now to work for all traversals.
2011-06-29 13:56:37 -04:00
Guillermo del Angel
5b6d279a2e
Two bug fixes:
...
a) Modified the way clipped bases are dealt with in ReadPosRankSumTest when annotating indels. Cigar string cannot be trusted because BWA can clip good high quality bases and some sites get incorrect ReadPos annotations if BWA systematically clips at an indel breakpoint.
b) PL header needs to specify "." as length. Otherwise we fail VCF validation if multiallelic sites are present.
2011-06-29 10:21:27 -04:00
David Roazen
139c6b84a1
Modified build.xml and the help extractor doclet to use the output of "git
...
describe" as an absolute version number (if the repository has at least one
tag), using the raw SHA-1 hash value as a fallback version number in the case
where there are no tags.
2011-06-28 08:37:05 -04:00
David Roazen
3c9497788e
Reorganized the codebase beneath top-level public and private directories,
...
removing the playground and oneoffprojects directories in the process. Updated
build.xml accordingly.
2011-06-28 06:55:19 -04:00