Christopher Hartl
92c7cfa1c8
BWA bindings and tests moved to public (was required for ValidationAmplicons)
...
Integration tests for ValidationAmplicons. New argument to disable BWA, lowercase letters only for repetitiveness instead.
2011-07-19 20:11:31 -04:00
Christopher Hartl
07e716d23a
PickSequenomProbes2 expanded functionality: lowercasing based on sequence uniqueness, preserving reference base prior to indel (not a part of the VC as I thought it was), masking deletion bases with 'N's, flanking insertion with 'N's, output is a fasta formatted file. Renamed to ValidationAmplicons since this is really not for picking sequenom probes, but for generating amplicon sequence from which other applications (like sequenom) can choose PCR primers. Moved from private to public.
2011-07-19 15:21:47 -04:00
Christopher Hartl
95040d95b9
Adding in check for filtered site (Sorry Mark, looks like it wasn't checking the validated rod, only the mask). Also by allowing user to lowercase SNPs, could miss 'SNP_TOO_NEAR_PROBE', now we properly check for that.
2011-07-12 19:21:26 -04:00
Christopher Hartl
61dad4f090
Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 18:33:30 -04:00
Christopher Hartl
30768eccbb
Big change to PSP2: Amplicon sequence no longer lower-cased for repetitiveness, but instead for non-uniqueness via alignment by bwa. Performance heavily dependent on length of sequence (duh), with size=30 a good balance, but default is 20 because that's the default length of a sequenom primer. Indentation changes to other stuff.
2011-07-12 18:33:12 -04:00
Mauricio Carneiro
60870b360a
Fixed the calculation of the allele count prior, it was using ac instead of maxAlleleCount.
2011-07-12 17:18:52 -04:00
Mauricio Carneiro
d21388f0cb
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 15:56:38 -04:00
Mauricio Carneiro
775d2c2598
Added VQSR to the ReducedBAM evaluation script. Our tests need to address the annotations used by VQSR so we can actually measure the impact of changing the parameters in the ReduceReads walker (especially context size).
2011-07-12 15:56:24 -04:00
Ryan Poplin
c944019678
Adding dev qscript I used to perform the exome t2d+1kg calling experiment
2011-07-12 15:44:45 -04:00
Ryan Poplin
837fb8f689
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 15:39:26 -04:00
Ryan Poplin
5077c94d85
Adding MappingQualityUnavailableReadFilter to the SNP and indel CountCovariates
2011-07-12 15:39:07 -04:00
Mark DePristo
2092fb439c
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 15:29:52 -04:00
Roger Zurawicki
6ee8a86197
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 15:26:13 -04:00
Roger Zurawicki
7991d69ba4
I added experimental function in the ReducedBAMEvaluation.
...
At the beginning of the scrpit you can specify a list of values to test and it will process the ReduceReadsWalker for your parameter.
I also added a method to convert the number when naming files to be sort-friendly.
2011-07-12 15:25:43 -04:00
Mark DePristo
01fd6a6949
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-12 15:20:44 -04:00
Mark DePristo
ccedd6ff4c
Difference is now the general form -- used to be SummarizedDifference. The old Difference class is now a subclass of Difference that includes pointers to specific the master and test DiffElements.
...
Added a size() function that calculates the number of elements tree from a DiffElement.
2011-07-12 15:20:28 -04:00
Eric Banks
a2597e7f00
This commit incorporates several different changes that each pretty much break all the VCF-based integration tests, so I bunched them all together. We now officially emit VCF4.1 files (woo hoo), which means that the VCF headers are now all different (header version is 4.1 plus counts for some of the annotations are 'A' or 'G'). Also, I've added a Read Filter for reads with MQ=255 ('unavailable' in the SAM spec) and have applied this to the UG and the RMS MQ annotation.
2011-07-12 14:11:53 -04:00
Ryan Poplin
329c3d8050
Merged bug fix from Stable into Unstable
2011-07-12 13:55:51 -04:00
Ryan Poplin
73735863b0
Fix for the case of requesting genotype for a sample that doesn't exist in a VariantContext
2011-07-12 13:55:21 -04:00
Guillermo del Angel
c4c145afb9
Merged bug fix from Stable into Unstable
2011-07-12 13:44:48 -04:00
Guillermo del Angel
cfe43e3971
Bug fix for Genotype given alleles: if we are in INDEL mode ignore SNPs and MNPs instead of emitting an empty site with alleles but no annotations
2011-07-12 13:43:46 -04:00
Mark DePristo
05212aea62
reader now takes an argument for the maximum number of elements to read from the file.
2011-07-12 08:53:19 -04:00
Mark DePristo
8056a3fe89
getElement() now uses O(1) get from hash instead of linear O(n) search. Enables us to read large files easily.
2011-07-12 08:52:31 -04:00
Mark DePristo
f313e14e4e
Now deletes the dump directory on ant clean
...
Moving diffengine tests from private to public
2011-07-12 08:50:58 -04:00
Eric Banks
d7d15019dd
Adding support for other simple header line types (e.g. ALT) and cleaning up the interface a bit.
2011-07-12 01:16:21 -04:00
Eric Banks
400b0d4422
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-11 23:38:57 -04:00
Mark DePristo
d5056ad899
Merge branch 'master' into diffit
2011-07-11 23:16:15 -04:00
Mark DePristo
893cc2e103
Making the package public, so there's no dependances from public -> private
2011-07-11 23:15:08 -04:00
Mark DePristo
5e593793af
DiffEngine utility function simpleDiffFiles
...
printSummaryReport now uses GATKReport for nice formating
Moved print formatting arguments into inner class provided to printing functions themselves, not the class
BAMDiffableReader only reads 1000 entries to avoid performance issue. Work around for BAM files with non-unique names
Uncommented all of the incorrectly commented out CombineVariants integrationtests
BaseTest now uses DiffEngine to provide inline differences to VCF and BAM files
2011-07-11 23:10:27 -04:00
Khalid Shakir
d11155ce2e
Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-11 19:19:54 -04:00
Khalid Shakir
e93052a51e
When generating the QGraph, don't regenerate if there aren't scatter/gather jobs.
...
Fixed a display issue with the number of milliseconds that Queue has tried to contact LSF.
2011-07-11 19:17:58 -04:00
Eric Banks
e3748675db
Support for VCF 4.1 header counts
2011-07-11 17:40:45 -04:00
Christopher Hartl
d6517adb42
Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-11 16:16:37 -04:00
Christopher Hartl
86890c6357
N and K (in binomial probability) got switched in RFA Walker with the last commit. No longer will NaNs be produced.
...
Added: TableToVCF. Kind of a longer-term project, but there are lots of variant calls available in a weird tabular format. I used this to convert Ju Et Al small indels to VCF. I'll check against the 1000G ASN superpopulation calls to see if we see a good amount of recapitulation, and if so, i'll put them in unvalidated comparisons. Minor chances to the TableCodec and TableFeatures to allow for this (the codec can sometimes drop a column, and the feature now allows you to grab on to its header).
2011-07-11 16:16:15 -04:00
Mark DePristo
b327fa3779
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-11 15:20:45 -04:00
Mark DePristo
41db509a17
A simple python program for downloading S3 logs in the cron script.
2011-07-11 15:20:01 -04:00
David Roazen
a18380ab96
Merged bug fix from Stable into Unstable
2011-07-11 12:16:50 -04:00
David Roazen
8a78414432
Removed TileCovariate as a dependency for AnalyzeCovariates.jar
2011-07-11 12:10:11 -04:00
Guillermo del Angel
6e7b5e1e7a
Merged bug fix from Stable into Unstable
...
Merge branch 'master' into unstable
2011-07-08 21:19:45 -04:00
Guillermo del Angel
7fbc5987d0
Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable
2011-07-08 21:17:32 -04:00
David Roazen
68e19edf59
Merged bug fix from Stable into Unstable, and resolved merge conflicts.
...
Conflicts:
build.xml
settings/ivysettings.xml
2011-07-08 15:50:31 -04:00
David Roazen
a3c9d9c3ff
Fixing Contracts for Java, and enabling contracts by default for unit/integration tests.
...
The NullPointerException we were seeing when trying to run with contracts enabled was being caused
by an outdated version of the asm library.
To run tests without contracts and disable their compilation, pass in "-Duse.contracts=false" to ant.
Also did some minor unrelated cleanup in build.xml
2011-07-08 15:34:39 -04:00
Mark DePristo
bd29236684
Merge branch 'master' into diffengine
2011-07-08 14:08:17 -04:00
Mark DePristo
8de82f3974
Updated names to be more reflective of the fact that this works for exomes and WG now.
2011-07-08 14:07:28 -04:00
Mark DePristo
ae02eabc93
Since it now works with all classes of variants, should really be renamed
2011-07-08 14:04:59 -04:00
Mark DePristo
2ea36b06cc
Really works now with files where (1) there's no functional annotation and (2) there's no indel calls.
2011-07-08 14:04:00 -04:00
Christopher Hartl
38d9b9b568
A printf from debugging made it in in some prior commit.
...
The read transform adding the AI tag can cause an exception for widowed reads -- added a check for this case, preventing blowup.
2011-07-08 13:13:58 -04:00
Ryan Poplin
51338cbe07
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-08 12:49:00 -04:00
Guillermo del Angel
224574424e
Bug fix: if we're genotyping a very long indel (>100 bp) fail gracefully instead of with an array out of bounds exception
2011-07-08 12:48:49 -04:00
Ryan Poplin
2a4b3ae4a2
Cleaning up / removing most of the monkeying around with annotation values that happens in VariantDataManager
2011-07-08 12:48:33 -04:00