Commit Graph

6268 Commits (6450b48951a28fceed4e4a46cd9edf7ec080ad34)

Author SHA1 Message Date
Mark DePristo 6450b48951 manageGATKS3Logs now downloads and deletes files in groups of 100 files by default. This should significantly improve the performance of the S3 log synchronization. 2011-07-13 10:14:38 -04:00
Eric Banks 969227c657 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-13 10:01:28 -04:00
Eric Banks 797c50e689 Fixing integration tests I broke yesterday; removing batch merging test since we don't support that anymore. 2011-07-13 10:01:23 -04:00
Eric Banks 6007eea3ff Allowing VCF records without GTs in vf4.1 2011-07-13 09:56:08 -04:00
Christopher Hartl 95040d95b9 Adding in check for filtered site (Sorry Mark, looks like it wasn't checking the validated rod, only the mask). Also by allowing user to lowercase SNPs, could miss 'SNP_TOO_NEAR_PROBE', now we properly check for that. 2011-07-12 19:21:26 -04:00
Christopher Hartl 61dad4f090 Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 18:33:30 -04:00
Christopher Hartl 30768eccbb Big change to PSP2: Amplicon sequence no longer lower-cased for repetitiveness, but instead for non-uniqueness via alignment by bwa. Performance heavily dependent on length of sequence (duh), with size=30 a good balance, but default is 20 because that's the default length of a sequenom primer. Indentation changes to other stuff. 2011-07-12 18:33:12 -04:00
Mauricio Carneiro 60870b360a Fixed the calculation of the allele count prior, it was using ac instead of maxAlleleCount. 2011-07-12 17:18:52 -04:00
Mauricio Carneiro d21388f0cb Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:56:38 -04:00
Mauricio Carneiro 775d2c2598 Added VQSR to the ReducedBAM evaluation script. Our tests need to address the annotations used by VQSR so we can actually measure the impact of changing the parameters in the ReduceReads walker (especially context size). 2011-07-12 15:56:24 -04:00
Ryan Poplin c944019678 Adding dev qscript I used to perform the exome t2d+1kg calling experiment 2011-07-12 15:44:45 -04:00
Ryan Poplin 837fb8f689 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:39:26 -04:00
Ryan Poplin 5077c94d85 Adding MappingQualityUnavailableReadFilter to the SNP and indel CountCovariates 2011-07-12 15:39:07 -04:00
Mark DePristo 2092fb439c Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:29:52 -04:00
Roger Zurawicki 6ee8a86197 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:26:13 -04:00
Roger Zurawicki 7991d69ba4 I added experimental function in the ReducedBAMEvaluation.
At the beginning of the scrpit you can specify a list of values to test and it will process the ReduceReadsWalker for your parameter.

I also added a method to convert the number when naming files to be sort-friendly.
2011-07-12 15:25:43 -04:00
Mark DePristo 01fd6a6949 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:20:44 -04:00
Mark DePristo ccedd6ff4c Difference is now the general form -- used to be SummarizedDifference. The old Difference class is now a subclass of Difference that includes pointers to specific the master and test DiffElements.
Added a size() function that calculates the number of elements tree from a DiffElement.
2011-07-12 15:20:28 -04:00
Eric Banks a2597e7f00 This commit incorporates several different changes that each pretty much break all the VCF-based integration tests, so I bunched them all together. We now officially emit VCF4.1 files (woo hoo), which means that the VCF headers are now all different (header version is 4.1 plus counts for some of the annotations are 'A' or 'G'). Also, I've added a Read Filter for reads with MQ=255 ('unavailable' in the SAM spec) and have applied this to the UG and the RMS MQ annotation. 2011-07-12 14:11:53 -04:00
Ryan Poplin 329c3d8050 Merged bug fix from Stable into Unstable 2011-07-12 13:55:51 -04:00
Ryan Poplin 73735863b0 Fix for the case of requesting genotype for a sample that doesn't exist in a VariantContext 2011-07-12 13:55:21 -04:00
Guillermo del Angel c4c145afb9 Merged bug fix from Stable into Unstable 2011-07-12 13:44:48 -04:00
Guillermo del Angel cfe43e3971 Bug fix for Genotype given alleles: if we are in INDEL mode ignore SNPs and MNPs instead of emitting an empty site with alleles but no annotations 2011-07-12 13:43:46 -04:00
Mark DePristo 05212aea62 reader now takes an argument for the maximum number of elements to read from the file. 2011-07-12 08:53:19 -04:00
Mark DePristo 8056a3fe89 getElement() now uses O(1) get from hash instead of linear O(n) search. Enables us to read large files easily. 2011-07-12 08:52:31 -04:00
Mark DePristo f313e14e4e Now deletes the dump directory on ant clean
Moving diffengine tests from private to public
2011-07-12 08:50:58 -04:00
Eric Banks d7d15019dd Adding support for other simple header line types (e.g. ALT) and cleaning up the interface a bit. 2011-07-12 01:16:21 -04:00
Eric Banks 400b0d4422 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-11 23:38:57 -04:00
Mark DePristo d5056ad899 Merge branch 'master' into diffit 2011-07-11 23:16:15 -04:00
Mark DePristo 893cc2e103 Making the package public, so there's no dependances from public -> private 2011-07-11 23:15:08 -04:00
Mark DePristo 5e593793af DiffEngine utility function simpleDiffFiles
printSummaryReport now uses GATKReport for nice formating
Moved print formatting arguments into inner class provided to printing functions themselves, not the class
BAMDiffableReader only reads 1000 entries to avoid performance issue.  Work around for BAM files with non-unique names
Uncommented all of the incorrectly commented out CombineVariants integrationtests
BaseTest now uses DiffEngine to provide inline differences to VCF and BAM files
2011-07-11 23:10:27 -04:00
Khalid Shakir d11155ce2e Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-11 19:19:54 -04:00
Khalid Shakir e93052a51e When generating the QGraph, don't regenerate if there aren't scatter/gather jobs.
Fixed a display issue with the number of milliseconds that Queue has tried to contact LSF.
2011-07-11 19:17:58 -04:00
Eric Banks e3748675db Support for VCF 4.1 header counts 2011-07-11 17:40:45 -04:00
Christopher Hartl d6517adb42 Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-11 16:16:37 -04:00
Christopher Hartl 86890c6357 N and K (in binomial probability) got switched in RFA Walker with the last commit. No longer will NaNs be produced.
Added: TableToVCF. Kind of a longer-term project, but there are lots of variant calls available in a weird tabular format. I used this to convert Ju Et Al small indels to VCF. I'll check against the 1000G ASN superpopulation calls to see if we see a good amount of recapitulation, and if so, i'll put them in unvalidated comparisons. Minor chances to the TableCodec and TableFeatures to allow for this (the codec can sometimes drop a column, and the feature now allows you to grab on to its header).
2011-07-11 16:16:15 -04:00
Mark DePristo b327fa3779 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-11 15:20:45 -04:00
Mark DePristo 41db509a17 A simple python program for downloading S3 logs in the cron script. 2011-07-11 15:20:01 -04:00
David Roazen a18380ab96 Merged bug fix from Stable into Unstable 2011-07-11 12:16:50 -04:00
David Roazen 8a78414432 Removed TileCovariate as a dependency for AnalyzeCovariates.jar 2011-07-11 12:10:11 -04:00
Guillermo del Angel 6e7b5e1e7a Merged bug fix from Stable into Unstable
Merge branch 'master' into unstable
2011-07-08 21:19:45 -04:00
Guillermo del Angel 7fbc5987d0 Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-07-08 21:17:32 -04:00
David Roazen 68e19edf59 Merged bug fix from Stable into Unstable, and resolved merge conflicts.
Conflicts:
	build.xml
	settings/ivysettings.xml
2011-07-08 15:50:31 -04:00
David Roazen a3c9d9c3ff Fixing Contracts for Java, and enabling contracts by default for unit/integration tests.
The NullPointerException we were seeing when trying to run with contracts enabled was being caused
by an outdated version of the asm library.

To run tests without contracts and disable their compilation, pass in "-Duse.contracts=false" to ant.

Also did some minor unrelated cleanup in build.xml
2011-07-08 15:34:39 -04:00
Mark DePristo bd29236684 Merge branch 'master' into diffengine 2011-07-08 14:08:17 -04:00
Mark DePristo 8de82f3974 Updated names to be more reflective of the fact that this works for exomes and WG now. 2011-07-08 14:07:28 -04:00
Mark DePristo ae02eabc93 Since it now works with all classes of variants, should really be renamed 2011-07-08 14:04:59 -04:00
Mark DePristo 2ea36b06cc Really works now with files where (1) there's no functional annotation and (2) there's no indel calls. 2011-07-08 14:04:00 -04:00
Christopher Hartl 38d9b9b568 A printf from debugging made it in in some prior commit.
The read transform adding the AI tag can cause an exception for widowed reads -- added a check for this case, preventing blowup.
2011-07-08 13:13:58 -04:00
Ryan Poplin 51338cbe07 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-08 12:49:00 -04:00