Commit Graph

111 Commits (83ba2c066a75ac7ebb10a60934b4dfa7dc89c2cf)

Author SHA1 Message Date
Eric Banks 83ba2c066a Making it deterministic 2011-07-18 13:59:02 -04:00
Eric Banks 92fa410450 Check that it's a valid bam file before parsing or bad things can happen 2011-07-18 13:43:34 -04:00
Eric Banks 80b5c5261a CombineVariants no longer combines records of different types. So now when combining SNP and indel callsets, overlapping calls get their own records. Useful for Khalid in the pipeline. For those interested, it turns out the previous behavior was doing the wrong thing occasionally (and this was even captured in the integration tests). 2011-07-18 13:42:45 -04:00
Eric Banks bc8b5da698 Added docs while I was reading through the code to understand it 2011-07-18 12:25:54 -04:00
Mark DePristo 51b0dd01c3 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-18 10:47:29 -04:00
Mark DePristo d6e2e89f99 Walker test system refactoring. All MD5DB related functions are now in MD5DB.java.
System has the concept of a local and a global MD5 db.  The local one is like it operated previously.  The global one lives in /humgen/gsa-hpprojects/GATK/data/integrationtests.  If the system can find this directory then MD5s will also be read / written to this location.  This means that gsabamboo will print differences as appropriate.  And all users will in effect have access to a complete history of MD5 file results.
A few minor code reshuffles changed VariantRecalibration and VCFHeader test files.
2011-07-18 10:46:01 -04:00
Mark DePristo 6f26c07b85 Removed the SpecificDifference class. Now Difference classes always have the option to remember specific master and test values. This means that all summarized differences carry with them specific examples of their differences. Consequently, now even summarized differences give at least one example of the specific difference, even when the count of the difference is > 1. Unit tests updated. Added DiffObjects integrationtest. VCFDiffableReader now specifically reads the first line of the VCF file to capture the version number. 2011-07-18 10:42:35 -04:00
Kiran V Garimella b2b7d27fed Merge branch 'laptop' 2011-07-18 00:25:46 -04:00
Kiran V Garimella 497721a799 Added class documentation string. 2011-07-18 00:25:21 -04:00
Kiran V Garimella ac9c66138d Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-18 00:20:33 -04:00
Kiran V Garimella 824100e57f Corrected typo in MergeAndMatchHaplotypes integration test 2011-07-17 22:50:54 -04:00
Kiran V Garimella 8167aba601 Moved (poorly named) MergeAndMatchHaplotypes to public. Added integration test 2011-07-17 22:47:32 -04:00
Kiran V Garimella afb506e128 Added MD5s for PhaseByTransmission integration tests 2011-07-17 21:55:33 -04:00
Kiran V Garimella 558e197989 Integration test for PhaseByTransmission 2011-07-17 21:25:08 -04:00
Mark DePristo 9992c373be Optimize imports run on the whole project, public and private. I just got too tired of all of the unused imports floating around. Confirmed that the system builds after the changes. 2011-07-17 20:29:58 -04:00
Kiran V Garimella 4ea433f8e1 Moved PhaseByTransmission to public 2011-07-17 19:42:00 -04:00
Mark DePristo 9ca9cf52ac Uncommenting a stray commented test. 2011-07-17 15:38:33 -04:00
Mark DePristo 4db2b13e9e Rev tribble.
Just added more documentation for diffEngine and pointer to new wiki:

http://www.broadinstitute.org/gsa/wiki/index.php/DiffEngine
2011-07-17 13:05:04 -04:00
Mark DePristo 92a1c0c278 Moved the varianteval/tags/DataPoint.java and varianteval/tags/Analysis.java to varianteval/utils. This allows rsync to see these files with the -C option, as tags is some kind of reserved CVS keyword. 2011-07-17 10:14:23 -04:00
Mark DePristo eacf205f40 Tests needed to be updated to reflect the code reorg of tribble. 2011-07-16 09:22:34 -04:00
Menachem Fromer 72f4cf9c0e Walker to perform deterministic annotation of phasing by transmission (to be compatible with RBP's definition of consecutive pairwise phasing) 2011-07-15 17:44:31 -04:00
Mark DePristo c0bbeb23ba Now providing more information when the index on the fly isn't equal to the one created by reading the file from disk. 2011-07-14 15:12:28 -04:00
Mark DePristo 5ffeddd3b1 better to use _ instead of ., as this is a special case later. 2011-07-14 14:45:16 -04:00
Eric Banks 9540df6998 Oops, forgot to update unit test 2011-07-14 14:00:19 -04:00
Eric Banks ed6beae1f3 Adding headers to diffable reading for VCFs 2011-07-14 13:55:35 -04:00
Eric Banks 66c652d687 Added some extra error checks in the VCF codec. Now that we've moved this back into the GATK, changed some of the standard exceptions to be USerErrors (instead of TribbleExceptions). 2011-07-14 11:56:10 -04:00
Eric Banks 0c54c796ed Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-13 14:57:33 -04:00
Eric Banks bb0e3a26fc Added integration test for VCF writing. Also, bug fix for writing the GT-free records. 2011-07-13 14:57:21 -04:00
Eric Banks 6a431da554 Don't output source and ref header lines anymore. Short-term motivation for this is that I'd like this tool when run on a VCF to emit the exact same VCF. Long-term motivation is that these tags should be output by the VCF writer itself for all tools. 2011-07-13 14:40:01 -04:00
Menachem Fromer 74aa49e423 Merged bug fix from Stable into Unstable 2011-07-13 12:12:42 -04:00
Menachem Fromer fa3ff53508 Filters should only be applied to the new VC if the old VC had filters applied 2011-07-13 11:58:16 -04:00
Eric Banks 969227c657 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-13 10:01:28 -04:00
Eric Banks 797c50e689 Fixing integration tests I broke yesterday; removing batch merging test since we don't support that anymore. 2011-07-13 10:01:23 -04:00
Eric Banks 6007eea3ff Allowing VCF records without GTs in vf4.1 2011-07-13 09:56:08 -04:00
Ryan Poplin 837fb8f689 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:39:26 -04:00
Ryan Poplin 5077c94d85 Adding MappingQualityUnavailableReadFilter to the SNP and indel CountCovariates 2011-07-12 15:39:07 -04:00
Mark DePristo 01fd6a6949 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:20:44 -04:00
Mark DePristo ccedd6ff4c Difference is now the general form -- used to be SummarizedDifference. The old Difference class is now a subclass of Difference that includes pointers to specific the master and test DiffElements.
Added a size() function that calculates the number of elements tree from a DiffElement.
2011-07-12 15:20:28 -04:00
Eric Banks a2597e7f00 This commit incorporates several different changes that each pretty much break all the VCF-based integration tests, so I bunched them all together. We now officially emit VCF4.1 files (woo hoo), which means that the VCF headers are now all different (header version is 4.1 plus counts for some of the annotations are 'A' or 'G'). Also, I've added a Read Filter for reads with MQ=255 ('unavailable' in the SAM spec) and have applied this to the UG and the RMS MQ annotation. 2011-07-12 14:11:53 -04:00
Ryan Poplin 329c3d8050 Merged bug fix from Stable into Unstable 2011-07-12 13:55:51 -04:00
Ryan Poplin 73735863b0 Fix for the case of requesting genotype for a sample that doesn't exist in a VariantContext 2011-07-12 13:55:21 -04:00
Guillermo del Angel c4c145afb9 Merged bug fix from Stable into Unstable 2011-07-12 13:44:48 -04:00
Guillermo del Angel cfe43e3971 Bug fix for Genotype given alleles: if we are in INDEL mode ignore SNPs and MNPs instead of emitting an empty site with alleles but no annotations 2011-07-12 13:43:46 -04:00
Mark DePristo 05212aea62 reader now takes an argument for the maximum number of elements to read from the file. 2011-07-12 08:53:19 -04:00
Mark DePristo 8056a3fe89 getElement() now uses O(1) get from hash instead of linear O(n) search. Enables us to read large files easily. 2011-07-12 08:52:31 -04:00
Mark DePristo f313e14e4e Now deletes the dump directory on ant clean
Moving diffengine tests from private to public
2011-07-12 08:50:58 -04:00
Eric Banks d7d15019dd Adding support for other simple header line types (e.g. ALT) and cleaning up the interface a bit. 2011-07-12 01:16:21 -04:00
Eric Banks 400b0d4422 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-11 23:38:57 -04:00
Mark DePristo d5056ad899 Merge branch 'master' into diffit 2011-07-11 23:16:15 -04:00
Mark DePristo 893cc2e103 Making the package public, so there's no dependances from public -> private 2011-07-11 23:15:08 -04:00