Commit Graph

373 Commits (75985c2fa0e248ae7bca4479616a8c7cb55c4e09)

Author SHA1 Message Date
Mark DePristo 3a27a25cfc Validates that the tribble binding provides the right object types at startup
Tests to ensure this remains working
2011-08-02 20:11:24 -04:00
Guillermo del Angel df37716857 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-02 18:27:13 -04:00
Ryan Poplin b2cde87378 Removing --DBSNP syntax from BQSR integration tests 2011-08-02 15:34:38 -04:00
Mark DePristo e4a67f3df1 RefMetaDataTracker has complete set of get() functions for List<RodBinding<T>>
Including unit tests
2011-08-02 14:28:35 -04:00
Mark DePristo 03741fb640 Merge branch 'master' into rodRefactor
Conflicts:
	public/java/src/org/broadinstitute/sting/gatk/walkers/annotator/VariantAnnotatorEngine.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerIntegrationTest.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerPerformanceTest.java
	public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java
2011-08-02 14:21:58 -04:00
Mark DePristo a366f9a18d Updating tools to use the RodBinding<T> syntax 2011-08-02 14:05:51 -04:00
Ryan Poplin c0653514b3 minor update to comment in UG 2011-08-02 13:34:48 -04:00
Ryan Poplin 2ba57bb502 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-02 13:30:46 -04:00
Ryan Poplin 38e4ae4176 minor update to comment in UG 2011-08-02 13:30:38 -04:00
Guillermo del Angel 821bbfa9e0 Bug fixes and enhancements to run whole-genome indel VQSR, removed old chr20-only code and cleanup 2011-08-02 13:17:20 -04:00
Eric Banks 65c5d55b72 Not sure how I missed these. These lines are now superfluous. 2011-08-02 12:48:36 -04:00
Eric Banks b9d0d2af22 Adding back temporarily removed integration test now that the file permissions have been fixed. 2011-08-02 12:39:11 -04:00
Eric Banks 1c387848de No more use of -D in the integration tests but instead stick with VCFs only. Since all of these tests were duplicated (one each for dbSNP format and for VCF), we don't actually lose coverage in the integration tests. 2011-08-02 10:39:50 -04:00
Eric Banks 2c5e526eb7 Don't use the mismatch fraction by default in the RealignerTargetCreator (since it's only useful when using SW in the indel realigner). Also, no more use of -D but instead move over to using VCFs. One integration test is temporarily commented out while I wait for a VCF file to get fixed. 2011-08-02 10:34:46 -04:00
Eric Banks 5626199bb6 The Unified Genotyper now does NOT emit SLOD/SB by default; to compute SB use --computeSLOD 2011-08-02 10:14:21 -04:00
Mark DePristo 184030dd56 RefMetaDataTracker no longer automagically converts inputs to VariantContexts
This was no longer working properly given that DBSNP indels needed to be moved around.  The adaptor system is being refactored and you will need to convert files from X -> VCF for many tools to work.
2011-08-01 15:21:16 -04:00
Mark DePristo 8b1adb8c95 Removed getVariantContext() code 2011-08-01 13:41:09 -04:00
Mark DePristo f69bff5dd6 Commented out, because these fail the now removed dbSNP conversion. 2011-08-01 13:34:25 -04:00
Eric Banks 3a9b6eacdf Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-01 11:23:18 -04:00
Mark DePristo 7b07c4e04e RefMetaDataTracker now has get() methods accepting RodBindings
RodBinding no longer duplicates the get() methods in RMDT.  This is just an object now that connects the command line system to the RMDT.
Updated programs to use new style
Added UnitTests for the RodBinding accessors.
2011-07-30 15:34:11 -04:00
Mark DePristo a6691ab2fd List<RodBinding<T>> now working (sort of).
At least the argument parsing system tolerates it.
2011-07-29 16:11:22 -04:00
Mark DePristo 6acb4aad3b RodBinding<T> are properly generic now.
VariantContextRodBinding removed, as RodBinding<VariantContext> is the right style now.
2011-07-29 14:37:12 -04:00
Mark DePristo 3b799db61a RefMetaDataTracker cleanup and unit tests
You know have to provide an explicit list of RODRecordLists upfront to the constructor.  RefMetaDataTracker is now immutable.  Changes in engine to incorporate these differences
Extensive UnitTests for RefMetaDataTracker now.
2011-07-29 13:23:17 -04:00
Ryan Poplin b06deac9ea Merged bug fix from Stable into Unstable 2011-07-29 10:02:36 -04:00
Ryan Poplin c0d4110ffd Correcting redundant warning text. 2011-07-29 10:01:11 -04:00
Mark DePristo 39b4e76fde Continuing refactoring of RefMetaDataTracker.
On the path towards converging getVariantContext() and getValues() in tracker so that we can have a single approach to get values from RODs with the new RodBinding() types
2011-07-28 17:48:28 -04:00
Mark DePristo 7c5c656b46 Uncovered fundamental accounting bug in VariantEval. Will be fixed by dev. team
Problem is that Novelty sees multiple records at a site (SNP, INDEL) to calculate whether a site is novel, but VariantEvalWalker makes an arbitrary decision which to use for analysis and CompOverlap may not see a comp record of the same type as eval.  So you get lines where the stratification is known but there are 10 novel sites!
2011-07-28 14:19:27 -04:00
Eric Banks 33b32c4211 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-28 13:57:22 -04:00
Eric Banks 7a2a65155f Merged bug fix from Stable into Unstable 2011-07-28 13:56:43 -04:00
Eric Banks 1afc49a297 There are some really 'interesting' (but apparently valid) records in the Mus musculus dbSNP file. Generalized the handling of complex cases in the dbSNP adaptor to handle it all. I just grabbed the actual Mus musculus dbSNP file as a test, ran it whole genome, and confirmed that we finally produce a valid VCF on it. Should be the last commit needed on this adaptor. 2011-07-28 13:55:58 -04:00
Mark DePristo f7a126722b Cleaned up VariantContext accessors in RefMetaDataTracker
It's no longer possible to provided allowed types, as this was a very rarely used feature in the engine.  These get methods have been removed and local uses replaced with tests directly in their code.  This simplified the RefMetaDataTracker significantly
VariantContextRodBinding now forwards along all of the RefMetaDataTracker methods, so it is possible to create a full equivalent VariantContextRodBinding now as a walker field variable.
All walkers updated to the new RefMetaDataTracker function call style
2011-07-28 00:16:34 -04:00
Mark DePristo c83f9432eb Cleaned up RefMetaDataTracker
Renamed many functions to more clearly state what they are actually doing
Removed unnecessary / unused functionality, reducing interface complexity
Updated all uses of this code in GATK
Added generic, type-safe accessors to RefMetaDataTracker such as public <T> List<T> getValues(final String name, Class<T> clazz)
Added standard refMetaDataTracker accessors to RodBinding, so you can do everything you can for generic rods with the tracker directly with with the RodBinding
2011-07-27 23:25:52 -04:00
Eric Banks 1865211b6d Merged bug fix from Stable into Unstable 2011-07-27 22:52:06 -04:00
Eric Banks 6230315ff2 Along with my half-written commit message from earlier, I also forgot to commit the integration test updates. This is what happens when you try to do things 30 seconds before you leave for the day. To finish up from before: complex events weren't being padded with the reference base as per the VCF spec. They are now. 2011-07-27 22:51:21 -04:00
Mark DePristo f3ad4ec94b Removed annoying FastaSequenceIndexBuilderProgressListener infrastructure that was just a boolean switch on whether to print progress or not. 2011-07-27 22:06:23 -04:00
Eric Banks ff31fa7990 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-27 16:15:23 -04:00
Eric Banks 5809a61b20 Merged bug fix from Stable into Unstable 2011-07-27 16:14:59 -04:00
Eric Banks 64aad67b5f Fixing dbSNP adaptor for complex indels (wasn) 2011-07-27 16:13:45 -04:00
Mark DePristo 15be383d5b Merge branch 'master' into rodRefactor 2011-07-27 15:36:49 -04:00
Mark DePristo 38a2518668 Merge branch 'master' into rodRefactor 2011-07-27 15:34:54 -04:00
Mark DePristo 60db6cc836 Warnings for old ROD system use.
Removed unused class GATKRODFeature
2011-07-27 12:39:12 -04:00
Mark DePristo 097828a466 ParsingEngine now maintains the list of rodBindings
No longer try to reparser objects to find the right fields
Direct support in RodBinding for getTags()
2011-07-27 11:36:53 -04:00
Mauricio Carneiro 20a3b31b61 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-26 19:29:45 -04:00
Mauricio Carneiro 321afac4e8 Updates to the help layout.
*New style.css, new template for the walker auto-generated html. Short description is no longer repeated in the long description of the walker.

 *Updated DiffObjectsWalker and ContigStatsWalker as "reference" documented walkers.
2011-07-26 19:29:25 -04:00
Kiran V Garimella 405e521d44 Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-26 17:56:48 -04:00
Kiran V Garimella 92a11ed8dc Updated MD5 for PhaseByTransmissionIntegrationTest 2011-07-26 17:52:25 -04:00
Kiran V Garimella 412c466de6 Bug fix, wherein triple-hets after genotype refinement need to be left unphased, not just prior to refinement 2011-07-26 17:43:43 -04:00
Mark DePristo 81f8e05bfa Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-26 17:35:46 -04:00
Mark DePristo f6a5e0e36a Go for global integrationtest path first, if possible. 2011-07-26 17:35:30 -04:00
Matt Hanna fec495e292 Fix a nasty little bug in the sharding system: if the last shard in contig n
overlaps exactly on disk with the first shard in contig n+1, the shards
would be merged together to avoid duplicate extraction.  Unfortunately,
the interval overlap filter couldn't handle shards spanning contigs, and
was choosing to filter out reads from contig n+1 which should have been
included.
I'm not completely sure why the BAM indexing code would ever specify that the
end of one chromosome had the same on-disk location as the start of the next
one.  I suspect that this is a indexer performance bug.
2011-07-26 15:43:20 -04:00
Mark DePristo 9dfb57168a RodBinding source is no longer assumed to be a file 2011-07-26 13:59:44 -04:00
Mark DePristo d0badd5bd6 RodBinding subclassed to VariantContextRodBinding for easy access to VariantContext providing RODs 2011-07-26 13:54:55 -04:00
Mark DePristo 7ab8b53339 Support for List<RodBinding> argument type 2011-07-26 11:37:31 -04:00
Mark DePristo 38969b9783 Prototype of RODBinding @Arguments instead of -B syntax
Initial version of RodBinding class.
Flow from walker Rodbinding @Arguments -> RMDTriplet (old system) -> GATK engine (standard).  Will need refactoring.
2011-07-26 11:09:06 -04:00
Matt Hanna 088fc39308 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-25 15:54:56 -04:00
Eric Banks a53aeb75ab Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-25 15:10:35 -04:00
Eric Banks a29554e565 Removing the Genomic Annotator and its supporting classes 2011-07-25 15:10:25 -04:00
Mark DePristo 3afcb3415d Max of 1000 records will be loaded and compared to avoid heap size problem. 2011-07-25 14:58:31 -04:00
Mark DePristo 2a51543693 Actually should have been gone... 2011-07-25 13:27:42 -04:00
Mark DePristo ebfd8df06c Restoring accidentially deleted unit test 2011-07-25 13:25:30 -04:00
Mark DePristo f3049fba63 refdata directory cleanup
Removing unused files RODRecordIterator, ReferenceOrderedData, QueryableTrack, RMDTrackCreationException, GATKFeatureIterator, ReferenceOrderedDataUnitTest
Refactored dbSNP and refseq utilities to be closer to the other files implementing these features
2011-07-25 13:21:52 -04:00
Matt Hanna 8014fad6ff Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-25 13:20:44 -04:00
Matt Hanna 2ac490dbdf Fix improper detection of command-line arguments with missing values. 2011-07-25 13:20:00 -04:00
Mark DePristo 90947ab359 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-25 12:53:56 -04:00
Mark DePristo 44bd9ae703 Restoring UninstantiableWalker, as it is not going to be possible to run ant test; ant gatkdocs without ant clean in between 2011-07-25 12:53:06 -04:00
Mark DePristo acda8eb09c Commented out test that causes new CommandLineGATK() to fail 2011-07-25 12:43:27 -04:00
Kiran V Garimella 357f503a21 Merge branch 'desktop' 2011-07-25 11:36:27 -04:00
Kiran V Garimella 0b43ee117c Added the required=false tag to the -noST and -noEV arguments so the auto-help output doesn't look weird (i.e. listing arguments as required when their value has already been specified by default). 2011-07-25 11:35:34 -04:00
Kiran V Garimella bbb8473f03 Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-25 10:59:00 -04:00
Mark DePristo 1a268ff1fd Refactor so that GenotypeAnnotation and InfoFieldAnnotation share common superclass VariantAnnotatorAnnotation 2011-07-25 10:55:09 -04:00
Mark DePristo 7f8e6a97ee InfoFieldAnnotation now an abstract class extended by annotations so doc system works 2011-07-25 10:47:11 -04:00
Mauricio Carneiro 4c6c16f895 Documented following the new gatkdoc framework 2011-07-25 00:25:08 -04:00
Mark DePristo 2039ce6102 Default values now displayed in arguments
DiffEngine fixed so that newInstance() would work.  Pretty quickly encountered a situation where newInstance() failed.  Debug output now written when this occurs in the log.
Logger now used instead of standard out, with INFO the default level.
2011-07-24 22:56:55 -04:00
Mark DePristo c43b5981f2 Hidden variables are hidden by default. Settable by command line option
DiffObjectsWalker test arguments removed.
Minor refactoring of GATKDoclet
2011-07-24 20:52:44 -04:00
Mark DePristo 1c1f1da349 Fixing compilation 2011-07-24 20:01:59 -04:00
Mark DePristo 9f06f6c493 Split GATKDoclet from ResourceBundleDoclet. Refactored GaTKDocWorkUnit 2011-07-24 20:00:04 -04:00
Mark DePristo ff85687679 Merge branch 'master' into help 2011-07-24 18:14:32 -04:00
Mark DePristo 83996f7951 Enumerated types are working. 2011-07-24 18:14:21 -04:00
Mark DePristo 3c34e9fa65 Cleanup emuns and tables 2011-07-24 17:45:58 -04:00
Mark DePristo c620d96c96 Inline enum documentation is working 2011-07-24 17:22:14 -04:00
Mark DePristo 793e7d3d1d Improved header and argument details
Argument detail structure cleaned up. Only relevant pieces of information are shown now, and in a cleaner layout.
Misc. cleanup in the code.
2011-07-24 16:36:25 -04:00
Mark DePristo c6af4efcdc Implemented see also and version header 2011-07-24 16:10:17 -04:00
Mark DePristo 5e0fe2d0f9 Support for style.css via refactored common.html included in all files 2011-07-24 15:42:39 -04:00
Mark DePristo d0ab6bf7a9 Now links to sub and superclass documentation, where possible. 2011-07-24 09:56:17 -04:00
Mark DePristo e2dabb70b8 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-24 08:57:47 -04:00
Mauricio Carneiro 1ef964c92c Merge branch 'contig' 2011-07-24 02:40:42 -04:00
Mauricio Carneiro 7ffedf211c Contig comparator -- sorting contigs like Picard
This is very useful if you want to output your text files or manipulate data in the usual chromosome ordering :
 1
 2
 3
 ...
 21
 22
 X
 Y
 GL???
 ...

 Just use this comparator in any SortedSet class constructor and your data will be sorted like in the BAM file.
2011-07-24 02:33:19 -04:00
Mark DePristo 6b501e267b Includes non-concrete classes in docs
CommandLineGATK has extraDocs to ReadFilter and UserException as well
2011-07-23 22:15:01 -04:00
Mark DePristo 7420ed098e Semi-working version of extraDocs tag in annotation to refer to one capability being accessible in another
Required a significant refactoring of the GATKDoclet, which now has a unified place where the ClassDoc, class, annotation, and handler are all stored together.
2011-07-23 22:07:30 -04:00
Mark DePristo 999acacfa1 Merge branch 'master' into help 2011-07-23 20:19:33 -04:00
Mark DePristo 1d3bcce2c4 Merge branch 'master' into NoDistributedGATK 2011-07-23 20:04:50 -04:00
Mark DePristo e262f4e10b gatkdoc now generalized to use @Annotation. Multiple subsystems now use annotation to receive docs
Index expanded to use summary() annotation field
UserExceptions, ReadFilters, GATK engine all use the system to generate docs
Doclet expanded to handle lots of new cases
2011-07-23 20:00:35 -04:00
Kiran V Garimella 0b36b6540f Merge branch 'laptop' 2011-07-23 01:44:54 -04:00
Kiran V Garimella e23cb27451 Modified MD5 to account for the triple hets that shouldn't be phased 2011-07-23 01:44:44 -04:00
Kiran V Garimella 1dba8b768c Merge branch 'laptop' 2011-07-23 01:39:15 -04:00
Kiran V Garimella 57e3d136eb Don't try to phase triple-hets either. 2011-07-23 01:38:58 -04:00
Kiran V Garimella f366124778 Merge branch 'laptop' 2011-07-23 01:25:36 -04:00
Kiran V Garimella 45f2ca8d99 Changed MD5 to reflect latest changes to PhaseByTransmission. 2011-07-23 01:21:07 -04:00
Kiran V Garimella 5af9d50183 Merge branch 'laptop' 2011-07-23 01:12:06 -04:00
Kiran V Garimella 5521919cc9 Fixed bug where variants to phase were not being selected properly. 2011-07-23 01:11:28 -04:00
Kiran V Garimella 7da99388ac Merge branch 'laptop' 2011-07-23 01:01:11 -04:00
Kiran V Garimella 58eed20b83 Copy all entries from the attributes map, rather than attempting to modify an unmodifiable map. 2011-07-23 01:00:46 -04:00
Kiran V Garimella b5deff48e6 Merge branch 'laptop' 2011-07-23 00:56:50 -04:00
Kiran V Garimella 5638017137 Removed the nofilters argument specification in the integrationtest 2011-07-23 00:56:23 -04:00
Kiran V Garimella ffa361f57f Merge branch 'laptop' 2011-07-23 00:50:38 -04:00
Kiran V Garimella 9417ba8c2c Modified to accept multi-sample VCFs, removed the application of filters, and changed transmission probability field to be a genotype field rather than an INFO field. 2011-07-23 00:48:26 -04:00
Mark DePristo 28b9432d26 Docs for read filters, the engine, and the UserExceptions. 2011-07-22 16:09:21 -04:00
Kiran V Garimella 051c1dc639 Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-22 15:59:00 -04:00
Mark DePristo f0be7348be Generalized handler to allow it to be used with any arbitrary class structure.
DocumentedGATKFeature now includes a field for the group name.
Build.xml works with public / private now.
2011-07-22 14:07:40 -04:00
Matt Hanna f50145b872 Reinitialize random seed in the bwa bindings from the fixed seed stored in the
BWA support files every time the support files are loaded.
2011-07-22 13:41:53 -04:00
Mark DePristo 453954182e Generalized the documentation system to use a class-specific annotation and processor.
Need to generalize and bug fix the system.  But at a high level it's working now.
2011-07-22 13:18:33 -04:00
Kiran V Garimella b8a0fd2a8d Multiply fractionRandom by 100.0 so that the line that indicates the percentage of variants that will be output says (for instance) 90%, not 0.9% 2011-07-22 11:54:59 -04:00
Mark DePristo 9e88d51db9 Removed now unused @version tags from walker docs. 2011-07-22 09:57:03 -04:00
Mark DePristo 421b70ca4f Removed previous, and largely unused, help system extensions.
This involved deleting the utils/help/*Taglet.java classes, which parsed out these fields unnecessarily
This also involved removing the few uses of these from the codebase.  For these uses, though, almost all were an identical copy of the first line of the docs, which is the default javadoc behavior anyway.
2011-07-22 09:42:44 -04:00
Mark DePristo 172b35372b Moved all of the distributed GATK code to archive. 2011-07-22 09:20:32 -04:00
Khalid Shakir 8b8f121cfb Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-21 23:01:11 -04:00
Khalid Shakir 59eb1f4663 Memory limits changed from Int to Double.
Updated LSF calls to read memory units from config along with tweaks to select hosts.
Moved some common code from GridEngine and LSF to super classes.
2011-07-21 22:57:18 -04:00
Mark DePristo 81d0cab27e Walker index html now emited. 2011-07-21 16:01:54 -04:00
Mark DePristo e892489696 V2 of the document system.
Now uses GATKDoc class to organize documentation for arguments.
Arguments now listed by feature (required, optional, hidden, etc) and link to detailed information about the argument in the html
Lots of code moving between Class and ClassDoc objects.  Should be refactored into a single static utility class.
2011-07-21 15:20:34 -04:00
Christopher Hartl 2f5d10d16b Fix bug wherein aligner could be closed prior to its being used to lowercase sequences. 2011-07-21 13:21:48 -04:00
Matt Hanna 7054c5342f When using the BWA bindings, you have to explicitly call close() to get the
bindings to release memory.
It may or may not be possible to implicitly close triggered by the GC; I'll add a JIRA.
2011-07-21 12:13:29 -04:00
Christopher Hartl 15610ce0c3 Per Matt's request, disabling BWA-based integration tests so he can assess bamboo memory usage. 2011-07-21 11:04:22 -04:00
Mark DePristo 6fa17d86ae Completely hacked together version of a FreeMarker + javadoc + custom doclet walker documentation generator 2011-07-21 00:18:07 -04:00
Mark DePristo 45c73ff0e5 Runs and emits an HTML document 2011-07-20 17:16:33 -04:00
Mark DePristo d31b176e15 Removed GATK use of distributed parallelism framework.
Moved distributed GATK prototype code into distributedutils, separating from threading package
2011-07-20 16:26:09 -04:00
Guillermo del Angel 0a1d2df8cb Merged bug fix from Stable into Unstable 2011-07-20 13:19:35 -04:00
Guillermo del Angel f15023b7d2 Bad bug fix: output GLs in multiallelic records were in incorred order (misread spec) 2011-07-20 12:10:48 -04:00
Guillermo del Angel b9c9e0e952 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-20 10:45:16 -04:00
Guillermo del Angel 7140280bf6 Further bug fixes/cleanups for PrintReadsWalker 2011-07-20 10:44:37 -04:00
Guillermo del Angel a2d90a3590 Bug fix: reverted logic so that default behavior skips over sample lookup 2011-07-20 10:23:10 -04:00
Guillermo del Angel e8409c80fa Further protection vs null pointers in PrintReadsWalker 2011-07-19 21:59:24 -04:00
Christopher Hartl 5d706c9e92 Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
Removing PSP and CSM

Conflicts:

	public/java/src/org/broadinstitute/sting/gatk/walkers/sequenom/CreateSequenomMask.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/sequenom/PickSequenomProbes.java
2011-07-19 20:25:33 -04:00
Guillermo del Angel fb2d475c22 Bug fix to prevent null pointer 2011-07-19 20:13:56 -04:00
Christopher Hartl 92c7cfa1c8 BWA bindings and tests moved to public (was required for ValidationAmplicons)
Integration tests for ValidationAmplicons. New argument to disable BWA, lowercase letters only for repetitiveness instead.
2011-07-19 20:11:31 -04:00
David Roazen baae381acb Revert "Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable"
This reverts commit 039a6bb01f345322ce2be50ae3634308bb24e77e, reversing
changes made to b9c9973d1c638dfc9f8c19b5eb845e99844f9d29.
2011-07-19 18:38:53 -04:00
Christopher Hartl 07e716d23a PickSequenomProbes2 expanded functionality: lowercasing based on sequence uniqueness, preserving reference base prior to indel (not a part of the VC as I thought it was), masking deletion bases with 'N's, flanking insertion with 'N's, output is a fasta formatted file. Renamed to ValidationAmplicons since this is really not for picking sequenom probes, but for generating amplicon sequence from which other applications (like sequenom) can choose PCR primers. Moved from private to public. 2011-07-19 15:21:47 -04:00
Guillermo del Angel 6181d1e4cb Fixed integration test for VariantsToTable: now the * in REF column is not output 2011-07-19 14:42:11 -04:00
Guillermo del Angel e6d306458c Merge bug fixes 2011-07-19 14:36:20 -04:00
Guillermo del Angel 989dd17f95 a) Add ability in PrintReads to specify a sample file to easily subset samples, useful for IGV visualization, b) VariantsToTable is more R-friendly with Indels when printing ref/alt columns, c) Changes to SelectVariants ability to speficy a mask to randomly sample from a given AF distribution 2011-07-19 14:29:07 -04:00
Mark DePristo 8f0badc52b Updating md5s, as the diffobjects walker now emits the summary in reverse order. 2011-07-18 15:44:21 -04:00
Mark DePristo c05451047c Support for multiple records at the same site. The first record gets chr:start, and subsequent records get chr:start_2, chr:start_3, etc. 2011-07-18 15:43:52 -04:00
Mark DePristo 782a05e9b5 Support for sorting the diff output in reverse order. 2011-07-18 15:43:01 -04:00
Mark DePristo 45702d3084 Now supports a mode where the primary key isn't sorted. In this case the records are displayed in the order in which they are added to to the table. 2011-07-18 15:40:15 -04:00
Eric Banks 83ba2c066a Making it deterministic 2011-07-18 13:59:02 -04:00
Eric Banks 92fa410450 Check that it's a valid bam file before parsing or bad things can happen 2011-07-18 13:43:34 -04:00
Eric Banks 80b5c5261a CombineVariants no longer combines records of different types. So now when combining SNP and indel callsets, overlapping calls get their own records. Useful for Khalid in the pipeline. For those interested, it turns out the previous behavior was doing the wrong thing occasionally (and this was even captured in the integration tests). 2011-07-18 13:42:45 -04:00
Eric Banks bc8b5da698 Added docs while I was reading through the code to understand it 2011-07-18 12:25:54 -04:00
Mark DePristo 51b0dd01c3 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-18 10:47:29 -04:00
Mark DePristo d6e2e89f99 Walker test system refactoring. All MD5DB related functions are now in MD5DB.java.
System has the concept of a local and a global MD5 db.  The local one is like it operated previously.  The global one lives in /humgen/gsa-hpprojects/GATK/data/integrationtests.  If the system can find this directory then MD5s will also be read / written to this location.  This means that gsabamboo will print differences as appropriate.  And all users will in effect have access to a complete history of MD5 file results.
A few minor code reshuffles changed VariantRecalibration and VCFHeader test files.
2011-07-18 10:46:01 -04:00
Mark DePristo 6f26c07b85 Removed the SpecificDifference class. Now Difference classes always have the option to remember specific master and test values. This means that all summarized differences carry with them specific examples of their differences. Consequently, now even summarized differences give at least one example of the specific difference, even when the count of the difference is > 1. Unit tests updated. Added DiffObjects integrationtest. VCFDiffableReader now specifically reads the first line of the VCF file to capture the version number. 2011-07-18 10:42:35 -04:00