Ryan Poplin
c0653514b3
minor update to comment in UG
2011-08-02 13:34:48 -04:00
Ryan Poplin
2ba57bb502
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-02 13:30:46 -04:00
Ryan Poplin
38e4ae4176
minor update to comment in UG
2011-08-02 13:30:38 -04:00
Guillermo del Angel
821bbfa9e0
Bug fixes and enhancements to run whole-genome indel VQSR, removed old chr20-only code and cleanup
2011-08-02 13:17:20 -04:00
Eric Banks
65c5d55b72
Not sure how I missed these. These lines are now superfluous.
2011-08-02 12:48:36 -04:00
Eric Banks
b9d0d2af22
Adding back temporarily removed integration test now that the file permissions have been fixed.
2011-08-02 12:39:11 -04:00
Eric Banks
1c387848de
No more use of -D in the integration tests but instead stick with VCFs only. Since all of these tests were duplicated (one each for dbSNP format and for VCF), we don't actually lose coverage in the integration tests.
2011-08-02 10:39:50 -04:00
Eric Banks
2c5e526eb7
Don't use the mismatch fraction by default in the RealignerTargetCreator (since it's only useful when using SW in the indel realigner). Also, no more use of -D but instead move over to using VCFs. One integration test is temporarily commented out while I wait for a VCF file to get fixed.
2011-08-02 10:34:46 -04:00
Eric Banks
5626199bb6
The Unified Genotyper now does NOT emit SLOD/SB by default; to compute SB use --computeSLOD
2011-08-02 10:14:21 -04:00
Mark DePristo
184030dd56
RefMetaDataTracker no longer automagically converts inputs to VariantContexts
...
This was no longer working properly given that DBSNP indels needed to be moved around. The adaptor system is being refactored and you will need to convert files from X -> VCF for many tools to work.
2011-08-01 15:21:16 -04:00
Mark DePristo
8b1adb8c95
Removed getVariantContext() code
2011-08-01 13:41:09 -04:00
Mark DePristo
f69bff5dd6
Commented out, because these fail the now removed dbSNP conversion.
2011-08-01 13:34:25 -04:00
Eric Banks
3a9b6eacdf
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-01 11:23:18 -04:00
Mark DePristo
7b07c4e04e
RefMetaDataTracker now has get() methods accepting RodBindings
...
RodBinding no longer duplicates the get() methods in RMDT. This is just an object now that connects the command line system to the RMDT.
Updated programs to use new style
Added UnitTests for the RodBinding accessors.
2011-07-30 15:34:11 -04:00
Mark DePristo
a6691ab2fd
List<RodBinding<T>> now working (sort of).
...
At least the argument parsing system tolerates it.
2011-07-29 16:11:22 -04:00
Mark DePristo
6acb4aad3b
RodBinding<T> are properly generic now.
...
VariantContextRodBinding removed, as RodBinding<VariantContext> is the right style now.
2011-07-29 14:37:12 -04:00
Mark DePristo
3b799db61a
RefMetaDataTracker cleanup and unit tests
...
You know have to provide an explicit list of RODRecordLists upfront to the constructor. RefMetaDataTracker is now immutable. Changes in engine to incorporate these differences
Extensive UnitTests for RefMetaDataTracker now.
2011-07-29 13:23:17 -04:00
Ryan Poplin
b06deac9ea
Merged bug fix from Stable into Unstable
2011-07-29 10:02:36 -04:00
Ryan Poplin
c0d4110ffd
Correcting redundant warning text.
2011-07-29 10:01:11 -04:00
Mark DePristo
39b4e76fde
Continuing refactoring of RefMetaDataTracker.
...
On the path towards converging getVariantContext() and getValues() in tracker so that we can have a single approach to get values from RODs with the new RodBinding() types
2011-07-28 17:48:28 -04:00
Mark DePristo
7c5c656b46
Uncovered fundamental accounting bug in VariantEval. Will be fixed by dev. team
...
Problem is that Novelty sees multiple records at a site (SNP, INDEL) to calculate whether a site is novel, but VariantEvalWalker makes an arbitrary decision which to use for analysis and CompOverlap may not see a comp record of the same type as eval. So you get lines where the stratification is known but there are 10 novel sites!
2011-07-28 14:19:27 -04:00
Eric Banks
33b32c4211
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-28 13:57:22 -04:00
Eric Banks
7a2a65155f
Merged bug fix from Stable into Unstable
2011-07-28 13:56:43 -04:00
Eric Banks
1afc49a297
There are some really 'interesting' (but apparently valid) records in the Mus musculus dbSNP file. Generalized the handling of complex cases in the dbSNP adaptor to handle it all. I just grabbed the actual Mus musculus dbSNP file as a test, ran it whole genome, and confirmed that we finally produce a valid VCF on it. Should be the last commit needed on this adaptor.
2011-07-28 13:55:58 -04:00
Mark DePristo
f7a126722b
Cleaned up VariantContext accessors in RefMetaDataTracker
...
It's no longer possible to provided allowed types, as this was a very rarely used feature in the engine. These get methods have been removed and local uses replaced with tests directly in their code. This simplified the RefMetaDataTracker significantly
VariantContextRodBinding now forwards along all of the RefMetaDataTracker methods, so it is possible to create a full equivalent VariantContextRodBinding now as a walker field variable.
All walkers updated to the new RefMetaDataTracker function call style
2011-07-28 00:16:34 -04:00
Mark DePristo
c83f9432eb
Cleaned up RefMetaDataTracker
...
Renamed many functions to more clearly state what they are actually doing
Removed unnecessary / unused functionality, reducing interface complexity
Updated all uses of this code in GATK
Added generic, type-safe accessors to RefMetaDataTracker such as public <T> List<T> getValues(final String name, Class<T> clazz)
Added standard refMetaDataTracker accessors to RodBinding, so you can do everything you can for generic rods with the tracker directly with with the RodBinding
2011-07-27 23:25:52 -04:00
Eric Banks
1865211b6d
Merged bug fix from Stable into Unstable
2011-07-27 22:52:06 -04:00
Eric Banks
6230315ff2
Along with my half-written commit message from earlier, I also forgot to commit the integration test updates. This is what happens when you try to do things 30 seconds before you leave for the day. To finish up from before: complex events weren't being padded with the reference base as per the VCF spec. They are now.
2011-07-27 22:51:21 -04:00
Mark DePristo
f3ad4ec94b
Removed annoying FastaSequenceIndexBuilderProgressListener infrastructure that was just a boolean switch on whether to print progress or not.
2011-07-27 22:06:23 -04:00
Eric Banks
ff31fa7990
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-27 16:15:23 -04:00
Eric Banks
5809a61b20
Merged bug fix from Stable into Unstable
2011-07-27 16:14:59 -04:00
Eric Banks
64aad67b5f
Fixing dbSNP adaptor for complex indels (wasn)
2011-07-27 16:13:45 -04:00
Mark DePristo
15be383d5b
Merge branch 'master' into rodRefactor
2011-07-27 15:36:49 -04:00
Mark DePristo
38a2518668
Merge branch 'master' into rodRefactor
2011-07-27 15:34:54 -04:00
Mark DePristo
60db6cc836
Warnings for old ROD system use.
...
Removed unused class GATKRODFeature
2011-07-27 12:39:12 -04:00
Mark DePristo
097828a466
ParsingEngine now maintains the list of rodBindings
...
No longer try to reparser objects to find the right fields
Direct support in RodBinding for getTags()
2011-07-27 11:36:53 -04:00
Mauricio Carneiro
20a3b31b61
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-26 19:29:45 -04:00
Mauricio Carneiro
321afac4e8
Updates to the help layout.
...
*New style.css, new template for the walker auto-generated html. Short description is no longer repeated in the long description of the walker.
*Updated DiffObjectsWalker and ContigStatsWalker as "reference" documented walkers.
2011-07-26 19:29:25 -04:00
Kiran V Garimella
405e521d44
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-26 17:56:48 -04:00
Kiran V Garimella
92a11ed8dc
Updated MD5 for PhaseByTransmissionIntegrationTest
2011-07-26 17:52:25 -04:00
Kiran V Garimella
412c466de6
Bug fix, wherein triple-hets after genotype refinement need to be left unphased, not just prior to refinement
2011-07-26 17:43:43 -04:00
Mark DePristo
81f8e05bfa
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-26 17:35:46 -04:00
Mark DePristo
f6a5e0e36a
Go for global integrationtest path first, if possible.
2011-07-26 17:35:30 -04:00
Matt Hanna
fec495e292
Fix a nasty little bug in the sharding system: if the last shard in contig n
...
overlaps exactly on disk with the first shard in contig n+1, the shards
would be merged together to avoid duplicate extraction. Unfortunately,
the interval overlap filter couldn't handle shards spanning contigs, and
was choosing to filter out reads from contig n+1 which should have been
included.
I'm not completely sure why the BAM indexing code would ever specify that the
end of one chromosome had the same on-disk location as the start of the next
one. I suspect that this is a indexer performance bug.
2011-07-26 15:43:20 -04:00
Mark DePristo
9dfb57168a
RodBinding source is no longer assumed to be a file
2011-07-26 13:59:44 -04:00
Mark DePristo
d0badd5bd6
RodBinding subclassed to VariantContextRodBinding for easy access to VariantContext providing RODs
2011-07-26 13:54:55 -04:00
Mark DePristo
7ab8b53339
Support for List<RodBinding> argument type
2011-07-26 11:37:31 -04:00
Mark DePristo
38969b9783
Prototype of RODBinding @Arguments instead of -B syntax
...
Initial version of RodBinding class.
Flow from walker Rodbinding @Arguments -> RMDTriplet (old system) -> GATK engine (standard). Will need refactoring.
2011-07-26 11:09:06 -04:00
Matt Hanna
088fc39308
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-25 15:54:56 -04:00
Eric Banks
a53aeb75ab
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-25 15:10:35 -04:00
Eric Banks
a29554e565
Removing the Genomic Annotator and its supporting classes
2011-07-25 15:10:25 -04:00
Mark DePristo
3afcb3415d
Max of 1000 records will be loaded and compared to avoid heap size problem.
2011-07-25 14:58:31 -04:00
Mark DePristo
2a51543693
Actually should have been gone...
2011-07-25 13:27:42 -04:00
Mark DePristo
ebfd8df06c
Restoring accidentially deleted unit test
2011-07-25 13:25:30 -04:00
Mark DePristo
f3049fba63
refdata directory cleanup
...
Removing unused files RODRecordIterator, ReferenceOrderedData, QueryableTrack, RMDTrackCreationException, GATKFeatureIterator, ReferenceOrderedDataUnitTest
Refactored dbSNP and refseq utilities to be closer to the other files implementing these features
2011-07-25 13:21:52 -04:00
Matt Hanna
8014fad6ff
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-25 13:20:44 -04:00
Matt Hanna
2ac490dbdf
Fix improper detection of command-line arguments with missing values.
2011-07-25 13:20:00 -04:00
Mark DePristo
90947ab359
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-25 12:53:56 -04:00
Mark DePristo
44bd9ae703
Restoring UninstantiableWalker, as it is not going to be possible to run ant test; ant gatkdocs without ant clean in between
2011-07-25 12:53:06 -04:00
Mark DePristo
acda8eb09c
Commented out test that causes new CommandLineGATK() to fail
2011-07-25 12:43:27 -04:00
Kiran V Garimella
357f503a21
Merge branch 'desktop'
2011-07-25 11:36:27 -04:00
Kiran V Garimella
0b43ee117c
Added the required=false tag to the -noST and -noEV arguments so the auto-help output doesn't look weird (i.e. listing arguments as required when their value has already been specified by default).
2011-07-25 11:35:34 -04:00
Kiran V Garimella
bbb8473f03
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-25 10:59:00 -04:00
Mark DePristo
1a268ff1fd
Refactor so that GenotypeAnnotation and InfoFieldAnnotation share common superclass VariantAnnotatorAnnotation
2011-07-25 10:55:09 -04:00
Mark DePristo
7f8e6a97ee
InfoFieldAnnotation now an abstract class extended by annotations so doc system works
2011-07-25 10:47:11 -04:00
Mauricio Carneiro
4c6c16f895
Documented following the new gatkdoc framework
2011-07-25 00:25:08 -04:00
Mark DePristo
2039ce6102
Default values now displayed in arguments
...
DiffEngine fixed so that newInstance() would work. Pretty quickly encountered a situation where newInstance() failed. Debug output now written when this occurs in the log.
Logger now used instead of standard out, with INFO the default level.
2011-07-24 22:56:55 -04:00
Mark DePristo
c43b5981f2
Hidden variables are hidden by default. Settable by command line option
...
DiffObjectsWalker test arguments removed.
Minor refactoring of GATKDoclet
2011-07-24 20:52:44 -04:00
Mark DePristo
1c1f1da349
Fixing compilation
2011-07-24 20:01:59 -04:00
Mark DePristo
9f06f6c493
Split GATKDoclet from ResourceBundleDoclet. Refactored GaTKDocWorkUnit
2011-07-24 20:00:04 -04:00
Mark DePristo
ff85687679
Merge branch 'master' into help
2011-07-24 18:14:32 -04:00
Mark DePristo
83996f7951
Enumerated types are working.
2011-07-24 18:14:21 -04:00
Mark DePristo
3c34e9fa65
Cleanup emuns and tables
2011-07-24 17:45:58 -04:00
Mark DePristo
c620d96c96
Inline enum documentation is working
2011-07-24 17:22:14 -04:00
Mark DePristo
793e7d3d1d
Improved header and argument details
...
Argument detail structure cleaned up. Only relevant pieces of information are shown now, and in a cleaner layout.
Misc. cleanup in the code.
2011-07-24 16:36:25 -04:00
Mark DePristo
c6af4efcdc
Implemented see also and version header
2011-07-24 16:10:17 -04:00
Mark DePristo
5e0fe2d0f9
Support for style.css via refactored common.html included in all files
2011-07-24 15:42:39 -04:00
Mark DePristo
d0ab6bf7a9
Now links to sub and superclass documentation, where possible.
2011-07-24 09:56:17 -04:00
Mark DePristo
e2dabb70b8
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-24 08:57:47 -04:00
Mauricio Carneiro
1ef964c92c
Merge branch 'contig'
2011-07-24 02:40:42 -04:00
Mauricio Carneiro
7ffedf211c
Contig comparator -- sorting contigs like Picard
...
This is very useful if you want to output your text files or manipulate data in the usual chromosome ordering :
1
2
3
...
21
22
X
Y
GL???
...
Just use this comparator in any SortedSet class constructor and your data will be sorted like in the BAM file.
2011-07-24 02:33:19 -04:00
Mark DePristo
6b501e267b
Includes non-concrete classes in docs
...
CommandLineGATK has extraDocs to ReadFilter and UserException as well
2011-07-23 22:15:01 -04:00
Mark DePristo
7420ed098e
Semi-working version of extraDocs tag in annotation to refer to one capability being accessible in another
...
Required a significant refactoring of the GATKDoclet, which now has a unified place where the ClassDoc, class, annotation, and handler are all stored together.
2011-07-23 22:07:30 -04:00
Mark DePristo
999acacfa1
Merge branch 'master' into help
2011-07-23 20:19:33 -04:00
Mark DePristo
1d3bcce2c4
Merge branch 'master' into NoDistributedGATK
2011-07-23 20:04:50 -04:00
Mark DePristo
e262f4e10b
gatkdoc now generalized to use @Annotation. Multiple subsystems now use annotation to receive docs
...
Index expanded to use summary() annotation field
UserExceptions, ReadFilters, GATK engine all use the system to generate docs
Doclet expanded to handle lots of new cases
2011-07-23 20:00:35 -04:00
Kiran V Garimella
0b36b6540f
Merge branch 'laptop'
2011-07-23 01:44:54 -04:00
Kiran V Garimella
e23cb27451
Modified MD5 to account for the triple hets that shouldn't be phased
2011-07-23 01:44:44 -04:00
Kiran V Garimella
1dba8b768c
Merge branch 'laptop'
2011-07-23 01:39:15 -04:00
Kiran V Garimella
57e3d136eb
Don't try to phase triple-hets either.
2011-07-23 01:38:58 -04:00
Kiran V Garimella
f366124778
Merge branch 'laptop'
2011-07-23 01:25:36 -04:00
Kiran V Garimella
45f2ca8d99
Changed MD5 to reflect latest changes to PhaseByTransmission.
2011-07-23 01:21:07 -04:00
Kiran V Garimella
5af9d50183
Merge branch 'laptop'
2011-07-23 01:12:06 -04:00
Kiran V Garimella
5521919cc9
Fixed bug where variants to phase were not being selected properly.
2011-07-23 01:11:28 -04:00
Kiran V Garimella
7da99388ac
Merge branch 'laptop'
2011-07-23 01:01:11 -04:00
Kiran V Garimella
58eed20b83
Copy all entries from the attributes map, rather than attempting to modify an unmodifiable map.
2011-07-23 01:00:46 -04:00
Kiran V Garimella
b5deff48e6
Merge branch 'laptop'
2011-07-23 00:56:50 -04:00
Kiran V Garimella
5638017137
Removed the nofilters argument specification in the integrationtest
2011-07-23 00:56:23 -04:00
Kiran V Garimella
ffa361f57f
Merge branch 'laptop'
2011-07-23 00:50:38 -04:00
Kiran V Garimella
9417ba8c2c
Modified to accept multi-sample VCFs, removed the application of filters, and changed transmission probability field to be a genotype field rather than an INFO field.
2011-07-23 00:48:26 -04:00