d35e20ce21Better error checking for missing .dict file.
hanna
2009-05-17 21:57:12 +0000
7161b8f927Disable support for short name values directly abutting their arguments.
hanna
2009-05-17 16:09:32 +0000
4ab9bfe662Upped sam-jdk jar to new version in public picard repository.
hanna
2009-05-17 15:03:05 +0000
d152c2b911New GATKArgumentCollection caused a subtle bug with argument grouping and the help system. Fixed.
hanna
2009-05-17 14:54:25 +0000
94e324b844Write N for the alt allele when we're hom-ref. Stop EM loop when we've converged (likelihood[t-1] == likelihood[t]).
jmaguire
2009-05-17 13:58:11 +0000
bd53bc18f9added new required annotations
kcibul
2009-05-17 12:24:06 +0000
6f1559bd77Cleaned up a bit. Added some documentation.
kiran
2009-05-15 21:22:24 +0000
2c4de7b5c5Switch TraverseByLoci over to new sharding system, and cleanup some code in passing read files along the pathway from command line to traversal engine.
hanna
2009-05-15 21:02:12 +0000
57e5f22987We now only build the files that have changed. It should speed up compile time as our source tree grows.
aaron
2009-05-15 20:48:01 +0000
f33f3c0434added LOD threshold for determining when to clean
ebanks
2009-05-15 20:23:59 +0000
99d4ebc26dAdded functionality to return the final accumulator of a traversal, so external tools can get the result of a walker.
aaron
2009-05-15 20:20:27 +0000
dae77bf14aFixed a typo in a comment.
kiran
2009-05-15 20:07:31 +0000
bfc40f54f0Nicer output when training off of perfect reads. Not that that works yet...
kiran
2009-05-15 20:07:08 +0000
d1f3000afabed-style output for IGV
kcibul
2009-05-15 17:58:44 +0000
36db44620bImproved output. Can optionally limit the number reads actually called.
kiran
2009-05-15 00:07:57 +0000
7834b969b4Better interface to the tabular ROD, now makes writing files easier. Also has corresponding test files
depristo
2009-05-14 23:20:11 +0000
50f32b7f61Added a shard strategy for the reduce-by-interval traversals. Also fixed bugs that I found along the way.
aaron
2009-05-14 21:20:18 +0000
8e9e2f4502Revised ROD system. Split the system in Basic type and interface. Enabled more control over rod accessing, including an initialize() function to fetch headers and other options from the file. Added general tabular rod, which has a named columns and supports a map<String,String> interface. Comes with shiny new Junit system for RODs. Also, added simple python script for accessing picard data.
depristo
2009-05-14 21:06:28 +0000
67293168e7Support periods in sequence names.
hanna
2009-05-14 20:17:57 +0000
641afc4e76fix a crash in the event that the input file has no read groups!
jmaguire
2009-05-14 19:27:41 +0000
d8c1b010f1Fixing the naming of the function I checked in earlier.
aaron
2009-05-14 19:27:10 +0000
7a1f85ff86option to print out the indels found by the cleaner to a file
ebanks
2009-05-14 17:50:08 +0000
b62bddee42The header was never being set. Added this hack for now and will alert the authorities ASAP...
ebanks
2009-05-14 17:18:51 +0000
959cf09d4bRemoved some debugging print statements.
kiran
2009-05-14 17:12:42 +0000
2f42a643a8A new, much simpler (and now, complete) driver program for four-base probs. Serves as a model for anyone who wants to write their own driver program that trains and calls with data from a different source than the raw Illumina data.
kiran
2009-05-14 16:58:22 +0000
5824dea0c1Trains and calls a read at a time rather than a base at a time (which, given it's name, it should have done in the first place)
kiran
2009-05-14 16:57:00 +0000
e4770885fdThe four-probs for all bases in a single read. Some utility functions for generating the primary and secondary base strings, as well as generating the SQ tag byte array in a manner that's consistent with the Bustard base calls (meaning the primary Bustard call and the secondary Four-Prob call are not permitted to be the same).
kiran
2009-05-14 16:55:49 +0000
fdd123fe16A parser the raw Illumina data. Allows one to arbitrarily jump from one tile to another.
kiran
2009-05-14 16:53:07 +0000
7aa90757acMoved the iterators over to the StingSAMIterator interface. This will help us ensure that iterators that need to be closed get closed.
aaron
2009-05-14 16:52:18 +0000
6d98234555Holds raw intensities, sequence, and quality scores.
kiran
2009-05-14 16:52:03 +0000
241de0b235A class that implements multiple training strategies and presents the training data in a common form.
kiran
2009-05-14 16:51:29 +0000
64c65c7751New methods to generated compressed SQ quality elements in line with the SAM spec.
kiran
2009-05-14 16:50:31 +0000
c3b2c66911The GATK doesn't need the rest
aaron
2009-05-14 16:20:45 +0000
0215905bb6Added an adapter class, that will adapt plain iterators and closeable iterators of SAMRecords into STingSAMIterators. Also unit tests.
aaron
2009-05-14 15:17:32 +0000
5dda448ae01. Add printouts for the cleaner 2. First pass at the entropy interval walker (still needs work)
ebanks
2009-05-14 13:59:48 +0000
80c13f7127Added a getter for command-line arguments.
hanna
2009-05-14 13:55:52 +0000
307c6e4ecfOops. Forgot to add new file to svn.
hanna
2009-05-14 00:52:30 +0000
d14cab0be7Added IterableLocusContextQueue and test. Cleaned up tests, adding BaseTest where it didn't exist. Enhanced test runner to run only classes ending in ...Test.java, so that utility classes can sit alongside the tests but won't be run by JUnit.
hanna
2009-05-13 21:32:05 +0000
7b59f63f12and don't forget to close sam writer after we are done...
asivache
2009-05-13 20:46:36 +0000
de0cce87eanew optional arg added that allows to specify a separate bam file to send all piles that fail to realign to; plus minor fixes
asivache
2009-05-13 20:24:23 +0000
8cce3d908fBumped sam to latest.
hanna
2009-05-13 19:19:55 +0000
12ae3a22b6Break locus context data access providers into modular components in preparation for traverse by loci.
hanna
2009-05-13 18:51:16 +0000
7084ecdeb6a few changes; checked in to allow debugging.
jmaguire
2009-05-13 15:50:48 +0000
5f924c46e0Added documentation for calling the GATK from Matlab. This is to document the extreme basic and experimental support for using Matlab to call the GATK, and is more of a placeholder for when we have time to revisit supporting this.
aaron
2009-05-13 15:25:51 +0000
4f2c8bf0a3Fixed an import statement that broke when all the files were moved to this directory.
kiran
2009-05-12 20:43:16 +0000
cedc4c9ccbRefactored into oblivion.
kiran
2009-05-12 20:33:07 +0000
01de5cc0eeMoved to org.broadinstitute.sting.secondarybase
kiran
2009-05-12 20:28:29 +0000
4e4767e5deMoved to org.broadinstitute.sting.secondarybase
kiran
2009-05-12 20:26:43 +0000
219eb60716Added newly-required documentation to arguments so that build can complete successfully.
kiran
2009-05-12 20:26:10 +0000
688358190cMoved secondary base stuff out of playground for the purpose of making it a core utility. Modified package names and imports such that things would build properly.
kiran
2009-05-12 20:24:18 +0000
1518f8f9bfUpdate training data creation in CovariateCounterWalker to output much smaller files by counting the number of occurences of each data point combination rather than outputting a line for each data point (i.e. each base). Also fixed bug in LogisticRecalibrationWalker where a null SAMHeader was being pulled from a function that is now marked deprecated.
andrewk
2009-05-12 19:23:14 +0000
6e69193e3cDeprecated calls to getSamReader on both the GenomeAnalysisEngine and the TraversalEngine. This call fails in the new style traversals, but it won't disapear until the cut-over to the new traversals is complete.
aaron
2009-05-12 18:52:42 +0000
9f942fdfa0Added code to correct the violation of the parsing interface. Now the analysis type resides in the command line arg, but is stored into the argument collection before it's passed to the genomeAnalysisEngine.
aaron
2009-05-12 15:33:55 +0000
c4d89997caput in a dummy sample_name so it'll compile
jmaguire
2009-05-12 15:12:42 +0000
c8d7223789do pooled calling properly for 1kg
jmaguire
2009-05-12 15:12:13 +0000
313a6d0fb5lots of changes to facilitate calling indels and 1kG
jmaguire
2009-05-12 15:11:42 +0000
0267ccae7fadd code for computing indel genotype likelihoods make reference lods negative
jmaguire
2009-05-12 15:09:29 +0000
11723fbcc2added method indelPileup. Generates a pileup of indel alleles given reads and ofsets (as from a locus walker).
jmaguire
2009-05-12 15:08:24 +0000
ee9077fc69LocusIterator iterated through LocusContexts, which was fine until now when we need something that iterates through loci (GenomeLocs). Rename LocusIterator to LocusContextIterator.
hanna
2009-05-12 13:54:57 +0000
608948210cCheck for a reference before extraction.
hanna
2009-05-12 13:29:44 +0000
32696b13f5Fixed method override issue with old-style traversals.
hanna
2009-05-12 01:22:18 +0000
862b8a6787intervals_file + genome_loc => intervals.
hanna
2009-05-12 01:04:18 +0000
0bca588629Botched some boolean logic.
hanna
2009-05-11 22:53:52 +0000
23e9e29964Changed reads traversals from providing a LocusContext from which the reference sequence could be extracted to a char[] containing the reference bases.
hanna
2009-05-11 22:45:11 +0000
052819bed5Switched dependencies of GenomeAnalysisTK to depend on GenomeAnalysisEngine.
hanna
2009-05-11 22:33:00 +0000
ff1b92acc4Switch over to the GenomeAnalysisEngine/CommandLineGATK system from the GenomeAnalysisTK code.
aaron
2009-05-11 22:05:58 +0000
009e71fcd9We need to sort cleaned reads ourselves (instead of letting SAMFileWriter do it) because the SAM headers are often screwed up and claim to be "unsorted". While here, I broke off the module from the SortSamIterator in case someone else wants to use it.
ebanks
2009-05-11 15:43:42 +0000
e8b8ab5985Added code to extend Matt's getReferenceBases out to the read walkers, so they can see the corresponding reference for each read.
aaron
2009-05-11 03:42:38 +0000
4ce3feba4dmy move ended up being a copy, so this is to delete dupplicate files.
aaron
2009-05-11 02:10:26 +0000
898f65547eAdded code to split GenomeAnalysisTK.java into an object concerned with loading command line args, and one that runs the engines. This will allow us to run the GATK from other tools (like Matlab). Also some cleanup to seperate out the legacy traversals and the new style traversals. This is not live yet, and any modifications you need should be made to GenomeAnalysisTK.java for now.
aaron
2009-05-11 02:07:20 +0000
8d43ec3d7ea fix for a situation where a chromosome on the reference file contains no reads, and doesn't align to the bam file. This came up using reference 18, which has chomosomes like chr1_random that aren't in all BAM files.
aaron
2009-05-11 01:39:25 +0000
ee02b61068added support for the argument collections code
aaron
2009-05-09 07:07:33 +0000
742840017badded the argument collection annotation for situations where fields in a command line args have embedded fields that should be checked for command line args
aaron
2009-05-09 06:59:17 +0000
55c1b688bdFix mediocre javadoc.
hanna
2009-05-08 22:31:16 +0000
522f8b58beAdded second method for getting large sequences of the reference for use in reads traversals.
hanna
2009-05-08 22:18:04 +0000
517f27f331Added sharding strat. code that picks the right kind of shard, based on the traversal engine
aaron
2009-05-08 21:55:10 +0000
6e394490cbCleanup in preparation for ByLoci traversal. Also did some work minimizing unit tests.
hanna
2009-05-08 21:27:54 +0000
ee777c89deChange the default mechanism for adding ROD bindings to the new system. TODO: create a new object type for these triplets.
hanna
2009-05-08 18:43:00 +0000