ee2f022c71Make new TraverseByLociByReference the default.
hanna
2009-04-24 19:50:11 +0000
e50ae97fe1Introduce new index-based fasta reader. Clean up MicroManager code, pushing necessary code back into TraversalEngine.
hanna
2009-04-24 19:40:21 +0000
3739682befActually has working version of the python script to merge multiple bam files
depristo
2009-04-24 19:15:55 +0000
40a2b3eeb3Basic logistic regression support for calibrating qualities; mostly for Andrew to experiment with
depristo
2009-04-24 19:09:50 +0000
38c2f73457LogRegression.py script that converts parameter files for each dinucleotide regression into one file to be read in by correction script.
andrewk
2009-04-24 18:31:26 +0000
061f4328b1Covariate counter now outputs files used by R to do logistic regression.
andrewk
2009-04-24 17:11:57 +0000
4e4fd33584First draft of actual pooled EM caller.
jmaguire
2009-04-24 13:43:41 +0000
dd408a2a9aFirst draft of actual pooled EM caller.
jmaguire
2009-04-24 13:42:15 +0000
13d4692d2e1. Added a by-interval traversal. 2. Added a shell for the indel cleaner walker (it's currently being used to test the interval traversal). 3. Fixed small bug in downsampling (make sure to downsample the offsets too) 4. GenomeAnalysisTK.execute => anyone object to my change to "instanceof" instead of trying to catch a ClassCastException (yuck)?
ebanks
2009-04-24 04:33:35 +0000
1984bb2d13Made num_loci_total public because I'm lazy. I'll change it back later.
kiran
2009-04-24 03:57:23 +0000
7ce11e152bSimplified. Added option to perform four-base retest of a putative variant.
kiran
2009-04-24 03:56:15 +0000
135d3eabebNow only distributes 80% of the residual probability to the secondary base, 10% each to the other two bases. Nicer labelling for stringified probability distribution output.
kiran
2009-04-24 03:34:43 +0000
3cda85f2e3New implementation of binomial probability that accurately computes values down to around 1e-237.
kiran
2009-04-24 03:32:04 +0000
305584b69eTest class for MathUtils with a test for binomialProbability().
kiran
2009-04-24 03:31:02 +0000
bd4cacb832Added code to make a read group and sample name for BAM files that don't annotate them on reads. The defaults for both are now the filename, but this may be shortened in the future.
aaron
2009-04-24 00:31:00 +0000
45d962e491I understood the contig index incorrectly when I initially wrote this code. Fixed.
hanna
2009-04-23 22:31:43 +0000
635bfd8604Added a little bit of hack to get the header back to the walker by initialization time, which was before sharding in the last version.
aaron
2009-04-23 21:07:11 +0000
0208d201c7Forgot this in the last commit...
aaron
2009-04-23 20:47:22 +0000
3dc2afd7abAdded the ability to get a merged header in a LociByReference traversal
aaron
2009-04-23 20:34:52 +0000
282f1d88b8Make the operation 'read from the iterator and place on the queue' atomic with respect to hasNext(), next().
hanna
2009-04-23 20:16:26 +0000
998763950cOops, contig index is a zero not one based value
aaron
2009-04-23 19:08:16 +0000
8c13940c5aA lot of changes to support by-read sharding and some from debugging of the by loci traversals
aaron
2009-04-23 19:03:14 +0000
32715a6c47First check-in of walker that produces tables showing covariation of read cycle, and dinucleotide with quality score in a format usable for R analysis and for doing logistic regression.
andrewk
2009-04-23 18:58:25 +0000
0720d248ceAdding the test case for by reads sharding of BAM data sources
aaron
2009-04-23 18:01:22 +0000
cae54ec52dWalker for creating intervals to be used in the indel cleaner
ebanks
2009-04-23 17:58:19 +0000
96db1477d4I meant for default lod threshold to be 5.0, not 0.0.
kiran
2009-04-23 17:46:08 +0000
ca66cccd2fPrivatized constructor to prevent instantiation.
kiran
2009-04-23 17:45:39 +0000
77e1e9e2f1Added a static class to house useful math methods. All this has at the moment are methods for comparing doubles and floats, but I suggest that the bulk of our little math methods should be added here to avoid filling up Utils.java with so much random stuff.
kiran
2009-04-23 17:45:19 +0000
3d7575bbb8Oops...omitted walker.initialize().
hanna
2009-04-23 17:35:28 +0000
11e85f1969Four-base mode now estimates the genotype using the one-base method and retests the site if the one-base method suggests the site is a het.
kiran
2009-04-23 17:23:24 +0000
bd719f9c06When checking that values are not infinite, also prints out the position so that I know which site was giving the error and I can just go there and debug it.
kiran
2009-04-23 17:21:58 +0000
efba30f1a1Added a constructor in which the lod threshold can be set.
kiran
2009-04-23 17:20:48 +0000
8c1905c7d9Simple walker to print all of the sample names present in a merged bam file.
jmaguire
2009-04-23 12:26:56 +0000
ef4a107548Updated the hello world document to reflect system changes.
aaron
2009-04-22 23:25:15 +0000
a3a1c9dae8Suppressed emission of duplicate paths through a four-base pileup.
kiran
2009-04-22 21:08:45 +0000
b8a6f6e830Support for indexBAM command
depristo
2009-04-22 19:39:07 +0000
d99d67d51cRefactored to clean it up a bit
ebanks
2009-04-22 19:18:46 +0000
1bf4d040d8Increase default shard size from 5 to 100000.
hanna
2009-04-22 18:29:44 +0000
3af66a462eMake PrintLocusContextWalker less verbose.
hanna
2009-04-22 18:28:02 +0000
ffcd672c1cIntermediate commit while working on getting four-base probs to work in the single sample genotyper. Has infrastructure for the new combinatorial approach and just choosing the best base more intelligently given a probability distribution over bases and the reference base.
kiran
2009-04-22 18:06:50 +0000
4cafb95be8TraverseByLoci / TraverseByLociByReference suffered from the same sam-triggered off-by-one (?) bug as TraverseByReference; it was just less obvious here because these versions don't shard.
hanna
2009-04-22 15:48:20 +0000
cb2f621d01reverting accidental commit of change to shard size
kcibul
2009-04-22 00:33:28 +0000
b820130dce* added ability to load multiple BAM files from command line
kcibul
2009-04-22 00:28:08 +0000
5b8502745aAdded an epsilon (1e-4) to the tertiary and quaternary base hypotheses.
kiran
2009-04-22 00:01:37 +0000
2ac240d78bRemoved an extraneous print statement.
kiran
2009-04-21 23:36:36 +0000
0149c887ffFixed a bug wherein the residual probability was not being distributed properly when a file had secondary probs and the best and next-best base agreed.
kiran
2009-04-21 23:36:09 +0000
5abfc7d079Added an argument ('extended' or 'ext') that outputs the four-base probs in a long format.
kiran
2009-04-21 22:27:26 +0000
dac76f041bAdded some methods to retreive the probability distributions of individual bases.
kiran
2009-04-21 22:26:25 +0000
5b2a7c9c23Added some methods to complement a single simple base ([AaCcGgTt]) and reverse-complement a byte-array of bases.
kiran
2009-04-21 22:25:33 +0000
55ca272919reimplemented; now implements Genotype interface instead of AllelicVariant
asivache
2009-04-21 21:06:42 +0000
5f37ba8f26now can be asked to log at INFO level all concordant or discordant sites, or both
asivache
2009-04-21 21:03:44 +0000
1f84b9647dauxiliary data structure for mendelian concordance reporting; it's nice to have the latest version checked in in order for the code to compile...
asivache
2009-04-21 21:02:40 +0000
ece3e9969eone trivial walker to filter reads; bam in -> filter -> bam out
asivache
2009-04-21 20:39:29 +0000
64b2fd866f* extracted core quality-score based genotype likelihood code * precompute expensive operations (log/pow) based on Picard experience
kcibul
2009-04-21 18:58:43 +0000
11c520b283completed my old draft of the old school single sample genotype walker
jmaguire
2009-04-21 05:38:04 +0000
b8233d92c8Simple IO walker to test / crush file systems and evalute I/O performance in general
depristo
2009-04-20 14:07:14 +0000
bf76eab955whoops; fix a comment line.
jmaguire
2009-04-19 17:54:54 +0000
bcba1ff424Fix a minor rounding bug and putz around with fractional counts in the pooled caller.
jmaguire
2009-04-19 17:52:24 +0000
af6788fa3dMisc: 1. Added logGamma function to utils 2. Required asserts to be enabled in the allele caller (run with java -ea) 3. put checks and asserts of NaN and Infinity in AlleleFrequencyEstimate 4. Added option FRACTIONAL_COUNTS to the pooled caller (not working right yet)
jmaguire
2009-04-19 15:35:07 +0000
eafb4633baTemporary workaround for samtools index bug: there seems to be an off-by-one error. Will file bug report.
hanna
2009-04-17 23:14:41 +0000
2a937fa8d3set SAM file header's sorting order to unsorted, hopefully it will help to speed things up
asivache
2009-04-17 19:32:24 +0000
03ec3452f2a first, simplest version of a walker that filters out reads based on user-specified criteria and writes remaining reads into a new bam file
asivache
2009-04-17 18:51:39 +0000
1660379753Matt's current status.
hanna
2009-04-17 18:03:11 +0000
01d000411dadded some updates to the omniplan
aaron
2009-04-17 17:44:04 +0000
d639ec3776Remove some copied code to make sure the traversal engine stays in sync with the locus context provider.
hanna
2009-04-17 16:41:56 +0000
df5aae5ed4got read of a couple of warnings and added percentage(x,base) methods
asivache
2009-04-17 15:15:21 +0000
50ae1763f7Support for -continue_after_errors flag in the validating pileup walker in case you want to see errors as they arise, rather than aborting greedily
depristo
2009-04-17 03:13:11 +0000
ee5ab9536ftrivial checking / flagging issues to enable testing of merging iterator performance
depristo
2009-04-17 03:11:59 +0000
dbf2344cefFixes for including duplicate reads in the locus traversal; now checks that the ref arg is provided when needed
depristo
2009-04-17 01:27:36 +0000
01be8f09e3Exception cleanup. All our non-runtime exceptions should extend from StingException, StingException needs to be lower in the tree to build.
hanna
2009-04-16 22:17:25 +0000
e5c80e59dcfixed the case when you're not seeking, it didn't initalize
aaron
2009-04-16 22:16:03 +0000
f47f640df6Better debugging output and testing
depristo
2009-04-16 21:54:56 +0000
165e504d1cTurn on new TraverseLociByReference is now only dependent on the -et flag. REGION_STR does not matter.
hanna
2009-04-16 19:45:47 +0000
12e1f192c4Fixed a bug in this code where it would eat reads that didn't start at the beginning of the provided interval. This should fix / help fix Kristian problem
aaron
2009-04-16 18:42:00 +0000
835f1067d8added isHom() and isHet() queries to the Genotype interface (with the obvious meaning)
asivache
2009-04-16 18:41:39 +0000
55537c0d1echnage class name, now it compiles...
asivache
2009-04-16 16:51:00 +0000
4f9bc7206fsome cleanup, also ensuring that all reads get written into output
asivache
2009-04-16 16:49:25 +0000
e8a6cdb386renamed standalone main
asivache
2009-04-16 15:56:46 +0000
832afd3d60renamed standalone main
asivache
2009-04-16 15:56:27 +0000
85308f4ddcresurrected indel tool's standalone main
asivache
2009-04-16 15:55:52 +0000
6f56938d42* added a bit more debugging output
kcibul
2009-04-16 15:20:26 +0000
d35a542bb9* fixed bug where the merged header was not being set on the read (although the read group was)
kcibul
2009-04-16 12:53:07 +0000
240eb18564fix a few related issues when not all the reads were written into the output files. now cleaned output still contains all reads either with modified alignments or untouched
asivache
2009-04-16 03:56:47 +0000
0d324354aeseparate interface for genotypes as opposed to (population) allelic variants
asivache
2009-04-16 03:55:16 +0000
7e05b43f40* added some error checking for read groups
kcibul
2009-04-16 03:22:49 +0000