2eabcfedb7Fixed potential bug with next() operation returning empty contexts when a read contains a large deletion. We can now use the look ahead safely...
depristo
2009-04-15 21:41:30 +0000
7261787b71Fixed potential bug with next() operation returning empty contexts when a read contains a large deletion. We can now use the look ahead safely...
depristo
2009-04-15 21:38:28 +0000
e70aecf518bug fix, but important
aaron
2009-04-15 21:07:20 +0000
feebd8cd55Latest version of sam.
hanna
2009-04-15 21:02:25 +0000
1edfe48194Better debugging output with .debug
depristo
2009-04-15 19:09:18 +0000
9cc808104eFixed subtle bug in permitting EXPAND_WINDOW to be > 1. We now use the right window size so we avoid including empty hangers. There's still a rare bug to sort out, which occurs in the case where a read with an indel can generate empty hangers.
depristo
2009-04-15 19:08:26 +0000
180ff13290Added a bunch of changes to support the new MicroManager code
aaron
2009-04-15 18:29:38 +0000
339261c4a9Load the dictionary and sanity check it against the index.
hanna
2009-04-15 18:04:13 +0000
26e84d7fd6Added index iteration for ReferenceSequenceFile interface compatibility. Added better error checking for querying past the end of a contig. Lots more testing.
hanna
2009-04-15 17:17:11 +0000
3fda8613c3* minor formatting changes * support for "extended" output
kcibul
2009-04-15 15:11:05 +0000
12407b5b1aDeleted the old file
aaron
2009-04-15 13:55:01 +0000
6db9127f90Added changes to shattering, refactored SAMBAM into SAM
aaron
2009-04-15 13:52:56 +0000
182626576fBasic indexed fasta POC in place. Requires a more complete implementation of the ReferenceSequenceFile interface, and much more testing.
hanna
2009-04-15 13:46:56 +0000
7949e377e4Intermediate commit. Refactored some simple base manipulation stuff into BaseUtils.java. Generalized some likelihood computation logic to make future possible EM-ing easier.
kiran
2009-04-15 04:18:07 +0000
d0b8d311e6Can now optionally print the read and the alignment region of the reference.
kiran
2009-04-15 04:10:30 +0000
d4aaa1bef4* fixed (with Matt's help) the argument parsing * outputting UCSC wiggle format
kcibul
2009-04-15 02:17:39 +0000
e6fb122d7dAdded some fixes and new iterator tests --This lin e, and those below, will be ignored--
aaron
2009-04-14 22:19:36 +0000
13b0995d54Adding an iterator that bounds the number of reads
aaron
2009-04-14 22:18:31 +0000
72a3d84ed2General purpose pileup code -- you can use these features to obtain detailed pileup data from reads and offsets. Useful for all pileup based walkers. Expanded support for rodSAMPileup to enable the new ValidatingPileupWalker, which takes a samtools pileup output and checks that GATK gives identical output as samtools on a per base and per qual pileup. It's going to be a very useful validation tool.
depristo
2009-04-14 22:13:10 +0000
baae98c6d5and don't allocate new 200M string every time please, just pass byte array!
asivache
2009-04-14 21:55:33 +0000
9d56355abebug fixed when reference name was passed as a string instead of actual reference bases
asivache
2009-04-14 21:46:27 +0000
222c4e5865Commented out some debugging lines
kiran
2009-04-14 20:15:41 +0000
49d76014d1Commented out a debugging line
kiran
2009-04-14 20:15:11 +0000
b39e584787Primary or secondary bases that got a quality score of literally zero led to unfortunate infinities. Added an epsilon (1e-5) to every prob.
kiran
2009-04-14 20:04:49 +0000
d28e9f9b98search over q's for finding argmax[q] p(D|q)
jmaguire
2009-04-14 19:15:45 +0000
96248cdec4Added some output to all the classes, including build in runtime analysis
aaron
2009-04-14 19:14:53 +0000
647827b18cTransitioned indel code to use GATK and Walkers
ebanks
2009-04-14 19:14:15 +0000
b363eedd2cDeal with screwy reads by changing logic to determine whether we are past the last interval
ebanks
2009-04-14 19:13:16 +0000
0629f79049Moved fasta support files into their own package.
hanna
2009-04-14 18:13:23 +0000
7a4a5a17c0Made sequence index compatible with Aaron's junit changes.
hanna
2009-04-14 17:53:20 +0000
88ebf1a05bFied some documentation
aaron
2009-04-14 17:41:38 +0000
186c799ffcClass to read an .fai file.
hanna
2009-04-14 17:37:18 +0000
704f1bd634It helps if I check the new base class in with my changes
aaron
2009-04-14 17:18:16 +0000
4b3578e1deAdded the base test case, fixed the rest of the test cases to follow suit. Added more verbose output to ant for junit tests.
aaron
2009-04-14 17:11:38 +0000
961dbbd4efNow output bases and qhat and qstar into the GFF.
jmaguire
2009-04-14 15:23:00 +0000
dafdff1974All bases are now indexed as A:0, C:1, G:2, T:3.
kiran
2009-04-14 14:49:43 +0000
40ea22eb17Added some methods to return the cross-talk partner base of a given base or base index.
kiran
2009-04-14 14:49:12 +0000
eb4b4a053bA bunch of updates to the SAM/BAM data source, along with test cases for the merging of multiple files (it works!).
aaron
2009-04-14 14:19:20 +0000
30121534edOutputs the secondary bases and quals (if available) in verbose mode. Prefixed with the tag 'SQ='.
kiran
2009-04-14 13:58:28 +0000
998fad76c6Some utility methods for creating pileups of secondary bases and secondary quals.
kiran
2009-04-14 13:57:54 +0000
8b2c2e677bUses the cleaner new GenomeLoc(read) syntax
depristo
2009-04-14 00:55:43 +0000
1cee7948abAdded lots of assertions to check for problems.
depristo
2009-04-14 00:55:19 +0000
794360c410Added verbose option to show mapping qualities and base qualities as ints!
depristo
2009-04-14 00:54:48 +0000
cc75e8f712Uses the cleaner new GenomeLoc(read) syntax
depristo
2009-04-14 00:53:58 +0000
11377ef390Added lots of assertions to check for problems. The current GenomeLoc needs to be cleaned up and refactored but at least it runs. We need unit tests ASAP
depristo
2009-04-14 00:53:08 +0000
bb666ce392Added mappingQualPileup function for use in the verbose mode of Pileup
depristo
2009-04-14 00:51:26 +0000
bc43c0eefcthere are really cases when we can not merge until we get just two pilesant now we do not crash in those cases but print a warning and just show the resulting n piles even when n>2
asivache
2009-04-14 00:45:47 +0000
8e6093d5a5remove mom/dad/kid cmd line arguments that were needed for mendelian walker; now we can use generic track binding!!
asivache
2009-04-14 00:45:34 +0000
f838a5e511Changed some double comparisons of the form a == b to abs(a - b) <= precision. Now we shouldn't be passing or failing some if conditions due to floating-point precision.
kiran
2009-04-13 20:05:46 +0000
887adcfc7fSome minor fixes to the last check-in
aaron
2009-04-13 18:24:51 +0000
f2d0d73309removed old shard strategy code
aaron
2009-04-13 18:13:45 +0000
dd604799dcAdded some new code for shard support over reads
aaron
2009-04-13 18:11:43 +0000
d44c30154aadded MAX_READ_LENGTH - now we can ignore long reads (454?); a bad idea in general, but the performance hit is to hard to take, at least for preliminary testing runs...
asivache
2009-04-13 16:53:12 +0000
e91a429c58A class to print out as much context about the given locus site as is possible. Useful for testing traversal engines -- run old and new code across a given region and diff the output to make sure they have the same context.
hanna
2009-04-13 15:29:55 +0000
cf929a8275Get rid of test case's dependence on transient methods.
hanna
2009-04-13 15:16:42 +0000
6e180ed44eUnified caller is go.
jmaguire
2009-04-13 12:29:51 +0000
f39092526dAdded function RandomSubset
jmaguire
2009-04-13 12:14:53 +0000
b4136b6d6ea few tweaks to make it more robust: ignore reads with cigars containing anything but I,D,M; don't set up contig ordering manually, rely upon reference sequence and its dictionary; don't die if a record does not have NM tag, but faal back to direct counting instead; now requires reference as a cmdline arg
asivache
2009-04-13 04:49:19 +0000
32e000bbfeAdded MatchSQTagToStrand jar target.
kiran
2009-04-13 00:50:36 +0000
756e6c61d8Strictness args are presented as lowercase in the help, but only accepted if uppercase. Changed help to list the valid arguments in uppercase.
kiran
2009-04-13 00:50:19 +0000
c51f51f255Make sure we always write at least 1000 points per base in each cycle's scatterplot. Print the disagreement rate between Bustard and FourBaseRecaller.
kiran
2009-04-13 00:49:41 +0000
1fb16d54e0For SAM files that have no alignments and when no reference is specified, contigInfo.getSequence() is null, causing an error when getSequenceName() is called on the resulting null pointer. Check for null instead and return that instead of barfing here.
kiran
2009-04-13 00:48:21 +0000
5e96ab6161Helpful functions for converting a base (char) to a base index (A:0, C:1, G:2, T:3, alphabetical and consistent with Illumina conventions to minimize confusion.
kiran
2009-04-13 00:46:23 +0000
35fc002d5dDebugging information is now written in such a way to make it easier to import into R.
kiran
2009-04-12 19:45:33 +0000
6ee4fe5a20Fixed a Bustard/Firecrest file synchronization bug.
kiran
2009-04-12 19:44:07 +0000
817278be46If a SAMRecord is on the negative strand, reverse complement the SQ tag.
kiran
2009-04-12 19:42:24 +0000
1d5a22cacfExtracts a Fastq file and the SQ tags to a separate file.
kiran
2009-04-12 19:41:44 +0000
e410c005c0A debugging tool to ensure the SQ tag in a four-prob SAM file matches the SAMRecord strand orientation.
kiran
2009-04-12 19:40:42 +0000
9c37400c4fAdded basic performance testing so I can make sure concurrent access doesn't slow down overall fasta access.
hanna
2009-04-12 18:05:56 +0000
c7777d46d6* re-enabled setting of sequence dictionary information on GenomeLoc
kcibul
2009-04-12 02:44:14 +0000
ce72932a45* refactored GenomeLoc to use contigIndex internally for performance and fixed several calling classes * added basic unit test for GenomeLoc * fixed bug when parsing genome locations like chr1:5000 the start position was being left as maxint rather than being set to the same as the stop position.
kcibul
2009-04-12 02:25:17 +0000
49fd951d8cInitial test suite for FastaSequenceFile2, so I can add parallelism support with abandon.
hanna
2009-04-11 21:10:42 +0000
608a66e6abTbyLocibyRef previously didn't seem to support traversals with no interval specified. Put in a temporary fix until the threaded approach is in place.
hanna
2009-04-10 22:14:06 +0000
c2669021b8Cleanup, and support either by-interval traversals or full traversals in data source-backed code.
hanna
2009-04-10 22:09:01 +0000
2322bb7d86Workaround: use a single ReferenceIterator for an entire micromanaged traversal. We'll have to do something about ReferenceIterator thread safety later.
hanna
2009-04-10 20:50:28 +0000
95753e1b34Should've been calling queryOverlapping in locus mode.
hanna
2009-04-10 20:22:04 +0000
a2a38a4bbbRemoved RepairBadlyCombinedSamFile jar target.
kiran
2009-04-10 04:21:19 +0000
2b59110dcaCombineSamAndFourProbs is better.
kiran
2009-04-10 04:19:53 +0000
2ef2c9e121Fixed an issue wherein the SQ field was only being pulled from the first read of the pileup, no matter what. Fixed an issue wherein Andrew enumerates his bases as A:0, C:1, T:2, G:3, and Kiran's QualityUtils methods enumerate bases as A:0, C:1, G:2, T:3 (we should standardize this). Fixed an issue wherein the remaining probability was being divided by 3 rather than 2 when four-base probs are enabled.
kiran
2009-04-10 04:17:53 +0000
17b3d5b554New ROD accessing system, including a generalized interface for binding ROD on the command line that doesn't require you to chance GenomeAnalysisTK.java
depristo
2009-04-09 22:04:59 +0000
f5cc2d8b0bCommented out import of IlluminaParser.
kiran
2009-04-09 21:30:29 +0000
0d825ccfc1Oops. Fixed duplicate reference to the reference.
hanna
2009-04-09 21:27:57 +0000
9afa101465Add interval support to the
aaron
2009-04-09 21:23:43 +0000
c5220c0822Four-base probs are now decoded with the relevant method in QualityUtils
kiran
2009-04-09 20:52:17 +0000
9bc763a835A better (aka 'working') tool for combining four-base probs with an aligned sam file.
kiran
2009-04-09 20:51:37 +0000
b7a2e82b46Can optionally process raw or corrected intensities.
kiran
2009-04-09 20:50:11 +0000
6cdad10dd1Make output type identical to the bustard parser so the values can be easily swapped for one another.
kiran
2009-04-09 20:49:34 +0000
d0ce56e018Remember to take the strand flag into account when calculating error rate per cycle as a surrogate for instrument performance.
kiran
2009-04-09 20:48:45 +0000
8a1207e4dbBringing up scaffolding for integration of locus traversals by reference with Aaron's data source code. Reverts to original TraverseByLociByReference behavior unless a special combination of command-line flags are used.
hanna
2009-04-09 20:28:17 +0000
49b2622e3dHelper utility for merging BAM files
depristo
2009-04-09 20:10:41 +0000
8e2f5471a1Some cleanup to the data source, and another JUnit test case.
aaron
2009-04-09 14:58:05 +0000
d56193b6dfCleanup of a couple of output statements
aaron
2009-04-09 14:09:07 +0000
c556a97f17Skeleton of Somatic Coverage tool
kcibul
2009-04-09 02:34:03 +0000