Commit Graph

  • 1df23b0417 Added a definitely inappropriately placed testing of the new fasta seeking system at the bottom of the file -- it's not called but it probably should be moved to somewhere more appropriate. depristo 2009-03-22 19:57:52 +0000
  • 611ab0bdb3 Uses the new FastaSequenceFile2 for high-performance seeks. Added far superior error checking (and reporting!) messages for incorrect usage of the location string. Prevents users from seeing complex FunctionalJ error message depristo 2009-03-22 19:56:54 +0000
  • e77d735e08 New reference iterator that works with the new FastaSequenceFile seek operations. Greatly improves performance of jumping around in the genome. depristo 2009-03-22 19:54:02 +0000
  • c8d7207a8e Fixed problem with GenomeLoc logic -- optimization was causing assertion failure. depristo 2009-03-22 19:53:00 +0000
  • 52ad08298a New FastaSequenceFile with support for poor-man's seek and querying the next contig name without loading the whole next contig into memory. Vastly speeds up the performance of jumping to distant parts of the genome with the location operator. depristo 2009-03-22 19:43:56 +0000
  • 4888df97c7 Added averageDouble function. How can we write a generic average function?! depristo 2009-03-22 19:41:30 +0000
  • cf407168cf keep track of the position you're called on. jmaguire 2009-03-22 16:47:49 +0000
  • 096f0dbc68 don't run off the end of the list of loci. jmaguire 2009-03-22 16:47:29 +0000
  • 4e0cd6ab84 Now works on single samples and computes metrics. jmaguire 2009-03-22 15:45:12 +0000
  • f7ad17016d some reformatting and logic cleanup in the comparison functions jmaguire 2009-03-22 15:36:56 +0000
  • dfe50ce773 optionally check that the records are sorted. jmaguire 2009-03-22 15:36:24 +0000
  • 149ac3d96c Now iterate over a large set of tiny intervals efficiently. jmaguire 2009-03-22 12:04:11 +0000
  • df2a7039cb Henious bug fixed: only rookies forget that external conditions need to be re-checked after loop ends on some other condition, duh! In addition, msa piles are now seeded with a single read sequence each (if there are less then 4 reads it might be hard to seed with two pairs) asivache 2009-03-21 18:32:18 +0000
  • 411e5cf647 Added FourBaseCaller as a jar build target. kiran 2009-03-21 17:59:13 +0000
  • 6e1fa7d61a Java version of basecaller that estimates probability distribution over four-base hypothesis space via an internal-control-initialized Gaussian mixture model over base channel intensities. kiran 2009-03-21 17:58:50 +0000
  • 3e350006e0 Added a directory to house some Illumina output parsers. Hopefully this will be merged back into Picard at some point. kiran 2009-03-21 17:55:56 +0000
  • 497eea2e5c minor changes and shuffling code around; also, now when realigned piles are printed they are sorted by start position asivache 2009-03-21 17:43:49 +0000
  • 0ea44a5805 1st draft of support for an file containing a list of intervals. jmaguire 2009-03-21 16:07:32 +0000
  • 1fcf4c0cbf Update picard to work with new samtools. hanna 2009-03-20 21:51:26 +0000
  • 5dca560c3c A bunch of refactoring, and more on the way. jmaguire 2009-03-20 21:31:07 +0000
  • b806a9cf68 Updated for new version of samtools, which returns a sequence dictionary rather than a simple list of sequences. hanna 2009-03-20 20:38:24 +0000
  • 6e2d939905 Added subversion rev 180 of the sam library. hanna 2009-03-20 20:17:51 +0000
  • c5433a3120 dumps out base qualities per position for use in making boxplots ebanks 2009-03-20 17:01:18 +0000
  • 1161c261ac made all data members public. switched logOddsVarRef to LOD. jmaguire 2009-03-20 16:44:17 +0000
  • 9b5e5e06f9 Now supports checking that the input files exist and are good depristo 2009-03-20 16:40:54 +0000
  • f3f1b47808 deal with reverse complemented reads ebanks 2009-03-20 16:01:49 +0000
  • 9ec96414c7 git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@114 348d0f76-0448-11de-a6fe-93d51630548a asivache 2009-03-20 15:54:29 +0000
  • 322f4b944f Better stress test depristo 2009-03-20 15:52:54 +0000
  • 3565b50ff5 main class (argument processing and traversing the reference) and implementation of all the Receiver functionality for building read piles over indels asivache 2009-03-20 05:18:04 +0000
  • 4c3b92b860 comparator for interval objects asivache 2009-03-20 05:15:13 +0000
  • f810412d75 equals(), hashCode() updated/added, also a few minor changes asivache 2009-03-20 05:13:07 +0000
  • 4badd54216 Indel also implements Interval interface but has its quirks asivache 2009-03-20 05:11:17 +0000
  • 501e92d441 an interface for an interval object and simple minimum implementation; note: in contrast to arachne, this is closed interval asivache 2009-03-20 05:09:56 +0000
  • 29d2d460f3 a trivial interface and even more trivial implementations that do nothing (ignore the data they receive) asivache 2009-03-20 05:08:15 +0000
  • b83c8319c7 Crushed subtle and potentially insidous bug in seeking within the fasta; a beer for anyone who can tell me the situation where this might arise... depristo 2009-03-20 00:07:06 +0000
  • 34ee48fd82 Fixing output printing issues in the code, as well as adding more safety checks depristo 2009-03-19 23:02:49 +0000
  • 6fdd622160 Describe how GATK finds walkers. Change the example to avoid copying the class file into the walkers directory. hanna 2009-03-19 22:41:12 +0000
  • 104e2811ec Configure the plugin directory. hanna 2009-03-19 22:12:25 +0000
  • 6bcdac5c62 Restructured AlleleFrequency classes into 3 classes: AlleleFrequencyWalker, AlleleFrequencyMetricsWalker, AlleleFrequencyEstimate. AlleleFrequencyMetricsWalker class now calls mapper function of AlleleFrequencyWalker and works with the result. AlleleFrequencyEstimate is now a separate class instead of a subclass of AlleleFrequencyWalker. andrewk 2009-03-19 22:06:01 +0000
  • 41fec1565c Hello, world! for GATK. hanna 2009-03-19 21:46:22 +0000
  • 7bc45b68aa Added dependences on two libraries: the Colt package, which is a collection of high performance computing libraries from CERN; and Log4j, which will be our new logging platform. aaron 2009-03-19 16:16:31 +0000
  • 5fa99f430e One line format is useable and two levels of debug output are available (debug = 1: one line format, debug = 2: table of sampled probs for each locus). Class AlleleFrequencyMetrics computes %dbSNP and frequency of SNPs. andrewk 2009-03-19 15:05:05 +0000
  • f1034f3dfd Stress Test utility for pushing the GATK to its limits. Takes a list of sam files and runs Analyses on them all, optionally in the queue depristo 2009-03-19 03:15:00 +0000
  • 4242dba295 Remove endless iterator. hanna 2009-03-18 23:53:40 +0000
  • 225ea64bd9 Moved extra walkers at Mark's request. hanna 2009-03-18 23:52:08 +0000
  • ffb6f8f5da Move the basic gatk framework into the core subtree. hanna 2009-03-18 23:39:00 +0000
  • 69316f1873 removed unused import statement asivache 2009-03-18 21:56:15 +0000
  • 875272e5c5 moved counted object to utils asivache 2009-03-18 21:54:04 +0000
  • e09af2ef70 changed variable declaration from concrete class to interface asivache 2009-03-18 21:50:47 +0000
  • 708ada3e99 an accessory for CountedObject: builds a comparator for CountedObject<T> given a comparator for T; compares the underlying objects T themselves, *not* the associated counters asivache 2009-03-18 21:45:54 +0000
  • 37101045af a simple wrapper class; less overhead than keeping a separate Integer counter object and going through object reallocation and/or autoboxing on each counter increment asivache 2009-03-18 21:44:30 +0000
  • 45d2a9acd8 Added walker to print out a histogram of where mismatches occur in alignments ebanks 2009-03-18 19:46:42 +0000
  • 1096bbd4d9 Moved build.xml, ivy.xml and settings to root of Sting repository. hanna 2009-03-18 19:13:19 +0000
  • d46ee96269 Added support for loose Walker class files in walkers directory. hanna 2009-03-18 17:32:24 +0000
  • fe9e52c47e allow on fly sorting AND validation ebanks 2009-03-18 15:50:17 +0000
  • bb94c853f8 Added WalkerManager -- a class that dynamically loads available walkers from the jar file. For now, added placeholder Walker interface so that WalkerManager could work with classes of type Walker rather than classes of type Object. hanna 2009-03-17 23:22:37 +0000
  • d9fa04f65c Fixed logic ebanks 2009-03-17 22:20:03 +0000
  • 1aa3958644 Added ability to sort reads on the fly ebanks 2009-03-17 20:29:09 +0000
  • 0362cb9e59 added Utils.filterInPlace() - purges elements directly from the passed collection object without creating new list for results asivache 2009-03-17 19:06:40 +0000
  • 58aa2aab43 Rough draft of patch to use bam indices when available. hanna 2009-03-17 16:39:03 +0000
  • 151c37591e removed unnecessary import that produced a warning. where did it come from in the first place?? asivache 2009-03-17 15:46:27 +0000
  • 478425b3d8 Better error messages depristo 2009-03-17 15:37:02 +0000
  • 0fd55d91d2 Fixed bug in unsafe mode depristo 2009-03-17 15:28:04 +0000
  • c74bd871b1 added module for aligned reads ebanks 2009-03-17 14:08:54 +0000
  • 28cc670a92 Walker to print out a histogram of aligned reads per mismatches allowed ebanks 2009-03-17 14:05:29 +0000
  • 9ae551e858 Lots of error checking added, fixed bugs associated with reading files out of order, added support for U (unsafe) flag for processing reads depristo 2009-03-16 23:22:04 +0000
  • 36b8b34490 Main tool that builds the clusters (multiple alignments) - so far; to be heavily refactored; most methods should find their proper homes in other classes asivache 2009-03-16 22:03:31 +0000
  • b9ffcdf047 matrix as the name suggests; utilizes special property (zeros at diagonal and below) to use less memory at the expense of slower access; this one is built directly on primitive data type (double) so it should not have any overhead associated with java classes asivache 2009-03-16 22:01:53 +0000
  • a17ed3cbf1 this class really computes (and keeps) a gapless pairwise alignment between the two sequences, ILT-style asivache 2009-03-16 21:59:26 +0000
  • 4972b03059 a class that keeps a pile of reads and can perform some simple computations on them; does not perform multiple alignments (so far) - external tools do the job asivache 2009-03-16 21:58:05 +0000
  • 6d481c64e7 just a square matrix of arbitrary stuff; the stuff must be full fledged Java type, however, not a primitive type. Hooray Java! asivache 2009-03-16 21:56:45 +0000
  • c68e0cc1fe Walks along the sequence and emits a sequence of subsequent, encoded Kmers (uses short int, so currently it's up to K=8) asivache 2009-03-16 21:54:45 +0000
  • 34d9af4702 Remove orphaned modules directory. hanna 2009-03-16 21:53:05 +0000
  • 1e89dbfcb1 Sequence bundled with its Kmer-based lookup index (same thing as old lookup table) asivache 2009-03-16 21:52:57 +0000
  • 02dec23628 Set of classes that perform clustering of reads without regard for the reference; at this stage the code actually knows nothing about 'indels' despite the package name; kept in a separate project-specific package for now - a playground within a playground asivache 2009-03-16 21:51:13 +0000
  • 685fc8bd61 Partial implementation of single sample allele calling andrewk 2009-03-16 19:30:42 +0000
  • 4808dff110 Added latest snapshot of picard. hanna 2009-03-16 19:05:14 +0000
  • 8ccbcc4101 Mismatch counter depristo 2009-03-16 15:51:48 +0000
  • 3ac3592ee4 IntelliJ cleanup. Added docs on build environment. hanna 2009-03-16 15:26:34 +0000
  • 3b5003bd11 Added support for accessing the reference in read traversal depristo 2009-03-16 14:46:19 +0000
  • feb70ff627 Added instructions for 3rd party sources and remote debugging. hanna 2009-03-16 01:29:48 +0000
  • 28d1e33e3b Added description of available ant tasks, help using intellij on Linux. hanna 2009-03-16 00:03:55 +0000
  • d93e9613f4 fixes to build.xml depristo 2009-03-15 22:46:43 +0000
  • 69aa669928 git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@56 348d0f76-0448-11de-a6fe-93d51630548a depristo 2009-03-15 22:42:24 +0000
  • 01e950ef2f Reference ordered data files depristo 2009-03-15 22:37:20 +0000
  • 24ae381c97 Renaming of ATK to GATK, the genome analysis TK. Also added several more layers of error checking depristo 2009-03-15 22:23:25 +0000
  • c9cb7a3596 Renaming of ATK to GATK, the genome analysis TK. Also added several more layers of error checking depristo 2009-03-15 22:21:48 +0000
  • a38a038204 fixed spelling mistake of method ebanks 2009-03-13 18:51:53 +0000
  • 968281e460 fixed spelling mistake of method ebanks 2009-03-13 18:50:29 +0000
  • c33a779f33 corrected spelling of method ebanks 2009-03-13 18:48:11 +0000
  • 45d69397b7 Added the beginnings of a guide to integrating our code with Intellij aaron 2009-03-13 18:44:13 +0000
  • a2ff12ee06 Add javadoc task. Lots of work remaining to get clean generation of javadoc w/o warnings. hanna 2009-03-13 18:43:07 +0000
  • 09307f768c Fix for leniency actually working depristo 2009-03-13 16:26:18 +0000
  • 10427f8191 Bad merge has been fixed! depristo 2009-03-13 16:00:23 +0000
  • 2a8dc05f2e Support for threaded IO! depristo 2009-03-13 14:50:45 +0000
  • 851254970c First somewhat functional version of AlleleFrequency caller! andrewk 2009-03-13 04:10:43 +0000
  • 7c7d4b3a95 Added support for HangingLocusIterator depristo 2009-03-12 23:31:00 +0000
  • 04befb942e Added support for HangingLocusIterator depristo 2009-03-12 23:30:19 +0000
  • 8a63606e11 Add publication date to Picard jar. hanna 2009-03-12 21:23:30 +0000
  • 727822f0c1 Add debugging info to executables. hanna 2009-03-12 20:04:00 +0000