gatk-3.8

Commit Graph

Author	SHA1	Message	Date
ebanks	d549347f25	Refactored GenotypeLikelihoods to use an underlying 4-base model. It needs to be modified a bit and then hooked up to a pooled model, but that is now possible. At this point, there is no difference to the Unified Genotyper. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1978 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-05 21:59:25 +00:00
aaron	aacd72854f	a fix for a bug Andrey discovered: in read-based interval traversals we're dupplicating reads in rare cases. The problem was that to accomidate a bug in SAM JDK indexing, we were forced to add one to the stop of our QueryOverlapping() calls to ensure we always got all of the overlapping reads. Added a PlusOneFixIterator that wraps other iterators, and eliminates reads that start outside of our intended interval (interval stop - 1). Updated and checked BamToFastqIntegrationTest MD5 sums. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1976 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-05 05:26:33 +00:00
ebanks	a545859c62	Joint Estimation model now emits a reasonable slod git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1969 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-03 21:12:42 +00:00
ebanks	11d950abe0	No longer allow the lod_threshold argument - use confidence instead. Have UG output qscores in all cases. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1968 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-03 16:18:51 +00:00
asivache	2fb45dbd73	Make window size a command line argument git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1967 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-03 16:13:35 +00:00
asivache	55f61b1f88	Bug fix in adjustment of the shift position. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1966 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-03 16:08:11 +00:00
ebanks	3a33401822	2nd stage of the genotyper output refactoring is complete. Now, all output is generalized and all of the intelligence lies where it is supposed to. Next stage is syncing up old and new models and making sure we're outputting exactly what we should. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1960 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-02 22:43:08 +00:00
aaron	ba67c7f02b	added a warning for those using bed files; we properly convert bed to the internal representation but the user needs to be aware that any output will be one-based closed intervals git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1959 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-02 21:09:18 +00:00
aaron	b71b66bd88	the underlying parameter is a float so we need to use Float.valueOf() instead; Noticed by external user Hou Huabin git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1958 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-02 20:22:25 +00:00
ebanks	af6d0003f8	-Generalized the GenotypeConcordance module to deal with any number of individuals (although it will default to its old behavior if the -samples argument is left out). -Make rods return the appropriate type of Genotype calls from getGenotype(). git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1954 348d0f76-0448-11de-a6fe-93d51630548a	2009-11-01 05:35:47 +00:00
asivache	4b0796ba58	After fixing a few glitches and bugs, this version finally works as intended git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1952 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-31 04:59:58 +00:00
asivache	ea8d5c7077	Some internal refactoring. Now "safely" ignores duplicate records (NOT duplicate reads but rather malformed bam files!) resulting from the bug/feature in CleanedReadInjector. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1949 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-30 17:50:51 +00:00
ebanks	4ee1d6f733	-Have the calculation models determine whether a call passes the lod/confidence thresholds (as opposed to returning everything and letting the UG decide); this way, walkers which call map() will get only the good calls. -Do the right thing in all models for all-base-mode (for Kiran). git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1940 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-30 02:35:51 +00:00
asivache	e3b4d4cbed	Genotyper reimplemented. Does the same thing, at least for now, but internal data structures redesign enables collecting various statistics for indel-containing/reference-matching reads. The statistics are not yet used by the caller itself to make a better judgement w.r.t. the validity of the calls it makes, but they are now printed into the output stream (--verbose). The statistics (for both normal and tumor) include: indel observation count/total coverage, av. number of mismatches per indel-containing and per ref-matching read, av. mapping quality, av. mismatch rate and av. base quality within an NQS windoew around the indel, numbers of indel and ref observations per strand. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1936 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-29 19:09:16 +00:00
ebanks	5cdbdd9e5b	now that the design is stable, pull the setReference and setLocation methods back out of Genotype and stick them into constructors of implementing classes git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1931 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-29 13:27:37 +00:00
ebanks	3091443dc7	Sweeping changes to the genotype output system, as per several discussions with Matt & Aaron. Some things still need to be changed, but it will entail some more design decisions first (which means I get to bug M&A again tomorrow!). git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1930 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-29 03:46:41 +00:00
depristo	86573177d1	Reverting rod walkers to use underlying refwalker implementation while we work on ROD2 and reenable the system. Added some serious sparse file parsing to variant eval tests git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1929 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-29 01:04:37 +00:00
aaron	5a3bd50537	adding error log reporting to the GATK, and a stream based output method for the argument collection git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1926 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-28 19:56:05 +00:00
aaron	04e9a494e9	removed the GenotypesBacked interface, which is currently unused. Also cleaned up some documentation lines git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1924 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-28 18:08:14 +00:00
depristo	186a8dd698	Trivial protection for null value git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1918 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-27 21:52:52 +00:00
depristo	726378be8b	Almost ready to stop doing eagar decoding; waiting on Eric git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1914 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-27 15:28:05 +00:00
aaron	3fb3773098	a fix for traverse dupplicates bug: GSA-202. Also removed some debugging output from FastaAltRef walker git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1912 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-26 20:18:55 +00:00
hanna	a1e8a532ad	Support for initialize() and onTraversalDone() output from parallelized walkers. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1911 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-26 20:18:31 +00:00
ebanks	75ad6bbef7	Check that map isn't being called passing in null arguments. (This seems wrong; see JIRA entry GSA-211) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1907 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-25 02:30:36 +00:00
hanna	65b98470f3	Temporary fix: have RodLocusView manage and close its RODs. Really the relationship between these two classes needs to be rethought; see JIRA GSA-207. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1904 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-23 16:00:12 +00:00
aaron	ad1fc511b1	intermediate commit for some changes in the Variation system, so Eric can go ahead with his changes. Everything is pretty set, but the Variation interface could use a convenience method that joins all the alternate alleles. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1903 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-23 06:31:15 +00:00
ebanks	6c338eccb8	Joint Estimation model now emits calls in all formats. The whole GenotypeCall framework needs to be changed, but this will work for the time being. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1902 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-23 03:07:28 +00:00
ebanks	54c61c663c	-Cleanup of the Joint Estimation code -Don't print verbose/debugging output to logger, but instead specify a file in the argument collection (and then we only need to print conditionally) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1899 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-22 15:25:29 +00:00
asivache	2cab4c68d4	Added method: isCodingExon(). Returns true if position is simultaneously within an exon AND within coding interval of any single transcript from the list. The old method of detecting coding positions as isExon() && isCoding() is buggy, as the position could be in the UTR part of one transcript (isExon() is true), and within coding region bounds (but not in the exon) of another transcript (isCoding() is true). As a result UTR positions would be erroneously annotated as coding. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1898 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-22 14:55:07 +00:00
ebanks	55fa1cfa06	-Renamed new calculation model and worked out some significant xhanges with Mark -Allow walkers calling the UG to pass in their own argument collections git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1896 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-21 20:49:36 +00:00
ebanks	9b9744109c	Mark's new unified calculation model is now officially implemented. Because it doesn't actually use EM, it's no longer a subclass of the EM model. Note that you can't use it just yet because it doesn't actually emit calls (just prints to logger). I need to deal with general UG output tomorrow. Hold off until then, Mark, and then you can go wild. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1891 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-21 02:39:23 +00:00
depristo	caa3187af8	Enabling correct high-performance ROD walker and moved VariantEval over to it. Performance improvements in variantEval in general. See wiki for full description git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1890 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-20 23:31:13 +00:00
depristo	449a6ba75a	Deleting lots of code as part of my cleanup. More classes tagged for removal. Many more walkers have their days numbered. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1885 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-20 12:23:36 +00:00
ebanks	b8ab77c91c	Don't filter out reads without proper read groups. Instead, allow the user (or another walker calling UG) to specify an assumed sample to use (but then we assume single-sample mode). git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1883 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-20 01:30:53 +00:00
ebanks	c29924e7cf	Reverting previous change. Aaron, it's all yours... git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1881 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-20 00:55:24 +00:00
aaron	d21b582b18	memory leak, where the Resource Pool was releasing based on the value and not the key, resulting in the resourceAssignments map growing with each additional shard git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1880 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-20 00:39:42 +00:00
ebanks	761a730758	assertBiAllelic -> assertMultiAllelic. Chris, if this breaks an integration test, you get it. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1879 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-20 00:09:46 +00:00
aaron	cfa86d52c2	ensure that in the indel case we don't allow identification as both an insertion and deletion at the same location in the VCF ROD git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1875 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-19 18:21:00 +00:00
ebanks	51f9ec0a5c	subtract largest posterior value from all values; this hopefully solves any precision issues git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1870 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-18 05:20:15 +00:00
ebanks	b9e8867287	-push allele frequency and genotype likelihood variable definitions down into the subclasses so that they can use different data structures -use slightly more stringent stability metric -better integration test git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1869 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-18 04:22:17 +00:00
chartl	ad777a9c14	@BasicPileup - made the counts public so they can be used @PoolUtils - split reads by indel/simple base @BaseTransitionTable - complete refactoring, nicer now @UnifiedArgumentCollection - added PoolSize as an argument @UnifiedGenotyper - checks to ensure pooled sequencing uses the appropriate model @GenotypeCalculationModel - instantiates with the new PoolSize argument git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1867 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-16 21:56:56 +00:00
andrewk	d1a4cd2f73	Added ValidationData analysis type to VariantEvalWalker; this eval takes a GFF file with validated truth data positions (bound to "validation")and calculates the accuracy of the genotype calls bound to "eval". git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1862 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-16 15:39:08 +00:00
ebanks	418e007ca6	A cleaner interface: now everyone can use UG's initialize method git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1860 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-16 14:09:16 +00:00
aaron	96972c3a5c	a fix for a bug Eric found: if your first call contains fewer samples than calls at other loci, your VCFHeader got setup incorrectly. Also moved a buch of Lists over to Sets for consistancy. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1859 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-16 04:57:50 +00:00
aaron	a69ea9b57c	Cleaning up the VCF code, adding lots of tests for a variety of edge cases. Two issues are still outstanding: updating the no call string with the standard 1000g decided on today, and fixing Eric's issue where not all the VCF sample names are present initially. also: their, I hope your happy Eric, from now on I'll try not to flout my awesomest grammer in the future accept when I need to illicit a strong response :-) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1858 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-16 04:11:34 +00:00
ebanks	993c567bd8	I had to remove some of my more agressive optimizations, as they were causing us to get slightly different results as MSG. Results in only small cost to running time. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1856 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-16 00:59:32 +00:00
asivache	7d7ff09f54	throw an exception if read has no associated read group git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1855 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-15 18:11:32 +00:00
depristo	0c2016c19a	Improved error messages -- now easier to read, points to the GATK Error Messages wiki, and avoids double printing of stack traces git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1850 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-15 12:07:44 +00:00
ebanks	a32470cea1	Deal with the fact that walkers can call UG's init/map functions directly. We need to filter contexts in that case since the calling walkers don't get UG's traversal-level filters. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1848 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-15 02:31:45 +00:00
ebanks	e740e7a7ce	Because walkers call UG's map function, we need to move the actual writing out to UG's reduce function. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1845 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-14 20:49:26 +00:00
ebanks	52d2e0ca07	All walkers now use read.getReadGroup() git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1839 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-14 19:27:40 +00:00
aaron	eb90e5c4d7	changes to VCF output, and updated MD5's in the integration tests git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1836 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-14 18:42:48 +00:00
ebanks	89771fef05	-Use read.getReadGroup() -Add another filter for read groups for Chris git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1835 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-14 18:08:32 +00:00
ebanks	311ab8da5a	A helper class to create the masks for the sequenom design maker. This project is now officially done. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1834 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-14 17:28:51 +00:00
ebanks	0c95d6906f	Merge both versions of the Sequenom assay design maker: use Jared's base code and add in indels. [Jared, this still emits the same output for SNPs as your original version) Remove all sequenom stuff from the FastaAlternateReferenceMaker so it can just concentrate on making alternate references... git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1831 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-14 17:11:45 +00:00
ebanks	f2886d88e0	We now emit genotype calls git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1828 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-14 02:49:56 +00:00
ebanks	96b8499a31	Remodeled version of the UnifiedGenotyper. We currently get identical lods and slods as MultiSampleCaller (except slods for ref calls, as I discussed with Jared) and are a bit faster in my few test cases. Single-sample mode still emulates SSG. The remaining to do items: 1. more testing still needed 2. we currently only output lods/slods, but I need to emit actual calls 3. stubs are in place for Mark's proposed version of the EM calculation and now I need to add the actual code. More check-ins coming soon... git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1821 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-13 20:27:01 +00:00
aaron	77499e35ac	fixes for GSA-199: Need easier way to write binary outputs to standard output. GLF and VCF now have stream constructors, and can get dumped to standard out. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1818 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-13 15:50:20 +00:00
ebanks	caf689821f	added method to get normalized posteriors git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1809 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-12 02:33:22 +00:00
ebanks	cf7a26759d	-use the getReadGroup() function that was added to picard for us -clean up some include lines git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1808 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-12 01:39:32 +00:00
hanna	d844d1c496	SAMFileWriters specified as command-line arguments were sometimes incorrectly altering the default short name. Make sure short name is not specified if shortName is not specified but fullName is. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1807 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-09 19:16:46 +00:00
hanna	da084357db	Fixed minor typo in output message. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1806 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-09 18:56:54 +00:00
ebanks	a9f3d46fa8	Your time has come, SSG. Fare thee well. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1799 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-08 20:27:56 +00:00
aaron	98e3a0bf1a	VCF can now be emitted from SSG. The basic's are there (the genotype, read depth, our error estimate), but more fields need to be added for each record as nessasary. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1797 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-08 19:50:04 +00:00
kiran	29ad6cd876	Made redundant by BCMMarkDupes git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1795 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-08 18:47:20 +00:00
ebanks	15bf014e0b	logger.info -> logger.debug (don't want to risk filling up my log on genome-wide calls) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1792 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-08 17:53:11 +00:00
ebanks	04fe50cadd	* We no longer have a separate model for the single-sample case. * For now, a single sample input will be special-cased in the EM model - but that will change when the EM model degenerates to the single sample output with a single sample as input. For now, the EM code for multi-samples isn't finished; I'm planning on checking that in soon. The SingleSampleIntegrationTest now uses the UnifiedCaller instead of SSG, and so should all of you. More on that in a separate email. Other minor cleanups added too. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1785 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-08 14:08:57 +00:00
kiran	829e99413b	Rescores a variant after removing duplicates (defined very strictly as reads with the same start points). git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1782 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-08 03:07:36 +00:00
ebanks	1905b5defa	Hash by chromosome for now to reduce memory. This is a temporary solution until we decide how to reture the Injector for good. Also, with Picard's latest changes, we need to make sure we don't double-close the sam writer. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1779 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-07 20:06:25 +00:00
ebanks	203c626fc2	A wrapper around the GenotypeLikelihoods class for the UnifiedGenotyper. This wrapper incorporates both strand-based likelihoods and a combined likelihoods over both strands. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1777 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-07 19:57:37 +00:00
depristo	8dd0924b37	Minor performance improvements to VariantEval -- now all of the CPU time is spent dealing with the ROD system... git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1772 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-06 23:40:30 +00:00
aaron	4554ca1b28	more cleanup, depecaited the old genotype, corrected SNPCallsFromGenotypes' imports and two other classes that depend on it. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1771 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-06 19:09:27 +00:00
aaron	3aec76136f	Removing the AllelicVariant interface, which is replaced by the Variation interface. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1770 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-06 17:44:24 +00:00
aaron	66fc8ea444	GSA-182: Adding support for BED interval files. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1767 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-06 02:45:31 +00:00
hanna	aec83b401d	SSG multithreading doesn't play well with some I/O changes made since I last svn up'd. Reverting until I can find the reason. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1766 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-05 19:48:57 +00:00
hanna	8a503c86b6	Code supporting SSG proof-of-concept shared memory parallelism. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1765 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-05 18:56:16 +00:00
ebanks	fb619bd593	-Refactoring: make GenotypeCalculationModel constructors empty so that they don't have to be updated every time we add a new parameter; instead put that logic in the super class's initialize method (making everything protected so that only the factory can access them) -Adding initial version of Multi-sample calculation model. This still needs much work: it needs to be cleaned up and finished. Right now, it (purposely) throws a RuntimeException after completing the EM loop. Also: -Fix logic in GenotypeLikelihoods.setPriors -Add logger to the models for output git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1764 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-05 18:10:36 +00:00
aaron	7fc4472e6d	A big fix for MergingSamRecordIterator, where we weren't correctly handling the comparisons of SAMRecords correctly (we weren't applying the new reference index first, so sometimes the MT contig would be ID 23, sometimes 24 in different records). Also a fix to the GLF tests, and a correction to PrintReadsWalker to remove the close() on the output source, the source handles that itself (and you get a double close). git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1758 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-02 19:35:35 +00:00
ebanks	53a4bd7f51	A better understanding of what's going on means no need for clearing the cache git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1755 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-02 18:07:46 +00:00
aaron	e885cc4b21	changes for corrected GLF likelihood output, along with better tests git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1754 348d0f76-0448-11de-a6fe-93d51630548a	2009-10-01 20:45:05 +00:00
aaron	2e4949c4d6	Rev'ing Picard, which includes the update to get all the reads in the query region (GSA-173). With it come a bunch of fixes, including retiring the FourBaseRecaller code, and updated md5 for some walker tests. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1751 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-30 20:37:59 +00:00
ebanks	303972aa4b	Yup, I broke the build... git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1750 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-30 20:20:43 +00:00
ebanks	841d25cc44	Added ability to set the priors after construction (and requiring a flushing of the likelihoods cache) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1749 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-30 19:55:49 +00:00
hanna	70e1aef550	Better integrate the @ArgumentCollection into the command-line argument parser. Walkers can now specify their own @ArgumentCollections. Also cleaned up a bit of the CommandLineProgram template method pattern to minimize duplicate code. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1746 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-29 22:23:19 +00:00
aaron	b1c321f161	Adjusted Genotype concordance to more accurately use the new Genotyping code, fixed the VCF rod, and temp. fix the build by reintroducing Shermans ReadCigarFormatter git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1745 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-29 21:28:21 +00:00
ebanks	9ef80e3c3c	One minor addition: to incorporate Pooled calling (and to be as general as possible), we allow the genotype calculation model to use rods if it wants. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1741 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-29 17:05:59 +00:00
ebanks	19bfe43173	First pass at a unified caller, being checked in now so Mark can give feedback if he chooses and so Matt can debug issues with the ArgumentCollection class. Some notes: 1. This design should be flexible enough to include pooled calling (for now) after discussions with Chris. 2. Using the unified caller with the SingleSampleCalculationModel emits the exact same output as SSG over all of chr20 for NA12878. Additionally, when we include the "max deletions allowed at a locus" argument (so we don't try to call SNPs at deletion sites), it removed 233 SNP calls in chr20 that were clearly indel artficts. 3. The MultiSampleEMCalculationModel is still a work in progress and will be checked in later this week. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1740 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-29 16:48:15 +00:00
andrewk	5dab95aa5a	Fix getMergedReadGroupsByReaders so that it provides read groups in the same way Picard does so that it works correctly when input read files have no clashes in their read groups and retain their original read group names. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1737 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-29 06:35:50 +00:00
asivache	bce2f0d7cf	Now instantiates the list of alternative consenses to evaluate as LinkedHashSet to guarantee iterator traversal order. Old implementation used HashSet and exhibited unstable behavior when two alt consenses turned out to be equally good: depending on the run conditions (including size of the interval set being cleaned??), either one could be seen first as selected as the 'best' one git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1734 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-28 06:15:46 +00:00
asivache	663175e868	Bug fix: when jumping onto next contig (chromosome), the walker was erasing last mismatch interval from the previous chr it was still holding without printing it; now it gets printed. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1733 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-25 22:24:34 +00:00
asivache	aec61c558b	moving IndelGenotyper out from playground git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1731 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-25 19:43:53 +00:00
aaron	2b7d39035a	switched over the FastaAlternateReferenceWalker to the Variation system git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1726 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-25 16:09:43 +00:00
aaron	7ffc1d97ef	Cut DeNovoSNPWalker over to the new Variation system, some renaming of methods on the Variation interface, and some corrections on the interface. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1724 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-25 04:35:52 +00:00
aaron	d2af26e81f	Pooled EM SNP Rod converted over to the Variation interface git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1719 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-24 16:33:11 +00:00
ebanks	97105ac001	We need to return a null RODRecordList when the default value is null (as opposed to a list with a single null value), because that's what everyone is expecting. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1718 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-24 16:23:12 +00:00
ebanks	d4b40bc06f	Filter for reads with missing read groups so we can safely assume all reads have valid read groups git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1717 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-24 16:10:26 +00:00
ebanks	90de2e0cde	Added ability to specify whether you want to use a point estimate or fair coin test calculation; for now you can use either but fair coin test is still experimental as it needs to be parametrized correctly. This job will hopefully be done by the future Bioinformatic Analyst... git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1716 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-24 15:29:50 +00:00
aaron	d262cbd41c	changes to add VCF to the rod system, fix VCF output in VariantsToVCF, and some other minor changes git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1715 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-24 15:16:11 +00:00
ebanks	423a3ee894	Added a sequenom rod to empower Carrie to convert 1KG validation SNPs to sequenom format git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1706 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-23 20:22:09 +00:00
hanna	856bbd0320	Let Picard specify the default compression level. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1701 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-23 19:01:48 +00:00
aaron	f783cb30e0	adding an interface so that the current @Requires with ROD annotations work in walkers like VariantEval git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1700 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-23 18:24:05 +00:00
hanna	ebfbe56b43	Make sure compression level always gets pushed into SAMFileWriterFactory. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1699 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-23 18:20:26 +00:00
asivache	bf7cd66d53	New, simpler rodRefSeq. Fully relies on the ROD system standard mechanisms. Multiple transcripts over a given location will be now returned by the ROD system itself as RodRecordList<rodRefSeq>; and yes, rodRefSeq does represent a single transcript record now and implements Transcript interface git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1697 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-23 18:18:25 +00:00
asivache	8fa4c93f5a	Transcript is now simply an interface git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1696 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-23 18:13:31 +00:00
asivache	1bd4c0077c	Now that ROD system supports overlapping RODs, we do not need rodRefSeq to be too smart and read in all the overlapping records (transcripts) on its own; leave it to the generic ROD mechanism. PARTIAL commit; new, simpler rodRefSeq will reappear in a seq. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1694 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-23 18:11:16 +00:00
aaron	11c32b588f	fixing VariantEvalWalkerIntegrationTest md5 sums, a couple comment changes, and a little bit of cleanup git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1690 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-22 20:54:47 +00:00
ebanks	0748d80baa	Added a convenience method in rodDbSNP to deal with Andrey's changes to the rod. Now you can just ask for the first real SNP rod from the list and not have to think about how it works. CountCovariates uses it. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1688 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-22 20:15:40 +00:00
ebanks	682b765536	bug: need to upper case chars so that == works throughout git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1684 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-22 18:20:43 +00:00
asivache	57d31b8e9b	Filter that discards reads from specific lanes; and also its friend that helps blacklisting a set of lanes from GATK command line a one-liner. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1681 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-22 16:46:06 +00:00
ebanks	5ce42cbab3	After thinking about this a bit more, it makes sense to pull this functionality out of my walker and into the GenomeLocParser where everyone else can benefit from it... git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1677 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-22 01:32:35 +00:00
ebanks	b1dc6d65e4	interval merging is now blazingly fast git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1674 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-21 21:15:04 +00:00
asivache	15135788ca	OK, let's bite the bullet. Now rodDbSNP objects are 'isSNP()' only when they are annotated as 'exact', not a 'range'. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1673 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-21 19:25:16 +00:00
asivache	8ad181f46f	Note to myself: do 'ant clean' now and then or old versions of the code that suddenly became invalid will stick around. The world is not perfect, and neither is automatic dependency resolution. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1672 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-21 17:40:52 +00:00
asivache	d2d1354199	Now uses BrokenRODSimulator class to pass the test. CHANGE the code to use new ROD system directly and MODIFY MD5 in corresponding tests, since a few snps are seen differently now. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1670 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-21 17:03:49 +00:00
asivache	29adc0ca1c	Little class that can be used to simulate the results returned by the old ROD system. This is needed to keep couple of tests from breaking. All the code that uses this class must be changed urgently to accomodate the data as returned by new ROD system, and the corresponding tests (MD5 sums) have to be modified as well since some data as seen through the new ROD system is indeed different. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1668 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-21 16:58:56 +00:00
asivache	a6bd509593	Changing the carpet under your feet!! New incremental update to th eROD system has arrived. all the updated classes now make use of new SeekableRodIterator instead of RODIterator. RODIterator class deleted. This batch makes only trivial updates to tests dictated by the change in the ROD system interface. Few less trivial updates to follow. This is a partial commit; a few walkers also still need to be updated, hold on... git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1667 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-21 16:55:22 +00:00
asivache	4c67a49ccb	Removed unused imports git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1666 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-21 16:45:22 +00:00
hanna	e7f44ada98	Make unpackList public static so that Doug can use it in the scatter/gather framework. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1665 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-21 15:32:49 +00:00
ebanks	7b627fd622	Check for empty interval lists to merge git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1664 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-21 04:34:26 +00:00
hanna	7f5778c966	Update gsadevelopers -> gsahelp. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1663 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-20 23:36:54 +00:00
aaron	3a487dd64e	little fixes; also fixed a tyPo git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1662 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-18 22:38:51 +00:00
aaron	b6d7d6acc6	fix for the eval tests, and a change to the backedbygenotypes interface, more changes to come git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1661 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-18 22:25:16 +00:00
depristo	4318f75910	tiny cleanup git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1660 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-18 21:04:25 +00:00
depristo	3a341b2f06	Fixes for VariantEval for genotyping mode git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1659 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-18 21:01:43 +00:00
aaron	7b39aa4966	Adding the VCF ROD. Also changed the VCF objects to much more user friendly. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1658 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-18 20:19:34 +00:00
ebanks	b19fd4d45c	Damn unit tests have a null Toolkit()... git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1654 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-18 17:10:49 +00:00
ebanks	90626c843d	oops - we don't need reference bases, but we still need reference git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1653 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-18 16:24:45 +00:00
ebanks	2b2df4e1ba	- Fix the CleanedReadInjector to deal with -L intervals correctly. - Some walkers don't use the ref base, so speed up traversals by not requiring it git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1652 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-18 16:17:58 +00:00
asivache	94618044e8	Starting an update of ROD system. These basic classes will completely replace old ones, but with this update they are not linked to anything, so this checkpoint should be safe. The main reason for the change is that there can be (and are!) multiple RODs overlapping with a single reference base position in a single track. There can be two "trivial" RODs at the same location (e.g. samtools pileup will have two point-like records at putative indel sites: one for the reference, the other one for the indel itself). Or there can be one or more "extended" RODs (length >1), eg. dbSNP can report an indel at Z:510-525 AND a SNP at Z:515. The ReferenceOrderedDatum object (and children) will not be changed, but it is now explicitly interpreted as a single data record, possibly out of many available from a given track for the current site. As long as single data record occupies one line in a data file, the new ROD system will take care of loading and keeping multiple records, including extended (length > 1) ones, and will automatically drop the records when they finally go out of scope. For one-line-per-record, multiple-records-per-site RODs, there is no need anymore for the hack used so far that involved passing ROD's own implementation of iterator through reflection mechanism (though it will still work) * RODRecordList: the ROD system (its iterators) will now always return a LIST of all RODs available at current position or at current query interval (see below). This class is a trivial wrapper for a list of ROD objects, with added location argument for the whole collection. The location of the RODRecordList is where the ROD system is currently sitting at: a single, current base on the reference (if next() traversal is performed), or the location of the query interval when returned by seekForward() (see below). The ROD objects themselves will have their locations set according to the original data in the file. Hence, perusing the above example of a dbSNP indel at Z:510-525 and SNP at Z:515, when moving to the position Z:515 the ROD system will return a RODRecorList with location Z:515, and with two ROD objects packaged inside, one with location Z:510-525, the other with Z:515. RODRecodIterator: Almost identical to old SimpleRODIterator used by ReferenceOrderedData; this is a low-level iterator that walks over records in the data file (with a callback to ROD's ::parseLine() to parse real data) SeekableRODIterator: a decorator class that wraps around Iterator<ROD> (such as RODRecordIterator) and makes the data traversable by reference position, rather than record by record. This is reimplementation of the old RODIterator. SeekableRODIterator's ::next() moves to the next position on the ref and returns all RODs overlapping with that position (as a RODRecordList). This iterator also adds a seekForward(loc) operation, that allows fast forwarding to a specified position or interval. Length > 1 query arguments (extended intervals) are fully supported by seekForward(), the returned RODRecordList wil contain all RODs overlapping with the specified interval, and the location of the returned RODRecordList object will be set to that query interval. NOTE: it is ILLEGAL to perform next() after a seekForward() query with length > 1 interval. seekForward() with point-like (length=1) interval reenables next(). git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1650 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-18 15:58:37 +00:00
hanna	355136928e	Play nice with other jobs in this VM -- don't close stdout / stderr. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1646 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-17 18:55:08 +00:00
ebanks	5d85bd9671	By default, VF should ask for deleted bases so that they show up in coverage. The Strand filter then needs to ignore those bases when determining bias. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1636 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-16 16:46:09 +00:00
hanna	01a9b1c63b	Fix for problem where err stream remapped to output stream in certain cases, (hopefully) completing Matt's hat trick of fail. Thanks, unit tests. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1634 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-16 08:33:56 +00:00
hanna	9f7cf73411	Output stream management fixes. I completely screwed up the output stream management system, but cleverly masked this fact by breaking some other stream management functionality that masked the problem. Sigh. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1630 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-15 21:06:45 +00:00
hanna	17758b381c	Properly initialize redirected output streams in case of out and err. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1629 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-15 19:47:43 +00:00
andrewk	00dfe014b7	Added option to FastaReferenceWalker to change output FASTA file format's line width and to remove header lines; allows dumping raw sequence using intervals git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1628 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-15 18:00:30 +00:00
hanna	b69eb208a6	Always create output files, even if no output was written to them. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1627 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-15 17:58:14 +00:00
aaron	b401929e41	incremental clean-up and changes for VariantEval, moved DiploidGenotype to a better home, and fixed a spelling error. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1624 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-15 04:48:42 +00:00
ebanks	01e7b39c8d	1. Don't print out values in filter field of the VCF. 2. Fix ratio printouts (for params file) 3. Rename ratio filter's get counts method to avoid confusion; more changes on the way this week. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1616 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-14 21:03:39 +00:00
ebanks	436f543b3b	I owe Doug a beer for finding this: don't print out intervals to be merged if they're not within the global -L intervals git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1615 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-14 20:22:30 +00:00
aaron	e03fccb223	Changes to switch Variant Eval over to the new Variation system. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1611 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-14 05:34:33 +00:00
aaron	5b41ef5f70	rod DBSNP had a bug where the reference wasn't calculated correctly under certain conditions. Fixed getRefBasesFWD and getRefSnpFWD so that they were more in line with getAltBasesFWD and getAltSnpFWD. Also updated Variant Eval tests to reflect this change. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1609 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-13 23:48:58 +00:00
ebanks	c669e8d5ad	Use constant seed in the random generator so we can be stable (and thus unit tests will work) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1607 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-13 17:40:56 +00:00
depristo	6c7a300664	Missing file git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1601 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:17:09 +00:00
depristo	6e13a36059	Framework for ROD walkers -- totally experiment and not working right now git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1600 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:13:15 +00:00
depristo	e8d544869d	Alignment context now supports the idea of skipped bases -- not currently in use git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1598 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:11:38 +00:00
depristo	3949b4ac72	commented out version of next() and hasNext() that appear to be correct but are causing testing problems git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1596 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:09:21 +00:00
depristo	58105636c8	getBoundRods() convenience method git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1595 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:07:57 +00:00
depristo	4e1eded389	Fixed bad compareTo operator git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1594 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:07:10 +00:00
depristo	7c8b17b456	fix for SSG with pl name git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1591 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 20:39:34 +00:00
chartl	d6a0b65ac9	Changes: Rollback of Variant-related changes of r1585, additional PGC code git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1586 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 16:23:01 +00:00
chartl	0c54aba92a	Changes: @VariantEvalWalker - added a command line option to input a file path to a pooled call file for pooled genotype concordance checking. This string is to be passed to the PooledGenotypeConcordance object. @AllelicVariant - added a method isPooled() to distinguish pooled AllelicVariants from unpooled ones. @ all the rest - implemented isPooled(); for everything other than PooledEMSNProd it simply returns false, for PooledEMSNProd it returns true. Added: @PooledGenotypeConcordance - takes in a filepath to a pool file with the names of hapmap individuals for concordance checking with pooled calls and does said concordance checking over all pools. Commented out as all the methods are as yet unwritten. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1585 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 15:01:50 +00:00
aaron	5a64a80ab5	changes to the variation class, updates to SSG, updated tests based on changes to the SSGenotypeCall, and added the ability to run a single integration test from using the build script. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1577 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-10 04:31:33 +00:00
depristo	c988205884	Notes for Aaron in SSG git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1576 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-10 03:18:51 +00:00
ebanks	1362a56227	Added fasta tests and small fix to cleaner test git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1575 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-10 03:13:11 +00:00
depristo	0093482c62	N reference base fix for SSG git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1572 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 21:19:36 +00:00
ebanks	cb31d5a0ab	VariantFiltration now outputs VCF. Important changes: 1. VariantsToVCF can now be called statically to output VCF for a single ROD instance; this is temporary until we have a VCF ROD. 2. VariantFiltration now outputs only 2 files, both mandatory: all variants that pass filters in geli text, and all variants in VCF. If there are any problems, go find Aaron. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1569 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 20:04:32 +00:00
asivache	dd0085c428	1) now is tolerant to sloppy cigar strings with 0-length elements (at the price of extra recursive call) 2) when reads with deletions are requested, adds to the pile just those: reads with 'D' over the current reference base, but not 'N' 3) next() now implements a loop: recursive forward iteration calls to next() until ref. position with non-zero coverage is encountered were OK for (short) deletions, but with long stretches of N's they end up with stack overflow git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1568 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 20:04:04 +00:00
ebanks	542af6402e	output correct format for Sequenom SNPs git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1567 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 19:21:53 +00:00
kiran	3b1e966b4c	Lowercases the sequencing platform so that a difference in case doesn't lead to the failure to look up an entry in the hash. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1565 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 17:35:45 +00:00
kiran	d82d6c0665	Excludes variants that fall below a certain LOD that changes as a function of depth. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1564 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 17:34:16 +00:00
kiran	06eae52292	Throws an exception if you attempt to use a filter that doesn't exist. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1563 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 17:33:27 +00:00
asivache	1060b36288	Bug fix: 'N' cigar elements now treated properly; for all practical intents and purposes, N is the same as D and should be treated as such, the difference is only in logical interpretation. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1562 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 17:08:35 +00:00
chartl	9d69bd2c84	Modifications: @CoverageAndPowerWalker - removed a hanging colon that was being printed after the reference position @VariantEvalWalker - added a command line argument for pool size for eventual use in doing pooled caller evaluations. As now, the variable is unused. @AlignmentContext - altered the scope of class variables from private to protected in order that child objects might have access to them New Additions: Filtered Contexts Sometimes we want to filter or partition reads by some aspect (quality score, read direction, current base, whatever) and use only those reads as part of the alignment context. Prior to this I've been doing the split externally and creating a new AlignmentContext object. This new approach makes it a bit easier, as each of these objects are children of AlignmentContext, and can be instantiated from a "raw" AlignmentContext. @FilteredAlignmentContext is an abstract class that defines the behavior. The abstract method 'filter' is called on the input AlignmentContext, filtering those reads and offsets by whatever you can think of. The filtered reads/offsets are then maintained in the reads and offsets fields. These classes can be passed around as AlignmentContexts themselves. Writing a new kind of read-filtered alignment context boils down to implementing the filter method. @ReverseReadsContext - a FilteredAlignmentContext that takes only reads in the reverse direction @ForwardReadsContext - a FilteredAlignmentContext that takes only reads in the forward direction @QualityScoreThresholdContext - a FilteredAlignmentContext that takes only reads above a given quality score threshold (defaults to 22 if none provided). A unit test bamfile and associated unit tests for these are in the works. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1559 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 15:49:52 +00:00
depristo	d9588e6083	bug fixes to LIBS and LIBH following ultra-aggressive regression testing across 454, solid, and solexa git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1558 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 15:36:12 +00:00
asivache	df11618092	Set default value of useLocusIteratorByHanger to FALSE. Otherwise the -LIBH flag is useless and there'd be no wayto "unset" the 'true' value. Old version was (always) using LocusIteratorByHanger. Now default iterator is indeed LocusIteratorByState, and -LIBH will switch back to the old one git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1556 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 15:09:09 +00:00
depristo	eeb9b6eb13	GenotypeLikelhoods now support a cache per subclass, avoiding genotyping clashes git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1554 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 10:39:14 +00:00
ebanks	0cc219c0df	-Added unit test for walkers dealing with intervals for cleaning -I also uncovered a corner case in the cleaner that for some reason was commented out but shouldn't have been. Hooray for unit tests! git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1553 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 02:35:17 +00:00
depristo	ec0f6f23c7	LocusIterationByState is now the system deafult. Fixed Aaron's build problem git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1552 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 01:28:05 +00:00
ebanks	5dbba6711c	Lots of changes: (I'll send email out in a sec) 1) Moved various disparate concordance / set splitting functionalities to a new parent tool which works like VariantFiltration (i.e. people can write various modules that fit inside and can be run though it). 2) Fixed up argument parsing in VariantFiltration to use key=value format so we don't accidentally mox up values (like I had been doing). 3) Have indel rod print samples git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1540 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-07 01:12:09 +00:00
depristo	1c3d67f0f3	Improvements to the CountCovariates and TableRecablirator, as well as regression tests for SLX and 454 data git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1539 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-04 22:26:57 +00:00
depristo	2b0d1c52b2	General WalkerTest framework. Includes some minor changes to GATK core to enable creation of true command-line like GATK modules in the code. Extensive first-pass tests for SSG git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1538 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-04 19:13:37 +00:00
aaron	0cc634ed5d	-Renamed rodVariants to RodGeliText -Remove KGenomesSNPROD -Remove rodFLT -Renamed rodGFF to RodGenotypeChipAsGFF -Fixed a problem in SSGenotypeCall -Added basic SSGenotype Test class -Make VCFHeader constructors public git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1536 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-04 18:40:43 +00:00
ebanks	fd1c72c151	Fixed package name git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1535 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-04 15:40:06 +00:00
ebanks	6c476514f8	Moved to core. Wiki pages are going up; unit tests will be written soon. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1533 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-04 15:09:11 +00:00
ebanks	849dce799d	This rod was all wrong for generating the alternate snp alleles (it returned null or even the wrong value); fixed. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1531 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-04 14:21:46 +00:00
depristo	a08c68362e	Renaming error to getNegLog10PError(); added Cached clearing method to GL; SSG now has a CallResult that counts calls; No more Adding class to System.out, now to logger.info; First major testing piece (and general approach too) to unit testing of a walker -- SingleSampleGenotyper now knows how many calls to make on a particular 1mb region on NA12878 for each call type and counts the number of calls AND the compares the geli MD5 sum to the expected one! git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1530 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-04 12:39:06 +00:00
aaron	3c2ae55859	changes for the genotype overhaul. Lots of changes focusing on the output side, from single sample genotyper to the output file formats like GLF and geli. Of note the genotype formats are still emitting posteriors as likelihoods; this is the way we've been doing it but it may change soon. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1529 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-04 05:31:15 +00:00
ebanks	5bd99fc1c4	VariantFiltration moved to core. Another win for the team. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1517 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-03 15:41:41 +00:00
depristo	bdd0a6f9fa	change to make build work git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1511 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-03 13:43:10 +00:00
depristo	b01ac9de0c	High performance LocusIterator implementation. Now with greatly reduced memory impact and 2x (and more potentially) speed ups of raw locus iteration. General performance improvements to SSG with empirical probs. You can enable high-performance locus iteration with the -LIBS arg. It's still testing but passes validing pileup. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1510 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-03 03:06:25 +00:00
ebanks	3dfc77dc89	Add an indel rod which represents the initial point of the indel only (useful for alternate reference making) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1507 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-02 19:32:29 +00:00
aaron	0e6feff8f2	fixed locus pile-up limiting problem git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1505 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-02 16:56:44 +00:00
aaron	05c164ec69	changing the default behavior to allow any sized read pile-up (which may exceed the memory limit); the user can then select their own read limit. The default of 100K was arbitrary. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1498 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-01 14:46:00 +00:00
ebanks	54c0b6c430	Allow this ROD to consist of just the positions git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1497 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-01 12:43:18 +00:00
aaron	4a1d79cd7b	added a flag, maximum_reads_at_locus, shortName "mrl", which limits the number of reads we add to the locusByHanger. In some bam files misalignment produces pile-ups of 750K or more reads. We now limit this to the default of 100K reads. The user is warned if a locus exceeds this threshold, and no more reads are added. Also CombineDup walker had an incorrect package name. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1496 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-01 04:21:58 +00:00
ebanks	0addae967a	IndelArtifact filter can now handle filtering false SNPs that occur within the span of an indel but after the first position git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1495 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-01 03:34:39 +00:00
ebanks	8e3c3324fa	Added filter for SNPs cleaned out by the realigner. It uses the realigner output for filtering; in addition, dbsnp indels partially work; IndelGenotyper calls don't yet work. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1489 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-31 04:32:32 +00:00
ebanks	8bc7afe781	Smarter SW penalties git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1488 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-31 04:29:19 +00:00
ebanks	1a299dd459	Require each filter or feature to declare whether or not they want mapping quality zero reads in the alignment context git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1486 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-31 03:31:37 +00:00
ebanks	215e908a11	Reworking of the VariantFiltration system to allow for a windowed view of variants and inclusion of more data to the various filters. This now allows us to incorporate both the clustered SNP filter and a SNP-near-indels filter, which otherwise wasn't possible. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1484 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-31 02:16:39 +00:00
depristo	813a4e838f	Removing old code git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1482 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-30 19:27:11 +00:00
depristo	49a7babb2c	Better organization of Genotype likelihood calculations. NewHotness is now just GenotypeLikelihoods. There are 1, 3, and empirical base error models available as subclasses, along with a simple way to make this (see the factory). git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1481 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-30 19:16:30 +00:00
depristo	522e4a77ae	Caching support across multiple technologies git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1480 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-30 18:10:14 +00:00
depristo	5af4bb628b	Intermediate checking before code reorganization. Full blown support for empirical transition probs in SSG for all platforms. Support for defaultPlatform arg in SSG. Renaming classes for final cleanup git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1479 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-30 17:34:43 +00:00
depristo	bde67428fd	Better formatting of the code git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1477 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-29 21:46:47 +00:00
aaron	8331c195fb	changed the full name of maximum_reads to maximum_iterations for consistancy git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1475 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-28 16:03:46 +00:00
depristo	8e129d76fd	Support for original quality scores OQ flag. pQ flag in TableRecalibation to preserve quality scores below a threshold (defaulting to 5) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1474 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-28 14:14:21 +00:00
depristo	bf60980653	Experitmental support for empirical P(B_true \| B_miscall). --useEmpiricalTransitions flag to SSG enables this support. Much better implementation of Genotype likelihoods -- the system should scream along now. Continuing progress towards deleting old model git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1469 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-28 00:17:24 +00:00
depristo	7cf9a54b64	change for new char/byte in BaseUtils git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1467 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-27 23:47:56 +00:00
hanna	e5115409fa	Force columnSpacing to be at least one. We need a general-purpose, working tool for outputting columnar data to a PrintStream; will add JIRA. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1457 348d0f76-0448-11de-a6fe-93d51630548a	2009-08-25 19:54:54 +00:00

... 2 3 4 5 6 ...

819 Commits (40c2d7a4bc9c427b730fbad4aebec3bba3f928ad)