gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Mark DePristo	6a5a70cdf1	Done GSA-539: SimpleTimer should use System.nanoTime for nanoSecond resolution	2012-09-05 15:45:23 -04:00
Mark DePristo	59109d5eeb	NanoScheduler tracks time outside of its execute call	2012-09-05 15:45:23 -04:00
Mark DePristo	800a27c3a7	NanoScheduler tracks time within input, map, and reduce -- Helpful for understanding where the time goes to each bit of the code. -- Controlled by a local static boolean, to avoid the potential overhead in general	2012-09-05 15:45:23 -04:00
Mark DePristo	7087b22ea3	No debugging output (even conditional) for ReadTransformers in PrintReads	2012-09-05 15:45:23 -04:00
Mark DePristo	e01258b261	NanoScheduler now supports printProgress. Bugfixes to printProgress -- TraverseReadsNano prints progress at the end of each traversal unit -- Fix bugs in TraversalEngine printProgress -- Synchronize the method so we don't get multiple logged outputs when two or more HMSs call printProgress before initialization at the start! -- Fix the logic for mustPrint, which actually had the logic of mustNotPrint. Now we see the done log line that was always supposed to be there -- Fix output formatting, as the done() line was incorrectly shifting over the % complete by 1 char as 100.0% didn't fit in %4.1f -- Add clearer doc on -PF argument so that people know that the performance log can be generated to standard out if one wants	2012-09-05 15:45:23 -04:00
Mark DePristo	6055101df8	NanoScheduler no longer groups inputs, each map() call is interlaced now -- Maximizes the efficiency of the threads -- Simplifies interface (yea!) -- Reduces number of combinatorial tests that need to be performed	2012-09-05 15:45:22 -04:00
Mark DePristo	397a5551ef	More memory for gatkdocs and extracthelp targets	2012-09-05 15:45:22 -04:00
Mark DePristo	e3b4cc02aa	Done GSA-282: Unindexed traversals crash if a read goes off the end of a contig -- Already fixed in the codebase. Added unindexed bam and integration tests to ensure this is fine going forward.	2012-09-05 15:45:22 -04:00
Yossi Farjoun	d6884e705a	Revert "fixed a typo in StringText.properties" This reverts commit b74c1c17e748f75e59d23545084b983e2a8d2fa6.	2012-09-05 15:21:00 -04:00
Yossi Farjoun	ad5fa449e7	fixed a typo in the string comment	2012-09-05 14:46:10 -04:00
Yossi Farjoun	f4b39a7545	Merge branch 'master' of ssh://gsa4/humgen/gsa-scr1/gsa-engineering/git/unstable merging trivially after a commit	2012-09-05 14:33:39 -04:00
Yossi Farjoun	6e517df5d9	fixed a typo in StringText.properties	2012-09-05 14:33:08 -04:00
Eric Banks	fc06f39411	Fixed docs for Pileup walker	2012-09-05 09:55:34 -04:00
Christopher Hartl	d795437202	- New UserExceptions added for when ReadFilters or Walkers specified on the command line are not found. When -rf xxxx cannot find the class corresponding to xxxx, all read filters are printed in a better formatted way, with links to their gatk docs. - VariantAnnotatorEngine changed to call genotype annotations even if pilups and allele -> likelihood mappings are not present. Current genotype annotations altered to check for null pilupes and null mappings.	2012-09-04 16:41:44 -04:00
Ryan Poplin	9cc1a9931b	Resolving merge conflicts.	2012-09-04 10:47:38 -04:00
Ryan Poplin	c9944d81ef	Skip array needs to also be used in the updateDataForRead function of the delocalized BQSR.	2012-09-04 10:33:37 -04:00
Mark DePristo	0892f2b8b2	Closing GSA-287:LocusReferenceView doesn't do very well in the case where contigs land off the end of the reference -- Confirmed that reads spanning off the end of the chromosome don't cause an exception by adding integration test for a single read that starts 7 bases from the end of chromosome 1 and spans 90 bases or so off. Added pileup integration test to ensure this behavior continues to work	2012-09-03 20:18:56 -04:00
Mark DePristo	52d6bea804	a few more useful git ignores	2012-09-01 11:08:36 -04:00
Mark DePristo	1b0ce511a6	Updating BQSR tests due to my change to reset BQSR calibration data	2012-08-31 19:51:09 -04:00
Eric Banks	277ba94c7b	Update from dbsnp135 to dbsnp137.	2012-08-31 14:06:29 -04:00
Eric Banks	5ea7cd6dcc	Updating resource bundle: no reason to include both genotype and sites files for Omni and HM3, sites are enough. Also, don't include duplicate entry for the Mills indels.	2012-08-31 14:01:54 -04:00
Mark DePristo	f066a02f3e	Merge branch 'applyRecalibration'	2012-08-31 13:43:52 -04:00
Mark DePristo	c9ea213c9b	Make BaseRecalibration thread-safe -- In the process uncovered two strange things 1 -- qualityScoreByFullCovariateKey was created but never used. Seems like a cache? 2 -- Discovered nasty bug in BaseRecalibrator: https://jira.broadinstitute.org/browse/GSA-534	2012-08-31 13:42:42 -04:00
Mark DePristo	27ddebee53	Protect PrintReads from strange state from TraverseReadsUnitTests	2012-08-31 13:42:41 -04:00
Mark DePristo	e028901d54	Fixed bad contract in ReadTransformer	2012-08-31 13:42:41 -04:00
Mark DePristo	cf91d894e4	Fix build problems with tests	2012-08-31 13:42:41 -04:00
Mark DePristo	817ece37a2	General infrastructure for ReadTransformers -- These are like read filters but can be applied either on input, on output, of handled by the walker -- Previous example of BAQ now uses the general framework -- Resulted in massive conceptual cleanup of SAMDataSource and ReadProperties! Yeah! -- BQSR now uses this framework. We can now do BQSR on input, on output, or within a walker -- PrintReads now handles all read transformers in the walker in map, enabling us to parallelize PrintReads with BAQ and BQSR -- Currently BQSR is excepting in parallel, which subsequent commit with fix -- Removed global variable setting in GenomeAnalysisEngine for BAQ, as command line parameters are cleanly handled by ReadTransformer infrastructure -- In principle ReadFilters are just a special kind of ReadTransformer, but this refactoring is larger than I can do. It's a JIRA entry -- Many files touched simply due to the refactoring and renaming of classes	2012-08-31 13:42:41 -04:00
Ryan Poplin	ff6ebbf3fd	Resolving merge conflicts.	2012-08-31 11:25:55 -04:00
Ryan Poplin	e22bd09477	Initial fix for delocalized BQSR to make it work with new RefMetaDataTracker	2012-08-31 11:23:08 -04:00
Christopher Hartl	143fbead03	Adding an experimental format field annotation that calculates the per-sample residual dosage after accounting for LD. It's meant to be run in a single pass over a chromosome, for instance. Currently it does not work due to a bug in the variant annotator engine, see GSA-532. When that's fixed it'll likely reveal broken code.	2012-08-31 04:04:00 -04:00
Eric Banks	ac0c44720b	I started to put together a set of unit tests for the PileupElement creation functionality of LocusIteratorByState and found pretty quickly that it's definitely still busted for indels. The data provider is nowhere near comprehensive yet, but I need to sit back and think about how to really test some of the functionality of LIBS. Committing what I have for now because at the very least it'll be helpful going forward (failing tests are commented out with TODO).	2012-08-30 22:49:13 -04:00
Mark DePristo	39400c56a9	Update md5s for VQSR, as VQSLOD is now a double and gets the standard double precision treatment in VCF	2012-08-30 19:41:49 -04:00
Mark DePristo	2f749b5e52	Added ThreadSafeMapReduce interface, super of TreeReducible -- A higher level interface to declare parallelism capability of a walker. This interface means that the walker can be multi-threaded, but doesn't necessarily support TreeReducible interface, which forces you to have a combine ReduceType operation that isn't appropriate for parallel read walkers -- Updated ReadWalkers to implement ThreadSafeMapReduce not TreeReducible	2012-08-30 19:41:49 -04:00
Mark DePristo	544740d45d	tasking for n threads should give you n threads in NanoScheduler, not n - 1	2012-08-30 19:41:49 -04:00
Mark DePristo	1212dfd2ef	Reduce the number of test combinations in ReadBasedREferenceOrderedView	2012-08-30 19:41:49 -04:00
Mark DePristo	7a462399ce	Fix GSA-529: Fix RODs for parallel read walkers -- TraverseReadsNano modified to read in all input data before invoking maps, so the input to TraverseReadsNano is a MapData object holding the sam record, the ref context, and the refmetadatatracker. -- Update ValidateRODForReads to be tree reducible, using synchronized map and explicitly sort the output map from locations -> counts in onTraversalDone -- Expanded integration tests to test nt 1, 2, 4.	2012-08-30 19:41:49 -04:00
Mark DePristo	7d95176539	Bugfix to compareTo and equals in GenomeLoc -- Yes, GenomeLoc.compareTo was broken. The compareTo function only considered the contig and start position, but not the stop, when comparing genome locs. -- Updated GenomeLoc.compareTo function to account for stop. Updated GATK code where necessary to fix resulting problems that depended on this. -- Added unit tests to ensure that hashcode, equals, and compareTo are all correct for GenomeLocs	2012-08-30 19:41:49 -04:00
Mark DePristo	5a9610d875	ReadShards now default to 10K (up from 1K) reads per samFile up to 250K -- This should help make the inputs for parallel read walkers a little meater, and avoid spinning the shard creation infrastructure so often	2012-08-30 19:41:49 -04:00
Christopher Hartl	5a142fe265	After dicussion with Ryan/Eric, the Structural_Indel variant type is now gone, and has been entirely replaced with the access pattern .isStructuralIndel(). This makes it a strict subtype of indel. I agree that this method is a bit more sensible. In addition, fix for GSA-310. If supplied -rf argument does not match a known read filter, the list of read filters will be printed, and users directed to the documentation for more information.	2012-08-30 17:57:31 -04:00
Mark DePristo	82b2845b9f	Fix: GSA-531 ApplyRecalibration writing to BCF: java.lang.String cannot be cast to java.lang.Double -- LOD must be added a double to attributes, not as string, so that it can be written out as BCF	2012-08-30 16:59:57 -04:00
Mark DePristo	7b4caec8cb	Fix: GSA-531 ApplyRecalibration writing to BCF: java.lang.String cannot be cast to java.lang.Double -- LOD must be added a double to attributes, not as string, so that it can be written out as BCF	2012-08-30 16:56:36 -04:00
Mark DePristo	863a3d73b8	Added ThreadSafeMapReduce interface, super of TreeReducible -- A higher level interface to declare parallelism capability of a walker. This interface means that the walker can be multi-threaded, but doesn't necessarily support TreeReducible interface, which forces you to have a combine ReduceType operation that isn't appropriate for parallel read walkers -- Updated ReadWalkers to implement ThreadSafeMapReduce not TreeReducible	2012-08-30 16:21:17 -04:00
Mark DePristo	59508f8266	tasking for n threads should give you n threads in NanoScheduler, not n - 1	2012-08-30 15:57:29 -04:00
Mark DePristo	27d1c63448	Reduce the number of test combinations in ReadBasedREferenceOrderedView	2012-08-30 15:56:58 -04:00
Mark DePristo	72cf6bdd9f	Fix GSA-529: Fix RODs for parallel read walkers -- TraverseReadsNano modified to read in all input data before invoking maps, so the input to TraverseReadsNano is a MapData object holding the sam record, the ref context, and the refmetadatatracker. -- Update ValidateRODForReads to be tree reducible, using synchronized map and explicitly sort the output map from locations -> counts in onTraversalDone -- Expanded integration tests to test nt 1, 2, 4.	2012-08-30 15:10:58 -04:00
Mark DePristo	7f166c3198	Bugfix to compareTo and equals in GenomeLoc -- Yes, GenomeLoc.compareTo was broken. The compareTo function only considered the contig and start position, but not the stop, when comparing genome locs. -- Updated GenomeLoc.compareTo function to account for stop. Updated GATK code where necessary to fix resulting problems that depended on this. -- Added unit tests to ensure that hashcode, equals, and compareTo are all correct for GenomeLocs	2012-08-30 15:07:02 -04:00
Ryan Poplin	7b366d4049	misc cleanup in active region traversal.	2012-08-30 11:01:01 -04:00
Mark DePristo	792092b891	ReadShards now default to 10K (up from 1K) reads per samFile up to 250K -- This should help make the inputs for parallel read walkers a little meater, and avoid spinning the shard creation infrastructure so often	2012-08-30 10:39:16 -04:00
Mark DePristo	76853806b0	Print out the time when downloads finished from S3	2012-08-30 10:15:11 -04:00
Mark DePristo	21dd70ed36	Test to ensure that ReadBasedReferenceOrderedView produces stateless objects -- Stateless objects are required for nano-scheduling. This means you can take the RefMetaDataTracker provided by ReadBasedReferenceOrderedView, store it way, get another from the same view, and the original one behaves the same.	2012-08-30 10:15:11 -04:00

1 2 3 4 5 ...

10472 Commits (6a5a70cdf1a80751d1fe54594c0d0d2ee6a3fa87) All Branches Search

10472 Commits (6a5a70cdf1a80751d1fe54594c0d0d2ee6a3fa87)

All Branches