Commit Graph

11433 Commits (a1de84c6c1affc62eff2360047d48c433e9576da)

Author SHA1 Message Date
Ami Levy-Moonshine a1de84c6c1 remove annotation from LasrgeScalePipeline and change eval to wrok with the srandard interval file 2013-01-06 23:28:38 -05:00
Ami Levy-Moonshine c554d9db25 add TODO 2013-01-06 23:04:38 -05:00
Ami Levy-Moonshine 81eef3aa37 merge development branchs of log-less HMM and FastGatherer to master 2013-01-06 23:01:58 -05:00
Ami Levy-Moonshine 80b531f695 emit all sites where more than 90% of the samples have good coverage 2013-01-04 14:27:50 -05:00
Ami Levy-Moonshine 10a705b27f Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2013-01-03 13:42:31 -05:00
Ami Levy-Moonshine 2018285a39 better error message 2013-01-03 13:41:03 -05:00
Eric Banks c7039a9b71 Pushing in implementation of the Bayesian estimate of Qemp for the BQSR.
This isn't hooked up yet with BQSR; it's just a static method used in my testing walker.  I'll hook this into BQSR after more testing and the addition of unit tests.
Most of the changes in this commit are actually documentation-related.
2013-01-02 15:21:44 -05:00
Chris Hartl 03294ae1c8 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2013-01-02 15:11:30 -05:00
Chris Hartl ea2d9aa4fe Merge branch 'incoming' 2013-01-02 15:10:28 -05:00
Chris Hartl 3753209584 One md5sum slipped past in the HC integration test. 2013-01-02 15:09:28 -05:00
Joel Thibault c515175313 Ensure that active region extensions stay on contig 2013-01-02 14:46:24 -05:00
Joel Thibault dcb7735d3c Active Region extensions must stay on contig 2013-01-02 14:46:24 -05:00
Chris Hartl 09199366b7 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2013-01-02 14:44:49 -05:00
Chris Hartl e1d09ab0db QD is now divided by the average length of the alternate allele (weighted by the allele count). The average length is stored in a related annotation, "AAL", which can be used to re-compute the "old" QD by simple multiplication. Integration tests *should* all pass. 2013-01-02 14:41:29 -05:00
Joel Thibault a15f368bdc Re-enable testIsActiveRangeLow/High 2013-01-02 11:57:50 -05:00
Chris Hartl 7188a4a921 Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable 2013-01-02 11:39:53 -05:00
Mark DePristo 12f4c6307e AutoFormattingTime cleanup and complete unittests
-- Underlying system now uses long nano times to be more consistent with standard java practice
-- Updated a few places in the code that were converting from nanoseconds to double seconds to use the new nanoseconds interface directly
-- Bringing us to 100% test coverage with clover with AutoFormattingTimeUnitTest
2013-01-02 11:29:25 -05:00
Mark DePristo c1c5737f80 Re-enabling contracts in tests in build.xml
-- Left contracts turned off in original clover commit
2013-01-02 11:29:25 -05:00
Mark DePristo 3d1c107f9d Detect if clover is present in build.xml. Automatically clean clover db in ant clean, if present 2013-01-02 11:29:24 -05:00
Joel Thibault 429567cd3f Rename to TraverseActiveRegionsUnitTest 2013-01-01 19:20:30 -05:00
Joel Thibault 57d38aac8a Temporarily disable due to unknown contracts problem 2013-01-01 19:20:04 -05:00
Joel Thibault 7748b3816f Delete the test BAI file as well as the BAM 2013-01-01 19:20:02 -05:00
Joel Thibault 5afeb465aa TODOs 2013-01-01 19:19:17 -05:00
Mark DePristo 03780578bc Archiving SomaticIndelDetector, RemapAlignments, ReadPair, and associated library code 2012-12-29 14:37:22 -05:00
Mark DePristo 5558a6b8f7 Deleting / archiving no longer classes
-- AminoAcidTable and AminoAcid goes to the archive
-- Removing two unused SAMRecord classes
2012-12-29 14:34:17 -05:00
Mark DePristo 38cc496de8 Move SomaticIndelDetector and associated tools and libraries into private/andrey package
-- Intermediate commit on the way to archiving SomaticIndelDetector and other tools.
-- SomaticIndelDetector, PairMaker and RemapAlignments tools have been refactored into the private andrey package.  All utility classes refactored into here as well.  At this point, the SomaticIndelDetector builds in this version of the GATK.
-- Subsequent commit will put this code into the archive so it no longer builds in the GATK
2012-12-29 14:34:08 -05:00
Mark DePristo 5f84a4ad82 Clover report excludes test files and other non-interesting files from the clover reports 2012-12-29 13:31:07 -05:00
Ami Levy-Moonshine f450cbc1a3 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-12-27 21:23:59 -05:00
Ami Levy-Moonshine 609ad7dbab can't override var help 2012-12-27 15:10:33 -05:00
Ami Levy-Moonshine 3c972175fc add tsv option to RRScript 2012-12-27 14:50:01 -05:00
Ami Levy-Moonshine 53f8be96d2 add tsv option to RRScript 2012-12-27 14:37:54 -05:00
Eric Banks 275575462f Protect against non-standard ref bases. Ryan, please review. 2012-12-26 15:46:21 -05:00
Eric Banks 75d5b88a3d Enabling the Recal Report unit test (which looks like it was never ever enabled) 2012-12-26 15:35:50 -05:00
Eric Banks efceb0d48c Check for well-encoded reads while fixing mis-encoded ones 2012-12-26 14:30:51 -05:00
Ami Levy-Moonshine fe427cdd77 add few queue script and the CatVariantsGatherer scala class 2012-12-26 13:06:36 -05:00
Chris Hartl f2b7c6f0e1 Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable 2012-12-26 09:20:32 -05:00
Mark DePristo 64c3a0ff62 Remove dependance on clover in build.xml 2012-12-24 13:53:29 -05:00
Mark DePristo af9746af52 Fix merge failure 2012-12-24 13:43:04 -05:00
Mark DePristo 04cc75aaec Minor cleanup and expansion of the RecalDatum unit tests 2012-12-24 13:35:58 -05:00
Mark DePristo 7ec7a5d6b6 Misc. improvements to clover in build.xml
-- Allow instrument level to be overridden on command line with -Dclover.instrument.level=statement
2012-12-24 13:35:58 -05:00
Mark DePristo 7bf1f67273 BQSR optimization: read group x quality score calibration table is thread-local
-- AdvancedRecalibrationEngine now uses a thread-local table for the quality score table, and in finalizeData merges these thread-local tables into the final table.  Radically reduces the contention for RecalDatum in this very highly used table
-- Refactored the utility function to combine two tables into RecalUtils, and created UnitTests for this function, as well as all of RecalibrationTables.  Updated combine in RecalibrationReport to use this table combiner function
-- Made several core functions in RecalDatum into final methods for performance
-- Added RecalibrationTestUtils, a home for recalibration testing utilities
2012-12-24 13:35:58 -05:00
Mark DePristo 7d250a789a ArtificialReadPileupTestProvider now creates GATKSamRecords with good header values 2012-12-24 13:35:57 -05:00
Mark DePristo 295455eee2 NanoScheduler optimizations and simplification
-- The previous model was to enqueue individual map jobs (with a resolution of 1 map job per map call), to track the number of map calls submitted via a counter and a semaphore, and to use this information in each map job and reduce to control the number of map jobs, when reduce was complete, etc.  All hideously complex.
-- This new model is vastly simply.  The reducer basically knows nothing about the control mechanisms in the NanoScheduler.  It just supports multi-threaded reduce.  The NanoScheduler enqueues exactly nThread jobs to be run, which continually loop reading, mapping, and reducing until they run out of material to read, when they shut down.  The master thread of the NS just holds a CountDownLatch, initialized to nThreads, and when each thread exits it reduces the latch by 1.  The master thread gets the final reduce result when its free by the latch reaching 0.  It's all super super simple.
-- Because this model uses vastly fewer synchronization primitives within the NS itself, it's naturally much faster at getting things done, without any of the overhead obvious in profiles of BQSR -nct 2.
2012-12-24 13:35:57 -05:00
Mark DePristo aa3ee29929 Handle case where the ReadGroup is null in GATKSAMRecord 2012-12-24 13:35:57 -05:00
Mark DePristo bf81db40f7 NanoScheduler reducer optimizations
-- reduceAsMuchAsPossible no longer blocks threads via synchronization, but instead uses an explicit lock to manage access.  If the lock is already held (because some thread is doing reduce) then the thread attempting to reduce immediately exits the call and continues doing productive work.  They removes one major source of blocking contention in the NanoScheduler
2012-12-24 13:35:57 -05:00
Mark DePristo 161487b4a4 MapResult compareTo() is now unit tested
-- Thanks clover!
2012-12-24 13:35:57 -05:00
Mark DePristo 940816f16a GATKSamRecord now checks that the read group is a GATKReadGroupRecord, and if not makes one 2012-12-24 13:35:57 -05:00
Mark DePristo 14944b5d73 Incorporating clover into build.xml
-- See http://gatkforums.broadinstitute.org/discussion/2002/clover-coverage-analysis-with-ant for use docs
-- Fix for artificial reads not having proper read groups, causing NPE in some tests
-- Added clover itself to private/resources
2012-12-24 13:35:57 -05:00
Mark DePristo 7796ba7601 Minor optimizations for NanoScheduler
-- Reducer.maybeReleaseLatch is no longer synchronized
-- NanoScheduler only prints progress every 100 or so map calls
2012-12-24 13:35:56 -05:00
Mark DePristo 0f04485c24 NanoScheduler optimization: don't use a PriorityBlockingQueue for the MapResultsQueue
-- Created a separate, limited interface MapResultsQueue object that previously was set to the PriorityBlockingQueue.
-- The MapResultsQueue is now backed by a synchronized ExpandingArrayList, since job ids are integers incrementing from 0 to N.  This means we avoid the n log n sort in the priority queue which was generating a lot of cost in the reduce step
-- Had to update ReducerUnitTest because the test itself was brittle, and broken when I changed the underlying code.
-- A few bits of minor code cleanup through the system (removing unused constructors, local variables, etc)
-- ExpandingArrayList called ensureCapacity so that we increase the size of the arraylist once to accommodate the upcoming size needs
2012-12-24 13:35:56 -05:00