Ryan Poplin
b85ded8389
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-08-30 10:11:48 -04:00
Ryan Poplin
650ff29e62
oops, I meant to use binary OR here.
2012-08-30 10:11:29 -04:00
Ryan Poplin
57d997f06f
Fixing bug from when FragmentUtils merging function moved over to the soft clipped start instead of the unclipped start
2012-08-30 10:10:43 -04:00
Ryan Poplin
f9bab37015
Merged bug fix from Stable into Unstable
2012-08-30 09:21:24 -04:00
Ryan Poplin
eb63221875
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable
2012-08-30 09:19:35 -04:00
Ryan Poplin
81d5eca975
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-08-30 09:10:56 -04:00
Ryan Poplin
35baf0b155
This along with Mauricio's previous commit (thanks!) fixes GSA-522. There are no longer any modifications to reads in the map calls of ActiveRegion walkers. Added the bam which identified this error as a new integration test.
2012-08-30 09:07:36 -04:00
Eric Banks
1acf0f0b2c
Fixing bug in fasta .fai generation: trim the contig names to the first whitespace if one appears. We now generate indexes identical to samtools.
2012-08-29 22:36:27 -04:00
Eric Banks
4d38befe86
Merged bug fix from Stable into Unstable
2012-08-29 15:13:56 -04:00
Eric Banks
150a969279
Be careful with String manipulation when constructing alleles in SomaticIndelDetector
2012-08-29 15:13:28 -04:00
Eric Banks
ce55ba98f4
Don't try to left align indels in unmapped reads (which for some reason can still have CIGARs) because the ref context is null.
2012-08-29 15:01:11 -04:00
Ryan Poplin
4ea38bbfe8
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-08-29 11:39:30 -04:00
Ryan Poplin
b132a33dac
Delocalized BQSR now BAQs the reads on the fly instead of on input.
2012-08-29 11:39:07 -04:00
Mauricio Carneiro
69b56e11c8
ReadClipper won't modify the original read
...
Reverting back to the original implementation, but now including write N's and write Q0's due to walkers that look at the same read multiple times in different reference windows
2012-08-29 11:33:19 -04:00
Ryan Poplin
e12ae65d33
Changing the commenting style in the BQSR
2012-08-29 11:27:45 -04:00
Mark DePristo
19cc0b373e
Some code review comments for Ryan
2012-08-28 17:06:08 -04:00
Khalid Shakir
f45226f01e
Updated HSPTest expected values to match FS changes in depristo's 3baf52 commit.
2012-08-28 16:34:55 -04:00
Ryan Poplin
6d6ca090c6
RecalDatums now hold doubles so the test for equality needs an epsilon.
2012-08-28 16:00:52 -04:00
Ryan Poplin
18eca3544e
Initial commit of the delocalized BQSR written as a read walker.
2012-08-28 15:24:20 -04:00
Eric Banks
e74c527d47
Register the depricated walkers as depricated starting in v2.2 so that users get a helpful error message
2012-08-28 10:19:18 -04:00
Eric Banks
67d348a31d
Retiring the alignment walkers and related integration test since we don't want to support them anymore.
2012-08-28 10:16:49 -04:00
Mark DePristo
0f4acaae1b
Update MD5s with new FS score
2012-08-28 08:06:47 -04:00
Mark DePristo
4b8d9c3915
Actually load the library necessary to compactPDF
...
-- Old version was buggy in that if you didn't load "tools" package in your script it wouldn't compact the resulting PDF! Fixed
2012-08-28 08:06:47 -04:00
Mark DePristo
2996693c9f
FisherStrand now computed with and without filtering low-qual bases, and least significant pvalue is kept
...
-- Old way (filtering for Q > 17 bases) resulted in biased FS when the site was good but there was a
systematic shift in the QUAL of REF and ALT between strands of the reads (sometimes happens)
-- New way (taking all bases) was consistent with BaseQualRankSum and other tests, but there can be
a lot of low qual reference bases on one strand in some techs (ION/PROTON/PACBIO) because of the
preference for introducing an indel vs. a mismatch.
-- This implementation allows us to have our cake and eat it to by computing both p-values, and
taking the maximum one (i.e., least significant).
-- No integration tests updated yet -- still exploring the consequences of this change
2012-08-28 08:06:47 -04:00
Eric Banks
bedcdbdc5f
Fixing merge conflict
2012-08-27 12:16:51 -04:00
Eric Banks
3d476487c6
LIBS is totally busted for deletions. Putting a check in AD for bad pileup event bases so that we don't produce busted alleles. We must fix LIBS ASAP.
2012-08-27 12:13:12 -04:00
Mark DePristo
63a9ae817a
Ensure thread-safety of CachingIndexedFastaSequenceFile
...
-- Cosmetic cleanup of ReadReferenceView
-- TraverseReadsNano provides the reference context, since it's thread-safe
-- Cleanup CachingIndexedFastaSequenceFile. Add docs, remove unnecessary setters
-- Expand CachingIndexedFastaSequenceFileUnitTest to test explicitly multi-threaded safety.
2012-08-27 12:11:54 -04:00
Mark DePristo
e5b1f1c7f4
Add simple main function to unit test so we can run the nano scheduler test from the command line
2012-08-27 12:11:54 -04:00
Khalid Shakir
2d1ea7124b
One less Queue command line requirement: -tempDir now defaults to .queue/tmp.
...
Also moved queueScatterGather to .queue/scatterGather.
2012-08-27 12:04:50 -04:00
Mark DePristo
68c5142d2d
numThreads > 1 any time you have -nt > 1 silly
2012-08-26 14:36:13 -04:00
Mark DePristo
faacacd6c0
Increase runtime of nano scheduler tests to 1 min
2012-08-26 08:42:58 -04:00
Mark DePristo
846e0c11bc
Add TimeOuts to new threading tests, in case there's a underlying deadlock
2012-08-26 08:18:43 -04:00
Mark DePristo
fde9824765
Optimizations for parallel read walkers
...
-- TraversalReadsNano only creates the NanoScheduler once, and shuts it down onTraversalDone
-- Nicer debugging output in NanoScheduler
-- ReadShard has a getBufferSize() method now
2012-08-25 17:21:12 -04:00
Mark DePristo
5066b14335
Parallel FlagStat
2012-08-25 17:21:12 -04:00
Mark DePristo
af540888f1
Limited version of parallel read walkers
...
-- Currently doesn't support accessing reference or ROD data
-- Parallel versions of PrintReads and CountReads
2012-08-25 17:21:12 -04:00
Mark DePristo
e060b148e2
Minor cleanup of TraverseReads
2012-08-25 17:21:11 -04:00
Mark DePristo
275a5e5439
More tests for NanoScheduler
...
-- Add more contracts
-- Test in the UnitTest that the reduce is being called in the correct order
2012-08-25 17:21:11 -04:00
Christopher Hartl
6db0988898
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-08-25 15:40:32 -04:00
Christopher Hartl
db2e88c7cb
Fix for badIndelLength() throwing NPE at non-indel sites. Added integration test.
2012-08-25 12:38:23 -07:00
Mark DePristo
59b5913b54
Merged bug fix from Stable into Unstable
2012-08-25 14:53:22 -04:00
Mark DePristo
dcc972a557
Usability cleanup for BQSR
...
-- I'm seeing a lot of people trying to use BinaryTagCovariate in the community. They really shouldn't do this, so I moved it to private.
-- Throw an exception if its required bintag argument is missing
-- Check explicitly if user is requesting DinucCovariate and tell them that its been retired in favor of ContextCovariate
-- Show the type (Required, Experimental, Standard) of the covariates when running --list
2012-08-25 14:53:00 -04:00
Mark DePristo
1044ddbc26
Minor improvement in naming of BQSR tests
2012-08-25 14:06:55 -04:00
Mark DePristo
58ca3f61df
Fix horrible bug in the classification of runs as successes, sting-exceptions, or user-exceptions in analyzeRunReports
...
-- Old logic was just busted.
2012-08-25 14:06:55 -04:00
Christopher Hartl
b59948709f
Code improvements re: JIRA GSA-510. Trio class migrated into the Samples package - because the trio structure is so ubiquitously used, it makes sense, I think, to have a class which imposes the structure on the samples. Existing functions which slightly duplicated the getTrios() method look like they have bugs. These functions are now deprecated.
...
A number of functions int he sampleDB looked to be assuming that samples could not share IDs (e.g. sample IDs are unique, so a sample present in two families could not be represented by multiple Sample objects). Added an assertion in the SampleDBBuilder to document/test this assumption.
MVLikelihoodRatio now uses the trio methods from SampleDB.
2012-08-25 08:48:27 -07:00
Mark DePristo
0996bbd548
Comments for Chris on cleanup
2012-08-24 16:04:58 -04:00
Mark DePristo
649b82ce85
Merge branch 'nanoScheduler'
...
Conflicts:
private/scala/qscript/org/broadinstitute/sting/queue/qscripts/performance/GATKPerformanceOverTime.scala
2012-08-24 15:59:36 -04:00
Mark DePristo
801b910b9e
GATKPerformanceOverTime is finalized (mark II)
...
-- Make BQSR run longer
-- Use Dinuc not context covariates for BQSR v1
2012-08-24 15:57:48 -04:00
Mark DePristo
62aa0ac77e
GATKPerformanceOverTime is finalized
...
-- Update BQSR to run v1 and v2. Use new single read group extracted BAM
-- Bug fixes
2012-08-24 15:57:48 -04:00
Mark DePristo
3bbdccb0ae
Refactor and cleanup GATKPerformanceOverTime
...
-- Use single read group BAM file for BQSR
-- Implement terrible (but clever) hack to support BQSR v1 and v2 in a single Scala class.
2012-08-24 15:57:48 -04:00
Mark DePristo
9f0eff4c4c
MySQLdb required to run analyzeRunReports, despite my best efforts
2012-08-24 15:57:48 -04:00