Mauricio Carneiro
7532be7f5a
Allowing to clip after AlignmentEnd if end is soft clipped.
...
Read clipper now identifies and clips even if the requested coordinate is outside the alignment but the read contains soft clipped bases in that region.
2011-08-30 02:44:46 -04:00
Mauricio Carneiro
90a1f5e15c
Several bug fixes
...
* When hard clipping a read that had insertions in it, the insertion was being added to the cigar string's hard clip element. This way, the old UnclippedStart() was being modified and so was the calculation of the new AlignmentStart(). Fixed it by subtracting the number of insertions clipped from the total number of hard clipped bases.
* Walker was sending read instead of filtered read when deleting a read that contains only Q2 bases
* Sliding the window was causing reads that started on the new start position to be entirely clipped.
2011-08-30 02:44:19 -04:00
Mauricio Carneiro
66a8b36cf5
Fixed most indexing bugs
...
* added bases and quals to consensus
* fixed consensus read cigar generation.
2011-08-30 02:43:41 -04:00
Khalid Shakir
6a4a47568c
Added sample script for generating per sample metrics and updated the queue.sh used for running pipelines to be renamed.
2011-08-29 22:10:35 -04:00
Khalid Shakir
077b6a58da
Merging (un-merging?) reverts into unstable. Current unstable uses "leftAligned" while current stable does not use "leftAligned".
2011-08-29 20:09:48 -04:00
Khalid Shakir
5fdd10340a
Merged bug fix from Stable into Unstable
2011-08-29 20:08:04 -04:00
Khalid Shakir
cf2430322a
Manually fixing unintentional path changes for dbsnps.
2011-08-29 20:06:28 -04:00
Khalid Shakir
2125ba1f23
Merged bug fix from Stable into Unstable
...
Conflicts:
private/java/src/org/broadinstitute/sting/pipeline/ReferenceData.java
2011-08-29 19:36:43 -04:00
Khalid Shakir
20ac24464d
Rev'ved picard to read new analysis_files.txt with a blank line after header and no reference sequence.
...
Updated error messages and unit tests.
2011-08-29 19:33:04 -04:00
Mark DePristo
427c643ce7
The missing tribble jar
2011-08-29 18:46:40 -04:00
Mark DePristo
c6d8df8639
queueJobReport is a public feature of Queue
2011-08-29 17:20:54 -04:00
Mark DePristo
1e5001b447
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-29 17:04:21 -04:00
Mark DePristo
5defaf5fac
Continuing to improve Tribble
...
-- ProfileRodSystem now has a just load index mode, allowing us to optimize the profiler
-- assessFarmNodes R script for making nice plots of performance of jobs on the farm
-- Rev. tribble to use new, optimized index loading (performance win when loading many many indices)
2011-08-29 17:02:57 -04:00
Mark DePristo
3af001fff2
Bugfix for file that must not exist on disk
2011-08-29 17:00:10 -04:00
David Roazen
71680e3fd6
Extremely minor tweaks to the GSAPipelineIndexer
2011-08-29 16:55:54 -04:00
Mark DePristo
3b09d42ed6
Now only prints 1 warning message about duplicate headers in simpleMerge
2011-08-29 14:41:29 -04:00
Eric Banks
c2f0db969b
Don't use the default deletion value from UG if not asking to have it set
2011-08-29 13:48:10 -04:00
Eric Banks
bb7a37e8f2
We need to allow reference calls in the input VCF for the GenotypeAndValidate walker when using the BAM as truth so that we can test supposed monomorphic calls against the truth.
2011-08-29 13:19:35 -04:00
Ryan Poplin
5b56c83401
Don't output variants that on the edges of the haplotype, far away from the original interval.
2011-08-29 11:15:07 -04:00
Ryan Poplin
661d658a0e
Updating smith-waterman parameters for the general case of aligning both with indels and snps simultaneously
2011-08-29 09:49:34 -04:00
Ryan Poplin
bc252a0d62
misc minor bug fixes in assembly. Increasing the minimum number of bad variants to be used in negative model training in the VQSR
2011-08-29 08:11:31 -04:00
Ryan Poplin
f9afc5876a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-29 08:04:39 -04:00
Mark DePristo
61633c95a8
Default jobreport is now jobPrefix, so you see logs like Q-2508.jobreport.txt
2011-08-28 19:19:45 -04:00
Mark DePristo
a5c65fc133
Debugging information to print out the Query tracks
2011-08-28 18:54:49 -04:00
Mark DePristo
d1b2b4ece9
Simple R script to analyse node performance
2011-08-28 16:40:38 -04:00
Mark DePristo
796ba34f6d
Simple script that can be used to access farm node performance
...
-- spawns bundle dbsnp countrod that takes ~15 minutes to complete in general
-- import cleanup for RodPerformanceGoals
2011-08-28 15:09:50 -04:00
Mark DePristo
7542d29507
Oops, enabling default displays again
2011-08-28 12:07:38 -04:00
Mark DePristo
84704aaee3
Support for exechosts. Only displays by default in the gantt chart as jobname @ exechosts.
2011-08-28 12:06:41 -04:00
Mark DePristo
b38de1fa35
Now captures the exechost in the job report
...
-- Works for in process, shell, and LSF runners
-- Cleanup of debugging output
2011-08-28 12:05:56 -04:00
Mark DePristo
7bf006278d
Moved ResolveHostname to general utils as a static function
2011-08-28 12:04:16 -04:00
Ryan Poplin
77426d0fe1
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-28 09:31:15 -04:00
Mauricio Carneiro
32e74affe1
Sliding Window refactor to account for indels
...
* Sliding window now operates over the cigar string to handle indels correctly
* window slides first, then adds the read.
* fixed consensus generation at the end of a variable region
* Sliding reads no longer keep duplicate information of bases and qualities
* consensus read is now a valid sam record including RG information
* BaseIndex is no longer a private class of BaseCounts as it's a useful utility for other tools in the reduced reads
* Some optimizations to the code in general.
2011-08-27 14:23:15 -04:00
Mark DePristo
ccec0b4d73
AnalyzeCovariates uses the general RScript system now
...
-- Convenience constructor for collection for testing
-- callRScript() now accepts Objects not Strings, for convenience
2011-08-27 12:54:13 -04:00
Mark DePristo
810a71c631
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-27 10:50:15 -04:00
Mark DePristo
1ceb020fae
UnitTests for RScript
2011-08-27 10:50:05 -04:00
Mark DePristo
ede4b0a116
Now includes the standard deviation in the average parameter results
2011-08-27 09:57:10 -04:00
David Roazen
50908fe285
Enable conditional inclusion of documentation for hidden features in GATKDocs.
...
To include documentation for hidden features in the generated GATKDocs,
run with -Dgatkdocs.include.hidden=true
I will enable this flag when bamboo generates GATKDocs for unstable.
2011-08-27 03:38:01 -04:00
David Roazen
ccfed5d64d
Enable Contracts for Java by default for test targets.
...
Contracts remain disabled for non-test build targets. To enable for
non-test targets, run with -Duse.contracts=true. To disable for test
targets, run with -Duse.contracts=false.
2011-08-27 02:45:47 -04:00
Mauricio Carneiro
732f1f12b7
Refactoring Sliding window
...
(*** THIS COMMIT BREAKS THE BUILD ***)
2011-08-26 18:25:49 -04:00
David Roazen
beb947d3cc
Standalone program to create an XML index of the GSA pipeline directory suitable for loading in IGV.
...
This is a replacement for an ancient Perl script that will soon be retired.
2011-08-26 14:48:38 -04:00
Mauricio Carneiro
6c8cafba63
CountsWithBases now handles indels
...
adding base counts now iterates over the read's cigar string instead of the bases to handle insertions and deletions correctly.
* This commit broke other functionality that was relying on the incorrectly formed base counts.
2011-08-26 14:41:16 -04:00
Ryan Poplin
dd2cf7d81e
Evaluating the likelihood of a haplotype should only use the passing reads
2011-08-26 14:36:28 -04:00
Mark DePristo
5a6ae954bf
Added VCF streaming to tribbleVsGATK
2011-08-26 14:04:48 -04:00
Mark DePristo
bd92a1b220
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-26 13:24:26 -04:00
Mark DePristo
e37a638e09
Fix for disallowed characters in GATKReportTable
...
-- Illegal characters are automatically replaced with _
2011-08-26 13:24:06 -04:00
Ryan Poplin
6e66f1c243
Removing code that cleans up the assembly graph for purposes of display. There seems to be bugs
2011-08-26 12:36:12 -04:00
Ryan Poplin
8f150c6764
Assembly debug mode now uses smith-waterman to locally align all haplotypes and outputs to a bam file instead of first writing a fasta file and using bwa-sw outside of the GATK.
2011-08-26 10:43:40 -04:00
Ryan Poplin
8f90a22555
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-26 10:16:56 -04:00
Mark DePristo
0cb1605df0
Clean documentation for JobRunInfo
2011-08-26 09:22:58 -04:00
Mark DePristo
415d5d5301
LSF long times are in seconds, convert to milliseconds to meet standard
2011-08-26 09:18:28 -04:00