Commit Graph

7318 Commits (0037b61e5d206e3eb94de7076bae22c6f001c34f)

Author SHA1 Message Date
Mauricio Carneiro 39d8dccc9c Don't close an empty sliding window
Sliding window may be empty due to a slide triggered by a previous read that didn't pass the minimum mapping quality filters.
2011-08-30 02:45:20 -04:00
Mauricio Carneiro fd540592ab Added RMS calculation for consensus MQ
Consensus MQ is now the average of the RMS of the mapping qualities of the reads making each site.
2011-08-30 02:45:20 -04:00
Mauricio Carneiro 7271998735 Ignore insertions at the beginning of a read.
Still use the read, but don't mark any base as having insertions for the purpose of the consensus.
2011-08-30 02:45:19 -04:00
Mauricio Carneiro f2cc483c22 Adding mapping quality filter to consensus
only reads > minimum mapping quality are now made into consensus
2011-08-30 02:45:19 -04:00
Mauricio Carneiro b2f39fef8e don't output deletions as bases in the read (consensus)
The running consensus accepts deletions as long as they are homozygous and (with the current parameters) there is only one read in the pileup. It was generating the cigar string correctly but adding the "D" base to the read bases. Fixed.
2011-08-30 02:45:19 -04:00
Mauricio Carneiro dc140b9f18 Fixed window header construction for out of order reads
Reads can come in out of order due to clipping and the window header may need to have elements added to the head of the list. Fixed.
2011-08-30 02:45:10 -04:00
Mauricio Carneiro 6f9264d2b3 Hard Clipping no longer leaves indels on the tails
The clipper could leave an insertion or deletion as the start or end of a read after hardclipping a read if the element adjacent to the clipping point was an indel. Fixed.
2011-08-30 02:44:58 -04:00
Mauricio Carneiro 943876c6eb Added QUAL/MINVAR parameters to the walker 2011-08-30 02:44:46 -04:00
Mauricio Carneiro c81675be4c Never allow deletions in the consensus
This makes it more in line with the insertion treatment and avoids the following problems:
* consensus reads starting with deletions
* variant regions ending abruptly and turning into one consensus with a long insertion, making it difficult to call the deletion in that location.
2011-08-30 02:44:46 -04:00
Mauricio Carneiro 7532be7f5a Allowing to clip after AlignmentEnd if end is soft clipped.
Read clipper now identifies and clips even if the requested coordinate is outside the alignment but the read contains soft clipped bases in that region.
2011-08-30 02:44:46 -04:00
Mauricio Carneiro 90a1f5e15c Several bug fixes
* When hard clipping a read that had insertions in it, the insertion was being added to the cigar string's hard clip element. This way, the old UnclippedStart() was being modified and so was the calculation of the new AlignmentStart(). Fixed it by subtracting the number of insertions clipped from the total number of hard clipped bases.
* Walker was sending read instead of filtered read when deleting a read that contains only Q2 bases
* Sliding the window was causing reads that started on the new start position to be entirely clipped.
2011-08-30 02:44:19 -04:00
Mauricio Carneiro 66a8b36cf5 Fixed most indexing bugs
* added bases and quals to consensus
* fixed consensus read cigar generation.
2011-08-30 02:43:41 -04:00
Khalid Shakir 6a4a47568c Added sample script for generating per sample metrics and updated the queue.sh used for running pipelines to be renamed. 2011-08-29 22:10:35 -04:00
Khalid Shakir 077b6a58da Merging (un-merging?) reverts into unstable. Current unstable uses "leftAligned" while current stable does not use "leftAligned". 2011-08-29 20:09:48 -04:00
Khalid Shakir 5fdd10340a Merged bug fix from Stable into Unstable 2011-08-29 20:08:04 -04:00
Khalid Shakir cf2430322a Manually fixing unintentional path changes for dbsnps. 2011-08-29 20:06:28 -04:00
Khalid Shakir 2125ba1f23 Merged bug fix from Stable into Unstable
Conflicts:
	private/java/src/org/broadinstitute/sting/pipeline/ReferenceData.java
2011-08-29 19:36:43 -04:00
Khalid Shakir 20ac24464d Rev'ved picard to read new analysis_files.txt with a blank line after header and no reference sequence.
Updated error messages and unit tests.
2011-08-29 19:33:04 -04:00
Mark DePristo 427c643ce7 The missing tribble jar 2011-08-29 18:46:40 -04:00
Mark DePristo c6d8df8639 queueJobReport is a public feature of Queue 2011-08-29 17:20:54 -04:00
Mark DePristo 1e5001b447 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-29 17:04:21 -04:00
Mark DePristo 5defaf5fac Continuing to improve Tribble
-- ProfileRodSystem now has a just load index mode, allowing us to optimize the profiler
-- assessFarmNodes R script for making nice plots of performance of jobs on the farm
-- Rev. tribble to use new, optimized index loading (performance win when loading many many indices)
2011-08-29 17:02:57 -04:00
Mark DePristo 3af001fff2 Bugfix for file that must not exist on disk 2011-08-29 17:00:10 -04:00
David Roazen 71680e3fd6 Extremely minor tweaks to the GSAPipelineIndexer 2011-08-29 16:55:54 -04:00
Mark DePristo 3b09d42ed6 Now only prints 1 warning message about duplicate headers in simpleMerge 2011-08-29 14:41:29 -04:00
Eric Banks c2f0db969b Don't use the default deletion value from UG if not asking to have it set 2011-08-29 13:48:10 -04:00
Eric Banks bb7a37e8f2 We need to allow reference calls in the input VCF for the GenotypeAndValidate walker when using the BAM as truth so that we can test supposed monomorphic calls against the truth. 2011-08-29 13:19:35 -04:00
Ryan Poplin 5b56c83401 Don't output variants that on the edges of the haplotype, far away from the original interval. 2011-08-29 11:15:07 -04:00
Ryan Poplin 661d658a0e Updating smith-waterman parameters for the general case of aligning both with indels and snps simultaneously 2011-08-29 09:49:34 -04:00
Ryan Poplin bc252a0d62 misc minor bug fixes in assembly. Increasing the minimum number of bad variants to be used in negative model training in the VQSR 2011-08-29 08:11:31 -04:00
Ryan Poplin f9afc5876a Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-29 08:04:39 -04:00
Mark DePristo 61633c95a8 Default jobreport is now jobPrefix, so you see logs like Q-2508.jobreport.txt 2011-08-28 19:19:45 -04:00
Mark DePristo a5c65fc133 Debugging information to print out the Query tracks 2011-08-28 18:54:49 -04:00
Mark DePristo d1b2b4ece9 Simple R script to analyse node performance 2011-08-28 16:40:38 -04:00
Mark DePristo 796ba34f6d Simple script that can be used to access farm node performance
-- spawns bundle dbsnp countrod that takes ~15 minutes to complete in general
-- import cleanup for RodPerformanceGoals
2011-08-28 15:09:50 -04:00
Mark DePristo 7542d29507 Oops, enabling default displays again 2011-08-28 12:07:38 -04:00
Mark DePristo 84704aaee3 Support for exechosts. Only displays by default in the gantt chart as jobname @ exechosts. 2011-08-28 12:06:41 -04:00
Mark DePristo b38de1fa35 Now captures the exechost in the job report
-- Works for in process, shell, and LSF runners
-- Cleanup of debugging output
2011-08-28 12:05:56 -04:00
Mark DePristo 7bf006278d Moved ResolveHostname to general utils as a static function 2011-08-28 12:04:16 -04:00
Ryan Poplin 77426d0fe1 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-28 09:31:15 -04:00
Mauricio Carneiro 32e74affe1 Sliding Window refactor to account for indels
* Sliding window now operates over the cigar string to handle indels correctly
* window slides first, then adds the read.
* fixed consensus generation at the end of a variable region
* Sliding reads no longer keep duplicate information of bases and qualities
* consensus read is now a valid sam record including RG information
* BaseIndex is no longer a private class of BaseCounts as it's a useful utility for other tools in the reduced reads
* Some optimizations to the code in general.
2011-08-27 14:23:15 -04:00
Mark DePristo ccec0b4d73 AnalyzeCovariates uses the general RScript system now
-- Convenience constructor for collection for testing
-- callRScript() now accepts Objects not Strings, for convenience
2011-08-27 12:54:13 -04:00
Mark DePristo 810a71c631 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-27 10:50:15 -04:00
Mark DePristo 1ceb020fae UnitTests for RScript 2011-08-27 10:50:05 -04:00
Mark DePristo ede4b0a116 Now includes the standard deviation in the average parameter results 2011-08-27 09:57:10 -04:00
David Roazen 50908fe285 Enable conditional inclusion of documentation for hidden features in GATKDocs.
To include documentation for hidden features in the generated GATKDocs,
run with -Dgatkdocs.include.hidden=true

I will enable this flag when bamboo generates GATKDocs for unstable.
2011-08-27 03:38:01 -04:00
David Roazen ccfed5d64d Enable Contracts for Java by default for test targets.
Contracts remain disabled for non-test build targets. To enable for
non-test targets, run with -Duse.contracts=true. To disable for test
targets, run with -Duse.contracts=false.
2011-08-27 02:45:47 -04:00
Mauricio Carneiro 732f1f12b7 Refactoring Sliding window
(*** THIS COMMIT BREAKS THE BUILD ***)
2011-08-26 18:25:49 -04:00
David Roazen beb947d3cc Standalone program to create an XML index of the GSA pipeline directory suitable for loading in IGV.
This is a replacement for an ancient Perl script that will soon be retired.
2011-08-26 14:48:38 -04:00
Mauricio Carneiro 6c8cafba63 CountsWithBases now handles indels
adding base counts now iterates over the read's cigar string instead of the bases to handle insertions and deletions correctly.

* This commit broke other functionality that was relying on the incorrectly formed base counts.
2011-08-26 14:41:16 -04:00