Commit Graph

179 Commits (3c8445b934c127581919d6be960ebc372be21342)

Author SHA1 Message Date
Mark DePristo 3c8445b934 Performance bugfix for GenomeLoc.hashcode
-- old version overflowed so most GenomeLocs had 0 hashcode.  Now uses or not plus to combine
2011-09-09 14:25:37 -04:00
Mark DePristo c6436ee5f0 Whitespace cleanup 2011-09-09 14:24:29 -04:00
Mark DePristo 87dc5cfb24 Whitespace cleanup 2011-09-09 14:23:42 -04:00
Mark DePristo 06cb20f2a5 Intermediate commit cleaning up scatter intervals
-- Adding unit tests to ensure uniformity of intervals
2011-09-09 12:56:45 -04:00
Eric Banks 3a04955a30 We already had isPolymorphic and isMonomorphic in the VariantContext, but the implementation was incorrect for many edge cases (e.g. sites-only files, sites with samples who were no-called). Fixing. Moving on to VE now. 2011-09-07 14:01:42 -04:00
Mauricio Carneiro 131cb7effd Bringing Reduce Reads bug fixes to the main repository 2011-09-07 12:25:53 -04:00
Mark DePristo 3db7ecb920 ReducedRead flag cached in GATKSAMRecord. 20% performance improvement 2011-09-06 15:11:38 -04:00
Roger Zurawicki 47607a7eff Fixed bug where deletions messed up interval clipping
- Instead of using readLength, the ReadUtil function are used to get a proper read coordinate
 - Added debug info in interval clipping ( with -dl)

  NOTE: method might not be safe for production and checks need to be added to the ClippingOp code
2011-09-06 14:25:57 -04:00
Mauricio Carneiro 08ae6c0c61 ReadClipper is now handling unmapped reads 2011-09-02 11:32:30 -04:00
Mauricio Carneiro 7d79de91c5 Merge branch 'master' into rr 2011-08-30 02:50:19 -04:00
Mauricio Carneiro 0cd9438ac2 fixed soft unclipped calculation
* getRefCoordSoftUnclippedEnd was not resetting the shift when hitting insertions. Fixed.
* getReadCoordinateForReferenceCoordinateBeforeAlignmentEnd was returning the wrong read coordinate position. Fixed.
2011-08-30 02:45:29 -04:00
Mauricio Carneiro fd540592ab Added RMS calculation for consensus MQ
Consensus MQ is now the average of the RMS of the mapping qualities of the reads making each site.
2011-08-30 02:45:20 -04:00
Mauricio Carneiro 6f9264d2b3 Hard Clipping no longer leaves indels on the tails
The clipper could leave an insertion or deletion as the start or end of a read after hardclipping a read if the element adjacent to the clipping point was an indel. Fixed.
2011-08-30 02:44:58 -04:00
Mauricio Carneiro 943876c6eb Added QUAL/MINVAR parameters to the walker 2011-08-30 02:44:46 -04:00
Mauricio Carneiro 7532be7f5a Allowing to clip after AlignmentEnd if end is soft clipped.
Read clipper now identifies and clips even if the requested coordinate is outside the alignment but the read contains soft clipped bases in that region.
2011-08-30 02:44:46 -04:00
Mauricio Carneiro 90a1f5e15c Several bug fixes
* When hard clipping a read that had insertions in it, the insertion was being added to the cigar string's hard clip element. This way, the old UnclippedStart() was being modified and so was the calculation of the new AlignmentStart(). Fixed it by subtracting the number of insertions clipped from the total number of hard clipped bases.
* Walker was sending read instead of filtered read when deleting a read that contains only Q2 bases
* Sliding the window was causing reads that started on the new start position to be entirely clipped.
2011-08-30 02:44:19 -04:00
Mauricio Carneiro 66a8b36cf5 Fixed most indexing bugs
* added bases and quals to consensus
* fixed consensus read cigar generation.
2011-08-30 02:43:41 -04:00
Mark DePristo 3b09d42ed6 Now only prints 1 warning message about duplicate headers in simpleMerge 2011-08-29 14:41:29 -04:00
Mark DePristo 7bf006278d Moved ResolveHostname to general utils as a static function 2011-08-28 12:04:16 -04:00
Mark DePristo ccec0b4d73 AnalyzeCovariates uses the general RScript system now
-- Convenience constructor for collection for testing
-- callRScript() now accepts Objects not Strings, for convenience
2011-08-27 12:54:13 -04:00
Mark DePristo 1ceb020fae UnitTests for RScript 2011-08-27 10:50:05 -04:00
Mark DePristo eef1ac415a Merge branch 'master' into rodTesting
Conflicts:
	public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantsToTable.java
2011-08-26 00:35:41 -04:00
Mark DePristo e01273ca7c Queue now writes out queueJobReport.pdf
-- General purpose RScript executor in java (please use when invoking RScripts)
-- Removed groupName.  This is now analysisName
-- Explicitly added capability to enable/disable individual QFunction
2011-08-25 16:57:11 -04:00
Eric Banks 09a729da3a Removing incorrect comment 2011-08-25 15:42:52 -04:00
Eric Banks 8bbef79fc2 Create clipped alleles during allele parsing instead of creating a full VC, clipping alleles, and regenerating the VC from scratch. 2011-08-25 15:37:26 -04:00
Ryan Poplin e5008aba00 Output the top two haplotypes as a variant call by running smith-waterman alignment against the reference and calling any difference as variation. This is the first verion that runs end-to-end by taking in reads as bam file and writing out variant calls in VCF. 2011-08-24 15:18:44 -04:00
Roger Zurawicki ac36271457 Fixed extra reads showing up in Variable Sites
Reads that were not hard clipped for the variable site no longer show up in output file
Walker now uses unclippedStart of Read to determine position in the sliding Window
2011-08-23 11:26:00 -04:00
Mauricio Carneiro feeab6075f Merging ReduceReads development with unstable repo
It is time to bring the ReadClipper class to the main repo. Read Clipper has tested functionality for soft and hard clipping reads. I will prepare thorough documentation for it as it will be very useful for the assembler and the GATK in general.
2011-08-22 23:03:03 -04:00
Eric Banks dc42571dd9 Only create the genotype map when necessary 2011-08-22 15:40:36 -04:00
Eric Banks 2c24b68a96 Working implementation of DecodeLoc for VCF parsing. Makes indexing 3x faster. 2011-08-22 15:11:21 -04:00
Eric Banks 518b3dd291 Don't let the genotypes map be null 2011-08-22 15:10:30 -04:00
Guillermo del Angel 4939648fd4 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-20 08:50:43 -04:00
Mark DePristo 8b3cfb2f1c Final documented version of GATKDoclet and associated classes
-- Docs on everything.
-- Feature complete.  At this point only minor improvements and bugfixes are anticipated
2011-08-19 16:52:17 -04:00
Mark DePristo b08d63a6b8 Documentation and code cleanup for ClipReads, CallableLoci, and VariantsToTable
-- Swapped -o [summary] and -ob [bam] for more standard -o [bam] and -os [summary] arguments.
-- @Advanced arguments
2011-08-19 15:06:37 -04:00
Mark DePristo 4d1fd17a97 GATKDoclet cleanup and documentation
-- Fixed bug in the way ArgumentCollections were handled that lead to failure in handling the dbsnp argument collection.
2011-08-19 13:13:41 -04:00
Mark DePristo 198955f752 GATKDoc descriptions for all standard codecs, or TODO for their owners
-- Also added vcf.gz support in the VCF codec.  This wasn't committed in the last round, because it was missed by the parallel documentation effort.
2011-08-19 09:57:21 -04:00
Guillermo del Angel 269ed1206c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-19 09:32:20 -04:00
Mark DePristo 5fbdf968f7 ArgumentSource no longer comparable. Arguments sorted by GATKDoclet 2011-08-18 22:20:14 -04:00
Mark DePristo d1892cd0d7 Bug fixes
-- Sorting of ArgumentSources now done in GATKDoclet, not in the ParsingEngine, as the system depends on the LinkedTreeMap
-- Fixed broken exception throwing in the case where a file's type could not be determined
2011-08-18 21:58:36 -04:00
Mark DePristo c5efb6f40e Usability improvements to GATKDocs
-- ArgumentSources are now sorted by case insensitive names, so arguments are shown in alphabetical order (Ryan)
-- @Advanced annotation can be used to indicate that an argument is an advanced option and should be visually deemphasized in the GATKs.  There's now an advanced section.  Mauricio or Ryan -- could you figure out how to make this section less prominent in the style.css?
2011-08-18 21:39:11 -04:00
Mark DePristo d94da0b1cf Moved CG and SOAP codecs to private 2011-08-18 21:20:26 -04:00
Mark DePristo f7414e39bc Improvements to GATKDocs
-- Allowed values for RodBinding<T> are displayed in the GATKDocs
-- Longest name up to 30 characters is chosen for main argument list (suggested by Ryan/Mauricio)
-- Features are listed in alphabetical order
-- Moved useful getParameterizedType() function to JVMUtils
-- Tests of these features in the Documentation Test
2011-08-18 21:20:09 -04:00
Mauricio Carneiro 6ef01e40b8 Complete rewrite of Hard Clipping (ReadClipper)
Hard clipping is now completely independent from softclipping and plows through previously hard or soft clipped reads.
2011-08-18 18:35:45 -04:00
Guillermo del Angel 58560a6d50 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-18 16:17:52 -04:00
Guillermo del Angel 3dfb60a46e Fixing up and refactoring usage of indel categories. On a variant context, isInsertion() and isDeletion() are now removed because behavior before was wrong in case of multiallelic sites. Now, methods isSimpleInsertion() and isSimpleDeletion() will return true only if sites are biallelic. For multiallelic sites, isComplex() will return true in all cases.
VariantEval module CountVariants is corrected and an additional column is added so that we log mixed events and complex indels separately (before they were being conflated).
VariantEval module IndelStatistics is considerably simplified as the sample stratification was wrong and redundant, now it should work with the VE-generic Sample stratification. Several columns are renamed or removed since they're not really useful
2011-08-18 16:17:38 -04:00
Mark DePristo faa3f8b6f6 Only concrete classes are now documented 2011-08-18 14:04:47 -04:00
Mark DePristo 5772766dd5 Improvements to GATKDocs
-- Now supports a static list of root classes / interfaces that should receive docs.  A complementary approach to documenting features to the DocumentedGATKFeature annotation
-- Tribble codecs are now documented!
-- No longer displayed sub and super classes
2011-08-18 14:00:09 -04:00
Mark DePristo e03db30ca0 New uses DocumentedGATKFeatureObject instead of annotation directly
-- Step 1 on the way to creating a static list of additional classes that we want to document.
2011-08-18 12:31:04 -04:00
Mark DePristo cbec69a130 Merge branch 'master' into help
Conflicts:
	public/java/src/org/broadinstitute/sting/utils/help/HelpUtils.java
2011-08-18 11:33:27 -04:00
Mark DePristo f5d7cabb20 Fix for reintroducing an already solved problem. 2011-08-18 11:20:12 -04:00