Commit Graph

277 Commits (edf852d47dc74823201102ac7387c2f6eec3659c)

Author SHA1 Message Date
Mauricio Carneiro ff2f4df043 Fixed hardclipping inside indel (right tail)
when hard clipping the right tail of a read falls inside a deletion, clipping should fall back to the last base before the deletion to follow the ReadClipper's contract.
2011-09-28 16:07:34 -04:00
Mauricio Carneiro 3c7b7f74ef Optimized interval iteration
Using a TreedSet to manipulate getToolkit.getIntervals() and being smart about which intervals to test makes interval clipping O(1) instead of O(n).
2011-09-28 16:07:34 -04:00
Mauricio Carneiro 5c9b659c02 clipping both ends of the reads was modifying the original read
This goes against the ReadClipper contract, and was affecting the second part of the read that spans over multiple intervals. Fixed.
2011-09-28 16:07:34 -04:00
Mark DePristo 5812004e06 Merge branch 'stable' 2011-09-28 11:36:40 -04:00
Mark DePristo a5006831d7 Shows "" not empty space when default string value is "" 2011-09-28 11:35:52 -04:00
Mark DePristo 1e32281a15 Fix to not show -null when missing short name argument 2011-09-28 11:31:20 -04:00
Mauricio Carneiro 89544c209c Fixing contracts
changed return type to Pair, changing contracts accordingly.
2011-09-28 11:19:17 -04:00
Eric Banks 232a6df11c Add longhand form to the error message. 2011-09-27 20:29:31 -04:00
Eric Banks 1d6fcb6eb1 Revert "Add longhand form to the error message to prevent users from posting borderline dumb posts to GS."
This reverts commit 75b2600527cfce05ae683cb394290ff2a80e8552.
2011-09-27 20:27:00 -04:00
Eric Banks 269b9826b6 Add longhand form to the error message to prevent users from posting borderline dumb posts to GS. 2011-09-27 20:26:36 -04:00
Mauricio Carneiro 3b6e43b7c4 Use reads that span multiple intervals
* RR will now compress reads that span across multiple intervals correctly and output them in the correct order.
* Fixed bug in getReadCoordinateForReferenceCoordinate where if the requested reference coordinate fell inside a deletion in the read the read would be clipped up to one element past the deletion.
2011-09-27 18:39:06 -04:00
Khalid Shakir 84bd355690 Merged bug fix from Stable into Unstable 2011-09-27 14:34:39 -04:00
Khalid Shakir b090751f62 Fixed Ant / PluginManager issue where reflections was picking up all class files under current working directory due to "." in jar manifest classpaths.
Updates to HybridSelectionPipeline:
- Added annotations back via snpEff
- Minor updates to VQSR paths and lowered memory
2011-09-27 14:33:57 -04:00
Mark DePristo 4f09453470 Refactored reduced read utilities
-- UnitTests for key functions on reduced reads
-- PileupElement calls static functions in ReadUtils
-- Simple routine that takes a reduced read and fills in its quals with its reduced qual
2011-09-26 12:58:31 -04:00
Mauricio Carneiro b76dbc72f0 Fixed interval navigation bug.
If a read was hard clipped away from the current interval, all subsequent reads within that interval (not hardclipped) would be filtered out. Fixed.
2011-09-26 08:13:44 -04:00
Guillermo del Angel 9afccd11b1 Minor refactoring: add ability to MathUtils.normalizeFromLog10 to not go to linear domain but just substract max value from log values and return. Use this function in snp and indel GL computation. 2011-09-25 21:18:56 -04:00
Guillermo del Angel 3eef800889 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-24 21:20:11 -04:00
Guillermo del Angel 203517fbb7 a) Cleanups/bug fixes to previous commit to CombineVariants.
b) Change md5 to reflect records that are now merged correctly.
c) Change unit merge alleles test to reflect the fact that a null non-variant vc object is not valid and not supported because there's no way to codify such object in a vcf. The code correctly converts this to a non-variant single-base event with whatever the reference is at that location.
2011-09-24 19:08:00 -04:00
Mauricio Carneiro c31f4cb2f6 Cleaning leading insertions
With the current implementation, a read cannot start with a deletion or an insertion. Maybe this will change in the future, but for now, chop the leading insertion off.
2011-09-24 14:33:32 -04:00
Guillermo del Angel cd058dd10f a) Fixed md5 for legit change in UG output that now also no-calls genotypes w/0,0,0 in PL's in SNP case.
b) First reimplementation of new vc merger of different types. Previous version did it in two steps, first merging all vc's per type and then trying to see if resulting vc's would be merged if alleles of one type were a subset of another, but this won't work when uniquifying genotypes since sample names would be messed up and GT sample names wouldn't match VC sample names. Now, it's actually simpler: when splitting vc's by type before merging, we check for alleles of one vc being a subset of alleles of vc of another type and if so we put them together in same list.
2011-09-24 13:40:11 -04:00
Mark DePristo bb11951255 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-24 09:26:45 -04:00
Mark DePristo 8d9e136bba Merge branch 'stable' 2011-09-24 09:26:28 -04:00
Mark DePristo 92acff46e5 Moved Haplotype into Utils root 2011-09-24 09:14:05 -04:00
Mark DePristo f792353dcd Framework for genotype unit test 2011-09-24 08:56:45 -04:00
Mark DePristo c0bb0cb465 Make DiploidGenotype enum private to walkers.genotyper 2011-09-24 08:48:33 -04:00
Guillermo del Angel 3a4469a236 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-23 21:58:34 -04:00
Guillermo del Angel 0e74cc3c74 a) Treat SNP genotype likelihoods just as indels, in the sense that they're always normalized as PL's so one of them will always be zero. This creates minor numerical differences in Qual and annotations due to numerical approximations in AF computation.
b) Intermediate CombineVariants fixes, not ready yet
2011-09-23 21:58:20 -04:00
Mauricio Carneiro 7cac75ae1d Merged bug fix from Stable into Unstable 2011-09-23 19:00:43 -04:00
Mauricio Carneiro fbe3c1e0b3 Adding warning on HardClipping
Hard Clipping is still under heavy development and should not be used by anyone less prepared than MacGyver.
2011-09-23 19:00:19 -04:00
Mauricio Carneiro 1a45c331b2 bringing the latest bug fixes to Reduce Reads 2011-09-23 16:40:06 -04:00
Mauricio Carneiro 9ea40f2e41 Deletions/Insertions in hard clip and bug fixes
* Deletions now count as hard clipped bases in order to recover the original alignment start of a clipped read.
* Insertions do not  count as hard clipped bases for the same reason.
* This created a bug in the previous cigar cleaning function. Fixed.
2011-09-23 16:37:08 -04:00
Mark DePristo 27ce3c822e Merge branch 'stable' 2011-09-23 09:04:52 -04:00
Mark DePristo dfce301beb Looks for @Hidden annotation on all classes and excludes them from the docs 2011-09-23 09:03:04 -04:00
Mark DePristo 4397ce8653 Moved removePLs to VariantContextUtils 2011-09-23 08:24:20 -04:00
Mark DePristo c49cc623de Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-22 17:26:21 -04:00
Mark DePristo 5cf82f9236 simpleMerge UnitTest tests filtered VC merging 2011-09-22 17:05:12 -04:00
Mauricio Carneiro 96c875399c Merging many bug fixes to reduce reads 2011-09-22 17:04:11 -04:00
Mauricio Carneiro 39b54211d0 Fixed hard clipping soft clipped bases after hard clips
if soft clipped bases were after a hard clipped section of the read, the hard clip was clipping the left soft clip tail as if it were a right tail. Mayhem.
2011-09-22 15:46:55 -04:00
Mauricio Carneiro 1acf7945c5 Fixed hard clipped cigar and alignment start
* Hard clipped Cigar now includes all insertions that were hard clipped and not the deletions.
* The alignment start is now recalculated according to the new hard clipped cigar representation
2011-09-22 14:51:14 -04:00
Mauricio Carneiro 4e9020c9f7 Fixed alignment start for hard clipping insertions 2011-09-22 13:28:25 -04:00
Mark DePristo ba5f83fee2 start of VariantContextUtils UnitTest
-- tests rsID merging
2011-09-22 12:10:39 -04:00
Mark DePristo 93dd1faa5f Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-22 11:20:10 -04:00
Mark DePristo a05c959e5a Empty unit tests for VariantContextUtils
-- will be expanded over the day
2011-09-22 11:20:07 -04:00
Christopher Hartl 4f4a0fc38a Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/chartl/dev/git 2011-09-22 11:01:58 -04:00
Christopher Hartl 982c47bfa7 Remove duplicate effort in ReadUtils (with apologies to Mauricio)
Big (but not major) cleanup of code in ILG - mostly excising the old likelihood model
Activated the early-abort check for ILG. I think it should be better this way.
2011-09-22 10:58:26 -04:00
Eric Banks 8f8b59a932 My interpretation of the VCF spec is that the FORMAT field should only be present if there is genotype/sample data. So the VCFCodec now throws an exception when it encounters such a case. I had to fix one of the integration test VCFs. 2011-09-21 22:23:28 -04:00
Christopher Hartl dc96f6da79 Merge branch 'master' of ssh://chartl@gsa2/humgen/gsa-scr1/chartl/dev/git 2011-09-21 18:18:41 -04:00
Christopher Hartl f9cdc119af Added a method to ReadUtils that converts reads of the form 10S20M10S to 40M (just unclips the soft-clips).
Be careful when using this - if you're writing a bam file it will be potentially written out of order (since the previous alignment start was at the M, not the S).
2011-09-21 18:16:42 -04:00
Mauricio Carneiro 96768c8a18 Sending latest bug fixes to Reduce Reads to the main repository 2011-09-21 17:43:11 -04:00
Mauricio Carneiro 70335b2b0a Hard clipping soft clipped reads to fix misalignments.
Pre-softclipped reads (with high qual) are a complicated event to deal with in the Reduced Reads environment. I chose to hard clip them out for now and added a todo item to bring them back on in the future, perhaps as a variant region.
2011-09-21 17:12:01 -04:00