ebanks
c6ad26e04f
1) When quals/GQs are really integers (x.00), strip off the floating points.
...
2) Keep track of whether vcf records are unfiltered vs. pass filters in the variant context so we can regenerate the records on output.
3) No more "ID" hard-coded all over the code to set the VariantContext ID. Use a static variable instead.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3840 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-20 18:01:45 +00:00
ebanks
f742980864
1. Refactoring of GenoypeWriters so that parallelization now works again with VCF4.0. We now have just a single reference to the old VCF classes, and that one will be purged soon.
...
2. Moved Jared's VCFTool code into archive so that everything would compile.
3. Added the vcf reference base (needed for indels) as an attribute to the VariantContext from the reader.
4. TribbleRMDTrackBuilderUnitTest was complaining that a validation file didn'r exist, so I commented it out.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3835 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-20 06:16:45 +00:00
depristo
c47a5ff5ab
Official parallel CountCovariates, passes all integration tests. Now poster-child example of parallelism in GATK (Matt H). Apparent general performance improvements throughout too.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3833 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-19 22:13:18 +00:00
rpoplin
0b56003d1a
Remove stray commented out line
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3832 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-19 19:14:39 +00:00
rpoplin
8e31c01680
Solid processing in base quality recalibrator now has several options for how to handle no calls in the color space. --ignore_nocall_colorspace is removed and replace by --solid_nocall_strategy. Fixed some of the @Deprecated tags in BaseUtils. LocusWalkers now filter out FailsVendorQualityCheck reads. HLA caller integration test bam file had bad vendor reads so its integration test changed.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3831 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-19 19:10:29 +00:00
aaron
18b0114e25
remove FixBAMSortOrder walker.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3830 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-19 17:27:23 +00:00
aaron
f4cfb0f990
The first step in integrating Jim's tree based index scheme:
...
- changed to a better method for getting headers from Codecs
- some removal of old commented out code in the GATKAgrumentCollection
- changes for the rename of FeatureReader to FeatureSource
- removed the old Beagle ROD
- cleaned up some of the code in SampleUtils
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3826 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-19 04:49:27 +00:00
depristo
7c42e6994f
FindBugs fixes throughout the code base
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3823 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-18 16:29:59 +00:00
ebanks
693672a461
Refactoring the VCF writer code; now no longer uses VCFRecord or any of its related classes, instead writing directly to the writer. Integration tests pass, but some are actually broken and will be fixed this week.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3822 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-18 13:19:56 +00:00
depristo
14b21e487b
always 4.0
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3817 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-17 22:28:48 +00:00
hanna
9207c58b8f
A fix for the integration test I broke on Friday on my way out the door --
...
some workflows using AlignmentContext were working with it in a way I didn't
expect and wound up treating extended pileups as base pileups. I'll work to
make sure the AlignmentContext interface is crystal clear.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3815 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-17 22:22:44 +00:00
asivache
6aedede7f3
Added Type.MNP to allowed variant context types; this does not break the tests (yet)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3808 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-16 15:50:25 +00:00
depristo
b29eda83bb
Parallelized CountCovarites! percent_ref_called_var now a standard genotype concordance module (for validation!). Really much smarter merging of headers for combineVariants. VCF codecs now actually look at the file version and blow up if they are the wrong versions. setHeaderVersion() in VCFHeaderLine.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3802 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-16 14:10:18 +00:00
ebanks
f293eb7de1
Fix for Kim: for some ungodly reason, I was initializing the bins that were maintaining counts to 1 instead of 0.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3801 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-16 03:40:29 +00:00
ebanks
e7e58d7129
The SAM spec has now officially reserved my new tags for original cigar and original alignment start... except that OS has been named OP ('original POS')
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3800 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-16 00:09:36 +00:00
ebanks
ab84ed8c68
Fix for Mark: get rid of old program tags whose IDs clash with the recalibrator/realigner tag (including if the id has a .1 at the end, etc.). Keeping them around is dangerous because we don't know which one refers to the latest run of the tool on the bam.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3798 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-15 19:13:50 +00:00
ebanks
78a4d8ec3d
Removing more references to VCFRecord
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3790 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-14 16:34:15 +00:00
ebanks
460283f6d2
No more manually converting VariantContexts to VCFRecords. You should be utilizing VCs and not VCFRecords.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3787 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-14 05:21:28 +00:00
ebanks
6b5c88d4d6
The GATK no longer writes vcf3.3; welcome to the world of vcf4.0. Needed to fix a few output bugs to get this to work, but it's looking great. Much more still to come. Guillermo: hopefully this doesn't break your local build too badly.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3786 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-14 04:56:58 +00:00
chartl
9d2a485532
Update to AminoAcidTransition eval module
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3783 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-13 17:12:03 +00:00
ebanks
d896d03554
Moving VF to vcf 4.0. Still need to fix genotype filters.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3777 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-13 11:39:51 +00:00
ebanks
1bef7dd170
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3775 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-13 00:56:12 +00:00
ebanks
52c534a8f2
Updating to VCF 4.0
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3770 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-12 20:18:30 +00:00
ebanks
e50627a49e
1. Updated tests and added integration test for liftover code.
...
2. Updated liftover code (and scripts) to emit vcf 4.0 and no longer depend on VCFRecord.
3. Beagle walker now also emits vcf 4.0.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3767 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-12 17:58:18 +00:00
ebanks
2a7112302a
More archiving
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3766 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-12 17:04:41 +00:00
ebanks
221e01fb27
deleting/archiving as instructed
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3765 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-12 16:59:45 +00:00
ebanks
8086ab1f75
Pulled sample/header merging routines out of CombineVariants and into util classes. Added more generalized methods for retrieving samples. Updated the Beagle walkers to use these methods.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3764 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-12 16:51:54 +00:00
ebanks
fb717fe128
First pass needed to remove old VCF code: moving all VCF-related constants into a single unified class
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3759 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-11 07:19:16 +00:00
ebanks
6b960bd9c5
Fix for Steve: genotype filters still want to see the values from the VC
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3758 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-11 04:30:15 +00:00
depristo
c3c66e853c
Improvements for Jason
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3756 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-09 20:18:37 +00:00
ebanks
405be230d0
Various code improvements based on FindBugs
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3755 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-09 15:04:48 +00:00
chartl
ea8fd506bf
Update to PickSequenomProbes: Option to ignore mask sites within X bp of a variant (very useful for indels where dbSNP entries near the indel are almost always false SNP calls). Also fixed an integration test where the variant site itself, being in dbSNP, was represented as [N/C] rather than [A/C]. Added integration test for 1bp no-mask window.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3753 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-09 04:03:19 +00:00
depristo
45fb614296
Fixes to VE for obscure bug, as well as disabled integration test for CombineVariants
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3749 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-09 00:13:07 +00:00
rpoplin
67f1589652
--fdr_filter_level isn't mandatory
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3748 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-08 22:48:30 +00:00
rpoplin
5d39cd5db8
Added --fdr_filter_level to ApplyVariantCuts so that you can create beautiful tranche plots and also decide which tranche level to filter at. The previous version always filtered at the smallest tranche. The tranche filter names are appropriately added to the VCF header.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3747 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-08 22:44:10 +00:00
depristo
760aaeda88
Update to CombineVariants. Now splits merge options into variant and genotype options separately.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3746 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-08 20:09:48 +00:00
depristo
56a0c7ee6f
All headers are now converted to VCF4 by default.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3741 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-08 14:14:17 +00:00
ebanks
ed0d0d78fa
corresponding fix for dealing with insertions
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3739 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-08 05:25:03 +00:00
ebanks
9a81f1d7ef
Fixed this tool for chartl so that it now properly handles deletions. Added deletion case to integration tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3737 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-08 04:45:59 +00:00
ebanks
4bc3ad2194
Shame on me: UG was emitting negative QUALs (-0) in all_bases mode. Thanks, Matt.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3732 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-07 20:30:22 +00:00
ebanks
30714ec8d9
As per quick chat with Richard Durban, don't increase the mapping quality of realigned reads too much; for now, arbitrarily increase the MQ by 10. We need to figure out a better solution.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3731 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-07 20:12:59 +00:00
ebanks
8ff1a4b929
Don't try to clean reads that fail the PF, in preparation for Ryan
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3730 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-07 19:49:36 +00:00
depristo
b934cc7554
Updates to fix some bugs in merger. Now able to merge into project wide indel VCF files. Integration teests coming tomorrow
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3727 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-07 03:16:33 +00:00
hanna
773a72e6ea
An initial fix for performance issues when filtering UG with new StratifiedAlignmentContext.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3724 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-07 01:07:46 +00:00
aaron
86031f4034
part two: todo's in combine variants, fixes for InferredGeneticContext, and some other tests and clean-up.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3721 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-05 21:07:53 +00:00
ebanks
36edc60ccc
Connected UG to the new comp track annotation system in VA. Also, when emit confidence is lower than call confidence (so that we emit records filtered with LowQual), add a corresponding FILTER header field to the VCF so that the validator doesn't complain.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3720 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-05 13:04:24 +00:00
aaron
3347d1ca7c
part one of combining format and info header lines code into a single abstract class for Mark; plus some 'm' removals from access methods for Eric. Adding fixes for CombineVariants next.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3719 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-05 05:57:58 +00:00
ebanks
07945040f8
Set VariantFiltration's JEXL engine to silent for warning messages
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3716 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-04 18:11:19 +00:00
depristo
cd2e4b0a1e
merging now very close to working. Bug todo in writer and vcf infrastructure. Can almost create merged snp and indel files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3712 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-02 20:09:25 +00:00
depristo
61e2b2e39b
Nearly finalize merging capabilities for CombineVariants. Support for dealing with inconsistent indel alleles at loci. Improvements to Allele and removal of addAllele to MutableGenotype. We are close to being able to merge all of 1000 genomes -- snps and indels -- into a single combined vcf
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3710 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-02 13:32:33 +00:00