caveat: Right now bwa only supports one read group, so if the original file had multiple @RG lines, only the first one will be kept. (working on a solution to this)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5931 348d0f76-0448-11de-a6fe-93d51630548a
Renamed it and moved to core. Happy to support it.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5930 348d0f76-0448-11de-a6fe-93d51630548a
ReplicationValidationWalker: Just the skeleton of what will be the implementation of the replication/validation model.
dataProcessingV2: Committing an UNTESTED implementation of BWA alignment. I am running tests on it over the weekend.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5900 348d0f76-0448-11de-a6fe-93d51630548a
Added GridEngine to pipeline tests.
Removed passing -jobProject since GridEngine projects must be predefined.
Writing the HybridSelectionPipelineTest yaml into the temp directory.
Disabled job priority as it needs to be refactored for use by GridEngine and LSF.
Fixed WholeGenomePipeline variantmergeoption rename to filteredRecordsMergeType.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5872 348d0f76-0448-11de-a6fe-93d51630548a
Reviewed pipelines with dev team.
HSP updates:
- Calling SNPs and Indels at the same time then using SelectVariants to separate them for filtering
- Moved logs next to the files like in WGP
- Flattened outputs into one directory
- The file names for the final outputs are now <projectName>.vcf and <projectName>.eval
- Updated test to pass the chr20 intervals instead of a boolean
- Removed MultiFCP
WGP updates:
- Only cleaning and calling chromosomes 1-22, X, Y, MT
- Splitting SNPs from indels, filtering indels, then merging the selected SNPs and selected Indels back together to make sure there are no collisions in CombineVariants
- Still running VQSR on the recombined SNPs plus hard filtered indels
- Using hard indel filters from delangel
- Reduced number of tranches with rpoplin
- Changed prior for dbsnp from 10 to 8 with rpoplin
- Assuming identical samples on both CombineVariants
- Explicitly using variant merge option UNION even though it's the default
- Not setting the default genotype merge option PRIORITIZE
- Generating a vcf and eval for each tranche
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5825 348d0f76-0448-11de-a6fe-93d51630548a
Removed job priority as temp space isn't as tight at the moment and planning on changing the priority interface.
Updated chunk calling with ebanks:
- Using "the bundle" of resources.
- Using dbsnp 132 and 1000G indel RODs for both RTC & IR.
- Using the default maxIntervalSize in RTC.
- Removed use of UG.exactCalculation argument.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5814 348d0f76-0448-11de-a6fe-93d51630548a
Using hapmap training and truth based on wiki.
Explicitly setting the ts_filter_level even though 99.0 is the default.
Recal file path now ends with with .recal.
Added ar's vcf input.
Omni rod name now omni instead of 1kg.
The VR RodBind tags had spaces in them.
Was passing both the full intervals and the chunk intervals to chunk jobs.
Switched back to chr20 for default since the VR crashes on small intervals sets with "MESSAGE: Matrix is singular."
Log files names based on the file paths + .out.
Added eval statifications by sample based on the Hybrid Selection / Whole Exome pipeline.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5800 348d0f76-0448-11de-a6fe-93d51630548a
Hardcoded the reference and dbsnp since the training rods are also hardcoded, for now.
Changed freeze/chr20 to wg/chr20/cent1 to also test the heaviest known shard.
Other cleanup.
TODO: Memory command line options or have the script figure it out using FLS or similar.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5799 348d0f76-0448-11de-a6fe-93d51630548a
Also added the old model of indel calling to the FCP.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5749 348d0f76-0448-11de-a6fe-93d51630548a