David Roazen
3c9497788e
Reorganized the codebase beneath top-level public and private directories,
...
removing the playground and oneoffprojects directories in the process. Updated
build.xml accordingly.
2011-06-28 06:55:19 -04:00
carneiro
b46279d62e
required RODs are now checked by annotations.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6080 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-25 06:38:19 +00:00
ebanks
3879b02cdd
updating a package
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6079 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-24 20:13:28 +00:00
ebanks
86aa82caf8
Missed this integration test during my move of VC from Tribble
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6078 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-24 20:07:25 +00:00
ebanks
c2ec2891d1
Other people besides Mark also wanted VariantContext moved to the GATK, so I listened. I am moving VariantContext and all codecs that rely on it (VCF, SoapSNP, HapMap, and CGvar) to the GATK - including relevant unit tests and data files. Additionally, Matt has modified build.xml to generate the necessary jar files so that people can use our VCF codec with Tribble.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6077 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-24 16:56:04 +00:00
carneiro
be123d1399
missed a check for null on sampleNames. Fixed.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6076 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-23 22:42:00 +00:00
carneiro
9c1b8ea796
Updated BQSR script to be more general and work with the new PacBio BAM files - for Kristian Cibulskis
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6075 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-23 21:05:28 +00:00
carneiro
087a25d9e3
quick memory upgrade to BWA classes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6074 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-23 20:53:32 +00:00
carneiro
fbe157137f
removing the old processing pipeline.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6073 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-23 20:19:13 +00:00
carneiro
91fb664135
Many updates to SelectVariants :
...
1) There is now a different parameter for sample name (-sn), sample file (-sf) or sample expression (-se). The unexpected behavior of the previous implementation was way too tricky to leave unchecked. (if you had a file or directory named after a sample name, SV wouldn't work)
1b) Fixed a TODO added by Eric -- now the output vcf always has the samples sorted alphabetically regardless of input (this came as a byproduct of the implementation of 1)
2) Discordance and Concordance now work in combination with all other parameters.
3) Discordance now follows Guillermo's suggestion where the discordance track is your VCF and the variant track is the one you are comparing to. I have updated the example in the wiki to reflect this change in interpretation.
4) If you DON'T provide any samples (-sn, -se or -sf), SelectVariants works with all samples from the VCF and ignores sample/genotype information when doing concordance or discordance. That is, it will report every "missing line" or "concordant line" in the two vcfs, regardless of sample or genotype information.
5) When samples are provided (-sn, -se or -sf) discordance and concordance will go down to the genotypes to determine whether or not you have a discordance/concordance event. In this case, a concordance happens only when the two VCFs display the same sample/genotype information for that locus, and discordance happens when the disc track is missing the line or has a different genotype information for that sample.
6) When dealing with multiple samples, concordance only happens if ALL your samples agree, and discordance happens if AT LEAST ONE of your samples disagree.
---
Integration tests:
1) Discordance and concordance test added
2) All other tests updated to comply with the new 'sorted output' format and different inputs for samples.
---
Methods for handling sample expressions and files with list of samples were added to SampleUtils. I recommend *NOT USING* the old getSamplesFromCommandLineInput as this mixing of sample names with expressions and files creates a rogue error that can be challenging to catch.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6072 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-23 20:18:45 +00:00
droazen
48055d45cb
Added support for PICARD functions to QUEUE after following Khalid's pointers on where to do it. I have added the 6 functions used by the Data Processing Pipeline, but from now on it should be a matter of seconds to copy/paste and create bindings to more functions.
...
Updated the Data Processing Pipeline to use the new Picard classes and reorganized the pre-processing of the pipeline accordingly.
Will only update the wiki once this change goes live.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6071 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:56:14 +00:00
droazen
658e65d26c
2 unrelated changes: 1) fix the variant context adaptor for dbsnp; conversion of deletions was totally broken. 2) stop using paths that include gsa-scr1 in integration tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6070 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:56:07 +00:00
droazen
d92055d1f9
Checkpointing some bugfixes with zero-length version directories and missing
...
Picard metrics files before the push back into svn.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6069 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:56:01 +00:00
droazen
171e20a111
Updated the tribble jar to revision 351
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6068 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:57 +00:00
droazen
3d27e5eb98
Default operating parameters in addition to the parameterized Rscript version.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6067 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:53 +00:00
droazen
0d07c979e9
added comments on how to use this very useful script!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6066 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:50 +00:00
droazen
ab1de3bfda
Updated the tribble jar to revision 350
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6065 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:46 +00:00
droazen
c8124496d0
now with the new 'consensus model' parameter to the cleaner.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6064 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:42 +00:00
droazen
c956e154a0
Kill silly plots.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6063 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:39 +00:00
droazen
772291c38f
Error model is now built by lane and each pool is called separately.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6062 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:36 +00:00
droazen
28d8b28bdf
Density plots.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6061 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:33 +00:00
droazen
d323ef0461
As promised, VariantFiltration can now mask out sites within a user-specified window around the provided mask rod. By default the window is 0, but you can now use the --maskExtension argument to increase that value. Added integration tests to cover this new functionality.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6060 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:29 +00:00
droazen
ea47ccf032
Implemented HET case with binomial distribution. Separated events from normal events and for now skip all extended events.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6059 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:24 +00:00
droazen
26d837f59e
Factorial and log Factorial utilities avoiding overflow using the gamma function. Lots of unit tests. Everything is working great.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6058 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:20 +00:00
droazen
8d5b4af8ca
Binomial and Multinomial interfaces for probability and coefficients in log and real space. Passed all unit tests.
...
BinomialCumulativeProbability was reformatted to follow the now standard parameter order.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6057 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:15 +00:00
droazen
4abb7c424b
implementation of the Gamma function and log10 Binomial / Multinomial coefficients. Unit tests for gamma and binomial passed with honors.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6056 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:09 +00:00
droazen
3392c67e0f
Support for command-line arguments.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6055 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:05 +00:00
droazen
d9973d3da7
Adding in a template for many other plots based on Mark's initial list of metrics.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6054 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:02 +00:00
droazen
237f73c1b1
Initial fingerprint boxplot for exome PreQC metrics.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6053 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:59 +00:00
droazen
ff6386c29b
binomial coefficient was in log2, changed to log10.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6052 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:55 +00:00
droazen
082abfd84f
implementation of the truth allele, different cases for REF , HOMVAR, FILTERED and HET.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6051 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:51 +00:00
droazen
3f974c62e6
Reorganized init() to check for RODs (reference / truth)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6050 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:48 +00:00
droazen
6f5a08ddc6
Simple walker to look at SNPs near indels. Didn't need to make this a walker and commit it, but used it as an opportunity to play with GIT in unstable.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6049 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:44 +00:00
droazen
29a0e08aa2
Testing bug fix process #3 (changes are irrelevant)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6048 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:40 +00:00
droazen
e148a75c32
Testing the 'bug fix' process #2 (changes are irrelevant)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6047 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:37 +00:00
droazen
2e3d6754cd
First implementation of the Error Model.
...
Added stratification by lane to ReadBackedPileup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6046 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:33 +00:00
droazen
27b1418b84
PSP2 output much better. Good masking of repetitive regions. Flagging of invalid amplicons rather than omission of them, reasons properly given. Kiran doesn't like the trailing comma, but the trailing comma also doesn't like Kiran.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6045 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:27 +00:00
droazen
f7fa373643
Incorporate lists of fingerprint data rather than summaries.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6044 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:23 +00:00
droazen
9a00d81d57
Is git commit -a different than git commit?
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6043 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:19 +00:00
droazen
84dd72e6cb
Adding in some read filters, updating MathUtils.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6042 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:15 +00:00
droazen
e0d203434f
Add a column summing the fingerprint LOD scores.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6041 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:09 +00:00
droazen
4f7a64a798
Fixing broken walker as per GS; adding integration test to cover it.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6040 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:04 +00:00
droazen
0e057276ae
Changing the default behavior of the IndelRealigner to run without Smith-Waterman. Changed around the integration tests accordingly.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6039 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:58 +00:00
droazen
751aa8bfa6
Partial rewrite of the summary metrics aggregator to accumulate all metrics
...
from sample-level summaries, rather than only specific metrics. Continues to
manually handle fingerprinting.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6038 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:53 +00:00
droazen
4288ca1c24
Fix doc bug.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6037 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:49 +00:00
droazen
cc1f94310d
A prototype script and library dependencies to extract a BAM list from a
...
reasonably well-formed PM's xls{x}-format spreadsheet or tsv file.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6036 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:45 +00:00
droazen
df71d5b965
bye bye
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6035 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:42 +00:00
droazen
9b90e9385d
Putting new association files, some qscripts, and the new pick sequenom probes file under local version control. I notice some dumb emacs backup files, I'll kill those off momentarily. Also minor changes to GenomeLoc (getStartLoc() and getEndLoc() convenience methods)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6034 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:37 +00:00
droazen
95614ce3d6
Updated the tribble jar to revision 345
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6033 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:25 +00:00
droazen
53c089949e
Added integration test for -n parameter
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6032 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:22 +00:00