depristo
4f4eec12dd
Minor improvement
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4659 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-12 19:30:54 +00:00
ebanks
b51762c279
When you commit code late at night you tend to make careless mistakes... like forgetting to update integration tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4658 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-12 14:41:10 +00:00
depristo
988da428ae
Bug fix for old style tranches file. ApplyVariantCuts moved over, and passes integration tests
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4657 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-12 14:38:26 +00:00
depristo
c5f8c4dd0d
VariantEval test for tranches file, plus cutting over VE to use the generic Tranches framework
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4656 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-12 13:52:40 +00:00
ebanks
69de3e51bf
Better precision for the calculated AF value. Now looks at the total number of samples to determine how much precision is necessary. Also, changing default min BQ used for calling in UGv2 to Q17.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4655 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-12 08:31:40 +00:00
depristo
ec83a4b765
Initial commit, without any tool changes, of a new infrastructure for determining tranches. This new version walker up from the lowest quality snps and determines Ti/Tv. This is marginally more stable than moving in the other direction when there are few novel variants (exomes). Can make a substantial difference in the size of the call set (10-20%). I'll hook it into the main system now. Includes an new class Tranche, isolated read/writing utilities that are now testing in TestVariantRecalibrator, which should be moved to UnitTest as soon as I can figure out how to do this on my mac.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4654 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-11 23:52:49 +00:00
depristo
ed6396ed43
No longer getting the inet, it seems to potentially hang the JVM
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4653 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-11 23:49:42 +00:00
ebanks
2f6666a988
Correcting traversal statistics
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4652 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-11 22:46:58 +00:00
depristo
dbde721dd0
Bug fix for filtered records
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4651 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-11 18:54:51 +00:00
aaron
698e5cf345
for GATK style codecs, make sure we fill in their GenomeLocParser from the RMDIndexer
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4650 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-11 18:44:15 +00:00
aaron
fd78ce6c86
include the codecs into the RMD indexer that are available in the GATK, not just Tribble
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4649 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-11 06:35:04 +00:00
depristo
0e062ae040
V1 of the data processing paper, produced results for the manuscript we presented. Commit for archival purposes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4648 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-11 01:43:21 +00:00
delangel
2f3be24a00
Improvement in exact allele frequency calculation model (still under test, but this is definitely better than what I had before). Instead of approximating log(10^x+10^y) as max(x,y), approximate full Jacobian formula max(x,y)+log(1+10^-abs(x-y)) with static lookup table for the second term.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4647 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-11 01:22:35 +00:00
kshakir
f35d1aa43f
Moving all file cleanup to IOUtils for easier debugging.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4646 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-10 21:00:58 +00:00
asivache
2e0296fef9
NWayOut logic slightly changed: 1) results.list file is gone; 2) now with -nWayOut one can specify either a) suffix to attach to every output file (i.e. cleaned reads from inputK.bam will be sent to inputK.suffix.bam) or b) *.map tab-separated file that must list <input_name> <output_name> mappings, one per line, for every input file.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4645 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-10 20:32:16 +00:00
asivache
a1adfb91ce
And now @Hidden tags are really in place :-/
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4644 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-10 20:28:40 +00:00
asivache
68ce55148e
(pseudo-)genotyping functionality added: force-emits calls (including REF) at specified locations. Currently @Hidden for testing.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4643 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-10 20:25:40 +00:00
hanna
8e36a07bea
Convert GenomeLocParser into an instance variable. This change is required
...
for anything that needs to be simultaneously aware of multiple references, eg
Queue's interval sharding code, liftover support, distributed GATK etc.
GenomeLocParser instances must now be used to create/parse GenomeLocs.
GenomeLocParser instances are available in walkers by calling either
-getToolkit().getGenomeLocParser()
or
-refContext.getGenomeLocParser()
This is an intermediate change; GenomeLocParser will eventually be merged
with the reference, but we're not clear exactly how to do that yet. This
will become clearer when contig aliasing is implemented.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4642 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-10 17:59:50 +00:00
depristo
760f06cf8c
now prints a nice report, can be invoked from command line
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4641 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-09 19:44:10 +00:00
depristo
5ef4b234d8
Updates for broken integration tests. Counting annotations (AC, AF) now work correctly for AC = 0 sites
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4640 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-09 19:43:43 +00:00
kshakir
33b3f2adaa
GSA-413 Explicitly set all compilers to use 512m memory maximum instead of 64m default maximum and running them out of the ant process.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4639 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-09 19:32:19 +00:00
depristo
3c08a1c061
Basic script for assessing simulation sensitivity and specificity
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4638 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-08 21:02:10 +00:00
depristo
4759fdd2ac
V1 of read and variant simulator and assessor. SimulateReadsForVariants generates BAM and VCF with given combinations of variant and read properties. AssessSimulatedPerformance produces a table suitable for analysis in R
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4637 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-08 21:01:33 +00:00
aaron
97db593efb
making my last commit message actually true
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4636 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-07 18:26:23 +00:00
aaron
be499fc986
making the reference optional (the GATK will set it on the first run if it's not included), and setting the seq index if they do supply it.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4635 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-07 18:15:31 +00:00
ebanks
e05af54f3e
Found the cause of 80% of our non-called FNs: an excess of filtered bases were causing us to choose the wrong alternate allele. More details to dev team.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4634 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-07 03:39:57 +00:00
aaron
2a8c97a4a7
better error catching, as well as allowing for default index naming, <filename>.idx
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4633 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-06 19:12:19 +00:00
aaron
cb2e26a004
by request, an indexer tool to create Tribble style indexes outside of the GATK
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4632 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-06 18:59:06 +00:00
chartl
c19f567424
Sometimes, inputs are really outputs in disguise.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4631 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-05 19:51:16 +00:00
depristo
bbb890dd6c
Bug fix for variants in VCF header fetching to avoid null pointer when a VariantContext tribble codec doesn't have a header
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4630 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-05 12:43:25 +00:00
ebanks
c9dbd8f80a
Bug fix for Tim: all point events must be treated equally
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4629 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-05 03:42:51 +00:00
chartl
0e40321a52
Brütall hack: make the bam list creator job wait for the interval creator job, so that there is an implicit dependency of UG on the interval list, by way of the bam list
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4628 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-04 20:43:11 +00:00
depristo
fc39377e6c
Simple pre-processing script for soapsnp files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4627 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-04 20:34:43 +00:00
chartl
cb0b2f9811
My analysis script for private mutations. I'm committing it because it contains a number of specialized command line functions that could prove useful in the future. (For example: ConcatVCF and ExtractSample)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4626 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-04 19:57:27 +00:00
rpoplin
913db5d1ab
Unfortunately when annotating sites with the UG the -G None option was wiping out the single annotations added by -A options
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4625 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-04 19:27:23 +00:00
ebanks
816c86776e
Walker description was wrong and it was bothering me
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4624 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-04 02:17:09 +00:00
ebanks
87f6738d4c
Deprecated
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4623 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-04 02:07:40 +00:00
chartl
42e9987e69
Bug fix to GenotypeConcordance. AC metrics get instantiated based on number of eval samples; if Comp has more samples, we can see AC indeces outside the bounds of the array.
...
Bug fix to LiftoverVariants - no barfing at reference sites.
AlleleFrequencyComparison - local changes added to make sure parsing works properly
Added HammingDistance annotation. Mostly useless. But only mostly.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4622 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-03 19:23:03 +00:00
fromer
3d27defe93
Fixed output stats (percentage denominator)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4621 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-03 18:47:06 +00:00
ebanks
4e109f58bf
In preparation for Ryan's jumping into SLOD: getting rid of bad hack to ensure P(AF=i) is calculated in the strand-specific cases. With Mark's recent changes this is no longer necessary and just makes the code slower.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4620 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-03 03:44:59 +00:00
fromer
22d64f77ff
Added hidden --outputMultipleBaseCountsFile option to detect cases where a single read has more than one base at the same position
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4619 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-03 03:22:48 +00:00
hanna
8ceb18eea9
Adding packaging system support for external directories.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4618 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-02 20:13:13 +00:00
hanna
8f9bf82aa7
Bamboo is correctly interpreting test fails. Reverting forced-fail test
...
code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4617 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-02 19:32:34 +00:00
hanna
1df166b76e
Forcing a unit test fail to ensure that Bamboo is picking up on failed tests
...
as well as successes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4616 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-02 19:03:12 +00:00
hanna
39c2247150
Capture failed test results in Bamboo as well as successful test results.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4615 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-02 16:57:15 +00:00
hanna
14e992ab90
Test: trying to make Bamboo aggregate all test results instead of only
...
most recently run suite.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4614 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-02 15:33:42 +00:00
fromer
a885ecf046
When merging MNPs, the phased flag and the phase quality (PQ) are determined simultaneously
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4613 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-02 14:44:26 +00:00
hanna
d496f2afde
Switch from JUnitReportReporter (sic) to JUnitXMLReporter for output.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4612 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-02 07:14:21 +00:00
hanna
af2313de45
Another TestNG fix: we were generating JUnit html reports to the wrong
...
directory.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4611 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-02 05:30:50 +00:00
hanna
ebc01648af
Update output directory for TestNG reports.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4610 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-01 23:11:33 +00:00