depristo
5487ab0ee6
Added several useful routines to MathUtils for summing and bounds checking of doubles
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1379 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-05 00:41:31 +00:00
sjia
68309408e4
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1378 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-04 21:23:01 +00:00
sjia
45ab212f22
Post-presentation update
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1377 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-04 21:21:12 +00:00
hanna
21d1eba502
Cleaned division of responsibilities between arguments to map function. Reference has been changed
...
from an array of bases to an object (ReferenceContext), and LocusContext has been renamed to reflect
the fact that it contains contextual information only about the alignments, not the locus in general.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1376 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-04 21:01:37 +00:00
kcibul
a5a7d7dab8
added "booster" metrics
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1375 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-04 20:53:45 +00:00
ebanks
3a8d923785
minor output changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1374 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-04 20:12:16 +00:00
mmelgar
939b19e715
Committing the first version of the homopolymer filter. Removes SNPs that occur at the edges of homopolymer runs and whose nonref allele matches the repeated base in the homopolymer.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1373 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-04 14:35:51 +00:00
depristo
20ff603339
New hotness and old and Busted genotype likelihood objects are now in the code base as I work towards a bug-free SSG along with a cleaner interface to the genotype likelihood object
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1372 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 23:07:53 +00:00
depristo
4986b2abd6
Fixing bug in SSG -- genotyping and discovery were mixed up by name
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1371 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 22:13:35 +00:00
depristo
3485397483
Reorganization of the genotyping system
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1370 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 20:55:31 +00:00
ebanks
9f1d3aed26
-Output single filtration stats file with input from all filters
...
-move out isHet test to GenotypeUtils so all can use it
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1369 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 20:44:21 +00:00
depristo
880a01cb5d
Slight reorganization of genotype interface
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1367 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 19:18:41 +00:00
depristo
d840a47b11
Slight reorganization of genotype interface
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1366 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 19:17:15 +00:00
depristo
20986a03de
cleanup before moving files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1365 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 19:08:24 +00:00
ebanks
e3b08f245f
Pull out RMS calculation into MathUtils for all to use
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1364 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 17:00:20 +00:00
ebanks
e495b836d3
- added mapping quality filter
...
- make the filters brainless in that they strictly have thresholds and filter based on them; require user to calculate and input these thresholds.
- update filters in preparation for migration to new output format
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1363 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 16:46:51 +00:00
ebanks
ba07f057ac
finish the math for RMS
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1362 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-03 16:18:09 +00:00
kiran
8bc925a216
Commit on the behalf of Mark: cleaning up some old and busted code in GenotypeLikelihood and associated objects.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1361 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-31 21:18:30 +00:00
aaron
9dfee7a75c
the "-genotype" option now acts correctly as a discovery mode caller in SSG
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1359 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-31 18:31:45 +00:00
aaron
c2c80dd946
cleanup and moving some things around to more logical locations
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1358 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-31 16:28:39 +00:00
sjia
9dada95ec3
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1357 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-31 16:21:16 +00:00
aaron
9a0761cd8f
accidentally committed some debug code
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1356 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-31 15:25:22 +00:00
aaron
2f2c8576a5
GLF output is now well validated, and some changes for new Genotypes interface code
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1355 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-31 15:21:28 +00:00
andrewk
678c2533ca
Removed custom output stream for file and replaced with the standard out PrintStream
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1350 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 22:36:42 +00:00
aaron
2a7dfce9ae
fix the header string mismatch that Andrew found
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1349 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 22:26:34 +00:00
andrewk
44673b2dce
Removed a debugging println that was accidentally checked in
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1348 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 22:23:27 +00:00
andrewk
845488ff94
VariantEval now decides whether a variant is not confidently called using BestVsNetxBest if genotypes are being evaluated and BestVsRef if not (variant discovery only). Also, the absolute value of the BestVsRef LOD (getVariantionConfidence) is used so that confident reference calls (if the GELI has output them) will show up in the final table as reference calls rather than no calls.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1347 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 21:54:06 +00:00
andrewk
fdc7cc555b
Removed extra column name from geliHeaderString that was mislabeling the 10 genotype likelihoods by shifting them over by onex
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1345 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 21:42:02 +00:00
aaron
0087234ed7
small code cleanup, a couple of little changes to SSGGenotypeCall
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1343 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 19:47:37 +00:00
ebanks
fbc7d44bc7
don't allow users to input priors anymore; they should be using heterozygosity and having the SSG calculate priors.
...
Note that nothing was changed for dnSNP/hapmap priors (not sure what we want to do with these yet - any thoughts?)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1342 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 19:10:33 +00:00
ebanks
b282635b05
Complete reworking of Fisher's exact test for strand bias:
...
- fixed math bug (pValue needs to be initialized to pCutoff, not 0)
- perform factorial calculations in log space so that huge numbers don't explode
- cache factorial calculations so that each value needs to be computed just once for any given instance of the filter
I've tested it against R and it has held up so far...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1341 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 18:52:13 +00:00
aaron
4033c718d2
moving some code around for better organizations, some fixes to the fields out of SSG
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1340 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 15:09:43 +00:00
ebanks
4366ce16e0
Made sure all RODs have a (good) toString() method - and use it in the Venn walker. (thanks, Mark)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1339 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 14:53:27 +00:00
aaron
9cd53d3273
some initial changes from the first review of the genotype redesign, more to come.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1338 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 07:04:05 +00:00
ebanks
feb7238f10
Wasn't always returning the correct alt base
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1337 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-30 03:08:04 +00:00
hanna
5429b4d4a8
A bit of reorganization to help with more flexible output streams. Pushed construction of data
...
sources and post-construction validation back into the GATKEngine, leaving the MicroScheduler
to just microschedule.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1336 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-29 23:00:15 +00:00
aaron
bca894ebce
Adding the intial changes for the new Genotyping interface. The bullet points are:
...
- SSG is much simpler now
- GeliText has been added as a GenotypeWriter
- AlleleFrequencyWalker will be deleted when I untangle the AlleleMetric's dependance on it
- GenotypeLikelihoods now implements GenotypeGenerator, but could still use cleanup
There is still a lot more work to do, but this is a good initial check-in.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1335 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-29 19:43:59 +00:00
kiran
c5c11d5d1c
First attempt at modifying the VFW interfaces to support direct emission of relevant training data per feature and exclusion criterion. This way, you could run the program once, get the training sets, and then feed that training set back to the filters and have them automatically choose the optimal thresholds for themselves. This current version is pretty ugly right now...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1334 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-29 19:29:03 +00:00
ebanks
3554897222
allow filters to specify whether they want to work with mapping quality zero reads; the VariantFiltrationWalker passes in the appropriate contextual reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1333 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-29 17:38:15 +00:00
hanna
7a13647c35
Support for specifying SAMFileReaders and SAMFileWriters as @Arguments directly. *Very*
...
rough initial implementation, but should provide enough support so that people can stop
creating SAMFileWriters in reduceInit.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1332 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-29 16:11:45 +00:00
depristo
56f769f2ce
Output improvements to GenotypeConcordance calculations
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1331 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-29 12:54:46 +00:00
ebanks
72dda0b85c
Fixed calculations for Mark
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1330 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-29 03:21:43 +00:00
ebanks
f0378db9b7
added accuracy numbers
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1329 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-29 01:38:33 +00:00
ebanks
a5a56f1315
At this point, we are convinced that the new priors are the way to go...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1328 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-28 17:25:25 +00:00
depristo
df4fd498c5
Improvements and bug fixes galore. (1) Now properly handles Q0 bases, filtering them out, you can disable this if you need to (2) support for three-state base probabilities (see email), which is disabled by default (still experimental) but appears to be more emppowered to detect variants (see email too)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1327 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-28 13:21:38 +00:00
depristo
46643d3724
Improvements and bug fixes galore. (1) Now properly handles Q0 bases, filtering them out, you can disable this if you need to (2) support for three-state base probabilities (see email), which is disabled by default (still experimental) but appears to be more emppowered to detect variants (see email too)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1326 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-28 13:21:27 +00:00
ebanks
3c4410f104
-add basic indel metrics to variant eval
...
-variants need a length method (can't assume it's a SNP)!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1324 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-28 03:25:03 +00:00
kcibul
1d6d99ed9c
walk by reference
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1323 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-27 20:21:04 +00:00
ebanks
089ae85be7
1. output grep-able strings for genotype eval
...
2. free DB coverage from isSNP restriction
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1322 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-27 17:36:59 +00:00
kcibul
1bca9409a4
calculate freestanding intervals
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1321 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-27 16:40:27 +00:00