aaron
ce08f5f0c3
Removed some unused variables, fixed some javadoc. The usual.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1211 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 22:10:22 +00:00
aaron
9cfd89c54f
a small refactoring, and some documentation cleanup
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1210 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 22:03:45 +00:00
aaron
d86717db93
Refactoring of the traversal engine base class, I removed a lot of old code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1209 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 21:57:00 +00:00
ebanks
3519323156
Output the correct geli text format
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1208 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 19:45:18 +00:00
ebanks
99631cdaa1
fix and then deprecate the rodGELI class (GELIs suck)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1207 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 19:18:13 +00:00
hanna
60a86fb34a
Better handling of fasta files with non-standard extensions.x
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1206 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 18:18:48 +00:00
hanna
5e26770634
Hack the MicroScheduler to be tolerant of RefWalkers. We need to implement a longer-term solution to make it easier for datasources to report problems they've encountered along the way (GSA-103).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1205 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 17:26:59 +00:00
kcibul
bc44e08225
refactored output logic
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1204 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 16:13:01 +00:00
ebanks
3fe7104963
Added walker to filter out clustered SNPs from a call set
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1203 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 03:16:27 +00:00
aaron
8ee5c7de8e
GLF reader and writer check in.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1202 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 23:06:37 +00:00
andrewk
c8fcecbc6f
Added ParseDCCSequenceData.py to repository and made changes that allow an analysis of quantity of sequence data by platform and project, moved table / record system to a new module called FlatFileTable.py and built that into ParseDCCSequenceData and CoverageEval.py; changed lod threshold in CoverageEvalWalker.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1201 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 22:04:26 +00:00
hanna
3f0304de5a
Get rid of unused iterator.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1200 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 20:39:16 +00:00
hanna
da4d26b1ea
Enum support for command-line argument system, and some cleanup for hacks to the CleanedReadInjector that were required because Enum support was missing.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1199 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 20:26:16 +00:00
ebanks
aacec3aeb0
rod for binary GELI files (still needs to be tested)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1198 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 20:25:56 +00:00
aaron
e106cf73d8
A quick change to provide more verbose output.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1197 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 19:08:19 +00:00
hanna
433ad1f060
Cleanup...deprecate FastaSequenceFile2 in favor of IndexedFastaSequenceFile or ReferenceSequenceFile from Picard, depending on the application.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1196 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 18:49:08 +00:00
jmaguire
0a67386525
.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1195 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:59:36 +00:00
hanna
d8fbb2b62c
Refactoring; make a better home for the MalformedReadFilteringIterator.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1194 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:54:20 +00:00
kiran
c78a72e775
Applies Fisher's Exact Test to determine whether there's a strand bias and, if so, filters the call out.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1193 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:14:11 +00:00
kiran
b211f500a3
Applies secondary base feature to variants.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1192 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:13:29 +00:00
kiran
6e31057e6b
Some changes involving output of marginal calls to different, per-filter files.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1191 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:12:57 +00:00
ebanks
787c84d68b
only compare pair position for paired end reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1190 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 04:07:08 +00:00
andrewk
d3daecfc4d
Added unit tests for function in ListUtils to randomly sample lists with replacement, updated AlleleFrequencyEstimate to provide a callType of HomRef, HetSNP, HomSNP, update indices in CoverageEval.py, and made a lot of changes to CoverageWalker biggest one being that it directly calls SingleSampleGenotyper instead of implementing some parts of SSG itself.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1189 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 02:05:40 +00:00
hanna
4ba2194b5e
Filter reads whose alignment starts past the end of the contig to which it allegedly aligns.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1188 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 22:27:44 +00:00
hanna
194b75613b
Fix compile problem with unit tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1187 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 20:29:31 +00:00
jmaguire
1db15ee468
made some things protected so that I can inherit them in MultiSampleCallerAccuracyTest
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1185 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 15:50:28 +00:00
jmaguire
1fa71aa31d
Now outputs stats. Doesn't do the downsampling thing because I think I'll have enough counts.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1184 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 15:29:31 +00:00
hanna
5d7393d7cb
Temporary fix for Eric's problems with SOLiD reads: make sure the command-line argument system takes the --validation-strictness command-line argument into account when creating SAMFileReaders.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1183 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 15:18:05 +00:00
aaron
f6a273a537
other fixes for some broken unit tests
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1181 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 05:53:13 +00:00
aaron
033bafe7a1
fixed sam by reads test for the new filtering code
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1180 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 05:45:50 +00:00
aaron
2a86f2f833
an initial pass at the GLF reader, and some other genotype changes to phase out the LikelihoodObject I created.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1179 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 04:30:27 +00:00
hanna
5735c87581
Basic infrastructure for filtering malformed reads.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1178 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 22:50:22 +00:00
depristo
b9d533042e
Two-tailed HardyWeinberg test implemented. VariantEval now separate violations from summary outputs for clarity; Fixing problems with CovariateCounterTest and TabularRodTest
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1177 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 22:02:04 +00:00
hanna
31313481f6
Temporary patch to filter out bad alignments that aren't quite fully reported as bad.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1176 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 18:41:55 +00:00
mmelgar
6580211c2a
First version of depth of coverage filter. Right now it takes in a maximum coverage threshold given by the user.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1175 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 18:22:46 +00:00
ebanks
fac7ac5142
Don't print out 0 coverage (which is always 0)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1174 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 17:44:32 +00:00
hanna
d19366eaad
Cleanup emergency fixes for out-of-bounds issues in reference retrieval. Fix spelling mistakes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1173 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 15:41:30 +00:00
kcibul
000d92a545
added gc calculation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1172 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 13:07:04 +00:00
ebanks
338cdbebad
deal with screwy solid reads in the cleaner (no cigar strings)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1171 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-05 16:49:58 +00:00
jmaguire
8bcbf7f18a
First draft of multi sample caller accuracy test.
...
Doesn't do it's job yet but the pieces are in place.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1170 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-05 16:29:13 +00:00
jmaguire
4019cd2bd7
Added ROD for parsing hapmap3 genotype files.
...
Tweak to TabularROD to allow HapMapGenotypeROD to work.
Added HapMapGenotypeROD to list of RODs in ReferenceOrderedData.java.
Modified MultiSampleCaller to return a single object with most of the relvant information.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1169 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-05 16:28:24 +00:00
ebanks
e5e249d4ac
temporary fix to deal with screwy SOLiD reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1168 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-05 03:25:57 +00:00
depristo
cf1854b339
Fix for monsterous problems with solid data -- now can dynamically expand recalibration tables on the fly as reads declare additional read groups -- use assumeFaultyHeader flag
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1167 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-03 17:15:49 +00:00
depristo
bcda66d2db
Simple performance improvements
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1166 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-03 16:45:23 +00:00
hanna
0d00823332
Fix for performance bug in extending the read with X's in cases where the read is aligned off the end of the contig.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1165 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-03 16:17:38 +00:00
kcibul
be2f8478c0
added supression of failure messages
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1164 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-03 15:19:37 +00:00
kcibul
25c30b12bb
added MAF-style output
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1163 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-03 15:10:19 +00:00
andrewk
dcb8892568
Lot of code for coverage evaluation tools including first version of python script to evaluate the downsampled SSG callls made and the java code to make all the calls at Hapmap chip sites at various downsampling levels; ListUtils contains functions for randomnly subsetting lists (with replacement) which are useful for subsetting the same elements in both the reads and the offsets lists of a LocusWalker
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1162 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-03 08:07:02 +00:00
asivache
d603145cb0
Meaning of input arguments has CHANGED: minFraction is now a minimum fraction of CONSENSUS indel observation, out of all reads covering the site, required to make the call. minConsensusFraction is still the minimum fraction of CONSENSUS indel observation out of all indel observations at the site
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1160 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-02 20:38:10 +00:00
hanna
62807139fc
Cleanup pileup and depth of coverage in preparation for release. Add pileup, depth of coverage, and print reads to package for distribution.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1159 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-02 14:54:01 +00:00