Commit Graph

1082 Commits (c5c11d5d1c9fdb3574e3198bcd2b2a8be021fbcd)

Author SHA1 Message Date
ebanks 1d2b545608 add FLT toString method (to be used in PrintRODs) and add it to ROD list
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1279 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-20 02:47:50 +00:00
mmelgar 8da754eb4e First implementation of a primary base filter. Assumes distribution of on/off bases is distributed according to a binomial.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1278 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-17 18:43:35 +00:00
ebanks 24ebfee604 don't print traversal stats
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1277 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-17 16:13:28 +00:00
ebanks 387316ebe1 added indel rod
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1276 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-17 16:05:51 +00:00
ebanks da4af3b620 print indels in the format required for 1KG submissions
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1275 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-17 15:59:18 +00:00
ebanks d45c90b166 ROD to represent simple output from IndelGenotyper
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1274 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-17 14:36:12 +00:00
ebanks f978b04633 A very simple walker to print out (using the ROD's toString method) all of
the RODs it sees.  This is the easiest solution to get around the (temporary)
bug of reads being seen multiple times by reads walkers when close intervals
are passed to them (i.e. process full contigs and then use a ref walker to
filter the ones within your intervals of choice)


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1273 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-17 14:03:34 +00:00
kcibul 129ad97ce5 performance improvement to GenomeLocParser -- moved regex pattern compile out of local field
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1272 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-17 02:56:25 +00:00
hanna df1c61e049 Re-add the plugin path.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1271 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-16 22:48:44 +00:00
hanna 7c30c30d26 Cleaned up some duplicate code in preparation for making plugin dir configurable.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1270 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-16 22:02:21 +00:00
depristo 31f3f466ca Improvements to support GLF generation -- now correctly handles GLF
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1269 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-16 21:10:39 +00:00
depristo 107f42a01e Hacks for getting GLFs support in the Rod system working
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1268 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-16 21:03:47 +00:00
depristo 0548026a2e Now understanding GLFs for calculating genotyping results like callable bases, as well as avoids emitting stupid amounts of data when doing a genotype evaluation (i.e., ignores non-SNP() calls)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1267 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-16 21:03:26 +00:00
depristo c5f6ab3dd5 CoverageHistogram now sees 0 coverage sites
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1266 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-16 20:58:41 +00:00
ebanks 8bc0832215 Generate chip concordance table.
This should work, although I need to test it with some real GLFs


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1265 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-16 17:44:47 +00:00
ebanks 88ffb08af4 Need to return real values for some of the AllelicVariant methods
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1264 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-16 02:31:10 +00:00
kcibul e1055bcc4c moving to new external repository
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1261 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 20:46:08 +00:00
kcibul 4a730adfc1 committing latest changes before moving repositories
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1260 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 20:44:02 +00:00
ebanks 692b1e206f stop throwing an exception here: we don't always have allele counts
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1259 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 20:34:01 +00:00
ebanks a245ee32fa A walker to split 2 call sets into their intersection/union/disjoint (sub)sets.
Yes, the name is retarded, but I'm under pressure here...


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1258 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 20:20:47 +00:00
ebanks ba349e8d52 add FLT ROD
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1257 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 19:40:50 +00:00
ebanks 800f7e6360 make AllelicVariant extend ReferenceOrderedDatum (not Comparable) since ROD itself is Comparable. Then we can generalize RMD tags.
Blame Matt if this doesn't work - he said it wouldn't break anything.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1256 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 19:25:06 +00:00
kcibul 00d49976fb committing latest changes before moving repositories
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1255 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 18:41:52 +00:00
ebanks 5be5e1d45f added conversion from iupac format and new rod to deal with FLT file format
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1254 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 18:34:41 +00:00
aaron d36e232ed3 adding GLF rods to the module list
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1252 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 15:42:34 +00:00
aaron 9ecb3e0015 adding GLFRods with tests and some other code changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1251 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 15:30:19 +00:00
hanna c25f84a01c Regression: we lost our hack to work around BAM files with index problems (affects BAM files created before 23 Apr 2009 and traversed by interval). Added the hack back in, along with a much more explicit comment about why its there.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1248 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 14:41:37 +00:00
depristo 1798aff01b VariantEval now understands the difference between a population-level analysis and a genotype analysis, and handles both. All analyses annotated as supporting one or the other or both. Preparation for genotype chip concordance calculations as well as called sites, etc analyses
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1247 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 14:07:13 +00:00
ebanks 513d43b5f3 now implements AllelicVariant
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1246 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 14:06:25 +00:00
ebanks d369136bda depricate this ROD yet again
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1245 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-15 13:33:03 +00:00
ebanks efcbb16688 un-deprecate this ROD and make it implement Genotype
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1240 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 19:45:41 +00:00
depristo 84d407ff3f Fixing odd merge problem with VariantEval -- better cluster analysis (no cumsum), rodVariant is now an AllelicVariant
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1239 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 18:53:27 +00:00
hanna 76b09a879b Display a more intelligent error message if the user runs a locus traversal across an unmapped reads file.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1238 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 18:36:09 +00:00
aaron 99ddd8ab15 bug fix for transitioning between chromosomes in GLF output
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1237 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 17:58:04 +00:00
aaron 7d755a4c90 GenotypeLikelihoods doesn't emit metrics, they don't make sense
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1236 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 17:22:28 +00:00
aaron 01fc8da270 adding the GenotypeLikelihoodsWalker, which generates GLF genotype likelihoods that are pretty much identical to the samtools calls.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1235 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 16:57:18 +00:00
hanna 99f9cd84ed Warning for possibly mismatched reads / reference was very aggressive. Relax
the criteria a bit.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1234 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 16:21:22 +00:00
hanna 12b5d9c70c The number of loci can easily overflow an int. Change reduce type to a Long.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1233 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 16:07:00 +00:00
depristo 5bf7647498 0.2.3 -- now preserves Q0 bases throughout the reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1232 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 12:27:31 +00:00
aaron 36819ed908 Initial changes to the SSG to output GLF by default
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1231 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 08:46:04 +00:00
hanna 0f6bfaaf73 Skip validation in case of no reads aligning.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1230 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 02:03:36 +00:00
ebanks a1d33f8791 -Added walker to dump strand test results to file
-Refactored strand filter to handle calls from the walker


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1229 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 01:56:50 +00:00
hanna bfe90af5e2 Some quick and dirty fixes to support querying unmapped BAM files.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1228 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-14 01:25:20 +00:00
aaron e4152af387 added a big speed-up for interval list input processing. With large interval sets this was taking way too long...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1227 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-13 22:00:00 +00:00
hanna 9f0fb9f3aa Fix for GSA-90: GATK banner and error messages should point to the wiki website.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1226 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-13 21:56:41 +00:00
hanna b18caa2052 Fix for GSA-90: System isn't failing with an error when you use the wrong reference.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1225 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-13 20:42:12 +00:00
ebanks 52659d02d4 ignore unmapped reads in all the indel walkers (since they're giving me overhead issues)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1224 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-13 16:51:11 +00:00
hanna 5c321f9630 Oops! Accidentally deactivated the ArgumentFactory, needed by the CleanedReadInjector, while refactoring last night.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1223 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-13 16:41:55 +00:00
hanna b61f9af4d7 Cleaning up, preparing to incorporate a better fix for Eric's problems with validation stringency in BAM files opened directly from the walkers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1222 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-13 01:42:13 +00:00
ebanks 4c02607297 genotyper also needs to have 454 reads filtered out
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1221 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-10 23:19:28 +00:00
ebanks dea72c576e use the filter to ignore 454 reads in the traversal to speed up cleaning
(since there's less area to actually clean against)


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1220 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-10 18:34:44 +00:00
ebanks 0070b8ea6a Until 454 goes far, far away, at least we can completely ignore it
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1219 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-10 18:31:53 +00:00
asivache 1401606344 move warning about strictly adjacent intervals in a contig from 'remap' to 'read', so it is issued only once
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1218 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-10 17:58:11 +00:00
hanna aa4f60d980 Make sure that only reads marked as 'mapped' are filtered based on validity of alignment.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1217 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-10 17:44:06 +00:00
asivache e01d37024a now updates mapping quality (to an arbitrary chosen value of 37 if the resulting mapping is unique) and X0, X1 tags after remapping (in REDUCE mode)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1216 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-10 16:40:52 +00:00
asivache b08b121756 synchronyzing; debug statements commented out, so nothing changed really
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1215 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-10 16:38:33 +00:00
asivache a1eb128377 few more detailed debug printouts conditioned on if (DEBUG), so no real changes...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1214 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-10 16:36:57 +00:00
hanna 03e1713988 Better support for specifying read filters to apply directly from the walkers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1212 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 23:59:53 +00:00
aaron ce08f5f0c3 Removed some unused variables, fixed some javadoc. The usual.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1211 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 22:10:22 +00:00
aaron 9cfd89c54f a small refactoring, and some documentation cleanup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1210 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 22:03:45 +00:00
aaron d86717db93 Refactoring of the traversal engine base class, I removed a lot of old code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1209 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 21:57:00 +00:00
ebanks 3519323156 Output the correct geli text format
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1208 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 19:45:18 +00:00
ebanks 99631cdaa1 fix and then deprecate the rodGELI class (GELIs suck)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1207 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 19:18:13 +00:00
hanna 60a86fb34a Better handling of fasta files with non-standard extensions.x
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1206 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 18:18:48 +00:00
hanna 5e26770634 Hack the MicroScheduler to be tolerant of RefWalkers. We need to implement a longer-term solution to make it easier for datasources to report problems they've encountered along the way (GSA-103).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1205 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 17:26:59 +00:00
kcibul bc44e08225 refactored output logic
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1204 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 16:13:01 +00:00
ebanks 3fe7104963 Added walker to filter out clustered SNPs from a call set
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1203 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 03:16:27 +00:00
aaron 8ee5c7de8e GLF reader and writer check in.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1202 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 23:06:37 +00:00
andrewk c8fcecbc6f Added ParseDCCSequenceData.py to repository and made changes that allow an analysis of quantity of sequence data by platform and project, moved table / record system to a new module called FlatFileTable.py and built that into ParseDCCSequenceData and CoverageEval.py; changed lod threshold in CoverageEvalWalker.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1201 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 22:04:26 +00:00
hanna 3f0304de5a Get rid of unused iterator.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1200 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 20:39:16 +00:00
hanna da4d26b1ea Enum support for command-line argument system, and some cleanup for hacks to the CleanedReadInjector that were required because Enum support was missing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1199 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 20:26:16 +00:00
ebanks aacec3aeb0 rod for binary GELI files (still needs to be tested)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1198 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 20:25:56 +00:00
aaron e106cf73d8 A quick change to provide more verbose output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1197 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 19:08:19 +00:00
hanna 433ad1f060 Cleanup...deprecate FastaSequenceFile2 in favor of IndexedFastaSequenceFile or ReferenceSequenceFile from Picard, depending on the application.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1196 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 18:49:08 +00:00
jmaguire 0a67386525 .
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1195 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:59:36 +00:00
hanna d8fbb2b62c Refactoring; make a better home for the MalformedReadFilteringIterator.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1194 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:54:20 +00:00
kiran c78a72e775 Applies Fisher's Exact Test to determine whether there's a strand bias and, if so, filters the call out.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1193 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:14:11 +00:00
kiran b211f500a3 Applies secondary base feature to variants.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1192 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:13:29 +00:00
kiran 6e31057e6b Some changes involving output of marginal calls to different, per-filter files.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1191 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 16:12:57 +00:00
ebanks 787c84d68b only compare pair position for paired end reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1190 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 04:07:08 +00:00
andrewk d3daecfc4d Added unit tests for function in ListUtils to randomly sample lists with replacement, updated AlleleFrequencyEstimate to provide a callType of HomRef, HetSNP, HomSNP, update indices in CoverageEval.py, and made a lot of changes to CoverageWalker biggest one being that it directly calls SingleSampleGenotyper instead of implementing some parts of SSG itself.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1189 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 02:05:40 +00:00
hanna 4ba2194b5e Filter reads whose alignment starts past the end of the contig to which it allegedly aligns.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1188 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 22:27:44 +00:00
hanna 194b75613b Fix compile problem with unit tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1187 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 20:29:31 +00:00
jmaguire 1db15ee468 made some things protected so that I can inherit them in MultiSampleCallerAccuracyTest
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1185 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 15:50:28 +00:00
jmaguire 1fa71aa31d Now outputs stats. Doesn't do the downsampling thing because I think I'll have enough counts.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1184 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 15:29:31 +00:00
hanna 5d7393d7cb Temporary fix for Eric's problems with SOLiD reads: make sure the command-line argument system takes the --validation-strictness command-line argument into account when creating SAMFileReaders.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1183 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 15:18:05 +00:00
aaron f6a273a537 other fixes for some broken unit tests
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1181 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 05:53:13 +00:00
aaron 033bafe7a1 fixed sam by reads test for the new filtering code
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1180 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 05:45:50 +00:00
aaron 2a86f2f833 an initial pass at the GLF reader, and some other genotype changes to phase out the LikelihoodObject I created.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1179 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 04:30:27 +00:00
hanna 5735c87581 Basic infrastructure for filtering malformed reads.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1178 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 22:50:22 +00:00
depristo b9d533042e Two-tailed HardyWeinberg test implemented. VariantEval now separate violations from summary outputs for clarity; Fixing problems with CovariateCounterTest and TabularRodTest
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1177 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 22:02:04 +00:00
hanna 31313481f6 Temporary patch to filter out bad alignments that aren't quite fully reported as bad.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1176 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 18:41:55 +00:00
mmelgar 6580211c2a First version of depth of coverage filter. Right now it takes in a maximum coverage threshold given by the user.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1175 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 18:22:46 +00:00
ebanks fac7ac5142 Don't print out 0 coverage (which is always 0)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1174 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 17:44:32 +00:00
hanna d19366eaad Cleanup emergency fixes for out-of-bounds issues in reference retrieval. Fix spelling mistakes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1173 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 15:41:30 +00:00
kcibul 000d92a545 added gc calculation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1172 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 13:07:04 +00:00
ebanks 338cdbebad deal with screwy solid reads in the cleaner (no cigar strings)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1171 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-05 16:49:58 +00:00
jmaguire 8bcbf7f18a First draft of multi sample caller accuracy test.
Doesn't do it's job yet but the pieces are in place.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1170 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-05 16:29:13 +00:00
jmaguire 4019cd2bd7 Added ROD for parsing hapmap3 genotype files.
Tweak to TabularROD to allow HapMapGenotypeROD to work.
Added HapMapGenotypeROD to list of RODs in ReferenceOrderedData.java.
Modified MultiSampleCaller to return a single object with most of the relvant information.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1169 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-05 16:28:24 +00:00
ebanks e5e249d4ac temporary fix to deal with screwy SOLiD reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1168 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-05 03:25:57 +00:00