Commit Graph

2810 Commits (4a7902bb8ea7345b90e33596e44ddd2978e2bdde)

Author SHA1 Message Date
ebanks 3b5673d967 1. Removed -all; by default all modules are used; use -none for no modules.
2. Don't make dbsnp track be a comp by default (to cut back on output). Please let me know if someone wants this back for some reason.
3. Cleaned up dbsnp module output to print the right numbers.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3220 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-21 02:46:42 +00:00
aaron 4e18c54bb8 fixing a couple of commented out portions of the VCFReader test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3219 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 22:20:35 +00:00
asivache 6fda78f93f Always return deleted bases in upper case
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3218 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 19:17:40 +00:00
asivache 52a570637d Always keep event bases in upper case
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3217 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 19:16:39 +00:00
aaron 80c4f88a72 removing the Variation interface.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3216 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 18:56:45 +00:00
asivache 7d952a34ae Fixing copyright note
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3215 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 18:28:57 +00:00
asivache cdc175f7e3 Synchronizing version to make sure everything compiles; this model is not operational yet
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3214 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 17:41:52 +00:00
asivache 4437456bb5 Pass array of ref bases to callExtendedLocus()
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3213 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 17:41:13 +00:00
asivache 5d2fab93f4 Method signature changed: for extended events, pass array of reference bases (to ensure we cover the full length of the indel event), not just reference base.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3212 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 17:40:30 +00:00
asivache 01e6492ba9 Updated to work correctly with extended pileups. Clogged and uses some dirty tricks; pileups/extended pileups need to be redesigned someday
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3211 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 17:38:09 +00:00
asivache 4723cad1be New method: getBasesAtLocus(int n); for the windowed reference context, this method extracts n bases starting at the current locus (NOT at the window start, so this method is an extension of getBase())
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3210 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 17:35:09 +00:00
asivache cac125b35c Fixed incorrect symbol printed into the output file (tag had 'R', should have had 'T')
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3209 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 16:37:28 +00:00
rpoplin f4977965b6 Removing debug statements
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3208 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 16:22:40 +00:00
rpoplin 124b7a2a58 Moved ApplyVariantClusters over to VariationContext
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3207 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 16:20:25 +00:00
asivache 200d3e2c47 added copyright note
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3205 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 15:44:26 +00:00
asivache 546dfb629e A draft (working) version of a tool that computes per-cycle base qualities averaged across the reads; the computed base qual profiles are stratifeid by lane/read end and separately by library.Come and shoot me if we already have such a tool somewhere in the repository :)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3204 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 15:38:16 +00:00
hanna c1e53d407d The copyright tag that I copied/pasted from a LaTeX document into IntelliJ had
unicode quote characters embedded in it.  These characters were invisible inside
IntelliJ but cause compile warnings for Ryan and Aaron, who for whatever reason
have a different default charset.  Fixed.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3203 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 15:26:32 +00:00
aaron b5f6f54968 Almost done removing any trace of the old Variation and Genotype interfaces.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3202 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 14:52:15 +00:00
hanna 818a95ea6e Test of new copyright message without unicode characters.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3200 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 14:14:54 +00:00
rpoplin 00feb3eee0 Moving over to VariationContext in CountCovariates. Removed references to class Variation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3199 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 13:26:22 +00:00
hanna 1bc26f69e9 An attempt to cleanup the Utils directory. Email to follow.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3198 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 23:00:08 +00:00
hanna c08936d6f4 Added a reservoir downsampler which can sample elements in an iterator uniformly
from a stream (see Vitter 1985).  Thanks to Eric and Andrey for the pointer.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3197 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 20:48:14 +00:00
ebanks c44f63c846 Fixing the performance tests: we need to catch the RuntimeException (not samtools' RuntimeIOExcpetion). Also, CountCovariates doesn't need the catch.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3196 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 14:28:12 +00:00
ebanks abf48cee05 Moving over to VariantContext from Variation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3195 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 06:56:29 +00:00
ebanks d73c63a99a Redoing the conversion to VariantContext: instead of walkers passing in a ref allele, they pass in the ref context and the adaptors create the allele. This is the right way of doing it.
Also, adding some more useful integration tests.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3194 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 05:47:17 +00:00
aaron 131703d9db more clean-up: moving AlleleBalanceInspector to archive.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3192 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 20:53:33 +00:00
ebanks 534f24177a Move to VariantContext and improve performance (and ease of use) by transitioning to be a RODWalker.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3191 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 20:09:48 +00:00
ebanks 8c32bb8f0a Complete the move over to VariantContext so that we can remove dependence on Variation (in the VCF code)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3190 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 19:41:42 +00:00
aaron 821e8b1c5f more cleanup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3189 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 19:16:16 +00:00
aaron e11ca74eb5 removing some outdated ROD classes (PooledEMSNPROD and SangerSNPROD), removing an out-of-date interface (VariantBackedByBenotype), and moving AnalyzeAnnotationWalker over to VariationContext.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3188 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 18:59:29 +00:00
ebanks d5e5589b8f No longer used
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3187 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 17:57:39 +00:00
aaron be7cbf948b adding a catch for the exception thrown by samtools when it attempts to close /dev/null in the performance tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3186 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 17:41:48 +00:00
aaron 4d75b26b7a Removing the code that made the ROD system case insensitive. Anyone using specific ROD names in their classes should take care in naming required tracks; All lowercase is the best practice.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3184 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 06:17:31 +00:00
asivache 6dc1275cfb Utility method added: getQualsInCycleOrder(read) - examines the read and returns its quals in the order the machine read them (i.e. always from cycle 1 to cycle N). Simply inverts quals if the read happens to be rc-aligned :)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3183 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 00:15:57 +00:00
ebanks f4673efd2f Moving to archive as it's no longer supported
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3182 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 22:10:42 +00:00
ebanks 02a6f4c401 Moving over to VariantContext
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3181 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 22:07:28 +00:00
ebanks 7adff5b81a Renaming for consistency
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3180 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 20:36:19 +00:00
ebanks e702bea99f Moving VE2 to core; calling it "VariantEval" (one more checkin coming)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3179 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 20:25:47 +00:00
chartl ac6f6363ce Execs() temporarily disabled after removal of bam file. New tests forthcoming.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3178 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 20:11:56 +00:00
ebanks ac9dc0b4b4 Removing VariantEval (v1); everyone should be using VE2 now. Docs coming ASAP.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3177 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 19:53:02 +00:00
ebanks 3330e254a9 Standardize the dbsnp track name in preparation for case-sensitivity
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3176 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 19:41:57 +00:00
ebanks 5f7564bf0a Better naming of output columns
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3175 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 18:08:07 +00:00
aaron e682460c1f add a fix so that XL arguments won't cancel out -BTI arguments, fixed a bug for Ben where the ROD -> interval list conversion was throwing an exception, and some old code removal.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3174 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 16:31:43 +00:00
aaron b54031fc86 adding an experimental format to VariantEval2, which when you source() from R, imports all VE2 output as individual tables with appropriate row and column names. More testing and feedback needed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3172 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 06:09:27 +00:00
ebanks 04909fa6ad Removing arbitrary selects
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3169 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 17:46:39 +00:00
ebanks f1189bac5a Bug fix: final map call wasn't being triggered (because we returned when ref==null before applying update0)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3168 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 16:58:55 +00:00
weisburd b930dc52a5 Integration test for GenomicAnnotator
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3167 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 14:43:25 +00:00
weisburd c0f4695902 Improved handling of haplotypeReference and haplotypeAlternate columns. Added haplotypeStrand column. Improved handling of empty fields in data files.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3166 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 14:42:19 +00:00
weisburd 74ec72d1ac Added AnnotatorROD - the TabularROD format specific to GenomicAnnotator
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3164 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 14:39:50 +00:00
weisburd 77a6608784 Changed a variable name
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3163 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 14:38:18 +00:00
weisburd 7b8056099c Fixed 'N' reference-base handling, changed some comments, var names
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3162 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 14:37:25 +00:00
ebanks dde092fb61 Added the ability in VE2 to select which eval modules to run, so that you aren't forced to use all of them. You can use --list to list all of the possible modules to run.
Heads up everyone: by default, *no* modules are run.  Please add "-all" to your scripts to maintain the previous behavior.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3161 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-13 22:15:58 +00:00
ebanks 0b575596f8 Fix for concordance: samples found only in truth no longer kill it.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3160 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-13 21:33:49 +00:00
hanna 8573b0bc6f Refactoring intervals, separating the process of parsing interval lists,
sorting and merging interval lists, and creating RODs from intervals.  This
gives Doug the ability to keep using our interval list parsing code when
sorting intervals on our behalf.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3159 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-13 15:50:38 +00:00
weisburd d0123956bc Modified comments.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3158 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-13 15:41:59 +00:00
chartl 7b05091c04 DoC now does not require a -o argument. (Change for Matt)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3157 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-13 13:58:17 +00:00
ebanks e413882302 Generalizing the SequenomValidationConverter to be able to take in any arbitrary rod type (provided it can be converted to VariantContext).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3155 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-12 20:42:18 +00:00
hanna 14b8101d45 Error message fail. Failed to supply one of the valid interval file types.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3153 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-12 01:19:01 +00:00
hanna 60d54e69f3 Hackish fix to present a better error message if the file does not have the proper extension. Will work with Brett to come up with a better solution.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3152 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-12 01:11:27 +00:00
ebanks d06c7835d8 Adding performance tests for the indel realigner; should take ~3 hours.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3151 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-11 04:45:22 +00:00
ebanks 3434a61146 Don't trigger when ref=N (which can happen when a dbsnp track is provided)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3150 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-11 02:59:11 +00:00
ebanks 961ca05abc Removed outdated Sequenom rod and renamed HapMapGenotypeROD to HapMapROD.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3149 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-11 01:43:07 +00:00
ebanks fa01876255 UnifiedGenotyper performance tests (WG, WEx); currently takes just over an hour.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3148 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-09 19:42:29 +00:00
ebanks 0cc6d0fbbb One more quick memory improvement: reuse Alleles in a given context instead of creating new ones for each sample (duh).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3147 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-09 18:48:36 +00:00
rpoplin c2a37e4b5c Variant Quality Score modules in VariantEval2 no longer create huge lists which hold all of the quality scores encountered and instead cast the quality score to an integer and use hash tables. Bug fix for files in which all the quality scores are set to -1.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3146 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-09 18:36:06 +00:00
ebanks 71f38a9199 Adding performance tests for the recalibrator (Whole Genome and Whole Exome tests).
Should take ~3 hours to run.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3145 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-09 18:30:59 +00:00
ebanks e73e6a4fb0 Significant memory improvements to plink code
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3144 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-09 16:12:38 +00:00
rpoplin f1b1e70612 Bug fix for multisample calls in ApplyVariantClusterWalker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3142 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-09 12:01:15 +00:00
ebanks 3f2455e346 Better error message as suggested by James P
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3141 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-09 05:52:53 +00:00
ebanks fba48b515a Heads up everyone:
For consistency, these tools should be writing to the walker's output stream and no longer use the -vcf argument.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3140 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-09 05:37:25 +00:00
ebanks e286623f6f Use byte[] instead of String in an attempt to cut down on memory usage
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3139 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-09 05:32:54 +00:00
chartl 7025f5b51d Added an auxiliary table to DepthOfCoverage, which is the cumulative equivalent of the locus table (got tired of doing the calculation by hand). Also took care of a trailing tab in the per-locus output table.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3138 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-08 19:37:17 +00:00
aaron 9f6377f7fb added a performance test build option (for the upcoming performance test suite), and added a sample performance test for VariantEval.
IMPORTANT: it was really redundant that we had -Dsingle and -Dsingleintegration to run single unit tests and integration tests, now you can just use -Dsingle to run a single test for performance, unit, and integration tests.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3136 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-08 15:37:15 +00:00
aaron 4014a8a674 A long overdue correction; all unit tests now end in 'UnitTest'. This was something we wanted to do for a while, and now with the performance tests coming, it was a good time to clean-up. Please label any new test appropriately: *UnitTest and *IntegrationTest are the two valid file name patterns for tests.
Thanks!



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3135 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-08 06:14:15 +00:00
aaron e148a3ac61 added the ability to create interval lists directly from a ROD, using the command line arg '-BTI' (long name '--rodToIntervalTrackName'). The parameter to this arg is the name of the ROD track, which must be a track name specified in the -B option.
Using this feature, sites covered by the target ROD will be iterated over.  This list of intevals generated is merged with any intervals from the -L and -XL args, and the Walker is run over the resulting merged list.

WARNING: for very large ROD's this can be costly.  Consider this experimental for now.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3134 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-08 05:14:41 +00:00
aaron 20cc2a85a4 removed the hashmap from Genotype Concordance, moved it into a table
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3133 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-07 21:24:48 +00:00
aaron e55f27b3b1 forgot a file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3132 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-07 20:51:13 +00:00
aaron 9ca8e345fc by-by old junk.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3131 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-07 20:41:48 +00:00
aaron 8fd59c8823 Modified the report system based on Ryan's feedback: tables are now created independently to avoid the permutation problem when they were all compressed in rows, and removed our dependency on FreeMarker. The Grep format stays the same.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3130 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-07 20:39:55 +00:00
depristo 918b746798 More detailed validation output. Fixes for genotyping overflow -- these are temporary and need to be properly resolved
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3129 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-07 16:38:28 +00:00
ebanks e7dad728df Trivial output changes for consistency
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3128 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-07 14:47:43 +00:00
depristo 058e7d3d12 Bug fix for Gregory
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3127 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-07 00:21:35 +00:00
rpoplin 7b44e6bd55 ApplyVariantClusters now outputs interesting threshold points based on hitting the target novel TiTv
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3126 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-06 19:47:29 +00:00
rpoplin 60c227d67f Added new VE2 module to create a plot of titv ratio by variant quality score
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3125 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-06 15:19:27 +00:00
asivache 3530ef5a41 Explicit type cast fixed in order to work with new ROD implementation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3124 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-06 15:02:56 +00:00
rpoplin 2d002c56c3 Added histogram of variant quality scores broken out by true positive and false positive calls to the GenotypeConcordance module of VariantEval2
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3123 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-06 13:48:31 +00:00
aaron 12e4f88ca7 a little bit more clean-up
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3122 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-05 20:49:06 +00:00
aaron df7e7921ce removing some unused code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3121 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-05 19:30:08 +00:00
ebanks 56eb15f91f Error checking for bad input (thanks, Aaron).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3120 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-04 03:17:01 +00:00
weisburd 705b28e90d First attempt at implement record filtering based on special 'hap_ref', 'hap_alt' columns in the input files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3118 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-02 21:52:26 +00:00
weisburd d78e7f6c0a Added documentation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3117 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-02 21:51:28 +00:00
aaron 8017fb123f changed the depth of coverage walkers class name, and added a dependency in the packaging system so that RODs will all get imported.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3116 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-02 20:55:19 +00:00
weisburd 6b7b07f178 First checkin of GenomicAnnotator which annotates an input VCF file by pulling data in a generic way from an arbitrary set of TabularRODs.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3114 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-02 17:49:42 +00:00
rpoplin 642c969896 reverting optimizer changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3112 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-02 16:59:13 +00:00
chartl d7880ef7ad Forgot to uncomment the AlignerIntegrationTest before committing. And yes, matt, commenting it out is, in fact, easier than just setting my classpath.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3110 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-01 17:17:16 +00:00
chartl f7d1b8f5de CoverageStatistics has now replaced DepthOfCoverage -- old DoC is in the archive.
Also, I can't be bothered to fix the spelling of "oldepthofcoverage" to contain the necessary number of D's. Be content that it does, however, contain the requisite number of O's.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3109 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-01 16:27:23 +00:00
aaron 585cc880a2 changed jexl expressions to jexl names in the VariantEval2 output, fixed integration test, and fixed a problem where a line was getting dropped in CSV output
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3108 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-01 16:23:14 +00:00
hanna d00bde22db Reverting one of Brett's changes that should not have been committed. Will
address with Brett separately.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3107 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-01 16:10:46 +00:00
bthomas b4f6f54502 Reorganizing the way interval arguments are processed
Most of the changes occur in GenomeAnalysisEngine.java and GenomeLocParser.java: 
-- parseIntervalRegion and parseGenomeLocs combined into parseIntervalArguments
-- initializeIntervals modified
-- some helper functions deprecated for cleanliness
Includes new set of unit tests, GenomeAnalysisEngineTest.java

New restrictions: 
-- all interval arguments are now checked to be on the reference contig
-- all interval files must have one of the following extensions: .picard, .bed, .list, .intervals, .interval_list



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3106 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-01 12:47:48 +00:00
aaron c3c6e632d1 support for two new VCF header info field value-types, Flag (for fields that are just boolean truths), and Character (for single charatcer info fields).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3105 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-01 03:11:32 +00:00
aaron 3d3d19a6a7 the last-mile commit for Tribble integration. The system is now ready for Tribble to be turned on, as soon as we've removed any dependencies in the ROD code on interfaces that aren't in the Tribble library (i.e. the Variation or Genotype interface on RODs). All of the walkers should be up to date.
a caveat: for anyone asking for all of the ROD's back from the RefMetaDataTracker (if your not using the facilities to get the track by name), you'll now be getting back a collection of GATKFeature objects.  This object will contain the track name, and a method for getting the underlying object (getUnderlyingObject()), which will be the traditional RodVCF, rodDbSNP, etc.  This layer is needed so we can integrate Tribble tracks (which don't natively have names).  Calls that ask for RODs by name will still get back the traditional reference ordered data objects (RodVCF, rodDbSNP, etc).

Sorry for the inconvenience!  More changes to come, but this is by far the largest (as has the greatest effect on end users).


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3104 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-31 22:39:56 +00:00
hanna 4fcee248f9 For Kristian: functions which, given a read, can uniquely identify the BAM file storing that read.
Introducing this into the pile of code which peeks under the covers of the SAMDataSource in the hopes
that this function can help to replace the others and provide a single path for crosstalk.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3103 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-31 20:46:44 +00:00
rpoplin d58fe70708 Correctly ignore filtered calls and indel calls in the truth sets
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3101 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-31 14:33:01 +00:00
hanna b60197ae10 Another round of cleanup and simplification in Picard -- Picard's unit tests
are now passing for my branch.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3100 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-31 01:02:59 +00:00
depristo 40f8e7644c Better, multi-haplotype aware haplotype scores. Looking very good now, seems to be vastly better at dealing with incorrect calls in deep and low pass data. Almost ready for use
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3099 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-30 23:57:36 +00:00
depristo f992f51a3b Deleting incorrect sampling genotype likelihoods from the codebase
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3098 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-30 23:56:35 +00:00
kiran b9d3fc3fbb Now checks if the i-th element of the FiltrationContext[] is null before trying to access it. This seems to happen occassionally at the very end of a VCF file... the array will be 6 elements long, but the last element will actually be null.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3097 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-30 22:40:17 +00:00
hanna 400684542c Revisions to take into account finalization of Picard patch: naming changes, better definition
of public interfaces.  This won't be the last Picard patch, but it should be the last big one.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3096 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-30 19:28:14 +00:00
aaron b00d2bf2bc fixing an annotation that was breaking the error log output system.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3095 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-30 15:34:04 +00:00
aaron a6e8687d71 implementing a clean way to import the template files into the GATK jar (they should not always get bundled). All further resources should be added to the gatk.resources path id in the build script.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3094 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-30 04:20:19 +00:00
ebanks babb9fb825 snp cluster filter should ignore ref calls when determining the clusters
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3093 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-29 17:57:33 +00:00
chartl 24461a2503 Let's *not* import classes that no longer exist. How my own ant test compiled is beyond me.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3091 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-29 13:59:01 +00:00
chartl dc802aa26f Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3090 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-29 13:32:00 +00:00
ebanks 1e8b3ca6ba Fare thee well, oh LocusWindowTraversal.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3089 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-29 13:17:26 +00:00
depristo 8ea98faf47 Deleting the pooled calcluation model -- no longer supported.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3088 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-29 11:44:27 +00:00
hanna 85037ab13f Fix for Kiran's sharding issue (Invalid GZIP header). General cleanup of
Picard patch, including move of some of the Picard private classes we use to Picard public.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3087 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-29 03:21:27 +00:00
depristo a45ac220aa Removing unnecessary printing routines
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3086 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-28 22:34:54 +00:00
depristo b8ab74a6dc Minor useful changes to BaseUtils and MathUtils to support a new haplotype score annotation that determines to the two most likely haplotypes over an interval and scores variants by their consistency with a diploid model. Appears to be useful.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3085 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-28 21:45:22 +00:00
kshakir e9e53f68ab Filter lists can now end with .list or .txt.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3084 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-27 17:41:24 +00:00
aaron 074ec77dcc First go of the new output system for VE2. There are three different report types supported right now (Table, Grep, CSV), which can be
specified with the reportType command line option in VE2.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3083 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-27 03:59:32 +00:00
kiran 85f4f66180 Updated to use VariantContext. Output has been reformatted: variant and genotype concordance are emitted for every coverage level per variant. If the requested sampling level is higher than what's available, the maximum available coverage at that locus is used. This makes it much easier to make plots indicating the percentage of comparison callset recovered at a certain sampling depth.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3082 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-26 21:02:43 +00:00
kiran 391e5843e4 If the annotation engine has not been supplied, don't try to annotate anything.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3081 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-26 20:52:21 +00:00
kiran 8048b709a0 Selects a single sample on which to operate.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3080 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-26 20:50:58 +00:00
kshakir 20e3ba15ca Added an optional argument -rgbl --read_group_black_list to filter read groups.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3079 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-26 19:38:57 +00:00
ebanks 73a14a985b Moving VariantsToVCF to core.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3078 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-26 18:55:12 +00:00
ebanks 14bf6923a8 HapMap-to-VCF now works fine within Variants-to-VCF. Added integration test for it and removed old code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3077 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-26 18:34:59 +00:00
hanna 78af6d5a40 New sharding system is going live again for on-the-fly merging.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3076 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-25 18:39:04 +00:00
hanna 46c14ec63f New, much less memory intensive implementation of BAM file sharding. Streams indices together with the expectation
that bins will be present in the bin sparse array, which avoids the problem of having to hold the sparse bin array
stored in every BAM file index in memory at the same time.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3075 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-25 17:41:22 +00:00
ebanks 4398a8b370 Updated. Now uses VariantContext and is truly "variants" to vcf (i.e. not just GELI to vcf).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3074 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-25 04:53:31 +00:00
ebanks 2373a4618f bug caused by a misprint: context != contexts
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3073 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-25 03:08:24 +00:00
ebanks 3176715c74 1. Alignability mask returns null when not available.
2. --list now prints out the available classes/groups too.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3072 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-24 20:49:07 +00:00
rpoplin 06a212e612 Adding VariantConcordanceROCCurveWalker to create ROC curves comparing concordance between optimized call sets and validation truth sets in VCF format in order to evaluate performance of variant optimizer independently of achieving a particular novel ti/tv ratio. Added option to ignore only the specified filters in the input call sets via --ignore_filter <String>. Added option to provide a prior estimate of error for known snps via --known_prior <qual>. The het and hom calls are clustered independently. Infrastructure in place to use titv of known snps to inform p(true) of novel snps. Tweaked protection against overfitting based on suggestions from several people. Minor edits to AnalyzeAnnotations.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3071 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-24 19:43:10 +00:00
ebanks 47e30aba92 Rods for reads hooked up into the cleaner
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3070 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-24 18:17:56 +00:00
aaron 5079f35e40 better method names for read based reference ordered data access.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3069 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-24 16:13:31 +00:00
ebanks 49117819f5 For the cleaner to clean, it must beat the entropy produced by the aligner (and not just the raw reads).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3068 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-24 15:21:58 +00:00
aaron 60dfba997b added some sample annotations to VariantEval2 analysis modules, and some changes to the report system.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3067 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-24 05:40:10 +00:00
hanna 1f451e17e5 Changing preloaded index to only "preload" reference sequences on demand.
Results in drastic lowering of startup cost when multiple BAM files are 
merged.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3066 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-23 22:02:28 +00:00
hanna 884a577013 Phase 2 of Picard patch refactoring: kill off SAMFileReader2/BAMFileReader2, merging the changes back into the base classes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3065 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-23 16:48:11 +00:00
aaron 7462a0b2d1 cleaned-up of VariantContextAdapter tests, fixed the double comparisons in equals() in RodGeliText (nice MathUtils.compareDoubles Kiran)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3064 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-23 15:18:30 +00:00
aaron a69b8555dd Geli to variant context.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3063 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-23 06:45:29 +00:00
aaron eafdd047f7 GLF to variant context. Added some methods in GLF to aid testing; and added a test that reads GLF, converts to VC, writes GLF and reads back to compare.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3062 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-23 03:43:25 +00:00
hanna 3767adb0bb Processing intervals as they stream in means much lower memory usage and
quicker runtime.  Making change as minimal as possible to avoid conflicts
with BT's incoming patch.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3061 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 22:04:45 +00:00
ebanks 0097106938 VariantFiltration can now filter specific samples.
This is *NOT* an ideal implementation.  One day when we have lots of free time (or a greater desire), we will implement this correctly and sophisticatedly using all the power of JEXL.  For now, though, this will have to do.
Docs coming tonight.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3060 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 20:45:11 +00:00
asivache 543aefc3d7 Fixing the bug introduced with the earlier commit. When trimming locus to the current bases, we need to take into account expanded boundaries (for windowed reference traversals)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3059 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 19:20:34 +00:00
asivache ee1dc6092f Test updated. Now we do not throw an exception when locus interval is out of bounds, we just return silently a reference context trimmed to the current shard boundaries. New test checks for trimming.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3058 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 17:37:52 +00:00
asivache d2944461ef We also have to allow the window to be (partially) outside the bounds and trimming to the contig size is not enough (thanks to shards). Now we trim to the current bounds too (i.e. if the interval is not completely within current bounds, we create reference context that contains only bases from the overlap between the interval and the bounds).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3057 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 17:36:29 +00:00
asivache 9053406798 LocusReferenceView: If the locus a view is requested for spans beyond the reference contig ends, create the actual window bounded by contig ends (so that the locus will not be fully contained in the window!!).
ReferenceContext: constructor does not throw an excepion anymore when locus is not fully contained inside the window. So now we can have a reference context associated with a locus such that the window/actual bases do not cover the whole locus. Scary. I am not sure I like this...

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3056 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 15:59:15 +00:00
aaron 439c34ed38 clean-up before annotating VariantEval2 for output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3055 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 07:39:20 +00:00
depristo 076d21d394 Minor bug workaround in GenotypeConcordance module (see todo). General platform read filter. You can say -rl Platform illumina to remove all SLX reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3054 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 02:47:09 +00:00
hanna 6cd97b78ab An additional safety check to ensure that we only walk over coordinate-sorted
data when doing locus traversals.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3053 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-21 23:31:45 +00:00