Commit Graph

680 Commits (36fb6ca3c5a4b4305e6c273ccdc21c067da4a31f)

Author SHA1 Message Date
ebanks 36fb6ca3c5 Allow user to specify the compression to be used when writing out BAM files.
Updated most of the walkers to reflect this change.
Now it won't take forever to write BAMs!



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@909 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 08:48:34 +00:00
ebanks c1792de44f First pass at fixing the incorrect border-case behavior of the cleaner
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@908 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 07:55:06 +00:00
hanna 9da04fd9ac Cleaned up error warning in case no PL groups are present.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@907 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 03:14:17 +00:00
ebanks 45eeefbb80 Deal with randomly occurring unmapped reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@906 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 02:55:53 +00:00
hanna fdfc3abf80 Better handling for case where PL attribute is missing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@905 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 02:52:30 +00:00
hanna 9689bb3331 Very early draft of script integrating the covariant counting / logistic regression. Deleted some unused code and spurious debug info.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@902 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 22:52:11 +00:00
aaron 109bef6c08 We're no longer in the read-dropping business.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@901 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 22:37:51 +00:00
ebanks 4d880477d6 Deal with ends of contigs
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@900 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 20:09:53 +00:00
hanna 40bc4ae39a The building blocks for segmenting covariate counting data by read group.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@899 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 19:55:24 +00:00
depristo 13be846c2a qualsAsInt argument for Pileup -- fixing stupid bug [again]
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@898 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 18:52:12 +00:00
depristo 97c8ff75dd qualsAsInt argument for Pileup -- fixing stupid bug
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@897 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 18:51:17 +00:00
depristo 9de3e58aa8 qualsAsInt argument for Pileup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@896 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 18:37:39 +00:00
asivache 4d654f30d4 slightly improved error message printed upon failure to parse interval list file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@895 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 18:24:43 +00:00
asivache bcc7bacba1 added List<Transcript> getTranscripts(); also more comments added
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@894 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 16:25:14 +00:00
depristo b492192838 Pairwise SNP distance metrics now enabled
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@892 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 00:11:29 +00:00
hanna 8672ae6019 Now seeing results from the training data. There are still some critical problems in the quality of the output, but we're at least getting training output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@891 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-03 20:41:07 +00:00
ebanks 4e41646c88 print out stats for Andrey
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@890 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-03 17:45:35 +00:00
andrewk dfe464cd81 Updated CovariateCounterWalker to be read group aware
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@889 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-03 10:06:06 +00:00
aaron 40af4f085c Adding some utilities to test unmapped reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@887 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-03 07:40:34 +00:00
hanna fa93661133 Eric wins the prize for pointing out that doubles weren't valid command-line arguments. Made all primitive types parseable as command-line arguments.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@884 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 22:41:10 +00:00
aaron 107b5d73b5 The flagStatReadWalker generates the exact same statistical output as the samtools flagstat command, so the two outputs can be diff'ed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@883 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 21:23:56 +00:00
kcibul a1218ef508 changed default value for failure output
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@880 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 19:32:29 +00:00
depristo 7e7c83ddca fixing insidious bugs
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@879 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 18:33:45 +00:00
hanna 6e60cddfed A fix for the 'rod blows up when it hits a GenomeLoc outside the reference' issu
e.  Really a stopgap; error handling in the RODs needs to be addressed in a more comprehensive way.  Right now, hasNext() isn't guaranteed to be correct.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@878 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 18:14:46 +00:00
kcibul ad5b057140 parameterized a bit more
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@877 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 17:58:26 +00:00
andrewk 587d07da00 Merged functionality of two python scripts into LogRegression.py, some clarity updates to covariate and regression java files.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@876 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 16:55:05 +00:00
aaron 82aa0533b8 added some more documentation to the GLF writer and it's supporting classes, and some other fixes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@875 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 14:53:58 +00:00
kcibul c4cb867d74 basic clustering of reads to reduce artifacts
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@873 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 02:54:21 +00:00
aaron e712d69382 GLF writing support
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@872 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 21:30:18 +00:00
jmaguire 417f5b145e Strand test and misc touch-ups
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@871 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 17:13:21 +00:00
aaron fc91e3e30e equals signs can be important
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@870 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 16:56:21 +00:00
aaron 4edb33788b added a fix for a bug Andrew found
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@869 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 16:53:56 +00:00
hanna b7defeae83 Fix bug in unit tests created by new filter in TraversalEngine.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@868 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 15:50:44 +00:00
hanna fc7320133c Cleaned up error when fasta index is missing. Code still throws an exception, but the message is more direct (no more 'error while micromanaging') and tells the user to run 'samtools faidx' to fix the issue.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@867 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 15:34:38 +00:00
depristo f19d7abba9 Added geli compatibility mode to SingleSampleGenotyper, to enable easy linking to the geli2popsnps.py script
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@866 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 14:32:12 +00:00
kcibul 4d6398cef9 a lot of people have been asking me for the equivalent of the old "PrintCoverage" command from Arachne. Even though I show them the pileup, and they agree that's more accurate/complete, they don't want to modify their scripts and/or write a translator. It was simple enough to write, so here it is.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@863 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-31 01:45:23 +00:00
hanna c04b67c969 Basic instrumentation support for the hierarchical microscheduler.x
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@862 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 22:19:27 +00:00
asivache c8347c3c94 set proper package name (...walkers.indels), remove couple of unused import statements
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@861 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 22:02:14 +00:00
asivache c549c34caa still in development and testing; kinda works
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@860 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 21:59:03 +00:00
asivache c252fec1bc synchronizing, no real changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@859 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 21:56:14 +00:00
asivache eafdba7300 more efficient implementation of line parsing, runs at least 1.5 times faster
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@858 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 21:09:06 +00:00
hanna 8761ab3aff Oops. IteratorPool was occasionally creating too many RODIterators in cases where some reference-ordered data was missing. Fixed by better tracking position of RODIterator.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@857 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 21:00:31 +00:00
asivache d601548d53 added reallocate(int[] orig_array, int new_size) and int[] indexOfAll(String s, int ch); the former is self-explanatory, while the latter returns array of indices of all occurences of ch in the specified string
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@856 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 20:15:00 +00:00
hanna a1edb898ef Make criteria for determining whether to stop and merge inputs more sane.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@855 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 18:08:18 +00:00
asivache fe3b843b65 intercept NullPointerException and rethrow it with (marginally) comprehensible error message when an attempt to get class source code location fails
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@854 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 15:56:55 +00:00
depristo e0803eabd9 enabled underlying filtering of zero mapping quality reads, vastly improves system performance
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@853 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 14:51:08 +00:00
hanna 1f93545c70 Always opt to merge dictionaries when creating a SAMFileHeaderMerger.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@852 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 22:38:16 +00:00
hanna 0cf90b6f8a Tie into sequence merging code in the latest version of picard.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@851 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 21:48:35 +00:00
aaron b43deda6c9 iterative changes to GLF files; also a test of checking-in over sshfs.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@850 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 20:24:30 +00:00
hanna 5e8c08ee63 Update to latest version of picard. Change imports in all classes dependent on picard public from import edu.mit.broad.picard... to import net.sf.picard...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@849 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 20:13:01 +00:00