Commit Graph

66 Commits (433ad1f0607e37526201f5ee8fe26b36ea31de41)

Author SHA1 Message Date
hanna 433ad1f060 Cleanup...deprecate FastaSequenceFile2 in favor of IndexedFastaSequenceFile or ReferenceSequenceFile from Picard, depending on the application.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1196 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-08 18:49:08 +00:00
hanna 194b75613b Fix compile problem with unit tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1187 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 20:29:31 +00:00
aaron f6a273a537 other fixes for some broken unit tests
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1181 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-07 05:53:13 +00:00
hanna d19366eaad Cleanup emergency fixes for out-of-bounds issues in reference retrieval. Fix spelling mistakes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1173 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 15:41:30 +00:00
aaron d4d3af20f2 made a fake fasta generator, so we can now generate a complete bam / fasta combo of made up data.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1150 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-01 21:35:34 +00:00
aaron f5cba5a6bb Fixed genome loc to be immutable, the only way to now change it's values is through the GenomeLocParser.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1132 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 19:17:24 +00:00
aaron d7d4298917 Some files to support generic genotype outputing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1112 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-26 15:43:41 +00:00
aaron 5b1c23a7f2 changes to fix and test the interval based traversals
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1095 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 17:54:15 +00:00
aaron bcb64d92e9 Aaron: 1, GenomeLoc: 0. I changed our GenomeLoc class, seperating the creation of a genome loc (with the reference setup) to a parser class. GenomeLoc now just represents the actual genomic postion. The constructors are now package-protected (to enforce using the parser), but we may want to expose some constructors in the future.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1069 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-22 14:39:41 +00:00
depristo 8ac40e8e2d Updated version of the recalibration tool
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1060 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-19 17:45:47 +00:00
aaron 6ee64c7e43 added changes to support alec toUnmappedRead seek. Huge improvements (orders of magnitude) in unmapped read performance.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1021 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-16 22:15:56 +00:00
hanna 71e3825fa1 First pass of a walker for Eric that searches through an input BAM file for unclean reads, injecting the cleaned reads in their place and outputting the composite result.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@989 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 20:18:13 +00:00
aaron 195b4ea7b4 a rename for consistancy of Sam to SAM, creating a genotype utils dir, and moving the GLF code into it.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@984 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 17:46:06 +00:00
aaron 36c98b9d6c added tools to test read based traversals using the artificial in-memory SAM file tools, and testing of the PrintReadsWalker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@957 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 01:52:25 +00:00
aaron eb962fe52a adding an artificial sam file writer, used to unit test some of the walkers (mainly the PrintReadsWalker)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@956 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-09 21:47:49 +00:00
kiran af0b03a257 Added tests for mostFrequentBaseFraction() and reverseComplementString()
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@944 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-09 00:53:45 +00:00
aaron 109bef6c08 We're no longer in the read-dropping business.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@901 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 22:37:51 +00:00
aaron 82aa0533b8 added some more documentation to the GLF writer and it's supporting classes, and some other fixes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@875 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 14:53:58 +00:00
aaron e712d69382 GLF writing support
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@872 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 21:30:18 +00:00
aaron b43deda6c9 iterative changes to GLF files; also a test of checking-in over sshfs.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@850 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 20:24:30 +00:00
hanna 5e8c08ee63 Update to latest version of picard. Change imports in all classes dependent on picard public from import edu.mit.broad.picard... to import net.sf.picard...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@849 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 20:13:01 +00:00
aaron d275c18e58 adding some objects we need for the GLF format.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@846 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-27 22:32:25 +00:00
aaron d994544c47 Added back end code support for Sharding based on genomic location for reads. Changed the sharding
code to take GenomeLocSortedSet instead of a list<GenomeLoc>, and added a bunch of much simplier 
and cleaner test cases.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@816 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-26 20:57:46 +00:00
aaron d056f9f3e8 Changed the name to reflect the sorted nature of the set, added some fixes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@810 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 22:34:24 +00:00
aaron 831d430025 Added a collection for storing GenomeLocs, that also has functions for removing by genomic region (that may span multiple GenomeLoc's in the collection), and adding regions, which are then merged with any overlapping regions.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@809 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 21:52:40 +00:00
depristo 7a979859a9 Intermediate checking for evaluation -- now supports transition / transversion evaluation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@793 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 17:05:06 +00:00
kiran bdf772f017 Added test for determining the fraction of a sequence that's taken up by the most frequent base (quick-and-dirty homopolymer testing).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@780 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 20:35:08 +00:00
kiran 324ef9cbd1 Test class for PathUtils.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@773 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 19:31:22 +00:00
hanna 7161b8f927 Disable support for short name values directly abutting their arguments.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@740 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-17 16:09:32 +00:00
hanna d14cab0be7 Added IterableLocusContextQueue and test. Cleaned up tests, adding BaseTest where it didn't exist. Enhanced test runner to run only classes ending in ...Test.java, so that utility classes can sit alongside the tests but won't be run by JUnit.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@693 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-13 21:32:05 +00:00
aaron 4ce3feba4d my move ended up being a copy, so this is to delete dupplicate files.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@651 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 02:10:26 +00:00
aaron bae4256574 Started the process to make the GATK engine into a runnable object so we can call it from other processes. Step 1: make a configuration object that can serialize to and from an XML file. This way we can store the information everyone uses shell scripts for. Also we can now pull the list of params out of the GenomeAnalysisTK.java. More to come...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@636 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 01:25:26 +00:00
hanna 7f8850a8a2 Argument validation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@631 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 20:28:56 +00:00
hanna d725c6cf1c Added unit tests for parsing failures that I encountered during integration testing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@618 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 14:01:54 +00:00
hanna 4177560543 Mutually exclusive options.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@616 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 13:27:48 +00:00
hanna 98716138e9 Cleanup: add support for non-public fields. Track matches as state of parsing engine as well as definitions.
Made fields of command-line argument system non-public by default.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@606 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 19:38:05 +00:00
hanna ef211f96b1 Remove old Apache CLI-based arg system.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@604 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 18:37:51 +00:00
hanna 521aa40baa Bring new command-line argument parsing system live.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@603 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 18:16:11 +00:00
hanna bfd6dfe36c Added real-world tests and tests for conditional validation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@601 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 13:38:46 +00:00
hanna 2ee9374975 Check for proper error output in case of boolean args with parameter specified.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@599 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 23:08:48 +00:00
hanna b0cdba8bb3 Acting on Kiran's suggestion to make the doc tag in the @Argument annotation required.x
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@598 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 22:43:40 +00:00
hanna ec0261275b Lots of command line argument validation. Catches all common validation problems, including missing required arguments, invalid arguments, and several types of misplaced argument value errors.
Still pending:
- Help system.
- Mutually exclusive arguments.
- Design includes too many classes per file.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@597 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 22:08:00 +00:00
hanna 6550fe6f97 Another pass of command-line arguments. Revised parser supports all types
of arguments that the existing parser supports, but does a poor job with
validation.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@591 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-04 22:41:23 +00:00
hanna 4f2ccda56a Interface skeleton for a new command line argument parser. Nowhere near the point of being a drop-in replacement for apache cli yet.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@588 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-04 00:11:42 +00:00
depristo 7ed496b859 JUnit test for RefHanger
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@584 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-01 20:11:14 +00:00
kiran 9800d09608 A more thorough test for multinomialProbability.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@577 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-01 06:27:05 +00:00
aaron 3bf3c21ddd Changed the assert code in the genome loc to throw exceptions, and deleted a function no one seems to be using.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@569 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-30 13:54:51 +00:00
hanna ba9a0b5da8 Break out some of the weird inner classes out of the HierachicalMicroScheduler.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@566 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-29 21:07:07 +00:00
kiran eeb0b78cce Added another assert to testBinomialProbability() and added a test method for testMultinomialProbability().
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@544 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-27 14:59:11 +00:00
hanna e50ae97fe1 Introduce new index-based fasta reader. Clean up MicroManager code, pushing necessary code back into TraversalEngine.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@531 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 19:40:21 +00:00