aaron
ff1b92acc4
Switch over to the GenomeAnalysisEngine/CommandLineGATK system from the GenomeAnalysisTK code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@655 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 22:05:58 +00:00
ebanks
009e71fcd9
We need to sort cleaned reads ourselves (instead of letting SAMFileWriter
...
do it) because the SAM headers are often screwed up and claim to be
"unsorted". While here, I broke off the module from the SortSamIterator
in case someone else wants to use it.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@654 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 15:43:42 +00:00
aaron
c735e1f627
small javadoc cleanup.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@653 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 03:44:21 +00:00
aaron
e8b8ab5985
Added code to extend Matt's getReferenceBases out to the read walkers, so they can see the corresponding reference for each read.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@652 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 03:42:38 +00:00
aaron
4ce3feba4d
my move ended up being a copy, so this is to delete dupplicate files.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@651 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 02:10:26 +00:00
aaron
898f65547e
Added code to split GenomeAnalysisTK.java into an object concerned with loading command line args, and one that runs the engines. This will allow us to run the GATK from other tools (like Matlab). Also some cleanup to seperate out the legacy traversals and the new style traversals. This is not live yet, and any modifications you need should be made to GenomeAnalysisTK.java for now.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@650 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 02:07:20 +00:00
aaron
8d43ec3d7e
a fix for a situation where a chromosome on the reference file contains no reads, and doesn't align to the bam file. This came up using reference 18, which has chomosomes like chr1_random that aren't in all BAM files.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@649 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 01:39:25 +00:00
aaron
ee02b61068
added support for the argument collections code
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@648 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-09 07:07:33 +00:00
aaron
742840017b
added the argument collection annotation for situations where fields in a command line args have embedded fields that should be checked for command line args
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@647 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-09 06:59:17 +00:00
hanna
55c1b688bd
Fix mediocre javadoc.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@646 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 22:31:16 +00:00
hanna
522f8b58be
Added second method for getting large sequences of the reference for use in reads traversals.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@645 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 22:18:04 +00:00
aaron
517f27f331
Added sharding strat. code that picks the right kind of shard, based on the traversal engine
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@644 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 21:55:10 +00:00
hanna
6e394490cb
Cleanup in preparation for ByLoci traversal. Also did some work minimizing unit tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@643 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 21:27:54 +00:00
hanna
ee777c89de
Change the default mechanism for adding ROD bindings to the new system. TODO: create a new object type for these triplets.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@642 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 18:43:00 +00:00
ebanks
3aabc144c6
Added functionality to allow for a contract between LocusWindowTraversalEngine and LocusWindowWalker which allows the Walker to act upon reads outside of the provided intervals.
...
(Really, all we want to do is spit out all reads, but this allows the Walker to do other things with the reads if it wants)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@641 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 17:28:16 +00:00
hanna
de1c282e62
Reference-ordered data relies on bugs in the old command-line argument system to work. Update the ROD system to from -B track1 type1 file1 track2 type2 file2 to -B track1,type1,file1 -B track2,type2,file2.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@640 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 15:28:19 +00:00
hanna
483a58627b
More cleanup -- pushing shared functions down into the traversal engine.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@639 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 14:12:45 +00:00
hanna
7a9cfe1f75
Push reduceInit down a level so that the walker can call into it without weird casts.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@638 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 13:46:28 +00:00
hanna
a5154d99a3
Haven't heard any complaints, so I'm deleting the original implementation of TraverseByLociByReference. All TbyLbyR's will now go through the new sharding code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@637 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 13:37:00 +00:00
aaron
bae4256574
Started the process to make the GATK engine into a runnable object so we can call it from other processes. Step 1: make a configuration object that can serialize to and from an XML file. This way we can store the information everyone uses shell scripts for. Also we can now pull the list of params out of the GenomeAnalysisTK.java. More to come...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@636 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 01:25:26 +00:00
hanna
226edbdef6
Hypen-style xml output. Much sexier.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@635 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 01:04:40 +00:00
hanna
4c269b8496
Cleanup LinearMicroScheduler in preparation for TraverseByLoci inclusion.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@634 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 00:58:37 +00:00
aaron
21536df308
Change the sample XML marshalling code over to simple XML, and take out the castor lines in the ivy.xml
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@633 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 00:08:25 +00:00
hanna
7f8850a8a2
Argument validation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@631 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 20:28:56 +00:00
hanna
a3d8febbf2
Error message cleanup.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@630 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 19:31:32 +00:00
hanna
c241d386a7
Beefed up command-line usage string.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@629 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 19:08:19 +00:00
depristo
5a6892900e
fixing oddities in duplicates
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@628 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:55:45 +00:00
depristo
4a26f35caa
new default syntax
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@627 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:16:53 +00:00
ebanks
283a4d1b54
Fix some special-case cleaner issues.
...
We now do the same as brute force in all examples to date.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@626 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:16:35 +00:00
depristo
93211c1cd8
template for windowmaker utility -- total non-functional
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@625 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:13:03 +00:00
depristo
2204be43eb
System for traversing duplicate reads, along with a walker to compute quality scores among duplicates and a smarter method to combine quality scores across duplicates -- v1
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@624 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:06:02 +00:00
depristo
71e8f47a6c
boundQual function for capping qual values
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@623 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:04:18 +00:00
depristo
e848f34896
countOccurances of char in string and max of a list of bytes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@622 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:03:49 +00:00
depristo
5a4bb76cc3
More capabilities for the pileup
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@621 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:03:13 +00:00
depristo
89a26a7078
Utilities for handling duplicates
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@620 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:02:24 +00:00
hanna
4f85062004
Cleanup parsing method to make it less generic.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@619 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 16:21:17 +00:00
hanna
d725c6cf1c
Added unit tests for parsing failures that I encountered during integration testing.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@618 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 14:01:54 +00:00
hanna
2f3ab53888
Oops. Arguments didn't load into applications with non-plugins (basically everything except the GATK).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@617 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 13:37:19 +00:00
hanna
4177560543
Mutually exclusive options.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@616 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 13:27:48 +00:00
hanna
752928df94
Switch to better mechanism for supplying a default.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@615 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 01:22:01 +00:00
hanna
dc944ec69b
First stage of ROD plumbing for MicroScheduler.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@614 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 23:26:21 +00:00
aaron
5136724884
Added code to the schedulers, one step closer to turning on the new reads traversals
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@613 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 22:36:25 +00:00
hanna
9c0b81e946
Default flags to 'not required'.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@612 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 22:09:49 +00:00
asivache
072808858e
added COUNT_CUTOFF arg: it is nor possible to tell the code to try to realign all read piles over trains of nearby indels with at least one indel observed in COUNT_CUTOFF or more different alignments (set the arg to 1 to realign around all indels); also, some diagnostic printouts added to the output (time spent on loading the reference, time spent on scrolling through the input bam file, counts of discarded reads)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@611 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 21:59:33 +00:00
hanna
1fe8155111
Some critical fixes for cases where argument values directly abut argument names
...
and for arguments with missing short names.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@610 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 21:47:34 +00:00
aaron
0aba688e6f
Added a interface that all our SAMRecord iterators should try to code to. This is in the effort to keep our code generic
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@609 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 21:40:41 +00:00
hanna
62e7e46754
Miscellaneous cleanup. Better display of help output. Better exception subtyping. More thought-out access routines.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@608 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 21:16:01 +00:00
ebanks
5be75e0ae6
First version of indel cleaner walker that works on intervals
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@607 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 20:20:48 +00:00
hanna
98716138e9
Cleanup: add support for non-public fields. Track matches as state of parsing engine as well as definitions.
...
Made fields of command-line argument system non-public by default.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@606 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 19:38:05 +00:00
aaron
f5eae98af2
Fixed a bug where we could ask for a read when there were none in the pool (that's a bad thing).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@605 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 18:40:55 +00:00