Commit Graph

1005 Commits (11aa7156304f9fa8bd724ade6c9a356b430a2e2d)

Author SHA1 Message Date
ebanks 11aa715630 added capability for filtering by platform
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1011 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-15 19:19:50 +00:00
ebanks 8f4bc8cb6e Move filtering functionality into the PrintReadsWalker. More to come.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1010 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-15 16:38:08 +00:00
kiran 161c74716c Forgot to change some direct references to variables in SSG. Fixed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1009 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-15 14:16:18 +00:00
kiran 9eeb5f79d4 Various refactoring to achieve hapmap and dbsnp awareness, the ability to set pop-gen and secondary base priors from the command-line, and general code cleanup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1008 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-15 07:21:08 +00:00
kiran f2946fa3e8 Various refactoring to achieve hapmap and dbsnp awareness, the ability to set pop-gen and secondary base priors from the command-line, and general code cleanup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1007 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-15 07:20:22 +00:00
ebanks f6af190b74 ignore clipped reads for realigning indel positions
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1006 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-15 01:01:27 +00:00
hanna 93dc2cdc70 Start of a 'package' format for xml files which should be distributed together.
Uses xslt scripts to transform packages into build scripts.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1005 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-15 00:52:48 +00:00
kiran 0583459839 Another formatting change to make Hapmap sites more clearly visible.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1004 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-12 19:53:21 +00:00
asivache 811f560efb add refseq annotations to single sample calls
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1003 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-12 19:43:30 +00:00
kiran e9be2a9c60 Changed a formatting issue.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1002 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-12 19:40:32 +00:00
asivache ca09a10b76 refseq annotation rod is now manually bound to tell coding indels from non-coding ones
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1001 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-12 19:27:37 +00:00
depristo 260fd0dc45 Trivial change
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1000 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-12 19:11:28 +00:00
hanna 5859948e80 Fixed bugs in CleanedReadInjector arising from integration testing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@999 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-12 17:37:33 +00:00
depristo fb7ba47fff Now does really neightbor distance calculation, as well as true snp cluster counting
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@998 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-12 16:29:26 +00:00
jmaguire dbf2cc037c don't have a null-pointer hissy fit when the reference is N.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@997 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-12 13:59:16 +00:00
depristo 1fb241a8b8 Now supports resume and dry runningRecalQual.py
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@996 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 23:31:59 +00:00
asivache 4eda040e0f what used to be internal cutoff values are now exposed as cmdline parameters: minCoverage, minNormalCoverage, minFraction, minConsensusFraction
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@995 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 21:22:52 +00:00
kiran 41687d5237 Added accessors for the prior probabilities.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@994 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 21:16:10 +00:00
kiran 12dd18cdba Now aware of Hapmap and dbSNP sites. We *can* change the priors there, but we don't yet.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@993 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 21:15:34 +00:00
asivache d5cd883b99 bug fixed when a read with alignment end exactly at the window boundary and with last cigar element being an indel would cause index-out-of-bounds exception
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@992 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 21:03:15 +00:00
kiran a12009e9e7 Added a new constructor in which priors for hom-ref, het, and hom-var can be specified. Otherwise, it uses the default values of 0.999, 1e-3, and 1e-5 respectively.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@991 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 20:33:45 +00:00
kiran 909fefa40a Argumentized priors for hom-ref, het, and hom-var.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@990 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 20:32:44 +00:00
hanna 71e3825fa1 First pass of a walker for Eric that searches through an input BAM file for unclean reads, injecting the cleaned reads in their place and outputting the composite result.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@989 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 20:18:13 +00:00
ebanks 032d0436e6 Added ROD for 1KG SNP calls
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@988 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 19:53:51 +00:00
ebanks ffffe3b2f6 -Support for 1KG SNP calls in RODs
-Minor bug fix


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@987 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 18:56:37 +00:00
hanna 5440dd13df Preparation for point release of read calibrator: no artificial heap size limit, no duplicate dbsnp records.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@986 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 18:39:33 +00:00
aaron 63b5c12cbd Changed dataSources to datasources, to be consistant with the rest of our package names. Also, this makes me champion in the largest check-in contest.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@985 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 18:13:22 +00:00
aaron 195b4ea7b4 a rename for consistancy of Sam to SAM, creating a genotype utils dir, and moving the GLF code into it.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@984 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 17:46:06 +00:00
ebanks 599ceeddd8 Better method for downsampling deep regions
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@983 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 16:57:40 +00:00
ebanks 4d9a88153a Update inferred insert size of cleaned reads when they are paired
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@982 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 16:29:13 +00:00
ebanks 3796654069 Added walker to emit intervals of clustered SNP calls
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@981 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 00:57:14 +00:00
hanna 678ddd914f Stopgap fixes GFF, DbSNP being half-open rather than half-closed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@980 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 21:38:57 +00:00
aaron 94b0e46d12 checked in a sample xml file used to store the defaults for the SomaticCoverage tool, and added it to the SomaticCoverage.jar in build.sml. Also added a inputStream marshalling method to the GATKArgumentCollection.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@979 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 20:46:16 +00:00
asivache 8d25f1a105 should be a little faster
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@978 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 20:33:45 +00:00
aaron 3a340ca887 adding the SomaticCoverage.jar to the list of generated jars, at least for now.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@977 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 20:05:54 +00:00
aaron 026f68fb41 a couple of quick name changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@976 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 20:02:52 +00:00
aaron 72a81f8f25 removed the requirement that a bam file list be present in the XML version of the command line arguments.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@975 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 20:01:13 +00:00
ebanks b1f90635c1 1. downsample when there are too many mismatching reads (needs perfecting)
2. allow user to specify that no reads be emitted


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@974 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 19:55:42 +00:00
asivache 39dcd4f11f an attempt to bail out when unmapped reads are reached at the end of the file(s). still testing...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@973 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 19:53:50 +00:00
asivache 030efc468f added naive ad-hoc cutoff for the pile size the cleaner will attempt to process; use --maxPileSize argument to force any pile larger than specified cutoff to be directly written to the output without cleaning
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@972 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 17:52:35 +00:00
ebanks f9be175f44 Be smart about trying alternate consenses:
try prior indels first and only 1 instance of them


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@971 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 17:43:22 +00:00
aaron f304803811 initial check-in of an easy way to create command line tools based on the GATK
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@970 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 17:34:02 +00:00
kiran b0cc763eb5 Added some methods to format bases such that read bases on the forward strand are in uppercase, while those on the negative strand are lowercase. This does *not* affect the default functionality of the standard PileupWalker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@969 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 17:31:00 +00:00
depristo 9ebcd6546d Convenience printing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@968 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 17:07:38 +00:00
asivache 06e5a765f8 now has two modes: one sample - just call indel sites; two samples - call somatic-looking variants only. Still uses heuristic count-based cutoffs, cutoffs are hardcoded and are pretty conservative...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@967 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 16:41:38 +00:00
ebanks 5451bbfd5a -move final vars to command-line args
-Per Andrey: ignore indels from aligner when testing against alt consensus


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@966 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 16:39:00 +00:00
hanna ad80894afa Bumped picard to latest svn version.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@965 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 14:36:34 +00:00
aaron ec2f015447 fixed a bunch of comments and license headers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@964 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 14:10:46 +00:00
kiran 6bb7f7e9d8 Commented some stuff out so that things compile.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@963 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 14:06:33 +00:00
hanna dc6a9ca196 Pooling resources to lower memory consumption.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@962 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 13:39:32 +00:00