7a921c908cCan now adjust the genotype likelihoods of a variant returned from the rod. This automatically causes the lodBtr, lodBtnb, and genotype to be recomputed.
kiran
2009-06-18 07:26:37 +0000
9a7cec7d2eDirectory to house variant calling and filtration tools.
kiran
2009-06-18 07:20:38 +0000
5992d88409skip N's in the reference (rather than crash. doh!)
jmaguire
2009-06-17 23:22:35 +0000
f45d5a73a5Package annotator for Alec.
hanna
2009-06-17 22:40:40 +0000
c4d9058f32Added module rodVariants.class to the list of allowable RODs.
kiran
2009-06-17 21:33:13 +0000
ab2a80f3eaA new ROD type that allows one to input a geli.calls file back into a walker.
kiran
2009-06-17 21:32:21 +0000
9ef391706cAdded outputting of genotype posteriors to geli.calls file.
kiran
2009-06-17 21:31:46 +0000
615572ea06output to out... not System.out...
kcibul
2009-06-17 20:43:10 +0000
b947fd586fFIxed a nasty bug in GenomeLoc compareContigs; we were using '==' to compare Integer contig ID's. The surprising thing is that it actually works for Integers > -127 and < 128 (they're cached by the JVM, so it's actually comparing the underlying ints). Switched over GenomeLoc contigs to int based.
aaron
2009-06-17 20:19:47 +0000
ed7fac1c90Add bcel and cleanup.
hanna
2009-06-17 19:28:04 +0000
87d1c11ed7Delete lingering empty directory.
hanna
2009-06-17 18:33:03 +0000
cba9025983More package-level documentation.
hanna
2009-06-17 16:28:45 +0000
43a28750e0Package level documentation -- helps new users get acclimated to the codebase more quickly.
hanna
2009-06-17 16:27:48 +0000
7d281296a7Finishing checking for building
depristo
2009-06-17 14:12:40 +0000
d1e25bfe88Intermediate checkin for safety -- now compiles
depristo
2009-06-17 13:16:55 +0000
2250769a42Intermediate checkin for safety -- do not use
depristo
2009-06-17 13:07:19 +0000
86c8c08375Intermediate checkin for safety -- do not use
depristo
2009-06-17 13:06:24 +0000
e2ccea4883Cleanup. Move output of packaging to dist directory. Don't always create resources directory. Make jar take on the package name.
hanna
2009-06-16 22:47:23 +0000
78b7fb25c7allow contig names to have spaces in the fai. This is not yet supported by samtools fai generator (which truncates at the first space), but we might as well fix it on our side.
aaron
2009-06-16 22:23:12 +0000
6ee64c7e43added changes to support alec toUnmappedRead seek. Huge improvements (orders of magnitude) in unmapped read performance.
aaron
2009-06-16 22:15:56 +0000
647b8a1ab0Fix TabularROD printing and testing so Aaron stops nagging me.
ebanks
2009-06-16 15:49:26 +0000
a0a549557fadded a check of the sort ordering to the query methods, so that we detect if a file is unsorted much earlier. Also added some verbosity to the exception; it now contains an information about the raw attribute we saw for 'SO', the sort order of the bam file.
aaron
2009-06-15 22:15:03 +0000
2259dc3a8fadded filtering out indels with large levels of noise (mismatches) remaining in the close proximity; also a bug in recording deletion coordinates is fixed
asivache
2009-06-15 21:13:28 +0000
a6477df6d1Now optionally outputs whether "SNPs" are maintained/cleaned out/introduced by cleaning
ebanks
2009-06-15 20:02:02 +0000
29df74ae23Plumbed packaging support into build.xml and added package for GATK.
hanna
2009-06-15 19:41:16 +0000
11aa715630added capability for filtering by platform
ebanks
2009-06-15 19:19:50 +0000
8f4bc8cb6eMove filtering functionality into the PrintReadsWalker. More to come.
ebanks
2009-06-15 16:38:08 +0000
161c74716cForgot to change some direct references to variables in SSG. Fixed.
kiran
2009-06-15 14:16:18 +0000
9eeb5f79d4Various refactoring to achieve hapmap and dbsnp awareness, the ability to set pop-gen and secondary base priors from the command-line, and general code cleanup.
kiran
2009-06-15 07:21:08 +0000
f2946fa3e8Various refactoring to achieve hapmap and dbsnp awareness, the ability to set pop-gen and secondary base priors from the command-line, and general code cleanup.
kiran
2009-06-15 07:20:22 +0000
93dc2cdc70Start of a 'package' format for xml files which should be distributed together. Uses xslt scripts to transform packages into build scripts.
hanna
2009-06-15 00:52:48 +0000
0583459839Another formatting change to make Hapmap sites more clearly visible.
kiran
2009-06-12 19:53:21 +0000
811f560efbadd refseq annotations to single sample calls
asivache
2009-06-12 19:43:30 +0000
e9be2a9c60Changed a formatting issue.
kiran
2009-06-12 19:40:32 +0000
ca09a10b76refseq annotation rod is now manually bound to tell coding indels from non-coding ones
asivache
2009-06-12 19:27:37 +0000
5859948e80Fixed bugs in CleanedReadInjector arising from integration testing.
hanna
2009-06-12 17:37:33 +0000
fb7ba47fffNow does really neightbor distance calculation, as well as true snp cluster counting
depristo
2009-06-12 16:29:26 +0000
dbf2cc037cdon't have a null-pointer hissy fit when the reference is N.
jmaguire
2009-06-12 13:59:16 +0000
1fb241a8b8Now supports resume and dry runningRecalQual.py
depristo
2009-06-11 23:31:59 +0000
4eda040e0fwhat used to be internal cutoff values are now exposed as cmdline parameters: minCoverage, minNormalCoverage, minFraction, minConsensusFraction
asivache
2009-06-11 21:22:52 +0000
41687d5237Added accessors for the prior probabilities.
kiran
2009-06-11 21:16:10 +0000
12dd18cdbaNow aware of Hapmap and dbSNP sites. We *can* change the priors there, but we don't yet.
kiran
2009-06-11 21:15:34 +0000
d5cd883b99bug fixed when a read with alignment end exactly at the window boundary and with last cigar element being an indel would cause index-out-of-bounds exception
asivache
2009-06-11 21:03:15 +0000
a12009e9e7Added a new constructor in which priors for hom-ref, het, and hom-var can be specified. Otherwise, it uses the default values of 0.999, 1e-3, and 1e-5 respectively.
kiran
2009-06-11 20:33:45 +0000
909fefa40aArgumentized priors for hom-ref, het, and hom-var.
kiran
2009-06-11 20:32:44 +0000
71e3825fa1First pass of a walker for Eric that searches through an input BAM file for unclean reads, injecting the cleaned reads in their place and outputting the composite result.
hanna
2009-06-11 20:18:13 +0000
032d0436e6Added ROD for 1KG SNP calls
ebanks
2009-06-11 19:53:51 +0000
ffffe3b2f6-Support for 1KG SNP calls in RODs -Minor bug fix
ebanks
2009-06-11 18:56:37 +0000
5440dd13dfPreparation for point release of read calibrator: no artificial heap size limit, no duplicate dbsnp records.
hanna
2009-06-11 18:39:33 +0000
63b5c12cbdChanged dataSources to datasources, to be consistant with the rest of our package names. Also, this makes me champion in the largest check-in contest.
aaron
2009-06-11 18:13:22 +0000
195b4ea7b4a rename for consistancy of Sam to SAM, creating a genotype utils dir, and moving the GLF code into it.
aaron
2009-06-11 17:46:06 +0000
599ceeddd8Better method for downsampling deep regions
ebanks
2009-06-11 16:57:40 +0000
4d9a88153aUpdate inferred insert size of cleaned reads when they are paired
ebanks
2009-06-11 16:29:13 +0000
3796654069Added walker to emit intervals of clustered SNP calls
ebanks
2009-06-11 00:57:14 +0000
678ddd914fStopgap fixes GFF, DbSNP being half-open rather than half-closed.
hanna
2009-06-10 21:38:57 +0000
94b0e46d12checked in a sample xml file used to store the defaults for the SomaticCoverage tool, and added it to the SomaticCoverage.jar in build.sml. Also added a inputStream marshalling method to the GATKArgumentCollection.
aaron
2009-06-10 20:46:16 +0000
8d25f1a105should be a little faster
asivache
2009-06-10 20:33:45 +0000
3a340ca887adding the SomaticCoverage.jar to the list of generated jars, at least for now.
aaron
2009-06-10 20:05:54 +0000
026f68fb41a couple of quick name changes
aaron
2009-06-10 20:02:52 +0000
72a81f8f25removed the requirement that a bam file list be present in the XML version of the command line arguments.
aaron
2009-06-10 20:01:13 +0000
b1f90635c11. downsample when there are too many mismatching reads (needs perfecting) 2. allow user to specify that no reads be emitted
ebanks
2009-06-10 19:55:42 +0000
39dcd4f11fan attempt to bail out when unmapped reads are reached at the end of the file(s). still testing...
asivache
2009-06-10 19:53:50 +0000
030efc468fadded naive ad-hoc cutoff for the pile size the cleaner will attempt to process; use --maxPileSize argument to force any pile larger than specified cutoff to be directly written to the output without cleaning
asivache
2009-06-10 17:52:35 +0000
f9be175f44Be smart about trying alternate consenses: try prior indels first and only 1 instance of them
ebanks
2009-06-10 17:43:22 +0000
f304803811initial check-in of an easy way to create command line tools based on the GATK
aaron
2009-06-10 17:34:02 +0000
b0cc763eb5Added some methods to format bases such that read bases on the forward strand are in uppercase, while those on the negative strand are lowercase. This does *not* affect the default functionality of the standard PileupWalker
kiran
2009-06-10 17:31:00 +0000
06e5a765f8now has two modes: one sample - just call indel sites; two samples - call somatic-looking variants only. Still uses heuristic count-based cutoffs, cutoffs are hardcoded and are pretty conservative...
asivache
2009-06-10 16:41:38 +0000
5451bbfd5a-move final vars to command-line args -Per Andrey: ignore indels from aligner when testing against alt consensus
ebanks
2009-06-10 16:39:00 +0000
ad80894afaBumped picard to latest svn version.
hanna
2009-06-10 14:36:34 +0000
ec2f015447fixed a bunch of comments and license headers.
aaron
2009-06-10 14:10:46 +0000
6bb7f7e9d8Commented some stuff out so that things compile.
kiran
2009-06-10 14:06:33 +0000
dc6a9ca196Pooling resources to lower memory consumption.
hanna
2009-06-10 13:39:32 +0000
87ba8b3451Removed some useless code. Don't apply second-base test if the coverage is too high, since the binomial probs explode and return NaN or Infinite values.
kiran
2009-06-10 08:27:06 +0000
a12ed404ceChanged method name from applyFourBaseDistributionPrior to applySecondBaseDistributionPrior. 'Cause that's how I roll.
kiran
2009-06-10 08:21:22 +0000
3adb4239e4Same as regular Pileup, but also allows you to see flanking region around locus. This will be useful in determining that some SNPs are spurious due to being at the ends of homopolymer regions.
kiran
2009-06-10 08:19:31 +0000
2b0e7f612bHandles bam pileups where some of the reads have SQ tags and some don't.
kiran
2009-06-10 08:17:15 +0000
36c98b9d6cadded tools to test read based traversals using the artificial in-memory SAM file tools, and testing of the PrintReadsWalker
aaron
2009-06-10 01:52:25 +0000
eb962fe52aadding an artificial sam file writer, used to unit test some of the walkers (mainly the PrintReadsWalker)
aaron
2009-06-09 21:47:49 +0000
e77dfe9983Allow script to be easily modified to support different platforms.
hanna
2009-06-09 16:06:57 +0000
7fa84ea15710x speedup of recalibration walker
depristo
2009-06-09 15:39:40 +0000
a62bc6b05dfixed some documentation and attached a correct license
aaron
2009-06-09 14:44:27 +0000
bf6190b471cleaned up the PrintReadsWalker, and added a lot of documentation.
aaron
2009-06-09 14:28:32 +0000
b45b1d5f2bborder case bug fixes
ebanks
2009-06-09 04:33:15 +0000
fecba2cae5Disabled option to show secondary quals as the definition has changed to conform to the spec and thus this printout is non-sensical.
kiran
2009-06-09 03:21:14 +0000
5fa3f7ed3aAdded absolute path bug fix for Mark.
hanna
2009-06-09 02:25:17 +0000
e7f222108dMore accessors. Can compute the sum of the quality scores in the read (useful for sorting) and can return a subset of itself.
kiran
2009-06-09 01:02:48 +0000
6506504a60Updates after seeing a certain number of reads, not a certain number of bases.
kiran
2009-06-09 01:01:36 +0000
65d0675a4eSome changes regarding what to do when a cycle is completely busted.
kiran
2009-06-09 01:01:13 +0000
0bd78d72d7Some changes regarding what to do when a cycle is completely busted.
kiran
2009-06-09 01:00:33 +0000
af0b03a257Added tests for mostFrequentBaseFraction() and reverseComplementString()
kiran
2009-06-09 00:53:45 +0000
681e67c72cAdded some methods to generate random bases or random base indexes, optionally disallowing the generation of a specified base or base index.
kiran
2009-06-09 00:47:54 +0000
13eb868536helper class. array-like random access and fast shift. good for sliding windows (e.g. keeping coverage over last 100 bases while sliding along the reference)
asivache
2009-06-09 00:11:57 +0000