asivache
08ca2ce89b
fixing accidental incomplete commit
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@151 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 14:39:15 +00:00
asivache
2dd14d7c17
auxiliary class for SequencePile, just one column of the MSA
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@150 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 14:37:49 +00:00
asivache
29136ee892
Arachne's alignment pile, more or less. Can accept sequences with alignments (cigars) and generate nice alignment pile plot with indels
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@149 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 14:36:00 +00:00
asivache
0188379174
PrimitivePair.\* : pair(s) based directly on primitive types. Hail generics.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@148 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 14:12:22 +00:00
asivache
1f60c70688
Missing STL. Added Pair<X,Y>
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@147 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 14:04:16 +00:00
asivache
835e85374e
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@146 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 05:46:09 +00:00
aaron
046cecb067
Switched our code over to the new command line style (gnu style args), added the initial logger code, and added apache commons CLI to the IVY script.
...
There will be a slow conversion of all the System.out and System.err in other files to the logger style output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@145 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 21:06:22 +00:00
asivache
38f18c8679
added generic SortPermutation that returns sorting permutation for arbitrary List<T> as long as T is Comparable
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@144 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 20:40:26 +00:00
aaron
09d605bb37
Changed how the example walker gets run, I'm about to check in the GNU style command line args.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@143 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 20:10:17 +00:00
depristo
02556ce4a6
Moved to core
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@142 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 20:02:19 +00:00
depristo
1df23b0417
Added a definitely inappropriately placed testing of the new fasta seeking system at the bottom of the file -- it's not called but it probably should be moved to somewhere more appropriate.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@141 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 19:57:52 +00:00
depristo
611ab0bdb3
Uses the new FastaSequenceFile2 for high-performance seeks.
...
Added far superior error checking (and reporting!) messages for incorrect usage of the location string. Prevents users from seeing complex FunctionalJ error message
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@140 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 19:56:54 +00:00
depristo
e77d735e08
New reference iterator that works with the new FastaSequenceFile seek operations. Greatly improves performance of jumping around in the genome.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@139 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 19:54:02 +00:00
depristo
c8d7207a8e
Fixed problem with GenomeLoc logic -- optimization was causing assertion failure.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@138 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 19:53:00 +00:00
depristo
52ad08298a
New FastaSequenceFile with support for poor-man's seek and querying the next contig name without loading the whole next contig into memory. Vastly speeds up the performance of jumping to distant parts of the genome with the location operator.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@137 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 19:43:56 +00:00
depristo
4888df97c7
Added averageDouble function. How can we write a generic average function?!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@136 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 19:41:30 +00:00
jmaguire
cf407168cf
keep track of the position you're called on.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@135 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 16:47:49 +00:00
jmaguire
096f0dbc68
don't run off the end of the list of loci.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@134 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 16:47:29 +00:00
jmaguire
4e0cd6ab84
Now works on single samples and computes metrics.
...
Here is an example metrics output from a very tiny region:
Allele Frequency Metrics (LOD >= 5)
-------------------------------------------------
Total loci : 14704
Total called with confidence : 10920 (74.27%)
Number of Variants : 16 (0.15%) (1/682)
Fraction of variant sites in dbSNP : 100.00%
Missing:
Microarray(hapmap) concordance, tp/fp.
Optional:
Histograms of depth of coverage, LOD, observed allele frequency, etc.
Still to implement:
Propagate command line argument N (number of chromosomes) into walker to enable pooled calling.
Take allele frequency priors as input.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@133 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 15:45:12 +00:00
jmaguire
f7ad17016d
some reformatting and logic cleanup in the comparison functions
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@132 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 15:36:56 +00:00
jmaguire
dfe50ce773
optionally check that the records are sorted.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@131 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 15:36:24 +00:00
jmaguire
149ac3d96c
Now iterate over a large set of tiny intervals efficiently.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@130 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 12:04:11 +00:00
asivache
df2a7039cb
Henious bug fixed: only rookies forget that external conditions need to be re-checked after loop ends on some other condition, duh! In addition, msa piles are now seeded with a single read sequence each (if there are less then 4 reads it might be hard to seed with two pairs)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@129 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-21 18:32:18 +00:00
kiran
411e5cf647
Added FourBaseCaller as a jar build target.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@128 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-21 17:59:13 +00:00
kiran
6e1fa7d61a
Java version of basecaller that estimates probability distribution over four-base hypothesis space via an internal-control-initialized Gaussian mixture model over base channel intensities.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@127 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-21 17:58:50 +00:00
kiran
3e350006e0
Added a directory to house some Illumina output parsers. Hopefully this will be merged back into Picard at some point.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@126 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-21 17:55:56 +00:00
asivache
497eea2e5c
minor changes and shuffling code around; also, now when realigned piles are printed they are sorted by start position
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@125 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-21 17:43:49 +00:00
jmaguire
0ea44a5805
1st draft of support for an file containing a list of intervals.
...
Appears to work, but inefficient:
At each reference location, the entire list of intervals is linear searched.
Instead we need to have the intervals sorted, and simply seek forward from interval to interval.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@124 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-21 16:07:32 +00:00
hanna
1fcf4c0cbf
Update picard to work with new samtools.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@123 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 21:51:26 +00:00
jmaguire
5dca560c3c
A bunch of refactoring, and more on the way.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@122 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 21:31:07 +00:00
hanna
b806a9cf68
Updated for new version of samtools, which returns a sequence dictionary
...
rather than a simple list of sequences.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@121 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 20:38:24 +00:00
hanna
6e2d939905
Added subversion rev 180 of the sam library.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@120 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 20:17:51 +00:00
ebanks
c5433a3120
dumps out base qualities per position for use in making boxplots
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@119 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 17:01:18 +00:00
jmaguire
1161c261ac
made all data members public.
...
switched logOddsVarRef to LOD.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@118 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 16:44:17 +00:00
depristo
9b5e5e06f9
Now supports checking that the input files exist and are good
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@117 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 16:40:54 +00:00
ebanks
f3f1b47808
deal with reverse complemented reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@115 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 16:01:49 +00:00
asivache
9ec96414c7
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@114 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 15:54:29 +00:00
depristo
322f4b944f
Better stress test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@113 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 15:52:54 +00:00
asivache
3565b50ff5
main class (argument processing and traversing the reference) and implementation of all the Receiver functionality for building read piles over indels
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@112 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 05:18:04 +00:00
asivache
4c3b92b860
comparator for interval objects
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@111 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 05:15:13 +00:00
asivache
f810412d75
equals(), hashCode() updated/added, also a few minor changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@110 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 05:13:07 +00:00
asivache
4badd54216
Indel also implements Interval interface but has its quirks
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@109 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 05:11:17 +00:00
asivache
501e92d441
an interface for an interval object and simple minimum implementation; note: in contrast to arachne, this is closed interval
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@108 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 05:09:56 +00:00
asivache
29d2d460f3
a trivial interface and even more trivial implementations that do nothing (ignore the data they receive)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@107 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 05:08:15 +00:00
depristo
b83c8319c7
Crushed subtle and potentially insidous bug in seeking within the fasta; a beer for anyone who can tell me the situation where this might arise...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@106 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 00:07:06 +00:00
depristo
34ee48fd82
Fixing output printing issues in the code, as well as adding more safety checks
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@105 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-19 23:02:49 +00:00
hanna
6fdd622160
Describe how GATK finds walkers. Change the example to avoid copying the class file into the walkers directory.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@104 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-19 22:41:12 +00:00
hanna
104e2811ec
Configure the plugin directory.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@103 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-19 22:12:25 +00:00
andrewk
6bcdac5c62
Restructured AlleleFrequency classes into 3 classes: AlleleFrequencyWalker, AlleleFrequencyMetricsWalker, AlleleFrequencyEstimate. AlleleFrequencyMetricsWalker class now calls mapper function of AlleleFrequencyWalker and works with the result. AlleleFrequencyEstimate is now a separate class instead of a subclass of AlleleFrequencyWalker.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@102 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-19 22:06:01 +00:00
hanna
41fec1565c
Hello, world! for GATK.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@101 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-19 21:46:22 +00:00