ebanks
8d3dc57c3d
Commit to emit in sorted order so we don't have to use /tmp
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1133 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 19:47:15 +00:00
aaron
f5cba5a6bb
Fixed genome loc to be immutable, the only way to now change it's values is through the GenomeLocParser.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1132 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 19:17:24 +00:00
hanna
455275996f
Added contents to the wiki.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1131 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 18:29:46 +00:00
asivache
177d6d00b8
added setContigIndex(). NOTE: both setContig() and setContigIndex are UNSAFE as one does not automatically involve updating the other, and there's also no validation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1130 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 17:40:37 +00:00
depristo
9fca79ed62
Read groups are now sorted in the output data, for convenience
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1129 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 16:50:44 +00:00
hanna
fe421e5712
All IntelliJ best practices info is now on the wiki.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1128 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 16:45:52 +00:00
ebanks
08df4771c8
count X/N/etc. as mismatches for the NM attribute in the BAMs
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1127 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 16:08:55 +00:00
kiran
d412c5dc2f
Updated to use SecondaryBaseAnnotator class.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1126 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 16:08:43 +00:00
kiran
e3cdf7ef4b
A single class that can be handed reads for training and basecalling. When in training mode, we accumulate no more than 10000 reads and always replace the lowest-quality reads with superior quality reads. Thus, the training set always contains 10000 of the best reads available. After training is complete, the class can be interrogated to return the SQ tag for a given RawRead object.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1125 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 16:03:15 +00:00
hanna
74cc7136f7
All info from the user manual is now in the wiki. Deleting.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1124 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 15:29:59 +00:00
hanna
ddf4003536
Updates to picard public / private and sam.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1123 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 14:50:55 +00:00
ebanks
8aa3b65e7f
fix to guarantee emission in sorted order
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1122 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-30 13:48:41 +00:00
aaron
03f8177a53
When you get the reference string for a read that is mapped partially off the end of a contig, the string is masked with X's for base positions without corresponding reference positions.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1121 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-29 20:51:55 +00:00
aaron
1dcababad1
a fix to make the test run
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1120 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-29 20:24:32 +00:00
jmaguire
a17bf145f6
fix to respond to the change in IndelLikelihood constructor.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1119 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-29 19:05:33 +00:00
depristo
7ecc43e9a7
Fixed subtle null ptr exception discovered by Kiran. Now deals with the rare situation where you have only say Q28 bases at dbSNP sites, so you fail in the Table recalibration step with a null pointer error into the data structure indexed by quality score. If you are Q score above those seen before you aren't modified in any way.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1118 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-29 18:57:42 +00:00
ebanks
95e2ae0171
Deal with reads whose ends are aligned off the end of a chromosome.
...
Includes update to ignore non-ATCG bases (not just 'N')
(Also, create a BWA dir for future work)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1117 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-29 16:50:05 +00:00
jmaguire
65a788f18a
Added a ROD (SangerSNP) for parsing the Sanger's chr20 pilot1 SNP calls.
...
Some doodling around with indel calling in an EM context.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1116 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-29 16:32:12 +00:00
asivache
ceeeec13b8
Computes a vector of numbers of reads falling into successive intervals of specified length (e.g. numbers of reads per every 1Mbase)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1115 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-29 16:12:21 +00:00
ebanks
3bacb3db03
updated some defaults
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1114 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-26 19:28:05 +00:00
ebanks
eb74b16e39
updated what constitutes removing entropy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1113 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-26 18:29:00 +00:00
aaron
d7d4298917
Some files to support generic genotype outputing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1112 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-26 15:43:41 +00:00
asivache
1a97c86f95
don't crash when an unmapped read is encountered, just write it into the output file, it should be ok
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1111 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-26 15:33:59 +00:00
ebanks
da1f168a3e
updated docs
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1110 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-26 05:20:17 +00:00
hanna
491ed70b44
TraverseByLocusWindow -- asstd bug fixes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1109 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 22:51:38 +00:00
depristo
5289230eb8
Version 0.2.1 (released) of the TableRecalibrator
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1108 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 22:50:55 +00:00
asivache
73caf5db15
This is, strictly speaking, NOT a GATK module. Standalone, picard-level executable except that it uses couple of gatk utils (GenomeLoc). Remaps alignments from cutom reference (such as transcritome, hyb-sel etc) onto the 'master' reference
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1107 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 22:04:18 +00:00
kiran
ee2af3b423
I committed this too soon... reverting...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1106 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 20:49:12 +00:00
hanna
ad3a3aa350
First pass at passing lists of files / lists of interval arguments work. Note that the interval
...
ROD system will throw up its hands and not deal with intervals at all if multiple interval files
are passed in (see JIRA GSA-95).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1105 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 20:44:23 +00:00
kiran
23680a9a16
Replaced an expensive sort with an inexpensive direct computation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1104 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 20:25:12 +00:00
ebanks
83816fb801
Stop using the annoying refIterator (temp change until new traversal is green lighted)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1103 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 20:05:39 +00:00
aaron
0c3aabd1c5
logger output should be less verbose by default. Also fixed a printout in my read validation walker
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1102 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 19:47:29 +00:00
kcibul
11d83ac7d0
pushing up to test on unix box
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1101 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 19:00:48 +00:00
ebanks
0d9041380d
remove printouts
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1100 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 18:54:14 +00:00
aaron
0a16519aa2
a couple of additions to the tests, plus a change to the artificial resource pool to support the queryContained flag
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1099 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 18:30:32 +00:00
jmaguire
2c97c5e873
Compute a simple histogram of depth of coverage.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1098 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 18:30:11 +00:00
hanna
102b38c055
Sketch of new version of TraverseByLocusWindow, and a flag to conditionally turn it on.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1097 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 18:20:56 +00:00
aaron
4e04370f14
forgot a file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1096 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 17:56:17 +00:00
aaron
5b1c23a7f2
changes to fix and test the interval based traversals
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1095 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 17:54:15 +00:00
kcibul
3b24264c2b
incorporating skew check, further output of metrics
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1094 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 16:01:07 +00:00
ebanks
ea2426dcd0
one more change needed to commit
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1093 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 15:09:53 +00:00
ebanks
f6eeb36c93
updated doc
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1092 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 14:32:51 +00:00
ebanks
347608cfe0
remove hacked traversal in preparation for move to Matt's new one
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1091 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 14:32:05 +00:00
ebanks
940d75171a
Big cleaner changes:
...
1. Added a Walker to merge intervals before cleaning
2. (Almost) all Walkers can filter out 454 reads (and do by default)
3. Got rid of -all command and related pieces (time to switch to CleanedReadsInjector)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1090 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 14:31:24 +00:00
asivache
3cb6d7048e
don't freak out if two reference intervals a custom contig is built of are strictly adjacent; instead politely warn user that her data suck and proceed
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1089 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-24 19:08:10 +00:00
asivache
d4f3ca1a10
A utility class for keeping the mapping from 'custom' reference (e.g. transcriptome) onto the 'master' reference (e.g. whole genome), and for remapping SAM records from the former onto the latter. It's Arachne's BaitMultiMap, pretty much
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1088 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-24 18:16:15 +00:00
kiran
69dc502174
I forgot that this depends on BoundedScoringSet.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1087 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-24 17:18:53 +00:00
aaron
61ce4e5983
quick doc change
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1086 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-24 16:35:46 +00:00
asivache
a9c30c5fcc
added -nosort cmdline flag; if specified, the output writer does not attempt to sort reads on the fly (sorting involves use of sorting collection backed up by temporary disk storage and can lead to crashes if temp size is low and/or filesystem is not behaving). Output can be later sorted externally by samtools
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1085 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-24 15:58:00 +00:00
kiran
7b5d8d7604
Changed the intensities array order from cycle,channel to channel,cycle. This, I'm told, is a far more efficient allocation strategy.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1084 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-24 15:41:06 +00:00