Mark DePristo
b3a2371925
Merge branch 'master' into ped
2011-09-29 14:32:17 -04:00
Mark DePristo
68761a6e28
Removed sample from header
2011-09-29 14:13:05 -04:00
Mauricio Carneiro
a5e75cd14c
Outputting both consensus base qualities and counts
...
The base qualities of a consensus reads are now the average quality of the bases forming the consensus base (most common base) and the consensus quality tag now carry an array with the counts of each base in the consensus. This should increase file size but improve calling sensitivity/specificity.
2011-09-29 12:54:41 -04:00
Mauricio Carneiro
d62f2f33bc
Added indel specific context size parameter
...
Parameter was added to the framework but implementing the functionality is pending.
2011-09-29 12:54:41 -04:00
Mark DePristo
505416b6c0
Merge branch 'master' into ped
2011-09-29 12:22:39 -04:00
Mauricio Carneiro
21c4abdd36
Disabling all SlidingReadUnitTests
2011-09-29 12:20:35 -04:00
Mauricio Carneiro
4086fa768f
Disabling all ReadClipperUnitTests
2011-09-29 12:20:35 -04:00
Mark DePristo
9536845e35
Cleaning up unused code in MV
2011-09-29 12:20:07 -04:00
Mark DePristo
5043d76c3d
Removing more bad uses of SampleDataSource creation
2011-09-29 12:16:34 -04:00
Mark DePristo
5c9227cf5e
Further cleanup of Sample database
...
-- Removing more and more unnecessary code
-- Partial removal of type safe Sample usage. On the road to SampleDB only
2011-09-29 11:50:05 -04:00
Khalid Shakir
6dec932ca9
Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-29 11:47:13 -04:00
Khalid Shakir
c08468eb9d
A couple of updates while trying to get desired R 2.13 compactPDF support.
...
preQC:
- For R 2.13 when parsing fingerprints explicitly coercing the text before parsing
- Added LOD geom_line() at +/-3 based on Tim's presentation at PM meeting (ppt to go to pipeline wiki asap)
- PF_INDEL_RATE of zero replaced with NA
- NA's are not "violations" auto filter samples since 0+NA = NA, and subset test only looks for 0 violations
- Restored plots for MEAN_READ_LENGTH, BAD_CYCLES, and MEDIAN_INSERT_SIZE by explicitly print()'ing the created plots
postQC:
- Fixed R 2.13 font scaling by moving size out of aes, except when using highlighting
- TODO: Don't know how to scale by aes for highlighting *and* use a smaller overall font size outside aes
2011-09-29 11:21:50 -04:00
Mark DePristo
2a0cd556d3
Further cleanup of Sample
...
-- Cleaned up interface functions in GAE
-- Added Walker.getSampleDB() function which is an easier option for tools to get the samples db
2011-09-29 10:34:51 -04:00
Mark DePristo
e76f381628
Moved sample package from DataSources to gatk, and renamed it samples
...
-- All associated changes to the codebase are just header updates
2011-09-29 09:57:15 -04:00
Mark DePristo
e197dcd1f3
Pre-cleanup commit of Sample and SampleDataSource
...
-- SampleDataSource has all reader functionality disabled
2011-09-29 09:44:18 -04:00
Mark DePristo
4d31673cc5
No longer supporting YAML file allows us to delete 75% of the sample's codebase
2011-09-29 09:43:31 -04:00
Mauricio Carneiro
fc86cd6fd8
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/carneiro/gatk/RR into rr
2011-09-29 00:12:15 -04:00
Roger Zurawicki
4fd5630f6a
Added ReadClipper Unit Test
...
* Includes tests that include HardClip to Read and Reference Coords.
* Changed ReadUtils.HardClipByReferenceCoordinates from private to protected to allow for testing
2011-09-28 23:13:50 -04:00
Mauricio Carneiro
f49a12de6b
Updating latest changes from the repository to reduce reads repo
2011-09-28 22:31:57 -04:00
Matt Hanna
9272ed03b5
Merged bug fix from Stable into Unstable
2011-09-28 21:26:43 -04:00
Matt Hanna
0acaf2df65
Fix an embarrassing issue where a specific configuration of minimal coverage
...
over small intervals could cause reads to be dropped from the pileup. Nothing
to see here...
2011-09-28 21:23:01 -04:00
Roger Zurawicki
07b0a75d96
Added SlidingRead Unit Test
...
Includes test clipStart and trimToVariableRegion
2011-09-28 21:22:57 -04:00
Khalid Shakir
c5f1a4325f
Updated preQC:
...
- full 8.5x11
- concating multiple initiatives / bait_sets
- Using NA instead of python None when WR dates are unavailable
- In new aggregations where the sample may have per library metrics, only using the sample level metrics, i.e. library is null
Updated postQC:
- Renamed some variables to assist with traceback()
- Fixed crashes on batches with two alleles or two samples such as Seminara_MC_1_09222011 or Engle_MC_2_09222011
- Added dependency tracking to PostCallingQC.scala so that the R script does try to run before the evals are complete
Other minor cleanup.
Tried to use R 2.13 compactPDF but a few issues to work out with fingerprint boxplots in preQC and geom_text font size in postQC.
2011-09-28 20:23:30 -04:00
Mauricio Carneiro
edf852d47d
Adding lists to ReduceReads script
...
script can handle single file or list of files separately now. Always scatter/gathering.
2011-09-28 18:40:30 -04:00
Mauricio Carneiro
64e7b3000c
Fix read spans deletion through the entire interval
...
if the read has a deletion that spans the entire length of the interval, it should not be added to mapped reads.
2011-09-28 18:40:30 -04:00
Mauricio Carneiro
a93ece07e3
ScatterGatherable reduce reads script
...
Get your reduce read in a matter of seconds...
2011-09-28 18:40:30 -04:00
Guillermo del Angel
c8d3a720f9
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-28 18:17:34 -04:00
Guillermo del Angel
7e3cb45093
Further performance optim in banded hmm, about 60% speed improvement over current implementation now
2011-09-28 16:27:28 -04:00
Ryan Poplin
1b1ca80df2
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-28 16:17:39 -04:00
Ryan Poplin
3b73dc89fe
Making several esoteric arguments in the BQSR @Hidden. Adding basic support for Complete Genomics machine cycle.
2011-09-28 16:17:31 -04:00
Mauricio Carneiro
ff2f4df043
Fixed hardclipping inside indel (right tail)
...
when hard clipping the right tail of a read falls inside a deletion, clipping should fall back to the last base before the deletion to follow the ReadClipper's contract.
2011-09-28 16:07:34 -04:00
Mauricio Carneiro
3c7b7f74ef
Optimized interval iteration
...
Using a TreedSet to manipulate getToolkit.getIntervals() and being smart about which intervals to test makes interval clipping O(1) instead of O(n).
2011-09-28 16:07:34 -04:00
Mauricio Carneiro
5c9b659c02
clipping both ends of the reads was modifying the original read
...
This goes against the ReadClipper contract, and was affecting the second part of the read that spans over multiple intervals. Fixed.
2011-09-28 16:07:34 -04:00
Guillermo del Angel
fe23e4d10c
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-28 15:53:11 -04:00
Guillermo del Angel
e2b9030e93
First mostly fully functional implementation of banded pair HMM likelihood computation for indel caller. More experimentation to follow but it right now works in small data sets and at least it doesn't break existing things. Disabled by default at this point
2011-09-28 15:51:48 -04:00
Eric Banks
1b45f21774
Removing this command-line tool. Purposely not doing this in stable so that users who may still use it have time to find other options. But the docs are no longer on the wiki.
2011-09-28 13:18:32 -04:00
Eric Banks
1f0e354fae
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-28 13:13:21 -04:00
Eric Banks
bb619a9a3c
Fixing docs
2011-09-28 13:13:03 -04:00
Mark DePristo
5812004e06
Merge branch 'stable'
2011-09-28 11:36:40 -04:00
Mark DePristo
a88b7c1203
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-28 11:36:33 -04:00
Mark DePristo
a5006831d7
Shows "" not empty space when default string value is ""
2011-09-28 11:35:52 -04:00
Mark DePristo
1e32281a15
Fix to not show -null when missing short name argument
2011-09-28 11:31:20 -04:00
Mauricio Carneiro
89544c209c
Fixing contracts
...
changed return type to Pair, changing contracts accordingly.
2011-09-28 11:19:17 -04:00
Mark DePristo
2e2463633f
Queue script to find missing calls between full and reduced bams
2011-09-28 11:17:25 -04:00
Eric Banks
eacbee3fe5
Merged bug fix from Stable into Unstable
2011-09-27 20:35:18 -04:00
Eric Banks
43b0c98298
Fix docs
2011-09-27 20:34:46 -04:00
Eric Banks
232a6df11c
Add longhand form to the error message.
2011-09-27 20:29:31 -04:00
Eric Banks
1d6fcb6eb1
Revert "Add longhand form to the error message to prevent users from posting borderline dumb posts to GS."
...
This reverts commit 75b2600527cfce05ae683cb394290ff2a80e8552.
2011-09-27 20:27:00 -04:00
Eric Banks
269b9826b6
Add longhand form to the error message to prevent users from posting borderline dumb posts to GS.
2011-09-27 20:26:36 -04:00
Mauricio Carneiro
3b6e43b7c4
Use reads that span multiple intervals
...
* RR will now compress reads that span across multiple intervals correctly and output them in the correct order.
* Fixed bug in getReadCoordinateForReferenceCoordinate where if the requested reference coordinate fell inside a deletion in the read the read would be clipped up to one element past the deletion.
2011-09-27 18:39:06 -04:00