Mark DePristo
3226d5dc0d
Merge branch 'master' into ped
2011-10-05 15:03:09 -07:00
Mark DePristo
6a573437af
Details documentation arguments for -ped
2011-10-05 15:00:58 -07:00
Mark DePristo
e7c80f7c45
Renaming quantitative trait to OtherPhenotype which is now a String not a double
...
-- we can now use PED file to represent population data or other arbitrary phenotype data, not just doubles
2011-10-05 12:26:33 -07:00
Mark DePristo
51ecc20867
getFamily() and associated methods implemented and tested
...
-- Sample no longer serializable
-- Sample now implements Comparable
2011-10-05 09:55:05 -07:00
Mark DePristo
f4bac58f14
Merged bug fix from Stable into Unstable
2011-10-04 21:00:34 -07:00
Mark DePristo
d1d39943d0
Updating MD5 for BAMs that I added a read group to, part 2
2011-10-04 21:00:15 -07:00
Mark DePristo
9bd3ba4c7e
Missed one MD5
2011-10-04 16:04:52 -07:00
Mark DePristo
ffdfdcde3f
Updating MD5s
...
-- Interval test now uses RG containing BAM
-- DoC sample name ordering has changed.
2011-10-04 15:54:45 -07:00
Mark DePristo
a45d985818
TODO method stubs
2011-10-04 15:54:09 -07:00
Mark DePristo
463eab7604
All MD5 mismatches for test are shown
...
-- Now for tests like DoC, with 20 output md5s, you see all of the differences before failing.
2011-10-04 15:53:52 -07:00
Mark DePristo
c642a080d4
Merged bug fix from Stable into Unstable
2011-10-04 14:08:41 -07:00
Mark DePristo
941317167e
Updating MD5 for BAMs that I added a read group to
2011-10-04 14:08:00 -07:00
Mark DePristo
e1d6c7a50a
Updating MD5 that have changed due to sample ordering differences
2011-10-04 09:33:23 -07:00
Mark DePristo
343a7b6b2f
Updating UG integration tests for arbitrary impact of sample order changes on downsampling
2011-10-04 08:14:00 -07:00
Mark DePristo
fee89e47ff
Only throws an error when there are no samples but there are reads
...
-- Handles the case when you are running a ROD traversal and yet the LIBS is still used to return null everywhere.
2011-10-04 06:50:54 -07:00
Mark DePristo
f552aede42
Only provide the sample names in the BAM file for efficiency
2011-10-04 06:50:12 -07:00
Mark DePristo
a27641e1fc
Cleaned up imports
2011-10-04 06:28:36 -07:00
Mark DePristo
b20689ff55
No longer supports extraProperties
...
-- the underlying data structure is still present, but until I decide what to do for the extensible system I've completely disabled the subsystem
-- Added code to merge Samples, so that a mostly full record can be merged with a consistent empty record. If the two records are inconsistent, an error is thrown
-- addSample() in Sample.class now invokes mergeSample() when appropriate
-- Validation types are now only STRICT or SILENT
-- Validation code implemented in SampleDBBuilder
-- Extensive unit tests for SampleDBBuilder
2011-10-03 19:20:33 -07:00
Mark DePristo
867a7476c1
Systematic unit tests for the sample object
2011-10-03 19:09:02 -07:00
Mauricio Carneiro
3837aa45b4
Fixing conflicts
...
Conflicts:
public/java/test/org/broadinstitute/sting/utils/clipreads/ReadClipperUnitTest.java
2011-10-03 19:07:59 -07:00
Mark DePristo
2e3dc52088
Minor function renaming
2011-10-03 14:41:13 -07:00
Mark DePristo
dd71884b0c
On path to SampleDB engine integration
...
-- PedReader tag parser
-- Separation of SampleDBBuilder from SampleDB (now immutable)
-- Removed old sample engine arguments
2011-10-03 12:08:07 -07:00
Eric Banks
c3eff7451a
Found a small inefficiency while profiling: we were still using String.split instead of ParsingUtils.split to break up array values in the INFO field. There was a noticeable (albeit not big) difference in the change when reading sites only files.
2011-10-03 14:20:39 -04:00
Mark DePristo
8ee0f91904
Remove residual processing tracker arguments
2011-10-03 09:50:01 -07:00
Mark DePristo
89ac50e86e
SampleDataSource -> SampleDB
2011-10-03 09:33:30 -07:00
Mark DePristo
93fba06cb5
Support for whitespace only lines
2011-10-03 09:30:10 -07:00
Mark DePristo
0604ce55d1
PedReader support for ; separated lines, not only newline
2011-10-03 09:19:58 -07:00
Mark DePristo
52f670c8b8
100% version of PedReader
...
-- Passes all unit tests
-- Added unit tests for missing fields
2011-10-03 06:12:58 -07:00
Roger Zurawicki
bf6a3a6532
Added framework to do batch CigarClip Testing
...
*NOTE: This commit has not been compiled!
2011-10-02 22:33:46 -04:00
Mark DePristo
dd75ad9f49
95% PedReader
...
-- Passes significiant unit tests
-- Implicit sample creation for mom / dad when you create single samples
-- Continuing cleanup of Sample and SampleDataSource
2011-09-30 18:03:34 -04:00
Andrey Sivachenko
c7898a9be7
inconsequential change in string constants printed into the vcf which noone uses anyway...
2011-09-30 16:40:21 -04:00
Mark DePristo
010899f886
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-30 15:51:09 -04:00
Mark DePristo
84160bd83f
Reorganization of Sample
...
-- Moved Gender and Afflication to separate public enums
-- PedReader 90% implemented
-- Improve interface cleanup to XReadLines and UserException
2011-09-30 15:50:54 -04:00
Mauricio Carneiro
05fba6f23a
Clipping ends inside deletion and before insertion
...
fixed.
2011-09-30 15:44:43 -04:00
Mark DePristo
c1cf6bc45a
PEDReader should be in samples
2011-09-30 14:22:19 -04:00
Mark DePristo
56f10b40a8
Fixing test bugs for WindowMaker that required empty sample list
2011-09-30 14:18:27 -04:00
Ryan Poplin
af6c053435
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-30 13:33:31 -04:00
Mark DePristo
810e8ad011
Removed getXByReaders() function from the engine
...
-- These could be simplied in their downstream uses
-- Or they could be replaced with a generic getSAMFileHeaders() function and then apply the getSamples(header) as desired downstream
2011-09-30 10:43:51 -04:00
Mark DePristo
178ba24c27
Move getSamplesForSamFile to SampleUtils
...
-- A nearly identical piece of code already lived in SampleUtils. Now there are two functions, one taking a regular header and another grabbing the merged header from the GATK engine itself. Much cleaner
2011-09-30 10:28:18 -04:00
Mark DePristo
30d23942b1
Renamed ReadBackedPileup getXSampleName() functions to getXSample
...
-- now that we don't have Sample objects floating around we don't have to have all of the Name extensions on our functions
2011-09-30 10:02:57 -04:00
Mark DePristo
3289a325fc
Removed final use of Sample in RBP
2011-09-30 09:57:39 -04:00
Mark DePristo
a69a4dda2f
SamplesDB no longer has null sample
...
-- Updated getSamples().size() == 2 test in CallableLociWalker that really ensured there was one sample in the system
2011-09-30 09:56:23 -04:00
Mark DePristo
e055a78f6e
LIBS now requires at least one sample be present
...
-- UnitTest provides a "null" sample for matching the reads without read groups
2011-09-30 09:49:35 -04:00
Mark DePristo
9860a2c989
Merge branch 'master' into ped
2011-09-30 09:28:18 -04:00
Mark DePristo
d901fed617
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-30 08:41:44 -04:00
Mauricio Carneiro
cabacf028d
Intermediate commit to fix interval skipping
...
may need additional testing.
2011-09-29 18:45:12 -04:00
Mark DePristo
b71b51751e
Bug fix for UnitTest
...
-- Provide the null sample to the LIBS, as this seems to be required for correctly passing this unit test
-- Will be fixed in a future update
2011-09-29 17:30:01 -04:00
Mark DePristo
1765fbeb6b
Merge branch 'master' into ped
2011-09-29 17:18:51 -04:00
Mark DePristo
98ecaf8aa0
Support for ReducedReads with reduced counts and average quals
...
-- ReadUtils and UnitTest updated to support new byte[] style
-- Removed unnecessary read transformer in PairHMM
2011-09-29 17:18:39 -04:00
Mauricio Carneiro
9508220157
fixed hard clipping both ends inside deletion
...
If both ends of the interval falls within a deletion in the read then hardClipBothEnds would cut the right tail first including the entire deletion, then fail to cut the left tail because there would not be any bases there anymore. Fixed.
2011-09-29 15:36:49 -04:00
Mark DePristo
9458f01409
Test cleanup of Sample object
2011-09-29 15:13:05 -04:00
Mark DePristo
625ffb6a07
LocusIteratorByState and ReadBackedPileups no long use Sample
2011-09-29 14:52:11 -04:00
Mark DePristo
b3a2371925
Merge branch 'master' into ped
2011-09-29 14:32:17 -04:00
Mark DePristo
68761a6e28
Removed sample from header
2011-09-29 14:13:05 -04:00
Mauricio Carneiro
a5e75cd14c
Outputting both consensus base qualities and counts
...
The base qualities of a consensus reads are now the average quality of the bases forming the consensus base (most common base) and the consensus quality tag now carry an array with the counts of each base in the consensus. This should increase file size but improve calling sensitivity/specificity.
2011-09-29 12:54:41 -04:00
Mark DePristo
505416b6c0
Merge branch 'master' into ped
2011-09-29 12:22:39 -04:00
Mauricio Carneiro
4086fa768f
Disabling all ReadClipperUnitTests
2011-09-29 12:20:35 -04:00
Mark DePristo
9536845e35
Cleaning up unused code in MV
2011-09-29 12:20:07 -04:00
Mark DePristo
5043d76c3d
Removing more bad uses of SampleDataSource creation
2011-09-29 12:16:34 -04:00
Mark DePristo
5c9227cf5e
Further cleanup of Sample database
...
-- Removing more and more unnecessary code
-- Partial removal of type safe Sample usage. On the road to SampleDB only
2011-09-29 11:50:05 -04:00
Mark DePristo
2a0cd556d3
Further cleanup of Sample
...
-- Cleaned up interface functions in GAE
-- Added Walker.getSampleDB() function which is an easier option for tools to get the samples db
2011-09-29 10:34:51 -04:00
Mark DePristo
e76f381628
Moved sample package from DataSources to gatk, and renamed it samples
...
-- All associated changes to the codebase are just header updates
2011-09-29 09:57:15 -04:00
Mark DePristo
e197dcd1f3
Pre-cleanup commit of Sample and SampleDataSource
...
-- SampleDataSource has all reader functionality disabled
2011-09-29 09:44:18 -04:00
Mark DePristo
4d31673cc5
No longer supporting YAML file allows us to delete 75% of the sample's codebase
2011-09-29 09:43:31 -04:00
Ryan Poplin
e366ee18bc
Adding ability to read in and make use of kmer quality tables during HMM evaluation
2011-09-29 07:46:19 -04:00
Mauricio Carneiro
fc86cd6fd8
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/carneiro/gatk/RR into rr
2011-09-29 00:12:15 -04:00
Roger Zurawicki
4fd5630f6a
Added ReadClipper Unit Test
...
* Includes tests that include HardClip to Read and Reference Coords.
* Changed ReadUtils.HardClipByReferenceCoordinates from private to protected to allow for testing
2011-09-28 23:13:50 -04:00
Matt Hanna
9272ed03b5
Merged bug fix from Stable into Unstable
2011-09-28 21:26:43 -04:00
Matt Hanna
0acaf2df65
Fix an embarrassing issue where a specific configuration of minimal coverage
...
over small intervals could cause reads to be dropped from the pileup. Nothing
to see here...
2011-09-28 21:23:01 -04:00
Guillermo del Angel
c8d3a720f9
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-28 18:17:34 -04:00
Guillermo del Angel
7e3cb45093
Further performance optim in banded hmm, about 60% speed improvement over current implementation now
2011-09-28 16:27:28 -04:00
Ryan Poplin
1b1ca80df2
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-28 16:17:39 -04:00
Ryan Poplin
3b73dc89fe
Making several esoteric arguments in the BQSR @Hidden. Adding basic support for Complete Genomics machine cycle.
2011-09-28 16:17:31 -04:00
Mauricio Carneiro
ff2f4df043
Fixed hardclipping inside indel (right tail)
...
when hard clipping the right tail of a read falls inside a deletion, clipping should fall back to the last base before the deletion to follow the ReadClipper's contract.
2011-09-28 16:07:34 -04:00
Mauricio Carneiro
3c7b7f74ef
Optimized interval iteration
...
Using a TreedSet to manipulate getToolkit.getIntervals() and being smart about which intervals to test makes interval clipping O(1) instead of O(n).
2011-09-28 16:07:34 -04:00
Mauricio Carneiro
5c9b659c02
clipping both ends of the reads was modifying the original read
...
This goes against the ReadClipper contract, and was affecting the second part of the read that spans over multiple intervals. Fixed.
2011-09-28 16:07:34 -04:00
Guillermo del Angel
fe23e4d10c
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-28 15:53:11 -04:00
Guillermo del Angel
e2b9030e93
First mostly fully functional implementation of banded pair HMM likelihood computation for indel caller. More experimentation to follow but it right now works in small data sets and at least it doesn't break existing things. Disabled by default at this point
2011-09-28 15:51:48 -04:00
Eric Banks
1b45f21774
Removing this command-line tool. Purposely not doing this in stable so that users who may still use it have time to find other options. But the docs are no longer on the wiki.
2011-09-28 13:18:32 -04:00
Eric Banks
1f0e354fae
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-28 13:13:21 -04:00
Eric Banks
bb619a9a3c
Fixing docs
2011-09-28 13:13:03 -04:00
Mark DePristo
5812004e06
Merge branch 'stable'
2011-09-28 11:36:40 -04:00
Mark DePristo
a5006831d7
Shows "" not empty space when default string value is ""
2011-09-28 11:35:52 -04:00
Mark DePristo
1e32281a15
Fix to not show -null when missing short name argument
2011-09-28 11:31:20 -04:00
Mauricio Carneiro
89544c209c
Fixing contracts
...
changed return type to Pair, changing contracts accordingly.
2011-09-28 11:19:17 -04:00
Eric Banks
eacbee3fe5
Merged bug fix from Stable into Unstable
2011-09-27 20:35:18 -04:00
Eric Banks
43b0c98298
Fix docs
2011-09-27 20:34:46 -04:00
Eric Banks
232a6df11c
Add longhand form to the error message.
2011-09-27 20:29:31 -04:00
Eric Banks
1d6fcb6eb1
Revert "Add longhand form to the error message to prevent users from posting borderline dumb posts to GS."
...
This reverts commit 75b2600527cfce05ae683cb394290ff2a80e8552.
2011-09-27 20:27:00 -04:00
Eric Banks
269b9826b6
Add longhand form to the error message to prevent users from posting borderline dumb posts to GS.
2011-09-27 20:26:36 -04:00
Mauricio Carneiro
3b6e43b7c4
Use reads that span multiple intervals
...
* RR will now compress reads that span across multiple intervals correctly and output them in the correct order.
* Fixed bug in getReadCoordinateForReferenceCoordinate where if the requested reference coordinate fell inside a deletion in the read the read would be clipped up to one element past the deletion.
2011-09-27 18:39:06 -04:00
Khalid Shakir
84bd355690
Merged bug fix from Stable into Unstable
2011-09-27 14:34:39 -04:00
Khalid Shakir
b090751f62
Fixed Ant / PluginManager issue where reflections was picking up all class files under current working directory due to "." in jar manifest classpaths.
...
Updates to HybridSelectionPipeline:
- Added annotations back via snpEff
- Minor updates to VQSR paths and lowered memory
2011-09-27 14:33:57 -04:00
Eric Banks
26e71f6688
The Omni files have multiple records (with the same ALT) at a particular location, with one PASSing and the other(s) filtered. Chris, this is why using this file as both eval and comp leads to ref/no-call cells in the GenotypeConcordance table. However, this led to non-determinism in VE because the VCs were placed in a HashSet; we use a LinkedHashMap instead to bring back determinism.
2011-09-27 11:03:17 -04:00
Guillermo del Angel
ceffefa6a6
Intermediate version with banded pair HMM
2011-09-27 10:18:58 -04:00
Mark DePristo
e99ff3caae
Removed lots of old, and not to be used, HMM options
...
-- resulted in massive code cleanup
-- GdA will integrate his new banded algorithm here
-- Removed: DO_CONTEXT_DEPENDENT_PENALTIES, GET_GAP_PENALTIES_FROM_DATA, INDEL_RECAL_FILE, dovit, GSA_PRODUCTION_ONLY
2011-09-27 10:08:40 -04:00
Mark DePristo
fa0efbc4ca
Refactoring of PairHMM to support reduced reads
2011-09-26 13:28:56 -04:00
Mark DePristo
a6b65d6347
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-26 13:26:21 -04:00
Mark DePristo
4f09453470
Refactored reduced read utilities
...
-- UnitTests for key functions on reduced reads
-- PileupElement calls static functions in ReadUtils
-- Simple routine that takes a reduced read and fills in its quals with its reduced qual
2011-09-26 12:58:31 -04:00
Eric Banks
234b74dd05
Merged bug fix from Stable into Unstable
2011-09-26 11:47:23 -05:00