Mark DePristo
e216e85465
First working version of VariantContextBenchmark
2011-11-11 09:56:00 -05:00
Mark DePristo
ee40791776
Attributes are now Map<String,Object> not Map<String,?>
...
-- Allows us to avoid an unnecessary copy when creating InferredGeneticContext (whose name really needs to change).
2011-11-11 09:55:42 -05:00
Mark DePristo
153e52ffed
VariantEvalIntegrationTest for IntervalStratification
2011-11-10 14:10:39 -05:00
Mauricio Carneiro
d00b2c6599
Adding a synthetic read for filtered data
...
* Generalized the concept of a synthetic read to cread both running consensus and a synthetic reads of filtered data.
* Synthetic reads can now have deletions (but not insertions)
* New reduced read tag for filtered data synthetic reads *(RF)*
* Sliding window header now keeps information of consensus and filtered data
* Synthetic reads are created simultaneously, new functionality is controlled internally by addToSyntheticReads
2011-11-09 20:16:22 -05:00
Eric Banks
02d5e3025e
Added integration test for intervals from bed file
2011-11-09 15:34:19 -05:00
Ryan Poplin
94dc447a70
Merged bug fix from Stable into Unstable
2011-11-07 15:26:35 -05:00
Ryan Poplin
0b181be61f
Bug fix in SelectVariants when using a discordance track but no sample specifications. Added integration test to test this.
2011-11-07 15:25:16 -05:00
Eric Banks
759f4fe6b8
Moving unclaimed walker with bad integration test to archive
2011-11-07 13:16:38 -05:00
Eric Banks
3517489a22
Better --sample selection integration test for VE. The previous one would return true even if --sample was not working at all.
2011-11-06 01:07:49 -04:00
Eric Banks
ad57bcd693
Adding integration test to cover using expressions with IDs (-E foo.ID)
2011-11-05 23:53:15 -04:00
Mauricio Carneiro
e89ff063fc
GATKSAMRecord refactor
...
The GATK engine will now provide a GATKSAMRecord to all tools which incorporates the functionality used by the GATK to the bam file (ReadGroups, Reduced Reads, ...).
* No tools should create SAMRecord anymore, use GATKSAMRecord instead *
2011-11-03 15:43:26 -04:00
Eric Banks
e8bceb1eaa
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-02 21:13:54 -04:00
Eric Banks
78a00d2ddc
Updating UG integration tests (needed updating only because the -mbq default is different from the old -mmq one).
2011-11-02 21:13:44 -04:00
Eric Banks
e1edd6bd12
Removing the min mapping quality argument since it wasn't being used in the normal processing of the pileups in UG - only for indel pileups. Instead, we apply the min base quality to the reads in the pileup for indels and define it to be the min 'confidence' of the base. Docs are updated but I didn't rename the argument as I don't want people to complain.
2011-11-02 20:32:58 -04:00
Mark DePristo
8a2929c1dd
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-11-02 16:21:00 -04:00
Eric Banks
4501dce58d
Fixing merge conflict
2011-11-02 12:50:32 -04:00
Eric Banks
54331b44e9
New way of looking at the size of a pileup: there's a physical number of elements in the data structure and there's a representative depth of coverage (since a reduced read represents depth >= 1). The size() method has been removed because its meaning is ambiguous. Updated several annotations and the UG engine to make use of the representative depths.
2011-11-02 12:47:30 -04:00
Mark DePristo
392e0aeace
Moved unit tests into master IntervalUtilsUnitTest
2011-11-02 10:52:00 -04:00
Mark DePristo
c2b97030a4
IntervalUtils for completely balanced locus-based scatter/gather
...
-- scatterLocusIntervals master utility
-- Moved around some general functionality from GenomeLocSortedSet to GenomeLoc
-- Util function for reversing a list (List<T> -> List<T>, unlike Collections version)
-- DoC is PartitionType.INTERVAL
-- Significant unit tests on new functionality (all passing)
-- Ready for real-world testing, as soon as I can get LocusScatterFunction.scala to actually work
2011-11-02 10:49:40 -04:00
Mauricio Carneiro
b004489c6d
Moving ReduceRead TAG to GATKSAMRecord
...
ReduceReads are now a feature of a GATKSAMRecord, so the tag and the special methods needed to use it will now be housed by the GATKSAMRecord.
2011-11-01 17:12:09 -04:00
Eric Banks
0ca7428e76
Allow processing of empty intervals, but warn user when this case is encountered.
2011-10-28 12:12:14 -04:00
Eric Banks
649dfe98f0
Add VCF header for any expressions that are requested
2011-10-28 10:22:19 -04:00
Eric Banks
8b1a62da27
Adding unit test to cover overlapping intervals from the same source with the intersection rule.
2011-10-28 09:59:43 -04:00
Eric Banks
6ba08a103d
Empty ROD files should generate an exception when used for creating intervals. Moved some now obsolete files to the archive as the realigner will now read all target intervals into memory.
2011-10-28 09:23:25 -04:00
Eric Banks
19e27d4568
Removing all instances of -BTI (in tests and in GATKdocs) and replacing them with the appropriate alternative.
2011-10-27 23:55:11 -04:00
Eric Banks
ccfd853b34
Added further integration tests for rod-based intervals that deal with more complex cases. Good call by Mark to test the empty VCF example because we were failing on it; fixed.
2011-10-27 20:43:50 -04:00
Khalid Shakir
b80d407dc7
No more hunting down R "resources". As a tradeoff Rscript cannot be specified on the commandline and will be found in the environment path.
...
Other minor cleanup.
2011-10-27 14:17:07 -04:00
Eric Banks
8c4dbce6d8
Don't serialize the GATKArgumentCollection for the GATKRunReports (which would have meant dealing with the new IntervalBindings). Also, forgot to remove a test that's no longer relevant to BED parsing.
2011-10-27 13:58:19 -04:00
Eric Banks
4a7e6fee3f
Remove support for BED file interval parsing in the GATK; it should all go through Tribble now. IndelRealigner no longer supports unordered interval input (which shouldn't have been used anyways). Temporarily commenting out serialization of arguments so that tests pass; this whole piece will be deleted soon anyways.
2011-10-27 13:38:08 -04:00
Eric Banks
44f905b5e5
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-26 23:31:11 -04:00
Mark DePristo
034a997d07
Generalized Reads -> Fragment calculation
...
-- Supports ReadBackedPileup -> FragmentCollection as before
-- Added support for List<SAMRecord> -> FragmentCollection for Ryan's haplotype caller
-- General cleanup, renaming, move to separate package, more extensive unit tests, etc.
-- Added toFragment() function to ReadBackedPileup interface
2011-10-26 15:54:38 -04:00
Eric Banks
b39fcb1bea
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-26 15:44:25 -04:00
Eric Banks
3273c20c98
Added integration tests for Tribble-based intervals and fixed up some of the other tests based on some method changes.
2011-10-26 15:29:18 -04:00
Mark DePristo
7fa943aef1
Renamed FragmentPileup to FragmentUtils
2011-10-26 14:01:45 -04:00
Mark DePristo
1b722c21cf
merge master
2011-10-25 16:08:39 -04:00
David Roazen
2794e5c1d4
Modified the VCFJarClassLoadingUnitTest to play nice with the packaged-jar test targets.
2011-10-25 14:47:15 -04:00
Khalid Shakir
fac9932938
Embedding gsalib source and queueJobReport R scripts in the dist and package jars.
...
Moved gsalib and queueJobReport.R to embeddable namespaced locations.
Updated packager dependencies/dir to add an @includes which filters the embedded fileset.
RScriptExecutor can now JIT compiles the gsalib.
RScriptExecutor uses ProcessController and sends the Rscript output to java's stdout when run under -l DEBUG.
Refactored ProcessController and IOUtils from Queue to Sting Utils.
Added more unit tests to ProcessController along with a utility class to hard stop OutputStreams at a specified byte count.
Replaced uses of some IOUtils with Apache Commons IO.
ShellJobRunner refactored to use direct ProcessController and now kills jobs on shutdown.
Better QGraph responsiveness on shutdown by using Object.wait() instead of Thread.sleep().
2011-10-24 15:58:34 -04:00
Khalid Shakir
89a581a66f
Added ability to specify arguments in files via -args/--arg_file
...
Pushing back downsample and read filter args so they show up in getApproximateCommandLineArgs()
2011-10-24 15:58:34 -04:00
Mark DePristo
502592671d
Cleanup FragmentPileup before main repo commit
...
-- removed intermiate functions. Now only original version and best optimized new version remain
-- Moved general artificial read backed pileup creation code into ArtificialSamUtils
2011-10-24 14:40:05 -04:00
Mark DePristo
166174a551
Google caliper example execution script
...
-- FragmentPileup with final performance testing
2011-10-24 14:04:53 -04:00
Mark DePristo
42bf9adede
Initial version of "fast" FragmentPileup code
...
-- Uses mayOverlapRoutine in ReadUtils
-- Attempts to be smart when doing overlap calculation, to avoid unnecessary allocations
-- PileupElement now comparable (sorts on offset than on start)
-- Caliper microbenchmark to assess performance
2011-10-22 21:36:37 -04:00
Guillermo del Angel
f4b409fa0d
CombineVariants bug fix: when merging records with disparate alleles we were leaving AC,AF fields intact. This had as a consequence that we could end up with a record with 3 alt alleles but only 2 values in AC,AF fields. Now, if alleles in combined vc are different from original, and if AC,AF fields can't be recomputed from genotypes, we remove attributes from vc map since they'll be invalid anyway. Integration test md5 changed since there were several badly merged records in result
2011-10-21 14:07:20 -04:00
Mark DePristo
b863390cb1
Moving reduced read functionality into GATKSAMRecord
...
-- More functions take / produce GATKSAMRecords instead of SAMRecord
2011-10-21 13:28:05 -04:00
Mark DePristo
110e13bc1e
Merge branch 'master' into SamRecordFactory
2011-10-21 09:43:52 -04:00
Mark DePristo
3227143a1c
Systematic test code for FragmentPileup
...
-- Creates all combinatinos of overlapping and non-overlapping read pair pileups in all orientations and first/second pairings to validate fragment detection.
2011-10-19 17:50:27 -04:00
Eric Banks
d8d73fe4f2
Treat ./X genotypes as MIXED so that isHet, isHom, etc. still return the expected and correct values. Added docs to these accessors with contracts explicitly mentioned. Fixed case where NPE could be thrown.
2011-10-19 15:11:13 -04:00
Eric Banks
5a6468c11e
Allowing ./X genotypes and adding a unit test to ensure that this case is covered from now on (especially given that we may want to revert in the future). Reverting this change is really easy and entails uncommenting a few lines of code. But for now, despite Mark's objections, this case is allowed in the VCF spec and we are wrong not to allow it.
2011-10-19 11:52:05 -04:00
David Roazen
88d6b8bc1f
Merged bug fix from Stable into Unstable
2011-10-14 20:13:38 -04:00
David Roazen
bd8bb93811
Split RScriptExecutorUnitTest into public and private test classes.
...
We can't have a public test that depends on both public and private
code/data -- the new release system needs to do public-only tests,
and will catch this sort of thing.
2011-10-14 20:04:42 -04:00
David Roazen
4f01a742cb
Merged bug fix from Stable into Unstable
2011-10-13 21:39:52 -04:00
David Roazen
edfd6f8a06
Removing a public -> private dependency from the test suite.
...
The public integration test VariantContextIntegrationTest was dependent on the
private walker TestVariantContextWalker. Moved this walker to public/java/test
(NOT public/java/src, since this walker is only used by the test suite) to avoid
errors during public-only tests.
2011-10-13 21:32:52 -04:00
Mark DePristo
404ef741f1
Merged bug fix from Stable into Unstable
2011-10-13 18:02:06 -04:00
Mark DePristo
2ebdff074c
Update MD5s for SOLiD recalibration
...
-- MD5 db had spelling error; fixed
-- Bug in AlignmentUtils resulted in some bases not being color space corrected. The integration test caught the change, and it's clear that the new version is correct, as the prev. version was not considering the last the N qualities for reads with a ND operation.
2011-10-13 18:01:51 -04:00
Eric Banks
9aecd50473
Adding ability to exclude annotations from the VA and UG lists. As described in the docs, this argument trumps all others (including -all) so that we can get around the SnpEff issue brought up by Menachem. Added integration test for it.
2011-10-12 15:44:54 -04:00
David Roazen
cfd0ac8410
Merged bug fix from Stable into Unstable
...
Conflicts:
public/java/test/org/broadinstitute/sting/gatk/walkers/genotyper/UnifiedGenotyperIntegrationTest.java
2011-10-11 12:03:51 -04:00
David Roazen
24b72334b3
UnifiedGenotyper now correctly initializes the VariantAnnotator engine.
...
This allows the annotation classes to perform any necessary initialization/validation.
For example, it allows the SnpEff annotator to (among other things) validate its rod binding.
This will prevent a NullPointerException when SnpEff annotation is requested but no rod binding
is present.
Added an integration test to cover this case so that it doesn't break again.
2011-10-11 12:02:05 -04:00
Mark DePristo
fb72bcf732
DiffObjects no longer prints out the file name in the status so MD5 are stable
2011-10-10 15:10:57 -04:00
Mark DePristo
e3ff4f4266
Failing MD5 because output now contains absolute path
2011-10-10 11:05:02 -04:00
Mark DePristo
3e6c16d961
CombineVariants preserves allele order
2011-10-10 11:04:38 -04:00
Mark DePristo
a4bb842958
RankSum tests have lightly different MD5 results based on allele order
...
-- UG GENOTYPE_GIVEN_ALLELES now uses the order of alleles in the VCF, so this changes the MD5
2011-10-10 11:04:07 -04:00
Mark DePristo
46e7370128
this.allele, getAlleles(), and getAltAlleles() now return List not set
...
-- Changes associated code throughout the codebase
-- Updated necessary (but minimal) UnitTests to reflect new behavior
-- Much better makealleles() function in VC.java that enforces a lot of key constraints in VC
2011-10-09 11:45:55 -07:00
Mark DePristo
822654b119
UnitTests for allele getting functions in VC in prep for move from set to list
2011-10-09 10:36:14 -07:00
Mark DePristo
c67f6c076b
simpleMerge now preserves allele order
...
-- UnitTests for dangerous PL merging cases in the multi-allelic case. The new behavior is correct
2011-10-08 17:39:53 -07:00
Mark DePristo
e94e6ba101
A UnitTest to ensure that the order of alleles is maintained
...
-> A, C, T and A, T, C are different and must be maintained. The constructors were doing this appropriately, so nothing needed to be changed
2011-10-08 08:47:58 -07:00
Matt Hanna
6fbd41724a
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-07 11:20:00 -04:00
Matt Hanna
4514bc350f
More reliable way of finding the Tribble jar.
2011-10-07 11:19:29 -04:00
Eric Banks
181c76750e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-06 22:38:55 -04:00
Eric Banks
ca9cd9b688
Minor fix for merging intervals which hadn't been necessary when only merging from the left to right. Added integration tests to cover the parallelization of RTC.
2011-10-06 22:38:44 -04:00
Khalid Shakir
f91b015e0e
Made the BaseTest.testDir absolute
2011-10-06 22:33:21 -04:00
Eric Banks
61a3dfae24
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-06 15:58:04 -04:00
Eric Banks
6eb87bf58a
RTC now caches all intervals as GenomeLocs (which is expected to take < 1Gb whole genome based on back of the envelope calculations with Matt) so that 1) we don't have to worry about emitting outside of the leaves in the hierarchical reductions and 2) we can emit the intervals in sorted order which is a big performance plus for the realigner. Integration tests change only because intervals whose start=stop are now printed as chr:start instead of chr:start-stop.
2011-10-06 15:57:49 -04:00
Mark DePristo
6d9c210460
Updating MD5s for updated BAM with read groups
2011-10-06 12:15:48 -07:00
Matt Hanna
3961733590
Merged bug fix from Stable into Unstable
2011-10-06 12:54:52 -04:00
Matt Hanna
4fa5045e84
Abandoning classfileset/rootfileset approach due to difficulting managing
...
classloading of bcel*.jar/ant-apache-bcel*.jar. Switching instead to manually
specifying a minimal set of packages/classes to include in the vcf.jar via
build.xml, and adding a unit test which creates a limited classloader
only aware of vcf.jar and tribble.jar and tries to use it to load the core
classes in the vcf jar.
Hopefully third time's the charm.
2011-10-06 12:49:51 -04:00
Mark DePristo
4b5b9155a9
Fixed bad expected value in PedReaderUnitTest
2011-10-06 08:16:47 -07:00
Mark DePristo
3226d5dc0d
Merge branch 'master' into ped
2011-10-05 15:03:09 -07:00
Mark DePristo
e7c80f7c45
Renaming quantitative trait to OtherPhenotype which is now a String not a double
...
-- we can now use PED file to represent population data or other arbitrary phenotype data, not just doubles
2011-10-05 12:26:33 -07:00
Mark DePristo
51ecc20867
getFamily() and associated methods implemented and tested
...
-- Sample no longer serializable
-- Sample now implements Comparable
2011-10-05 09:55:05 -07:00
Mark DePristo
f4bac58f14
Merged bug fix from Stable into Unstable
2011-10-04 21:00:34 -07:00
Mark DePristo
d1d39943d0
Updating MD5 for BAMs that I added a read group to, part 2
2011-10-04 21:00:15 -07:00
Mark DePristo
9bd3ba4c7e
Missed one MD5
2011-10-04 16:04:52 -07:00
Mark DePristo
ffdfdcde3f
Updating MD5s
...
-- Interval test now uses RG containing BAM
-- DoC sample name ordering has changed.
2011-10-04 15:54:45 -07:00
Mark DePristo
463eab7604
All MD5 mismatches for test are shown
...
-- Now for tests like DoC, with 20 output md5s, you see all of the differences before failing.
2011-10-04 15:53:52 -07:00
Mark DePristo
c642a080d4
Merged bug fix from Stable into Unstable
2011-10-04 14:08:41 -07:00
Mark DePristo
941317167e
Updating MD5 for BAMs that I added a read group to
2011-10-04 14:08:00 -07:00
Mark DePristo
e1d6c7a50a
Updating MD5 that have changed due to sample ordering differences
2011-10-04 09:33:23 -07:00
Mark DePristo
343a7b6b2f
Updating UG integration tests for arbitrary impact of sample order changes on downsampling
2011-10-04 08:14:00 -07:00
Mark DePristo
a27641e1fc
Cleaned up imports
2011-10-04 06:28:36 -07:00
Mark DePristo
b20689ff55
No longer supports extraProperties
...
-- the underlying data structure is still present, but until I decide what to do for the extensible system I've completely disabled the subsystem
-- Added code to merge Samples, so that a mostly full record can be merged with a consistent empty record. If the two records are inconsistent, an error is thrown
-- addSample() in Sample.class now invokes mergeSample() when appropriate
-- Validation types are now only STRICT or SILENT
-- Validation code implemented in SampleDBBuilder
-- Extensive unit tests for SampleDBBuilder
2011-10-03 19:20:33 -07:00
Mark DePristo
867a7476c1
Systematic unit tests for the sample object
2011-10-03 19:09:02 -07:00
Mauricio Carneiro
3837aa45b4
Fixing conflicts
...
Conflicts:
public/java/test/org/broadinstitute/sting/utils/clipreads/ReadClipperUnitTest.java
2011-10-03 19:07:59 -07:00
Mark DePristo
2e3dc52088
Minor function renaming
2011-10-03 14:41:13 -07:00
Mark DePristo
dd71884b0c
On path to SampleDB engine integration
...
-- PedReader tag parser
-- Separation of SampleDBBuilder from SampleDB (now immutable)
-- Removed old sample engine arguments
2011-10-03 12:08:07 -07:00
Mark DePristo
89ac50e86e
SampleDataSource -> SampleDB
2011-10-03 09:33:30 -07:00
Mark DePristo
93fba06cb5
Support for whitespace only lines
2011-10-03 09:30:10 -07:00
Mark DePristo
0604ce55d1
PedReader support for ; separated lines, not only newline
2011-10-03 09:19:58 -07:00
Mark DePristo
52f670c8b8
100% version of PedReader
...
-- Passes all unit tests
-- Added unit tests for missing fields
2011-10-03 06:12:58 -07:00
Roger Zurawicki
bf6a3a6532
Added framework to do batch CigarClip Testing
...
*NOTE: This commit has not been compiled!
2011-10-02 22:33:46 -04:00
Mark DePristo
dd75ad9f49
95% PedReader
...
-- Passes significiant unit tests
-- Implicit sample creation for mom / dad when you create single samples
-- Continuing cleanup of Sample and SampleDataSource
2011-09-30 18:03:34 -04:00
Mark DePristo
84160bd83f
Reorganization of Sample
...
-- Moved Gender and Afflication to separate public enums
-- PedReader 90% implemented
-- Improve interface cleanup to XReadLines and UserException
2011-09-30 15:50:54 -04:00
Mark DePristo
56f10b40a8
Fixing test bugs for WindowMaker that required empty sample list
2011-09-30 14:18:27 -04:00
Mark DePristo
30d23942b1
Renamed ReadBackedPileup getXSampleName() functions to getXSample
...
-- now that we don't have Sample objects floating around we don't have to have all of the Name extensions on our functions
2011-09-30 10:02:57 -04:00
Mark DePristo
e055a78f6e
LIBS now requires at least one sample be present
...
-- UnitTest provides a "null" sample for matching the reads without read groups
2011-09-30 09:49:35 -04:00
Mark DePristo
b71b51751e
Bug fix for UnitTest
...
-- Provide the null sample to the LIBS, as this seems to be required for correctly passing this unit test
-- Will be fixed in a future update
2011-09-29 17:30:01 -04:00
Mark DePristo
1765fbeb6b
Merge branch 'master' into ped
2011-09-29 17:18:51 -04:00
Mark DePristo
98ecaf8aa0
Support for ReducedReads with reduced counts and average quals
...
-- ReadUtils and UnitTest updated to support new byte[] style
-- Removed unnecessary read transformer in PairHMM
2011-09-29 17:18:39 -04:00
Mark DePristo
9458f01409
Test cleanup of Sample object
2011-09-29 15:13:05 -04:00
Mark DePristo
625ffb6a07
LocusIteratorByState and ReadBackedPileups no long use Sample
2011-09-29 14:52:11 -04:00
Mark DePristo
505416b6c0
Merge branch 'master' into ped
2011-09-29 12:22:39 -04:00
Mauricio Carneiro
4086fa768f
Disabling all ReadClipperUnitTests
2011-09-29 12:20:35 -04:00
Mark DePristo
5043d76c3d
Removing more bad uses of SampleDataSource creation
2011-09-29 12:16:34 -04:00
Mark DePristo
5c9227cf5e
Further cleanup of Sample database
...
-- Removing more and more unnecessary code
-- Partial removal of type safe Sample usage. On the road to SampleDB only
2011-09-29 11:50:05 -04:00
Mark DePristo
2a0cd556d3
Further cleanup of Sample
...
-- Cleaned up interface functions in GAE
-- Added Walker.getSampleDB() function which is an easier option for tools to get the samples db
2011-09-29 10:34:51 -04:00
Mark DePristo
e76f381628
Moved sample package from DataSources to gatk, and renamed it samples
...
-- All associated changes to the codebase are just header updates
2011-09-29 09:57:15 -04:00
Mauricio Carneiro
fc86cd6fd8
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/carneiro/gatk/RR into rr
2011-09-29 00:12:15 -04:00
Roger Zurawicki
4fd5630f6a
Added ReadClipper Unit Test
...
* Includes tests that include HardClip to Read and Reference Coords.
* Changed ReadUtils.HardClipByReferenceCoordinates from private to protected to allow for testing
2011-09-28 23:13:50 -04:00
Matt Hanna
9272ed03b5
Merged bug fix from Stable into Unstable
2011-09-28 21:26:43 -04:00
Matt Hanna
0acaf2df65
Fix an embarrassing issue where a specific configuration of minimal coverage
...
over small intervals could cause reads to be dropped from the pileup. Nothing
to see here...
2011-09-28 21:23:01 -04:00
Mark DePristo
4f09453470
Refactored reduced read utilities
...
-- UnitTests for key functions on reduced reads
-- PileupElement calls static functions in ReadUtils
-- Simple routine that takes a reduced read and fills in its quals with its reduced qual
2011-09-26 12:58:31 -04:00
Guillermo del Angel
3eef800889
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-24 21:20:11 -04:00
Guillermo del Angel
4707ab4a7d
Added unit tests to test genotype merges with PL's
2011-09-24 21:17:15 -04:00
Guillermo del Angel
203517fbb7
a) Cleanups/bug fixes to previous commit to CombineVariants.
...
b) Change md5 to reflect records that are now merged correctly.
c) Change unit merge alleles test to reflect the fact that a null non-variant vc object is not valid and not supported because there's no way to codify such object in a vcf. The code correctly converts this to a non-variant single-base event with whatever the reference is at that location.
2011-09-24 19:08:00 -04:00
Guillermo del Angel
cd058dd10f
a) Fixed md5 for legit change in UG output that now also no-calls genotypes w/0,0,0 in PL's in SNP case.
...
b) First reimplementation of new vc merger of different types. Previous version did it in two steps, first merging all vc's per type and then trying to see if resulting vc's would be merged if alleles of one type were a subset of another, but this won't work when uniquifying genotypes since sample names would be messed up and GT sample names wouldn't match VC sample names. Now, it's actually simpler: when splitting vc's by type before merging, we check for alleles of one vc being a subset of alleles of vc of another type and if so we put them together in same list.
2011-09-24 13:40:11 -04:00
Mark DePristo
8d9e136bba
Merge branch 'stable'
2011-09-24 09:26:28 -04:00
Mark DePristo
f792353dcd
Framework for genotype unit test
2011-09-24 08:56:45 -04:00
Mark DePristo
c0bb0cb465
Make DiploidGenotype enum private to walkers.genotyper
2011-09-24 08:48:33 -04:00
Khalid Shakir
1803bd6ae2
Merged bug fix from Stable into Unstable
2011-09-23 21:05:00 -04:00
Khalid Shakir
8ceb93b8ac
Fixed an integration test which crashed on the out of date LSF DRMAA library when run against the obsolete LSF dotkit instead of .combined_LSF_SGE
2011-09-23 21:03:22 -04:00
David Roazen
40202c85e0
Merged bug fix from Stable into Unstable
2011-09-23 16:35:55 -04:00
David Roazen
e1cb5f6459
SnpEff annotator now assigns a functional class to each effect and distinguishes between actual effects and mere modifiers.
...
-We now assign a functional class (nonsense, missense, silent, or none) to each SnpEff effect, and add a
SNPEFF_FUNCTIONAL_CLASS annotation to the INFO field of the output VCF.
-Effects are now prioritized according to both biological impact and functional class, instead of impact only.
-Many of SnpEff's "low-impact" effects are now classified as "modifiers" with lower priority than every
other effect. This includes such "effects" as DOWNSTREAM, UPSTREAM, INTRON, GENE, EXON, and others that
really describe the location of the variant rather than its biological effect.
This code will be short-lived (likely 1.2-only), as the next version of SnpEff will include most of these
features directly.
Checking this change into Stable+Unstable instead of Unstable because the current functional class stratification
in VariantEval is basically broken and urgently needs to be fixed for production purposes.
2011-09-23 16:06:52 -04:00
Mark DePristo
106a26c42d
Minor file cleanup
2011-09-23 08:25:20 -04:00
Mark DePristo
a9f073fa68
Genotype merging unit tests for simpleMerge
...
-- Remaining TODOs are all for GdA
2011-09-23 08:24:49 -04:00
Eric Banks
a8e0fb26ea
Updating md5 because the file changed
2011-09-23 07:33:20 -04:00
Mark DePristo
c49cc623de
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-22 17:26:21 -04:00
Mark DePristo
dab7232e9a
simpleMerge UnitTest for not annotating and annotating to different info key
2011-09-22 17:26:11 -04:00
Mark DePristo
30ab3af0c8
A few more simpleMerge UnitTest tests for filtered vcs
2011-09-22 17:14:59 -04:00
Mark DePristo
5cf82f9236
simpleMerge UnitTest tests filtered VC merging
2011-09-22 17:05:12 -04:00
Mark DePristo
46ca33dc04
TestDataProvider now can be named
2011-09-22 17:04:32 -04:00
Mark DePristo
68da555932
UnitTest for simpleMerge for alleles
2011-09-22 15:16:37 -04:00
Eric Banks
80d7300de4
Unit test was passing in FORMAT as one of the sample names. There used to be a hack in the VCFHeader to check for this and remove it and I couldn't figure out why, but now I know. Hack was removed and now the unit test passes in only the sample names as per the contract.
2011-09-22 13:28:42 -04:00
Eric Banks
9c1728416c
Revert "Updating md5 for fixed file" because this was fixed properly in unstable (but will break SnpEff if put into Stable).
...
This reverts commit 6b4182c6ab3e214da4c73bc6f3687ac6d1c0b72c.
2011-09-22 13:16:42 -04:00
Eric Banks
888d8697b1
Merged bug fix from Stable into Unstable
2011-09-22 13:16:31 -04:00
Eric Banks
15a410b24b
Updating md5 for fixed file
2011-09-22 13:15:41 -04:00
Mark DePristo
ba5f83fee2
start of VariantContextUtils UnitTest
...
-- tests rsID merging
2011-09-22 12:10:39 -04:00
Mark DePristo
a05c959e5a
Empty unit tests for VariantContextUtils
...
-- will be expanded over the day
2011-09-22 11:20:07 -04:00
Mark DePristo
3fdee2b9ed
Merge from stable into unstable
2011-09-22 11:19:43 -04:00
Mark DePristo
c514df6d18
Merge of stable into unstable
2011-09-22 10:34:27 -04:00
Mark DePristo
f81a41b889
Updating MD5s for CombineVariants
...
-- Old version had broken RSIDs, new version is fixed. No longer see rs1234,. as it is now just rs1234
2011-09-22 10:30:25 -04:00
Eric Banks
b8ea9ceb68
Adding integration test that uses the -V:dbsnp binding to make sure it won't fail later on if someone messes with Tribble.
2011-09-21 22:43:31 -04:00
Mark DePristo
6bcfce225f
Fix for dynamic type determination for bgzip files
...
-- GZipInputStream handles bgzip files under linux, but not mac
-- Added BlockCompressedInputStream test as well, which works properly on bgzip files
2011-09-21 15:39:19 -04:00