Mark DePristo
e3ff4f4266
Failing MD5 because output now contains absolute path
2011-10-10 11:05:02 -04:00
Mark DePristo
3e6c16d961
CombineVariants preserves allele order
2011-10-10 11:04:38 -04:00
Mark DePristo
a4bb842958
RankSum tests have lightly different MD5 results based on allele order
...
-- UG GENOTYPE_GIVEN_ALLELES now uses the order of alleles in the VCF, so this changes the MD5
2011-10-10 11:04:07 -04:00
Mark DePristo
46e7370128
this.allele, getAlleles(), and getAltAlleles() now return List not set
...
-- Changes associated code throughout the codebase
-- Updated necessary (but minimal) UnitTests to reflect new behavior
-- Much better makealleles() function in VC.java that enforces a lot of key constraints in VC
2011-10-09 11:45:55 -07:00
Mark DePristo
822654b119
UnitTests for allele getting functions in VC in prep for move from set to list
2011-10-09 10:36:14 -07:00
Mark DePristo
c67f6c076b
simpleMerge now preserves allele order
...
-- UnitTests for dangerous PL merging cases in the multi-allelic case. The new behavior is correct
2011-10-08 17:39:53 -07:00
Mark DePristo
e94e6ba101
A UnitTest to ensure that the order of alleles is maintained
...
-> A, C, T and A, T, C are different and must be maintained. The constructors were doing this appropriately, so nothing needed to be changed
2011-10-08 08:47:58 -07:00
Mark DePristo
ec14a4a606
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-07 08:38:50 -07:00
Matt Hanna
6fbd41724a
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-07 11:20:00 -04:00
Matt Hanna
4514bc350f
More reliable way of finding the Tribble jar.
2011-10-07 11:19:29 -04:00
Eric Banks
181c76750e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-06 22:38:55 -04:00
Eric Banks
ca9cd9b688
Minor fix for merging intervals which hadn't been necessary when only merging from the left to right. Added integration tests to cover the parallelization of RTC.
2011-10-06 22:38:44 -04:00
Khalid Shakir
f91b015e0e
Made the BaseTest.testDir absolute
2011-10-06 22:33:21 -04:00
Mark DePristo
c7864c7256
Filter application order is now deterministic, in the order defined by the walker
...
-- For no apparent reason we were using a HashSet to store the ReadFilters, so the order of operations was really arbitrarily applied. The order now is
(1) the order of the walker intrinsic filters
(2) read group black list (if provided)
(3) command line filters (if provided)
2011-10-06 18:51:40 -07:00
Mark DePristo
0b88af4af9
Counts of records failing filters are displayed sorted
...
-- Stops random ordering of the output, as the counts are returned sorted by string name of the class
-- Deleted now unused sh*tty assessors in Utils
2011-10-06 18:42:26 -07:00
Mark DePristo
d1e70d6ec2
Removed Nx counting of reads in metrics with -nt > 1
2011-10-06 18:29:26 -07:00
Eric Banks
c61804a450
Rename the long version of the argument name to more accurately reflect its purpose.
2011-10-06 16:14:04 -04:00
Eric Banks
61a3dfae24
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-06 15:58:04 -04:00
Eric Banks
6eb87bf58a
RTC now caches all intervals as GenomeLocs (which is expected to take < 1Gb whole genome based on back of the envelope calculations with Matt) so that 1) we don't have to worry about emitting outside of the leaves in the hierarchical reductions and 2) we can emit the intervals in sorted order which is a big performance plus for the realigner. Integration tests change only because intervals whose start=stop are now printed as chr:start instead of chr:start-stop.
2011-10-06 15:57:49 -04:00
Mark DePristo
6d9c210460
Updating MD5s for updated BAM with read groups
2011-10-06 12:15:48 -07:00
Mark DePristo
ab357ef900
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-06 10:50:02 -07:00
Eric Banks
1b0735f0a3
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-06 13:41:45 -04:00
Eric Banks
c4dfc1fb8b
Temporary commit of parallelization support for RealignerTargetCreator. Tim begged us for this and I got assurances from Khalid/Matt that this would also be extremely helpful for the whole genome calling pipeline, so I spent a while working on this. Needs to be fixed up though because apparently only the leaves in the hierarchical reduce get their output aggregated. Worked out a better solution with Matt.
2011-10-06 13:41:36 -04:00
Matt Hanna
3961733590
Merged bug fix from Stable into Unstable
2011-10-06 12:54:52 -04:00
Matt Hanna
4fa5045e84
Abandoning classfileset/rootfileset approach due to difficulting managing
...
classloading of bcel*.jar/ant-apache-bcel*.jar. Switching instead to manually
specifying a minimal set of packages/classes to include in the vcf.jar via
build.xml, and adding a unit test which creates a limited classloader
only aware of vcf.jar and tribble.jar and tries to use it to load the core
classes in the vcf jar.
Hopefully third time's the charm.
2011-10-06 12:49:51 -04:00
Mark DePristo
73f9d1f217
GATK read group requirement iron hand
...
-- The GATK will now throw a user exception if it opens a SAM/BAM file that doesn't have at least one RG defined
-- LIBS again throws an error if the complete list of samples isn't provided
-- Updating ExmpleCountLociPipeline test to use the well-formated versions of the exampleBAM and exampleFASTA files in testdata, instead of the old broken ones in validation_data.
-- Convenience constructors for UserExceptions.MalformedBAM
2011-10-06 08:40:35 -07:00
Mark DePristo
23845ac798
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-06 08:17:08 -07:00
Mark DePristo
4b5b9155a9
Fixed bad expected value in PedReaderUnitTest
2011-10-06 08:16:47 -07:00
Mark DePristo
daa5999489
Fixed typo in argument description
2011-10-06 08:16:25 -07:00
Guillermo del Angel
8a474e38ff
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-06 10:08:39 -04:00
Guillermo del Angel
93f7e632bd
Minor fix/enhancement for VariantEval: if a vcf has symbolic alleles, program would crash ungracefully - now we'll just skip record without processing. This is a big issue since we can't process 1000G integration files with code as is.
2011-10-06 10:07:46 -04:00
Mark DePristo
190be4d0d1
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-10-05 21:27:11 -07:00
Mark DePristo
8e6845806a
Allowing empty samples list in LIBS
...
-- Right now we cannot process BAM files without read groups because we enforce the samples list to not be empty when there's a SAM record. Now if there are reads and there are no samples we add the "null" sample so that LIBS walks the reads properly
2011-10-05 21:26:21 -07:00
Matt Hanna
180c8f286f
Merged bug fix from Stable into Unstable
2011-10-05 20:37:43 -04:00
Matt Hanna
55b9f06527
Ensure that IndelRealigner n-way out option supports MD5 generation.
2011-10-05 20:36:28 -04:00
Mark DePristo
be2d29ce69
Final PED documentation
2011-10-05 15:17:41 -07:00
Mark DePristo
3226d5dc0d
Merge branch 'master' into ped
2011-10-05 15:03:09 -07:00
Mark DePristo
6a573437af
Details documentation arguments for -ped
2011-10-05 15:00:58 -07:00
Mark DePristo
e7c80f7c45
Renaming quantitative trait to OtherPhenotype which is now a String not a double
...
-- we can now use PED file to represent population data or other arbitrary phenotype data, not just doubles
2011-10-05 12:26:33 -07:00
Mark DePristo
51ecc20867
getFamily() and associated methods implemented and tested
...
-- Sample no longer serializable
-- Sample now implements Comparable
2011-10-05 09:55:05 -07:00
Mark DePristo
f4bac58f14
Merged bug fix from Stable into Unstable
2011-10-04 21:00:34 -07:00
Mark DePristo
d1d39943d0
Updating MD5 for BAMs that I added a read group to, part 2
2011-10-04 21:00:15 -07:00
Mark DePristo
9bd3ba4c7e
Missed one MD5
2011-10-04 16:04:52 -07:00
Mark DePristo
ffdfdcde3f
Updating MD5s
...
-- Interval test now uses RG containing BAM
-- DoC sample name ordering has changed.
2011-10-04 15:54:45 -07:00
Mark DePristo
a45d985818
TODO method stubs
2011-10-04 15:54:09 -07:00
Mark DePristo
463eab7604
All MD5 mismatches for test are shown
...
-- Now for tests like DoC, with 20 output md5s, you see all of the differences before failing.
2011-10-04 15:53:52 -07:00
Mark DePristo
c642a080d4
Merged bug fix from Stable into Unstable
2011-10-04 14:08:41 -07:00
Mark DePristo
941317167e
Updating MD5 for BAMs that I added a read group to
2011-10-04 14:08:00 -07:00
Mark DePristo
e1d6c7a50a
Updating MD5 that have changed due to sample ordering differences
2011-10-04 09:33:23 -07:00
Mark DePristo
343a7b6b2f
Updating UG integration tests for arbitrary impact of sample order changes on downsampling
2011-10-04 08:14:00 -07:00