gatk-3.8

Commit Graph

Author	SHA1	Message	Date
depristo	bd75a8d168	Unused code has been removed git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1599 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:12:23 +00:00
depristo	e8d544869d	Alignment context now supports the idea of skipped bases -- not currently in use git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1598 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:11:38 +00:00
depristo	3ad97e4ab4	Easier to print GenomeLoc compareTo() git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1597 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:10:35 +00:00
depristo	3949b4ac72	commented out version of next() and hasNext() that appear to be correct but are causing testing problems git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1596 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:09:21 +00:00
depristo	58105636c8	getBoundRods() convenience method git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1595 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:07:57 +00:00
depristo	4e1eded389	Fixed bad compareTo operator git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1594 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:07:10 +00:00
depristo	17ab1d8b25	General purpose merging iterator implementation git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1593 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-12 19:06:15 +00:00
hanna	275707f5f6	Data structure for counts, to isolate the user from wonky 'sometimes counts are cumulative, other times base-by-base' gotchas. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1592 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 20:53:24 +00:00
depristo	7c8b17b456	fix for SSG with pl name git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1591 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 20:39:34 +00:00
andrewk	5354c1876c	De Novo SNP caller as presented at 1KG meeting on 9/10/09 with min LOD 5 calls required from both parents and a LOD 5 call in the daugter gold standard concordant call set. All SNP calls must be present as bound RODs. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1590 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 19:30:23 +00:00
hanna	0f3049652a	Start to build BWT abstractions, so we can present a reasonable facsimile of the BWT to the user no matter how it's represented on disk. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1589 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 18:23:15 +00:00
chartl	c3f77acd5e	Alteration to CoverageAndPowerWalker. It can now be flagged with -uc which will cause it to print not only the coverage on each strand that exceeds the quality score threshold, but also the total coverage on each strand as well. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1588 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 17:55:44 +00:00
chartl	d6a0b65ac9	Changes: Rollback of Variant-related changes of r1585, additional PGC code git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1586 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 16:23:01 +00:00
chartl	0c54aba92a	Changes: @VariantEvalWalker - added a command line option to input a file path to a pooled call file for pooled genotype concordance checking. This string is to be passed to the PooledGenotypeConcordance object. @AllelicVariant - added a method isPooled() to distinguish pooled AllelicVariants from unpooled ones. @ all the rest - implemented isPooled(); for everything other than PooledEMSNProd it simply returns false, for PooledEMSNProd it returns true. Added: @PooledGenotypeConcordance - takes in a filepath to a pool file with the names of hapmap individuals for concordance checking with pooled calls and does said concordance checking over all pools. Commented out as all the methods are as yet unwritten. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1585 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-11 15:01:50 +00:00
ebanks	e24c8d00d5	So, the VCF spec allows for an optional meta field in the header representing the date. However, using this field means that integration tests run on the vcf file will fail the MD5 test (which is what happened to the VariantFiltration test this morning after working just fine yesterday). After consulting our resident expert (Aaron), we're going to (temporarily) remove the date from the vcf output until we can come up with a better solution. However, this shouldn't cause any short-term problems because the data truly is optional. VF test's MD5s are updated. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1580 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-10 14:28:43 +00:00
aaron	296878e8e3	adding a basic implementation of the Variation interface. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1578 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-10 04:41:13 +00:00
aaron	5a64a80ab5	changes to the variation class, updates to SSG, updated tests based on changes to the SSGenotypeCall, and added the ability to run a single integration test from using the build script. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1577 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-10 04:31:33 +00:00
depristo	c988205884	Notes for Aaron in SSG git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1576 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-10 03:18:51 +00:00
ebanks	1362a56227	Added fasta tests and small fix to cleaner test git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1575 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-10 03:13:11 +00:00
ebanks	8ca89279aa	Added a test for VariantFiltration and the VECs git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1574 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-10 02:21:14 +00:00
hanna	6de54dcd2a	Higher-level readers and writers for BWTs and suffix arrays. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1573 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 22:45:32 +00:00
depristo	0093482c62	N reference base fix for SSG git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1572 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 21:19:36 +00:00
hanna	bc9fe31cf5	Cleanup of int-packed file readers / writers. All primitive writers for BWTs and SAs are in place; time to move on to compound reader / writers. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1571 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 20:36:39 +00:00
asivache	d9f3e9493f	Does not return 0-length cigar elements anymore (used to do so when previous cigar element ended exactly at the segment boundary) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1570 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 20:05:55 +00:00
ebanks	cb31d5a0ab	VariantFiltration now outputs VCF. Important changes: 1. VariantsToVCF can now be called statically to output VCF for a single ROD instance; this is temporary until we have a VCF ROD. 2. VariantFiltration now outputs only 2 files, both mandatory: all variants that pass filters in geli text, and all variants in VCF. If there are any problems, go find Aaron. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1569 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 20:04:32 +00:00
asivache	dd0085c428	1) now is tolerant to sloppy cigar strings with 0-length elements (at the price of extra recursive call) 2) when reads with deletions are requested, adds to the pile just those: reads with 'D' over the current reference base, but not 'N' 3) next() now implements a loop: recursive forward iteration calls to next() until ref. position with non-zero coverage is encountered were OK for (short) deletions, but with long stretches of N's they end up with stack overflow git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1568 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 20:04:04 +00:00
ebanks	542af6402e	output correct format for Sequenom SNPs git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1567 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 19:21:53 +00:00
hanna	43d1c6741c	Cleanup. Separate common packing functionality into utils class. Make base packing utility as generic as possible. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1566 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 17:54:12 +00:00
kiran	3b1e966b4c	Lowercases the sequencing platform so that a difference in case doesn't lead to the failure to look up an entry in the hash. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1565 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 17:35:45 +00:00
kiran	d82d6c0665	Excludes variants that fall below a certain LOD that changes as a function of depth. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1564 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 17:34:16 +00:00
kiran	06eae52292	Throws an exception if you attempt to use a filter that doesn't exist. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1563 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 17:33:27 +00:00
asivache	1060b36288	Bug fix: 'N' cigar elements now treated properly; for all practical intents and purposes, N is the same as D and should be treated as such, the difference is only in logical interpretation. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1562 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 17:08:35 +00:00
ebanks	bed646e4f6	Adding cleaner test git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1561 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 16:05:56 +00:00
chartl	9c7f456510	Changed the short name on the PoolSize cmd line argument git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1560 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 15:53:22 +00:00
chartl	9d69bd2c84	Modifications: @CoverageAndPowerWalker - removed a hanging colon that was being printed after the reference position @VariantEvalWalker - added a command line argument for pool size for eventual use in doing pooled caller evaluations. As now, the variable is unused. @AlignmentContext - altered the scope of class variables from private to protected in order that child objects might have access to them New Additions: Filtered Contexts Sometimes we want to filter or partition reads by some aspect (quality score, read direction, current base, whatever) and use only those reads as part of the alignment context. Prior to this I've been doing the split externally and creating a new AlignmentContext object. This new approach makes it a bit easier, as each of these objects are children of AlignmentContext, and can be instantiated from a "raw" AlignmentContext. @FilteredAlignmentContext is an abstract class that defines the behavior. The abstract method 'filter' is called on the input AlignmentContext, filtering those reads and offsets by whatever you can think of. The filtered reads/offsets are then maintained in the reads and offsets fields. These classes can be passed around as AlignmentContexts themselves. Writing a new kind of read-filtered alignment context boils down to implementing the filter method. @ReverseReadsContext - a FilteredAlignmentContext that takes only reads in the reverse direction @ForwardReadsContext - a FilteredAlignmentContext that takes only reads in the forward direction @QualityScoreThresholdContext - a FilteredAlignmentContext that takes only reads above a given quality score threshold (defaults to 22 if none provided). A unit test bamfile and associated unit tests for these are in the works. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1559 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 15:49:52 +00:00
depristo	d9588e6083	bug fixes to LIBS and LIBH following ultra-aggressive regression testing across 454, solid, and solexa git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1558 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 15:36:12 +00:00
asivache	0721c450c2	Bug fix: single unmapped read now keeps mapping qual 0 after remapping, not 37! git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1557 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 15:29:34 +00:00
asivache	df11618092	Set default value of useLocusIteratorByHanger to FALSE. Otherwise the -LIBH flag is useless and there'd be no wayto "unset" the 'true' value. Old version was (always) using LocusIteratorByHanger. Now default iterator is indeed LocusIteratorByState, and -LIBH will switch back to the old one git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1556 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 15:09:09 +00:00
aaron	0df6a9da5c	-Seperating out normal (unit) tests and integration tests. From now on if your test are more of an integration test (i.e. you're testing a walker and all the subunits it relies on) please name the test "______IntegrationTest.java" instead of "______Test.java". -Bamboo will now run the integration tests once a day, and the normal units tests on each check-in. -Also added a bunch of unit tests for VariantEval walker git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1555 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 15:01:40 +00:00
depristo	eeb9b6eb13	GenotypeLikelhoods now support a cache per subclass, avoiding genotyping clashes git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1554 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 10:39:14 +00:00
ebanks	0cc219c0df	-Added unit test for walkers dealing with intervals for cleaning -I also uncovered a corner case in the cleaner that for some reason was commented out but shouldn't have been. Hooray for unit tests! git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1553 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 02:35:17 +00:00
depristo	ec0f6f23c7	LocusIterationByState is now the system deafult. Fixed Aaron's build problem git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1552 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 01:28:05 +00:00
aaron	ea6ffd3796	initial VariantEvalWalker test. More to be added soon... Also fixed the case where MD5 sums had leading zero's clipped off git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1551 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-09 01:02:04 +00:00
hanna	adce3bd536	My reference implementation is now generating a BWT which matches BWT-SW's. Note to self: never give project status in an svn log. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1550 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-08 22:11:03 +00:00
hanna	f22f590192	Successfully writing .sa files. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1549 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-08 17:34:34 +00:00
sjia	600c234643	Starting code on phasing git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1548 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-08 15:20:38 +00:00
aaron	3276e01e5f	fixing the build git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1546 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-08 13:13:55 +00:00
kiran	f963cfcb21	Made enum listing header fields public. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1545 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-08 06:12:59 +00:00
kiran	fd20f5c2e8	For a file or files backed by a ROD implementing AllelicVariant, outputs a VCF file summarizing the information. Metadata like Hapmap and dbSNP membership, genotype LOD, read depth, etc, are annotated appropriately. The results output by this program are equivalent to those given by Gelis2PopSNPs.py. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1544 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-08 06:12:18 +00:00
ebanks	4a95f2181d	print out the right variant git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1543 348d0f76-0448-11de-a6fe-93d51630548a	2009-09-08 01:37:35 +00:00

1 2 3 4 5 ...

1312 Commits (bd75a8d1685af44478db13707103db984d1a88ab)