gatk-3.8

Commit Graph

Author	SHA1	Message	Date
droazen	ca42be9788	Merge pull request #78 from broadinstitute/dr_pdfgen_bamboo_script_GSA-794 Trivial shell script for bamboo to trigger the website pdfgen script	2013-02-28 11:48:41 -08:00
David Roazen	b050d16b22	Trivial shell script for bamboo to trigger the website pdfgen script	2013-02-28 14:45:25 -05:00
depristo	92d6a4f441	Merge pull request #75 from broadinstitute/eb_missing_rg_error_GSA-407 Added better error message for BAMs with bad read groups.	2013-02-28 05:20:39 -08:00
depristo	cac3f80c64	Merge pull request #73 from broadinstitute/eb_remove_nested_hashmap_GSA-732 Replace uses of NestedHashMap with NestedIntegerArray.	2013-02-28 05:19:56 -08:00
Eric Banks	12fc198b80	Added better error message for BAMs with bad read groups. * Split the cases into reads that don't have a RG at all vs. those with a RG that's not defined in the header. * Added integration tests to make sure that the correct error is thrown. * Resolved GSA-407.	2013-02-27 16:02:56 -05:00
Eric Banks	45fc0ed261	Merge pull request #74 from broadinstitute/eb_update_rtc_docs_GSA-716 Update docs for RTC.	2013-02-27 11:58:09 -08:00
Eric Banks	d2904cb636	Update docs for RTC.	2013-02-27 14:56:44 -05:00
MauricioCarneiro	97b332943b	Merge pull request #64 from broadinstitute/md_agbt	2013-02-27 11:41:04 -08:00
Eric Banks	69b8173535	Replace uses of NestedHashMap with NestedIntegerArray. * Removed from codebase NestedHashMap since it is unused and untested. * Integration tests change because the BQSR CSV is now sorted automatically. * Resolves GSA-732	2013-02-27 14:03:39 -05:00
Eric Banks	4b1071a815	Merge pull request #68 from broadinstitute/aw_reduce_reads_perf Eliminate 7-element arrays in BaseCounts and BaseAndQualsCount and repla...	2013-02-27 10:03:36 -08:00
Alec Wysoker	c8368ae2a5	Eliminate 7-element arrays in BaseCounts and BaseAndQualsCount and replace with in-line primitive attributes. This is ugly but reduces heap overhead, and changes are localized. When used in conjunction with Mauricio's FastUtil changes it saves and additional 9% or so of execution time.	2013-02-27 12:49:56 -05:00
Ryan Poplin	69f6d53494	Merge pull request #72 from broadinstitute/md_profile_hc_vs_ug_GSA-749 GATKPerformanceOverTime now includes a mode to compare HC & UG performance	2013-02-27 08:14:00 -08:00
Mark DePristo	b987df5d8d	GATKPerformanceOverTime now includes a mode to comparing HC & UG performance -- Compares HC and UG performance on single deep genomes and 140 WEx bams over small intervals, as well as deep WGS reduced data and 1000G 4x data -- Added mode to GATKPerformanceOverTime to include lots of versions, so we can make beautiful graphs of the cost of tools over many versions as well -- Marginally better plots for multiple iterations in GATKPeformanceOverTime.R	2013-02-27 10:58:34 -05:00
David Roazen	752f4335a5	Merged bug fix from Stable into Unstable	2013-02-27 05:20:41 -05:00
David Roazen	2a7af43164	Fix improper dependencies in QScripts used by pipeline tests, and attempt to fix the flawed MisencodedBaseQualityUnitTest -Some QScripts used by public pipeline tests unnecessarily used the (now protected) UnifiedGenotyper. Changed them to use PrintReads instead. -Moved ExampleUnifiedGenotyperPipelineTest to protected -Attempt to fix the flawed and sporadically failing MisencodedBaseQualityUnitTest: After looking at this class a bit, I think the problem was the use of global arrays for the quals shared across all reads in all tests (BAMRecord class definitely does not make a separate copy for each read!). One test (testFixBadQuals) modifies the bad quals array, and if this happens to run before the testBadQualsThrowsError test the bad quals array will have been "fixed" and no exception will be thrown.	2013-02-27 04:45:53 -05:00
David Roazen	6466463d5a	Merged bug fix from Stable into Unstable	2013-02-26 21:54:54 -05:00
David Roazen	12a3d7ecad	Fix licenses on files modified in 2.4-1	2013-02-26 21:53:17 -05:00
David Roazen	a53b4a7521	Merged bug fix from Stable into Unstable	2013-02-26 21:41:13 -05:00
David Roazen	65d31ba4ad	Fix runtime public -> protected dependencies in the test suite -replace unnecessary uses of the UnifiedGenotyper by public integration tests with PrintReads -move NanoSchedulerIntegrationTest to protected, since it's completely dependent on the UnifiedGenotyper	2013-02-26 21:19:12 -05:00
droazen	dd338bebd0	Merge pull request #70 from broadinstitute/dr_nightly_build_script_adjustments Nightly build script improvements	2013-02-26 14:46:09 -08:00
David Roazen	d2f4626bdd	Nightly build script improvements -Include the word "nightly" in the version -Add a ".tar.bz2" extension to the symlinks for the current build	2013-02-26 17:43:19 -05:00
depristo	7c3f8d384b	Merge pull request #69 from broadinstitute/dr_nightly_build_script_GSATDG-78 Shell script to release GATK nightly builds	2013-02-26 13:58:39 -08:00
David Roazen	3680879926	Shell script to release GATK nightly builds -publishes GATK jar + accompanying GATKDocs archive to a new nightly build directory -nightly builds are versioned by date rather than tag	2013-02-26 16:53:42 -05:00
depristo	93205154b5	Merge pull request #63 from broadinstitute/eb_fix_pairhmm_unittest_GSA-776 Eb fix pairhmm unittest gsa 776	2013-02-26 11:56:58 -08:00
Eric Banks	734353e9df	Merge pull request #60 from broadinstitute/mc_fastutil_GSATDG-83 Brought all of ReduceReads to fastutils	2013-02-26 11:56:41 -08:00
Eric Banks	3ce0a32da7	Merge remote-tracking branch 'unstable/master'	2013-02-26 14:48:39 -05:00
Eric Banks	7a7adb79f1	Merge pull request #67 from broadinstitute/dr_release_script_disable_validation Temporarily disable paranoid validation in the release scripts	2013-02-26 11:25:01 -08:00
Eric Banks	2cf0dc9939	Merge pull request #66 from broadinstitute/mc_retire_coveragebysample_walker_GSATDG-90 Archiving CoverageBySample	2013-02-26 11:19:09 -08:00
David Roazen	2b13af042d	Temporarily disable paranoid validation in the release scripts These validation steps are not strictly necessary, and would fail with the protected repo right now, as it currently lacks a master branch.	2013-02-26 14:17:39 -05:00
Mauricio Carneiro	711cbd3b5a	Archiving CoverageBySample This walker was not updated since 2009, and users were getting wrong answers when running it with ReduceReads. I don't want to deal with this because DiagnoseTargets does everything this walker does.	2013-02-26 13:49:00 -05:00
Ryan Poplin	357a05683d	Merge pull request #65 from broadinstitute/dr_change_haplotypecaller_downsampling_settings_GSA-699 Change default downsampling coverage target for the HaplotypeCaller to 2...	2013-02-26 10:33:19 -08:00
David Roazen	8b29030467	Change default downsampling coverage target for the HaplotypeCaller to 250 -was previously set to 30, which seems far too aggressive given that with ActiveRegionWalkers, as with LocusWalkers, this limits the depth of any pileup returned by LIBS -250 is a more conservative default used by the UG -can adjust down/up later based on further experiments (GSA-699 will remain open) -verified with Ryan that all integration test differences are either innocent or represent an improvement GSA-699	2013-02-26 09:33:25 -05:00
depristo	51d618de97	Merge pull request #62 from broadinstitute/rp_increase_max_kmer_in_assembly The maximum kmer length is derived from the reads.	2013-02-26 05:37:02 -08:00
Mark DePristo	79d1050457	AGBT analysis scripts -- Simple scripts to realign BAMs around indels and run HC scatter gathered -- AGBT analysis R script	2013-02-25 16:01:46 -05:00
depristo	ed5aff3702	Merge pull request #55 from broadinstitute/dr_fix_sequence_dictionary_validation_GSA-768 Sequence dictionary validation: detect problematic contig indexing differences	2013-02-25 12:39:56 -08:00
Eric Banks	396b7e0933	Fixed the intermittent PairHMM unit test failure. The issue here is that the OptimizedLikelihoodTestProvider uses the same basic underlying class as the BasicLikelihoodTestProvider and we were using the BasicTestProvider functionality to pull out tests of that class; so if the optimized tests were run first we were unintentionally running those same tests again with the basic ones (but expecting different results).	2013-02-25 15:05:13 -05:00
Eric Banks	7519484a38	Refactored PairHMM.initialize to first take haplotype max length and then the read max length so that it is consistent with other PairHMM methods.	2013-02-25 15:04:23 -05:00
Ryan Poplin	89e2943dd1	The maximum kmer length is derived from the reads. -- This is done to take advantage of longer reads which can produce less ambiguous haplotypes -- Integration tests change for HC and BiasedDownsampling	2013-02-25 14:40:25 -05:00
MauricioCarneiro	bd9875aff5	Merge pull request #61 from broadinstitute/dr_update_release_scripts 1. removed all directives related to gatklite (we're getting rid of this distribution) 2. adapting scripts to the new gsa-protected repository	2013-02-25 10:37:59 -08:00
Mauricio Carneiro	0ff3343282	Addressing Eric's comments -- added @param docs to the new variables -- made all variables final -- switched to string builder instead of String for performance. GSATDG-83	2013-02-25 13:33:47 -05:00
David Roazen	3645ea9bb6	Sequence dictionary validation: detect problematic contig indexing differences The GATK engine does not behave correctly when contigs are indexed differently in the reads sequence dictionaries vs. the reference sequence dictionary, and the inconsistently-indexed contigs are included in the user's intervals. For example, given the dictionaries: Reference dictionary = { chrM, chr1, chr2, ... } BAM dictionary = { chr1, chr2, ... } and the interval "-L chr1", the engine would fail to correctly retrieve the reads from chr1, since chr1 has a different index in the two dictionaries. With this patch, we throw an exception if there are contig index differences between the dictionaries for reads and reference, AND the user's intervals include at least one of the mismatching contigs. The user can disable this exception via -U ALLOW_SEQ_DICT_INCOMPATIBILITY In all other cases, dictionary validation behaves as before. I also added comprehensive unit tests for the (previously-untested) SequenceDictionaryUtils class. GSA-768 #resolve	2013-02-25 11:14:22 -05:00
David Roazen	baa3b15207	Update release scripts in preparation for open-sourcing protected	2013-02-25 10:17:16 -05:00
Eric Banks	f62dd84869	Merge pull request #57 from broadinstitute/rp_bubble_traversal_merge_GSA-680 Rp bubble traversal merge gsa 680	2013-02-24 05:08:05 -08:00
Mauricio Carneiro	9e5a31b595	Brought all of ReduceReads to fastutils -- Added unit tests to ReduceReads name compression -- Updated reduce reads walker for unit testing GSATDG-83	2013-02-23 22:53:23 -05:00
Ryan Poplin	6a639c8ffc	Replace Smith-Waterman alignment with the bubble traversal. -- Instead of doing a full SW alignment against the reference we read off bubbles from the assembly graph. -- Smith-Waterman is run only on the base composition of the bubbles which drastically reduces runtime. -- Refactoring graph functions into a new DeBruijnAssemblyGraph class. -- Bug fix in path.getBases(). -- Adding validation code to the assembly engine. -- Renaming SimpleDeBruijnAssembler to match the naming of the new Assembly graph class. -- Adding bug fixes, docs and unit tests for DeBruijnAssemblyGraph and KBestPaths classes. -- Added ability to ignore bubbles that are too divergent from the reference -- Max kmer can't be bigger than the extension size. -- Reverse the order that we create the assembly graphs so that the bigger kmers are used first. -- New algorithm for determining unassembled insertions based on the bubble traversal instead of the full SW alignment. -- Don't need the full read span reference loc for anything any more now that we clip down to the extended loc for both assembly and likelihood evaluation. -- Updating HaplotypeCaller and BiasedDownsampling integration tests. -- Rebased everything into one commit as requested by Eric -- improvements to the bubble traversal are coming as a separate push	2013-02-22 15:42:16 -05:00
depristo	2ad559cf58	Merge pull request #59 from broadinstitute/mc_reving_testng_GSA-695 Updating TestNG to the latest version	2013-02-22 10:39:04 -08:00
depristo	50612ac981	Merge pull request #58 from broadinstitute/mc_callset_assesment_GSATDG-52 AGBT scripts, tool updates and misc	2013-02-22 07:23:59 -08:00
Mauricio Carneiro	3f901ff0e7	R scripts for covreage analysis of the genome (AGBT13) -- script that generates a scatterplot of the poorly covered regions versus PCR+ -- script that calculates the uncovered portion of the genome	2013-02-22 10:19:01 -05:00
Mauricio Carneiro	e3f01673e1	Implementation of the find and diagnose Queue script -- Added 'uncovered intervals' output for FindCoveredIntervals -- updated scala script to make use of it.	2013-02-22 10:19:01 -05:00
Mauricio Carneiro	15a8f6d82e	Coverage analysis by variant type R script (for AGBT13)	2013-02-22 10:19:01 -05:00

1 2 3 4 5 ...

11957 Commits (ca42be97888eb78406d01ef38564bc21e13e94a5) All Branches Search

11957 Commits (ca42be97888eb78406d01ef38564bc21e13e94a5)

All Branches