gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Ryan Poplin	f3a67edc24	Merge pull request #402 from broadinstitute/gg_dcov_docs Improvements to gatkdocs related to downsampling	2013-09-27 07:07:21 -07:00
Ryan Poplin	cba9668641	Merge pull request #401 from broadinstitute/gg_vqsr_ignorefilter_doc Minor clarifications regarding ignoreFilter argument	2013-09-27 07:05:41 -07:00
kshakir	a29f1f84bf	Merge pull request #397 from lbergelson/lb_scala_2.10.2 Update scala from 2.9 to 2.10.2	2013-09-26 21:51:43 -07:00
Geraldine Van der Auwera	511948890a	Modify gatkdoc template to handle downsampling info better	2013-09-26 14:50:32 -04:00
Geraldine Van der Auwera	66d0235efc	Minor clarifications & formatting tweaks for dcov docs	2013-09-26 14:28:22 -04:00
Geraldine Van der Auwera	27808d336a	Minor clarifications regarding ignoreFilter argument	2013-09-26 13:13:53 -04:00
Geraldine Van der Auwera	a9fa5206ee	Merge pull request #399 from broadinstitute/eb_update_docs_for_DepthPerSampleHC Updated docs for DepthPerSampleHC to deliver PT #54237024.	2013-09-25 15:20:19 -07:00
Ryan Poplin	f362597f69	Merge pull request #400 from broadinstitute/mm_bugfix_combine_variants_implicit_casting Bug fix: annotation values ar parsed as Doubles when they should be parsed as Integers due to implicit conversion.	2013-09-25 11:47:17 -07:00
Michael McCowan	5113e21437	Bug fix: annotation values ar parsed as Doubles when they should be parsed as Integers due to implicit conversion. * Updated expected test data in which an integer annotation (MQ0) was formatted as a double.	2013-09-25 13:12:02 -04:00
David Roazen	41fef329b4	Update scripts for depristo -> gsa-engineering migration	2013-09-25 12:57:49 -04:00
Eric Banks	2783c84c6b	Updated docs for DepthPerSampleHC to deliver PT #54237024 .	2013-09-24 22:32:19 -04:00
Eric Banks	66b51dbc0f	Merge pull request #398 from broadinstitute/eb_update_indel_model_arg_docs Updated docs to tell users not to use PCR indel error modeling for PCR free data.	2013-09-23 12:51:18 -07:00
Eric Banks	d6992d1263	Updated docs to tell users not to use PCR indel error modeling for PCR free data.	2013-09-23 15:48:47 -04:00
Louis Bergelson	c05208ecec	Resolving warnings --specifying exception types in cases where none was already specified ----mostly changed to catch Exception instead of Throwable ----EmailMessage has a point where it should only be expecting a RetryException but was catching everything --changing build.xml so that it prints scala feature warning details --added necessary imports needed to remove feature warnings --updating a newly deprecated enum declaration to match the new syntax	2013-09-23 12:42:22 -04:00
Louis Bergelson	b32ad99d3f	Changing from scala 2.9.2 to 2.10.2. --modified ivy dependencies --modified scala classpath in build.xml to include scala-reflect --changed imports to point to the new scala scala.reflect.internal.util --set the bootclasspath in QScriptManager as well as the classpath variable. --removing Set[File] <-> Set[String] conversions ----Set is invariant now and the conversions broke --removing unit tests for Set[File] <-> Set[String] conversions	2013-09-23 12:42:22 -04:00
Mauricio Carneiro	5bbad75402	Changing max coverage threshold Because Integer.maxValue is not unit testable	2013-09-20 18:54:40 -04:00
Geraldine Van der Auwera	175388de1d	Merge pull request #396 from broadinstitute/mc_dt_excessive_coverage_defaults Updating excessive coverage default parameter & docs+test for QualifyMissingIntervals	2013-09-20 15:12:16 -07:00
Mauricio Carneiro	5e2ffc74fc	Automated interpretation for QualifyMissingIntervals * add a new column to do what I have been doing manually for every project, understand why we got no usable coverage in that interval * add unit tests -- this tool is now public, we need tests. * slightly better docs -- in an effort to produce better docs for this tool	2013-09-20 16:47:12 -04:00
Mauricio Carneiro	74639463b9	Updating excessive coverage default parameter most people don't care about excessive coverage (unless you're very particular about your analysis). Therefore the best possible default value for this is Integer.maxValue so it doesn't get in the way. Itemized Changes: * change maximumCoverage threshold to Integer.maxValue [delivers #57353620]	2013-09-19 23:07:20 -04:00
droazen	713c988404	Merge pull request #395 from broadinstitute/dr_chapmanb_patch_add_api_close_methods Provide close methods to clean up resources used while creating AlignmentContexts from BAM file regions	2013-09-11 10:25:54 -07:00
chapmanb	2f5064dd1d	Provide close methods to clean up resources used while creating AlignmentContexts from BAM file regions. Allows utilization of CoveredLocusView via the API Signed-off-by: David Roazen <droazen@broadinstitute.org>	2013-09-10 15:32:54 -04:00
Ryan Poplin	5e539bb11b	Merge pull request #394 from broadinstitute/rp_single_sample_calling_pipeline_update Updates to the single sample calling pipeline to reflect latest experime...	2013-09-10 07:12:05 -07:00
Geraldine Van der Auwera	292426b504	Merge pull request #390 from broadinstitute/mc_update_clipreads Added REVERT SOFTCLIPPED bases to ClipReads	2013-09-09 16:43:03 -07:00
Geraldine Van der Auwera	8b829255e7	Clarified docs on using clipping options	2013-09-09 19:40:03 -04:00
Ryan Poplin	8971861fd0	Updates to the single sample calling pipeline to reflect latest experiments with full whole genome callsets.	2013-09-09 15:59:50 -04:00
MauricioCarneiro	014bc4269e	Merge pull request #361 from broadinstitute/bt_pairhmm_array_implementation Add Array Logless PairHMM	2013-09-08 20:16:53 -07:00
Ryan Poplin	08474a39fb	Merge pull request #389 from broadinstitute/rp_single_sample_calling_pipline_HC Created a single sample calling pipeline which leverages the reference m...	2013-09-06 14:35:51 -07:00
Ryan Poplin	3503050a39	Created a single sample calling pipeline which leverages the reference model calculation mode of the HaplotypeCaller -- Adding changes to CombineVariants to work with the Reference Model mode of the HaplotypeCaller. -- Added -combineAnnotations mode to CombineVariants to merge the info field annotations by taking the median -- Added new StrandBiasBySample genotype annotation for use in computing strand bias from single sample input vcfs -- Bug fixes to calcGenotypeLikelihoodsOfRefVsAny, used in isActive() as well as the reference model -- Added active region trimming capabilities to the reference model mode, not perfect yet, turn off with --dontTrimActiveRegions -- We only realign reads in the reference model if there are non-reference haplotypes, a big time savings -- We only realign reads in the reference model if the read is informative for a particular haplotype over another -- GVCF blocks will now track and output the minimum PLs over the block -- MD5 changes! -- HC tests: from bug fixes in calcGenotypeLikelihoodsOfRefVsAny -- GVCF tests: from HC changes above and adding in active region trimming	2013-09-06 16:56:34 -04:00
Mauricio Carneiro	b6c3ed0295	Added REVERT SOFTCLIPPED bases to ClipReads	2013-09-06 09:30:01 -04:00
Ryan Poplin	add17dc463	Merge pull request #388 from broadinstitute/eb_change_record_size_mismatch_to_user_error Changed the error for the record size mismatch in the genotyping engine ...	2013-08-30 10:29:54 -07:00
Eric Banks	ea0deb1bb2	Changed the error for the record size mismatch in the genotyping engine to be a user error since it is possible to reach this state with input VCFs that contain the same event multiple times (and it's not something we want to handle in the code).	2013-08-30 12:18:19 -04:00
Eric Banks	5d79a6cbe0	Merge pull request #387 from lbergelson/lb_add_ungenotyped_case_countvariants adding a check for the UNAVAILABLE case of GenotypeType in CountVariants	2013-08-30 06:14:08 -07:00
Louis Bergelson	4473b0065e	adding a check for the UNAVAILABLE case of GenotypeType in CountVariants	2013-08-29 17:27:00 -04:00
bradtaylor	0435bbe38f	Retreived PairHMM benchmarks from archive and made improvements PairHMMSyntheticBenchmark and PairHMMEmpirical benchmark were written to test the banded pairHMM, and were archived along with it. I returned them to the test directory for use in benchmarking the ArrayLoglessPairHMM. I commented out references to the banded pairHMM (which was left in archive), rather than removing those references entirely. Renamed PairHMMEmpiricalBenchmark to PairHMMBandedEmpiricalBenchmark and returned it to the archive. It has a few problems for use as a general benchmark, including initializing the HMM too frequently and doing too much setup work in the 'time' method. However, since the size selection and debug printing are useful for testing the banded implementation, I decided to keep it as-is and archive it alongside with the other banded pairHMM classes. I did fix one bug that was causing the selectWorkingData function to return prematurely. As a result, the benchmark was only evaluating 4-40 pairHMM calls instead of the desired "maxRecords". I wrote a new PairHMMEmpiricalBenchmark that simply works through a list of data, with setup work and hmm-initialization moved to its own function. This involved writing a new data read-in function in PairHMMTestData. The original was not maintaining the input data in order, the end result of which would be an over-estimate of how much caching we are able to do. The new benchmark class more closely mirrors real-world operation over large data. It might be cleaner to fix some of the issues with the BandedEmpiricalBenchmark and use one read-in function. However, this would involve more extensive changes to: PairHMMBandedEmpiricalBenchmark PairHMMTestData BandedLoglessPairHMMUnitTest I decided against this as the banded benchmark and unit test are archived.	2013-08-28 17:23:35 -04:00
bradtaylor	86fe9fae76	Changes to Array PairHMM to address review comments Returned Logless Caching implementation to the default in all cases. Changing to the array version will await performance benchmarking Refactored many pieces of functionality in ArrayLoglessPairHMM into their own methods.	2013-08-28 17:23:29 -04:00
bradtaylor	3671e41b0c	Add Array Logless PairHMM A new PairHMM implementation for read/haplotype likelihood calculations. Output is the same as the LOGLESS_CACHING version. Instead of allocating an entire (read x haplotype) matrix for each HMM state, this version stores sub-computations in 1D arrays. It also accesses intersections of the (read x haplotype) alignment in a different order, proceeding over "diagonals" if we think of the alignment as a matrix. This implementation makes use of haplotype caching. Because arrays are overwritten, it has to explicitly store mid-process information. Knowing where to capture this info requires us to look ahead at the subsequent haplotype to be analyzed. This necessitated a signature change in the primary method for all pairHMM implementations. We also had to adjust the classes that employ the pairHMM: LikelihoodCalculationEngine (used by HaplotypeCaller) PairHMMIndelErrorModel (used by indel genotyping classes) Made the array version the default in the HaplotypeCaller and the UnifiedArgumentCollection. The latter affects classes: ErrorModel GeneralPloidyIndelGenotypeLikelihoodsCalculationModel IndelGenotypeLikelihoodsCalculationModel ... all of which use the pairHMM via PairHMMIndelErrorModel	2013-08-28 17:21:23 -04:00
Ryan Poplin	7479152977	Merged bug fix from Stable into Unstable	2013-08-28 12:40:25 -04:00
Ryan Poplin	6bda569666	One of the log10sumLog10s in the VQSR was missed in a previous bug fix. Thanks to Mike McCowan for spotting this one.	2013-08-28 12:40:08 -04:00
Eric Banks	983097cff2	Merge pull request #385 from broadinstitute/gg_vqsr_docfixes Fixed a few typos and clarified some doc points.	2013-08-26 17:42:47 -07:00
Geraldine Van der Auwera	ed465cd2a5	Fixed a few typos and clarified some doc points.	2013-08-26 17:33:17 -04:00
David Roazen	42d771f748	Remove org.apache.commons.collections.IteratorUtils dependency from the test suite -This was a dependency of the test suite, but not the GATK proper, which caused problems when running the test suite on the packaged GATK jar at release time -Use GATKVCFUtils.readVCF() instead	2013-08-21 19:44:02 -04:00
Eric Banks	4b00c81181	Merge remote-tracking branch 'unstable/master'	2013-08-21 17:12:26 -04:00
Eric Banks	38b80a5916	Merge pull request #384 from broadinstitute/eb_pbt_dropping_multiallelics Fixed bug in PhaseByTransmission where it was completely dropping multi-allelic records.	2013-08-21 14:09:32 -07:00
Eric Banks	9424008055	Merge pull request #383 from broadinstitute/dr_change_phone_home_aws_settings Update GATK AWS phone-home configuration	2013-08-21 14:08:21 -07:00
Eric Banks	d4dc5ba04a	Fixed bug in PhaseByTransmission where it was completely dropping multi-allelic records. Added test to make sure this is no longer happening.	2013-08-21 15:46:57 -04:00
David Roazen	9fbb4920d0	Update GATK AWS phone-home configuration -Switch to using new GSA AWS account for storage of phone home data -Use DNS-compliant bucket names, as per Amazon's best practices -Encrypt publicly-distributed version of credentials. Grant only PutObject permission, and only for the relevant buckets. -Store non-distributed credentials in private/GATKLogs/newAWSAccountCredentials for now -- need to integrate with existing python/shell scripts later to get the log downloading working with the new account	2013-08-21 14:31:46 -04:00
Eric Banks	cdfd07f9eb	Merge pull request #382 from broadinstitute/ami-FixBcfCoder-52571227 Ami fix bcf coder 52571227	2013-08-21 11:29:19 -07:00
Ami Levy-Moonshine	0f5bb706ff	- update picard, sam, variants and tribble after fixing bug in BCF2Utils.makeDictionary as reported in ticket 52571227 - update call for VCFSimpleHeaderLine constructor in GATKVCFUtils	2013-08-21 12:06:42 -04:00
Ami Levy-Moonshine	ec0c33890a	change the dbsnp version from 137 to 129 for variantEval	2013-08-19 23:54:43 -04:00
Eric Banks	e1174a582d	Merge pull request #379 from broadinstitute/mc_dpp_updates_part2 Including SplitByRG in the FullProcessingPipeline	2013-08-19 18:42:12 -07:00

1 2 3 4 5 ...

12771 Commits (f3a67edc2424f2314cc394da8a0f5fc7eaf0ef32) All Branches Search

12771 Commits (f3a67edc2424f2314cc394da8a0f5fc7eaf0ef32)

All Branches