gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Mark DePristo	de7fe2e086	Merge pull request #308 from broadinstitute/rp_assessment_low_coverage Don't count no coverage sites as false negatives in the assessment again...	2013-06-27 06:46:23 -07:00
Eric Banks	9f08718636	Merge pull request #309 from jsilter/master Add "isComplexEvent" as attribute	2013-06-26 16:00:51 -07:00
Jacob Silterra	beb834e849	Add "isComplexEvent" as attribute to VariantContextBuilder for MongoVariantContext	2013-06-26 17:12:32 -04:00
Ryan Poplin	fe5348ea5d	Don't count no coverage sites as false negatives in the assessment against the knowledge base	2013-06-26 16:02:44 -04:00
Mark DePristo	a514dd0643	Merge pull request #307 from broadinstitute/eb_rr_off_by_one_error Proper fix for previous RR -cancer_mode fix.	2013-06-26 13:02:23 -07:00
Eric Banks	876e40466a	Proper fix for previous RR -cancer_mode fix. I "fixed" this once before but instead of testing with unit tests I used integration tests. Bad decision. The proper fix is in now, with a bonafide unit test included.	2013-06-26 14:48:09 -04:00
Eric Banks	95eab80f9b	Merge pull request #306 from broadinstitute/eb_make_assessreducedquals_hidden Make this walker @Hidden	2013-06-26 08:47:28 -07:00
Eric Banks	f242be12c0	Make this walker @Hidden	2013-06-26 11:45:21 -04:00
Mark DePristo	28d4c3debc	Merge pull request #305 from broadinstitute/dr_move_DownsampleReadsQC_to_private Move DownsampleReadsQC walker to private	2013-06-25 16:33:20 -07:00
David Roazen	94294ed6c4	Move DownsampleReadsQC walker to private	2013-06-25 15:48:44 -04:00
Mark DePristo	d13ed06e9d	Merge pull request #303 from broadinstitute/eb_update_kb_to_use_exome_intervals Various updates to have the KB use the expanded exome intervals (from D ...	2013-06-24 13:06:52 -07:00
Eric Banks	6dc816beee	Various updates to have the KB use the expanded exome intervals (from D MacArthur) in addition to chr20. 1. MergeIntervalLists should take the global interval padding into account when merging. 2. Update the name of the imported callsets in the setup script because of renaming for expanded intervals. 3. If there are too many intervals to process, MongoDB falls apart. Refactored the site selection code so that in such cases we pull out all records from the DB and the GATK itself does the interval filtering. 4. Add isComplex to callset summary for the consensus summarizer. 5. Remove the check for out of order records in the SiteIterator since records now do come out of order (since contigs are sorted lexicographically in MongoDB). Results: Iteration over the gencode intervals (90 MB) in AssessNA12878 now takes 90 seconds. I can't tell you how much time it took before because it kept crashing Mongo (but it was a long, long time).	2013-06-24 14:57:35 -04:00
Mark DePristo	ff76d0c877	Merge pull request #304 from broadinstitute/eb_rr_header_negative_fix_again Fixing the 'header is negative' problem in Reduce Reads... again.	2013-06-24 11:55:52 -07:00
Eric Banks	165b936fcd	Fixing the 'header is negative' problem in Reduce Reads... again. Previous fixes and tests only covered trailing soft-clips. Now that up front hard-clipping is working properly though, we were failing on those in the tool. Added a patch for this as well as a separate test independent of the soft-clips to make sure that it's working properly.	2013-06-24 14:06:21 -04:00
Valentin Ruano-Rubio	b97f9a487d	Merged bug fix from Stable into Unstable	2013-06-24 14:00:01 -04:00
Mark DePristo	521d9c1df5	Merge pull request #302 from broadinstitute/mc_processing_pipeline2 quick updates to the techdev processing pipeline scala script	2013-06-24 09:52:55 -07:00
Mauricio Carneiro	c38b8065d8	quick fixes to the scala script * Increase the memory limit for HTSLIB - Bam shuffling just eats up a ton of memory. * Concurrent HTSLIB processes need unique temp files the bam shuffling step was messing up with the temporary files and failing without returning zero. Fixed it by giving a unique name to each process.	2013-06-24 12:44:47 -04:00
Mark DePristo	191e4ca251	Merge pull request #300 from broadinstitute/mc_move_qualify_intervals_to_protected Few bug fixes to this tool now that it is in protected	2013-06-24 09:35:45 -07:00
Yossi Farjoun	d8ca4d3e6d	Merge pull request #299 from broadinstitute/eb_mate_fixer_confused_by_nonprimary_alignment Another fix for the Indel Realigner that arises because of secondary alignments.	2013-06-24 06:58:27 -07:00
Valentin Ruano-Rubio	3e5ff6095f	Added the pertinent DocumentedGATKFeature annotation ot AnalyzeCovariates	2013-06-21 17:02:26 -04:00
Eric Banks	d976aae2b1	Another fix for the Indel Realigner that arises because of secondary alignments. This time we don't accidentally drop reads (phew), but this bug does cause us not to update the alignment start of the mate. Fixed and added unit test to cover it.	2013-06-21 16:59:22 -04:00
Mark DePristo	dee51c4189	Error out when NCT and BAMOUT are used with the HaplotypeCaller -- Currently we don't support writing a BAM file from the haplotype caller when nct is enabled. Check in initialize if this is the case, and throw a UserException	2013-06-21 09:25:57 -04:00
David Roazen	e03a5e9486	Update source release script in attempt to work around intermittent github issues Github was intermittently rejecting large pushes that were in fact fast-forward updates as being non-fast-forward. Try to prevent this by ensuring that all refs are up-to-date and properly checked out after branch filtering and before doing a source release.	2013-06-20 16:58:01 -04:00
David Roazen	0018af0c0a	Update README file for the 2.6 release	2013-06-20 13:08:29 -04:00
Eric Banks	6977d6e2a7	Merge remote-tracking branch 'unstable/master'	2013-06-20 12:14:33 -04:00
Eric Banks	9f979fdc81	Merge pull request #297 from broadinstitute/md_vcfversion2 Better GATK version and command line output	2013-06-20 09:11:36 -07:00
Mark DePristo	fdfe4e41d5	Better GATK version and command line output -- Previous version emitted command lines that look like: ##HaplotypeCaller="analysis_type=HaplotypeCaller input_file=[private/testdata/reduced.readNotFullySpanningDeletion.bam] ..." the new version provides additional information on when the GATK was run and the GATK version in a nicer format: ##GATKCommandLine=<ID=HaplotypeCaller,Version=2.5-206-gbc7be2b,Date="Thu Jun 20 11:09:01 EDT 2013",Epoch=1371740941197,CommandLineOptions="analysis_type=HaplotypeCaller input_file=[private/testdata/reduced.readNotFullySpanningDeletion.bam] read_buffer_size=null phone_home=AWS ..."> -- Additionally, the command line options are emitted sequentially in the file, so you can see a running record of how a VCF was produced, such as this example from the integration test: ##GATKCommandLine=<ID=HaplotypeCaller,Version=2.5-206-gbc7be2b,Date="Thu Jun 20 11:09:01 EDT 2013",Epoch=1371740941197,CommandLineOptions="lots of stuff"> ##GATKCommandLine=<ID=SelectVariants,Version=2.5-206-gbc7be2b,Date="Thu Jun 20 11:16:23 EDT 2013",Epoch=1371741383277,CommandLineOptions="lots of stuff"> -- Removed the ProtectedEngineFeaturesIntegrationTest -- Actual unit tests for these features!	2013-06-20 11:19:13 -04:00
Mark DePristo	701d70401f	Merge pull request #296 from broadinstitute/md_pubprotfix Fix public / protected dependency	2013-06-19 17:17:21 -07:00
Mark DePristo	0672ac5032	Fix public / protected dependency	2013-06-19 19:42:09 -04:00
Eric Banks	74415a6a2a	Merge pull request #292 from broadinstitute/vrr_analyzeCovariates Added the AnalyzeCovariates tool to generate BQSR quality assessment plots.	2013-06-19 13:26:59 -07:00
Valentin Ruano-Rubio	1f8282633b	Removed plots generation from the BaseRecalibration software Improved AnalyzeCovariates (AC) integration test. Renamed AC test files ending with .grp to .table Implementation: * Removed RECAL_PDF/CSV_FILE from RecalibrationArgumentCollection (RAC). Updated rest of the code accordingly. * Fixed BQSRIntegrationTest to work with new changes	2013-06-19 14:47:56 -04:00
Valentin Ruano-Rubio	08f92bb6f9	Added AnalyzeCovariates tool to generate BQSR assessment quality plots. Implemtation details: * Added tool class .AnalyzeCovariates Added convenient addAll method to Utils to be able to add elements of an array. * Added parameter comparison methods to RecalibrationArgumentCollection class in order to verify that multiple imput recalibration report are compatible and comparable. * Modified the BQSR.R script to handle up to 3 different recalibration tables (-BQSR, -before and -after) and removed some irrelevant arguments (or argument values) from the output. * Added an integration test class.	2013-06-19 14:38:02 -04:00
Mark DePristo	fb114e34fe	Merge pull request #295 from broadinstitute/dr_remove_PrintReads_ds_argument PrintReads: remove -ds argument	2013-06-19 10:55:10 -07:00
droazen	573ecadecc	Merge pull request #294 from broadinstitute/dr_handle_zero_length_cigar_elements SAMDataSource: always consolidate cigar strings into canonical form	2013-06-19 10:32:22 -07:00
David Roazen	51ec5404d4	SAMDataSource: always consolidate cigar strings into canonical form -Collapses zero-length and repeated cigar elements, neither of which can necessarily be handled correctly by downstream code (like LIBS). -Consolidation is done before read filters, because not all read filters behave correctly with non-consoliated cigars. -Examined other uses of consolidateCigar() throughout the GATK, and found them to not be redundant with the new engine-level consolidation (they're all on artificially-created cigars in the HaplotypeCaller and SmithWaterman classes) -Improved comments in SAMDataSource.applyDecoratingIterators() -Updated MD5s; differences were examined and found to be innocuous -Two tests: -Unit test for ReadFormattingIterator -Integration test for correct handling of zero-length cigar elements by the GATK engine as a whole	2013-06-19 13:29:01 -04:00
David Roazen	23ee192d5e	PrintReads: remove -ds argument -This argument was completely redundant with the engine-level -dfrac argument. -Could produce unintended consequences if used in conjunction with engine-level downsampling arguments.	2013-06-19 13:22:44 -04:00
David Roazen	0be788f0f9	Fix typo in snpEff documentation	2013-06-19 13:15:24 -04:00
chartl	a3d6ad55f9	Merge pull request #271 from broadinstitute/chartl_extend_genotypeconcordance_documentation Extend Genotype Concordance Documentation	2013-06-19 09:03:05 -07:00
Chris Hartl	af275fdf10	Extend the documentation of GenotypeConcordance to include notes about Monomorphic and Filtered VCF records. Address Geraldine's comments - information on moltenization and explanation of fields Fix paren	2013-06-19 12:01:58 -04:00
amilev	28a8d74290	Merge pull request #293 from broadinstitute/md_catvariants CatVariants accepts reference files ending in any standard extension	2013-06-19 08:36:58 -07:00
Mark DePristo	15171c07a8	CatVariants accepts reference files ending in any standard extension -- [resolves #49339235] Make CatVariants accept reference files ending in .fa (not only .fasta)	2013-06-19 11:10:36 -04:00
MauricioCarneiro	6a5502c94a	Merge pull request #289 from broadinstitute/md_fix_bq Bugfix: defaultBaseQualities actually works now	2013-06-18 11:58:39 -07:00
delangel	1c400e8f8e	Merge pull request #291 from broadinstitute/gda_new_hmm_in_ug Swapping in logless Pair HMM for default usage with UG:	2013-06-18 07:07:57 -07:00
Guillermo del Angel	f176c854c6	Swapping in logless Pair HMM for default usage with UG: -- Changed default HMM model. -- Removed check. -- Changed md5's: PL's in the high 100s change by a point or two due to new implementation. -- Resulting performance improvement is about 30 to 50% less runtime when using -glm INDEL.	2013-06-18 10:06:27 -04:00
Mark DePristo	4c482eb0f0	Merge pull request #290 from broadinstitute/rp_pruning_priority_queue Adding new pruning parameter to ReadThreadingAssembler	2013-06-17 17:16:00 -07:00
Ryan Poplin	8511c4385c	Adding new pruning parameter to ReadThreadingAssembler -- numPruningSamples allows one to specify that the minPruning factor must be met by this many samples for a path to be considered good (e.g. seen twice in three samples). By default this is just one sample. -- adding unit test to test this new functionality	2013-06-17 16:46:40 -04:00
delangel	a6a58cbc78	Merge pull request #288 from broadinstitute/gda_more_ancient_dna_fixes Feature requested by Reich lab and Paavo lab in Leipzig for ancient DNA ...	2013-06-17 13:04:21 -07:00
Mark DePristo	cb5b1c3c34	Create README.md	2013-06-17 16:03:45 -03:00
Mark DePristo	7b22467148	Bugfix: defaultBaseQualities actually works now -- It was being applied in the wrong order (after the first call to the underlying MalformedReadFilter) so if your first read was malformed you'd blow up there instead of being fixed properly. Added integration tests to ensure this continues to work. -- [delivers #49538319]	2013-06-17 14:37:27 -04:00
Guillermo del Angel	f6025d25ae	Feature requested by Reich lab and Paavo lab in Leipzig for ancient DNA processing: -- When doing cross-species comparisons and studying population history and ancient DNA data, having SOME measure of confidence is needed at every single site that doesn't depend on the reference base, even in a naive per-site SNP mode. Old versions of GATK provided GQ and some wrong PL values at reference sites but these were wrong. This commit addresses this need by adding a new UG command line argument, -allSitePLs, that, if enabled will: a) Emit all 3 ALT snp alleles in the ALT column. b) Emit all corresponding 10 PL values. It's up to the user to process these PL values downstream to make sense of these. Note that, in order to follow VCF spec, the QUAL field in a reference call when there are non-null ALT alleles present will be zero, so QUAL will be useless and filtering will need to be done based on other fields. -- Tweaks and fixes to processing pipelines for Reich lab.	2013-06-17 13:21:09 -04:00

1 2 3 4 5 ...

12539 Commits (de7fe2e086f570ca48ca0cfc908e85decdc52fe6) All Branches Search

12539 Commits (de7fe2e086f570ca48ca0cfc908e85decdc52fe6)

All Branches