gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Khalid Shakir	218fe3875a	Quoting -out parameter during resource bundle (StingText.properties) creation. Fixes case where directory has parenthesis in it, like "Dropbox (Broad Dropbox1)".	2014-04-15 17:06:49 +08:00
Ryan Poplin	4b140c9e48	Merge pull request #600 from broadinstitute/rp_random_forest_no_QUAL Improvements to the Random Forest pipeline based on Marathon results.	2014-04-11 13:41:05 -04:00
Ryan Poplin	04ddbac585	Improvements to the Random Forest pipeline based on Marathon results. -- We no longer use QUAL because it scales insidiously with AC. -- By default we exclude sites in which NA12878 is polymorphic to prevent overfitting to the knowledgebase. -- Tweaks to training parameters were required because of the QUAL change. -- We now test for model convergence instead of specifying the number of iterations at the command line.	2014-04-11 12:16:05 -04:00
kshakir	6d58e61f23	Merge pull request #603 from broadinstitute/ks_specify_columns_analyzerunreports Mapping fields to explicit column names in analyzeRunReports.py	2014-04-11 04:30:31 +08:00
Khalid Shakir	c84235c17c	Mapping fields to explicit column names in analyzeRunReports.py. Removed SQLSetupTable support.	2014-04-11 04:28:33 +08:00
Eric Banks	e38a295ebd	Merge pull request #601 from broadinstitute/ami-updateScalaScript update scala scrits to include more of the pipeline stpes	2014-04-10 16:01:02 -04:00
droazen	1590f06322	Merge pull request #602 from broadinstitute/use_version_controlled_scripts_for_s3_dl Use version-controlled copies of scripts in GATKReports downloader	2014-04-10 15:40:37 -04:00
David Roazen	147bd88d26	Use version-controlled copies of scripts in GATKReports downloader	2014-04-10 15:39:06 -04:00
Ami Levy-Moonshine	40360ddb56	update scala scrits to include more of the pipeline stpes Add a new script for evaluating the RNA-seq downsample results	2014-04-10 15:29:17 -04:00
jmthibault79	c275d76a3e	Merge pull request #599 from broadinstitute/jt_logging_test Integration test for logging to stderr	2014-04-09 15:31:51 -04:00
Joel Thibault	c84126205b	Test that stdout redirects and log files do not affect output	2014-04-09 13:52:42 -04:00
Joel Thibault	1103fd231a	Better exception message	2014-04-09 10:51:45 -04:00
Ryan Poplin	1001a75d0e	Merge pull request #598 from broadinstitute/rp_random_forest_fix_tranches Bug fix for correctly parsing the tranche tag in the RandomForestWalker.	2014-04-09 09:28:23 -04:00
kshakir	5b32b7b191	Merge pull request #595 from broadinstitute/ks_picard_matecigar_update After comments from @nh13, updated latest picard and setMateInfo call.	2014-04-09 10:30:22 +08:00
Ryan Poplin	edd15add7c	Bug fix for correctly parsing the tranche tag in the RandomForestWalker.	2014-04-08 15:39:17 -04:00
Khalid Shakir	a6b0754990	After comments from @nh13, updated latest picard and setMateInfo call.	2014-04-08 15:22:45 -04:00
kshakir	cc580ac75f	Merge pull request #593 from broadinstitute/ks_bqsrgatherer_missing_readgroups_68720468 BQSRGatherer handles missing read groups from some input files.	2014-04-09 03:17:53 +08:00
Khalid Shakir	3047d6ff32	BQSRGatherer handles missing read groups from some input files. [#68720468 ]	2014-04-08 23:58:54 +08:00
Eric Banks	b07c0a6b4c	Merge pull request #594 from broadinstitute/dr_vcf_sample_renaming Extend on-the-fly sample renaming feature to vcfs	2014-04-08 11:47:45 -04:00
David Roazen	af6a897479	Extend on-the-fly sample renaming feature to vcfs -Only works with single-sample vcfs -As with bams, the user must provide a file mapping the absolute path to each vcf whose samples are to be renamed to the new sample name for that vcf. The argument is the same as for bams: --sample_rename_mapping_file, and the mapping file may contain a mix of bam and vcf files should the user wish. -It's an error to attempt to remap the sample names of a multi-sample or sites-only vcf -Implemented at the codec level at the instant the vcf header is first read in to minimize the chances of downstream code examining vcf headers/records before renaming occurs. -Integration tests are in sting, unit tests are in picard -Rev picard et. al. to 1.111.1902	2014-04-08 11:07:00 -04:00
Eric Banks	e40cad7b50	Merge pull request #597 from broadinstitute/eb_fix_b36_chainfile The contig is named MT, not M in b36. Delivers PT68890442.	2014-04-08 10:04:44 -04:00
Eric Banks	e690ed1a67	The contig is named MT not M in b36. Delivers PT68890442.	2014-04-08 10:03:47 -04:00
Eric Banks	85f68f610e	Merge pull request #596 from broadinstitute/eb_fix_na12878_roc_curve_maker Don't error out with ArithmeticException in ROC maker when using small sets	2014-04-08 09:56:07 -04:00
Eric Banks	ad336375dc	Merge pull request #590 from broadinstitute/vrr_validate_variants_unused_alleles_fix Addresses issue with strict validation on GVCF files.	2014-04-07 22:10:49 -04:00
Ryan Poplin	416ccef0c5	Merge pull request #592 from broadinstitute/rp_random_forest_improvements Balancing training classes between SNP/Indel and TP/FP.	2014-04-07 21:22:45 -04:00
Valentin Ruano-Rubio	5afcc8e05f	Change in the command line interface of ValidateVariants. Following reviewers comments the command line interface has been simplified. All extra strict validations are performed by default (as before) and the user has to indicate which one he/she does not want to use with --validationTypeToExclude. Before he/she was able to indicate the only ones to apply with --validationType but that has been scrapped out. Stories: - https://www.pivotaltracker.com/story/show/68725164 Changes: - Removed validateType argument. - Improved documentation. - Added some warnning log message on suspicious argument combinations. Tests: - ValidateVariantsIntegrationTest#*	2014-04-07 16:27:11 -04:00
Ryan Poplin	7d11b4d5f1	Balancing training classes between SNP/Indel and TP/FP. -- This results in much more consistent distribution of LOD scores for SNPs and Indels. -- Removing genotype summary stats since they are now produced by default. -- Added functionality to specify certain subsets of the training data to be used in Tranche file generation, -good:tranche=true set.vcf	2014-04-07 15:23:53 -04:00
Eric Banks	de2a2442d9	Merge pull request #591 from broadinstitute/rp_add_genotype_summary_annotations Adding GenotypeSummaries as INFO field annotations.	2014-04-07 09:21:07 -04:00
Ryan Poplin	f058224b3e	Adding GenotypeSummaries as INFO field annotations. -- This is needed so the ref model pipeline can cut down to sites-only files without losing these useful statistics. -- Added new unit test to test this info field annotation. -- GenotypeGVCF integration tests change because new annotations are present in the output	2014-04-06 11:50:10 -04:00
Eric Banks	d5edb53906	Don't error out with ArithmeticException in ROC maker when using small sets	2014-04-05 23:34:40 -04:00
MauricioCarneiro	84861fa10a	Merge pull request #587 from broadinstitute/eb_actually_fail_on_reduced_bams Make sure to fail in all cases where the BAM being used was created by ReduceReads.	2014-04-04 17:27:57 -04:00
Eric Banks	267603f9a9	Merge pull request #589 from broadinstitute/ldg_SelVarXsampleFile Added check to make sure file passed in with sample IDs is valid (used i...	2014-04-04 15:56:16 -04:00
Laura Gauthier	ff25b656e1	Added check to make sure file passed in with sample IDs is valid (used in SelectVariants) -- throws UserException. Corresponding test checks for UserException.	2014-04-04 15:38:50 -04:00
Valentin Ruano-Rubio	18deeec6b0	Addresses issue with strict validation on GVCF files. More concretelly Picard's strict VCF validation does not like that there is alternative alleles that are not participating in any genotype call across samples. This is an issue with GVCF in the single-sample pipeline where this is certainly expected with <NON_REF> and other relative unlikely alleles. To solve this issue we allow the user to exclude some of the strict validations using a new argument --validationTypeToExclude. In order to avoid the validation issue with GVCF the user needs to add the following to the command line: '--validationTypeToExclude ALLELES' Story: https://www.pivotaltracker.com/story/show/68725164 Changes: - Added validateTypeToExclude argument to ValidateVariants walker. - Implemented the selective exclusion of validation types. - Added new info and improved existing documentation of the ValidateVariants walker. Tests: - ValidateVariantsIntegrationTest#testUnusedAlleleError - ValidateVariantsIntegrationTest#testUnusedAlleleFix	2014-04-04 14:37:10 -04:00
Laura Gauthier	06d78ba068	Expanded documentation to include description of which callsets are being compared in what order and more definitions	2014-04-04 10:35:53 -04:00
Eric Banks	9be07e0838	Merge pull request #588 from broadinstitute/eb_fix_ir_exception IndelRealigner throws a user error when it encounters reads with I opera...	2014-04-04 10:11:51 -04:00
Eric Banks	7174f8cfeb	IndelRealigner throws a user error when it encounters reads with I operators greater than the number of read bases. Added test to ensure it works.	2014-04-03 18:16:24 -04:00
Eric Banks	a3d55b3341	Make sure to fail in all cases where the BAM being used was created by ReduceReads. In some cases, the program records were being removed from the BAM headers by the GATK engine before we applied the check for reduced reads (so we did not fail appropriately). Pushed up the check to happen before the PG tags are modified and added a unit test to ensure it stays that way. It turns out that some UG tests still used reduced bams so I switched to use different ones. Based on reviewer feedback, made it more generic so that it's easy to add new unsupported tools.	2014-04-03 16:52:41 -04:00
Geraldine Van der Auwera	890f4e8873	Merge pull request #586 from broadinstitute/eb_allow_users_to_specify_iupac_sample Slightly modifying the way to use the IUPAC ambiguity codes in the Fasta...	2014-04-03 09:29:56 -04:00
Eric Banks	0b73573abc	Slightly modifying the way to use the IUPAC ambiguity codes in the FastaAlternateReferenceMaker. Previously it required you to create a single sample VCF and then to pass that in to the tool, but Geraldine convinced me that this was a pain for users (because they usually have multi-sample VCFs). Instead now you can pass in a multi-sample VCF and specify which sample's genotypes should be used for the IUPAC encoding. Therefore the argument changed from '--useIUPAC' to '--use_IUPAC_sample NA12878'.	2014-04-02 21:34:25 -04:00
Eric Banks	6bba8d7147	Merge pull request #585 from broadinstitute/ks_variantqc_patch Resuscitated from git and copy/pasted in old gsalib methods need for the private script variantCallQC.R to run, for now.	2014-04-02 16:48:42 -04:00
Khalid Shakir	0647824e75	Resuscitated from git and copy/pasted in old gsalib methods need for the private script variantCallQC.R to run, for now.	2014-04-03 04:22:11 +08:00
Valentin Ruano Rubio	45c192bb6d	Merge pull request #580 from broadinstitute/vrr_graphbase_infinite_likelihoods_reprise Fixed bug using GraphBased due to infinite likelihoods resulting from th...	2014-04-02 00:45:17 -04:00
Valentin Ruano-Rubio	84711b8e90	Fixed bug using GraphBased due to infinite likelihoods resulting from the calculation of alignment cost of very long insertion or deletions (done in linear scale) Stories: https://www.pivotaltracker.com/story/show/66263868 Bug: The problem was due to the way we were calculating the fix penalty of a large deletion or insertion. In this case we calculate the alignment likelihood of the portion or read or haplotype deletion as the penalty of that deletion/insertion without going through the full pair-hmm process. For large events this resulted in a 0 in in linear scale computations that ins transformed into an infinity in log scale. Changes: - Change to use log10 scale for calculate those penalties. - Minor addition of .gitignore to hide ./public/external-example/target which is generated by the building process.	2014-04-01 16:14:52 -04:00
droazen	c0286853b7	Merge pull request #584 from broadinstitute/dr_update_queue_test_script_for_naming_change Update queue test runner script for upcoming naming changes	2014-04-01 11:52:22 -04:00
David Roazen	ef8f91a5be	Update queue test runner script for upcoming naming changes Use both the old and new names for now, until the transition is complete.	2014-04-01 11:49:55 -04:00
jmthibault79	8703bd7ad4	Merge pull request #583 from broadinstitute/jt_tabix Create Tabix indices for block-compressed VCFs	2014-03-31 16:17:25 -04:00
Joel Thibault	70fe7f72f1	Return a TabixIndexCreator for appropriate file types [Fixes #68291082]	2014-03-31 16:15:34 -04:00
Joel Thibault	ab5634cbac	Test that a Tabix index is created for block-compressed output formats - Replace .idx and .tbi with appropriate constants	2014-03-31 14:36:48 -04:00
Joel Thibault	a2d40c84ba	Keep the list of zipped suffixes in sync with Variant	2014-03-31 14:36:41 -04:00

1 2 3 4 5 ...

13353 Commits (218fe3875a1f84d712ca8120c67d5377ea8e0a26) All Branches Search

13353 Commits (218fe3875a1f84d712ca8120c67d5377ea8e0a26)

All Branches