gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Ami Levy-Moonshine	0f5bb706ff	- update picard, sam, variants and tribble after fixing bug in BCF2Utils.makeDictionary as reported in ticket 52571227 - update call for VCFSimpleHeaderLine constructor in GATKVCFUtils	2013-08-21 12:06:42 -04:00
Ami Levy-Moonshine	ec0c33890a	change the dbsnp version from 137 to 129 for variantEval	2013-08-19 23:54:43 -04:00
Eric Banks	e1174a582d	Merge pull request #379 from broadinstitute/mc_dpp_updates_part2 Including SplitByRG in the FullProcessingPipeline	2013-08-19 18:42:12 -07:00
Eric Banks	6663d48ffe	Merge pull request #381 from broadinstitute/mm_rev_picard_to_get_tribble_updates Adaptations to accomodate Tribble API changes.	2013-08-19 18:31:02 -07:00
Michael McCowan	c3a933ce84	Adaptations to accomodate Tribble API changes, comprising mostly of the following. * Refactoring implementations of readHeader(LineReader) -> readActualHeader(LineIterator), including nullary implementations where applicable. * Galvanizing fo generic types. * Test fixups, mostly to pass around LineIterators instead of LineReaders. * New rev of tribble, which incorporates a fix that addresses a problem with TribbleIndexedFeatureReader reading a header twice in some instances. * New rev of sam, to make AbstractIterator visible (was moved from picard -> sam in Tribble API refactor).	2013-08-19 15:52:47 -04:00
Mauricio Carneiro	e991307eb5	Including SplitByRG in the FullProcessingPipeline Why wasn't it there before, you ask ---------------------------------- Before I was running it separately (by hand), but now it's integrated in the FullProcessingPipeline. Integration was a pain because of Queue's limitation of only allowing 1 @Output file. This forced me to write the ugliest piece of code of my life, but it's working and it's processing the YRI from scratch using that right now. So I'm happy... somewhat. Other changes to the pipeline ----------------------------- * Add --filter_bases_not_stored to the IndelRealigner step -- sometimes BAM files have reads with no bases stored in the unmapped section (no idea why) but this disrupts the pipeline. * Change adaptor marking parameter to "dual indexed" instead of "pair-ended" -- for PCR Free data.	2013-08-18 00:51:32 -04:00
droazen	ee5de8510d	Merge pull request #380 from broadinstitute/gg_gatkdocs_arglabels More detailed labels for arguments in the gakdocs	2013-08-16 15:34:56 -07:00
Geraldine Van der Auwera	80ed186971	More detailed labels for arguments in the gakdocs (requested by David)	2013-08-16 14:25:53 -04:00
Geraldine Van der Auwera	6cda283115	Merge pull request #378 from broadinstitute/gg_mutehelp Disabled the help system's printout of cmdline options when GATK errors ...	2013-08-16 10:11:28 -07:00
Geraldine Van der Auwera	9bb0aac7bf	Disabled the help system's printout of cmdline options when GATK errors out. Now the user has to explicitly ask for it using -h.	2013-08-16 13:09:52 -04:00
Geraldine Van der Auwera	3841635fcb	Changed 'depreciated' to the more correct 'deprecated'	2013-08-16 13:06:41 -04:00
Ryan Poplin	a21d5252c8	Merge pull request #377 from broadinstitute/eb_experiment_with_pcr_error_model Eb experiment with pcr error model	2013-08-16 08:38:55 -07:00
Eric Banks	08be871309	Removing unused code in VariantsToTable: GQ is not an INFO field and is taken care of by -GF and not -F.	2013-08-16 01:57:24 -04:00
Eric Banks	17eb7b49fe	Adding ability to use Ryan's PCR error modeling to the Haplotype Caller. There is now a command-line option to set the model to use in the HC. Incorporated Ryan's current (unmerged) branch in for most of these changes. Because small changes to the math can have drastic effects, I decided not to let users tweak the calculations themselves. Instead they can select either NONE, CONSERVATIVE (the default), or AGGRESSIVE. Note that any base insertion/deletion qualities from BQSR are still used. Also, note that the repeat unit x repeat length approach gave very poor results against the KB, so it is not included as an option here.	2013-08-16 01:53:04 -04:00
Eric Banks	e7152e10f7	Rev'ing picard, tribble, and variant jars.	2013-08-16 00:16:31 -04:00
Eric Banks	07f3f6f69d	Some minor updates to the KB: 1. PASSing records shouldn't make ImportReviews error out 2. Fix KB setup script so that non-NA12878 sites-only files are not assumed to be TPs	2013-08-15 10:02:55 -04:00
Eric Banks	1a5e4cc4e7	Merge pull request #375 from broadinstitute/rp_queue_jobreport_rscript Something changed with the ggtitle syntax in the latest version of ggplo...	2013-08-14 12:48:42 -07:00
Eric Banks	7f48f991cc	Merge pull request #376 from broadinstitute/gg_AR_gatkdocfix Made AR an Advanced argument...	2013-08-14 12:48:05 -07:00
Geraldine Van der Auwera	19a4bf9ff0	made AR an Advanced argument to discourage basic users from fiddling with it	2013-08-14 14:46:56 -04:00
Ryan Poplin	d4ac183580	Something changed with the ggtitle syntax in the latest version of ggplot2.	2013-08-14 14:40:03 -04:00
Eric Banks	928a9779db	Merge pull request #374 from broadinstitute/mc_dpp_updates Updated Full Processing Pipeline	2013-08-14 04:50:25 -07:00
Mauricio Carneiro	765f5450ac	Updated Full Processing Pipeline * add interleaved fastq option to sam2fastq * add optional adapter trimming path * add "skip_revert" option to skip reverting the bams (sometimes useful -- hidden parameter) * add a walker that reads in one bam file and outputs N bam files, one for each read group in the original bam. This is a very important step in any BAM reprocessing pipeline. I am using this new pipeline to process the CEU and YRI PCR Free WGS trios.	2013-08-13 23:35:32 -04:00
Eric Banks	69e78efeae	Merge pull request #366 from broadinstitute/gg_gatkdocfixes More gatkdoc fixes	2013-08-13 04:52:03 -07:00
Eric Banks	bcf9a1cda5	Merge pull request #370 from broadinstitute/rp_dont_output_filtered_variants_in_VQSR Adding mode to VQSR to not output variant records that are filtered out ...	2013-08-12 12:01:50 -07:00
Eric Banks	4aeb37e1e7	Merge pull request #373 from broadinstitute/rp_vqsr_numbad_docs Cleaning up help text for the -numBad argument.	2013-08-12 10:56:46 -07:00
Ryan Poplin	a45011d7e7	Adding mode to VQSR to not output variant records that are filtered out after applying the recalibration. Necessary for 1000G calling.	2013-08-12 11:22:59 -04:00
Ryan Poplin	59f56bef30	Cleaning up help text for the -numBad argument.	2013-08-12 09:51:56 -04:00
Eric Banks	163d796ae7	Merge pull request #372 from broadinstitute/eb_add_zook_reviews_to_script Forgot to add this in the previous commit to add J Zook's reviews	2013-08-12 06:51:31 -07:00
Eric Banks	b86742903b	Forgot to add this in the previous commit to add J Zook's reviews	2013-08-12 09:50:41 -04:00
Eric Banks	a2134a92fb	Merge pull request #371 from broadinstitute/eb_add_zook_reviews Adding Justin Zook's review file to the KB	2013-08-12 06:45:11 -07:00
Eric Banks	8071862af8	Adding Justin Zook's review file to the KB	2013-08-12 09:37:46 -04:00
Geraldine Van der Auwera	a09831489b	Disabled emission of doc URLs for external codecs to avoid broken links	2013-08-10 10:04:04 -07:00
Geraldine Van der Auwera	4d20c71e09	Improvements to various gatkdocs - Make -rod required - Document that contaminationFile is currently not functional with HC - Document liftover process more clearly - Document VariantEval combinations of ST and VE that are incompatible - Added a caveat about using MVLR from HC and UG. - Added caveat about not using -mte with -nt - Clarified masking options - Fixed docs based on Erics comments	2013-08-10 10:01:31 -07:00
Mark DePristo	8b5178bd59	Merge pull request #368 from broadinstitute/md_incremental_caller UG and script changes to make accessing the value of joint calling easier	2013-08-09 08:21:03 -07:00
Mark DePristo	415aee2be2	Two scripts to enable assessment of the value of joint calling and the effectiveness of the joint caller	2013-08-09 11:00:15 -04:00
Mark DePristo	b7d1096ced	Added onlyEmitSamples argument to UnifiedGenotyper -- When provided, this argument causes us to only emit the selected samples into the VCF. No INFO field annotations (AC for example) or other features are modified. It's current primary use is for efficiently evaluating joint calling. -- Add integration test for onlyEmitSamples	2013-08-09 11:00:15 -04:00
Ryan Poplin	cf5dd58303	Merge pull request #369 from broadinstitute/eb_small_updates_to_kb Various small improvements to the KB assessments.	2013-08-09 06:09:27 -07:00
Eric Banks	dd85646067	Various small improvements to the KB assessments. 1) TP reviews with 0/0 genotypes were killing those sites and making them appear as assessed FPs even when correctly called! Fixed this by changing the logic in the assessor to allow discordant genotypes through as FPs. Also, isMonomorphic() in the MongoGenotype needs to check whether the genotype is discordant. Added unit test for this case. 2) Minor code cleanup in the Assessor class. The most important being the renaming of isUsableCall() to isNotUsableCall() since that's what it is returning.	2013-08-08 23:37:45 -04:00
Mark DePristo	ccf0df0fea	Misc. debugging functionality to FS calculation (disabled by default)	2013-08-08 12:06:23 -04:00
kshakir	1f86cf13d1	Merge pull request #359 from lbergelson/lb_relax_add_parameter Trivial update to QScript.scala	2013-08-07 06:45:01 -07:00
Eric Banks	6d67795916	Merge pull request #365 from broadinstitute/md_kb_improvements Fix multiple critical bugs in NA12878 KB	2013-08-07 06:38:21 -07:00
Mark DePristo	66e1b75118	Critical bugfix for OneChunkIterator -- Previous version used overlaps on the full GenomeLoc of the variant in the KB, which meant that deletions that didn't start in an interval would be included in an interval, which isn't the behavior of the tribble and so caused a mismatch when assessing variants in the knowledgebase	2013-08-07 08:08:37 -04:00
Mark DePristo	7aba5a2f9f	Several improvements to AssessNA12878 and KB -- Bugfix for BAMs containing reads without real (M,I,D,N) operators. Simply needed to set validation stringency to SILENT in the read. Added a BadCigar filter to the SAMRecord stream anyway -- Add capture all sites mode to AssessNA12878: will write all sites to the badSites VCF, regardless of whether they are bad. It's useful if you essentially want to annotate a VCF with KB information for later analysis, such as computing ROC curves -- Add ignore filters mode to AssessNA12878: will as expected treat all sites in the input VCF calls as PASS, even if the site has a FILTER field setting -- Add minPNonRef argument to AssessNA12878: this will consider a site not called even if the NA12878 genotype is not 0/0 if the PLs are present and the PL for 0/0 isn't greater than this value. It allows us to easily differentiate low confidence non-ref sites obtained via multi-sample calling from highly confident non-ref calls that might be real TP or FPs	2013-08-07 08:08:37 -04:00
Mark DePristo	00f4d767e4	Merge pull request #364 from broadinstitute/md_vqsr_improvements Separate num Gaussians for + and - GMM in VQSR	2013-08-07 04:37:45 -07:00
Mark DePristo	c21402d4af	Separate num Gaussians for + and - GMM in VQSR -- The previous approach in VQSR was to build a GMM with the same max. number of Gaussians for the positive and negative models. However, we usually have many more positive sites than negative, so we'd prefer to use a more detailed GMM for the positive model and a less well defined model using few sites for the negative model. -- Now the maxGaussians argument only applies to the positive model -- This update builds a GMM for the negative model with a default 4 max gaussians (though this can be controlled via command line parameter) -- Removes the percentBadVariants argument. The only way to control how many variants are included in the negative model is with minNumBad -- Reduced the minNumBad argument default to 1000 from 2500 -- Update MD5s for VQSR. md5s changed significantly due to underlying changes in the default GMM model. Only sites with NEGATIVE_TRAINING_LABELs and the resulting VQSLOD are different, as expected. -- minNumBad is now numBad -- Plot all negative training points as well, since this significantly changes our view of the GMM PDF	2013-08-07 07:36:50 -04:00
Mark DePristo	44348a3761	Merge pull request #363 from broadinstitute/md_het_docs Better docs on the meaning of heterozygosity	2013-08-07 04:28:16 -07:00
Mark DePristo	318f7e74e4	Better docs on the meaning of heterozygosity -- [delivers #53522209]	2013-08-07 07:27:45 -04:00
Eric Banks	dd0e6409c6	Merge pull request #367 from broadinstitute/md_hc_ref_fix Bugfix for ReferenceConfidenceModel	2013-08-06 20:37:08 -07:00
Mark DePristo	40bc7d6a9c	Bugfix for ReferenceConfidenceModel -- In the case where there's some variation to assembly and evaluate but the resulting haplotypes don't result in any called variants, the reference model would exception out with "java.lang.IllegalArgumentException: calledHaplotypes must contain the refHaplotype". Now we detect this case and emit the standard no variation output. -- [delivers #54625060]	2013-08-06 16:00:32 -04:00
Ryan Poplin	6dfd17122f	Merge pull request #362 from broadinstitute/rp_single_sample_hc_pipeline Adding single sample HC qscript for Mauricio.	2013-08-06 12:11:50 -07:00

1 2 3 4 5 ...

12724 Commits (0f5bb706ffd3e02a2fec45f32317f29e4e45cb6d) All Branches Search

12724 Commits (0f5bb706ffd3e02a2fec45f32317f29e4e45cb6d)

All Branches