Commit Graph

  • 9f979fdc81 Merge pull request #297 from broadinstitute/md_vcfversion2 Eric Banks 2013-06-20 09:11:36 -0700
  • fdfe4e41d5 Better GATK version and command line output Mark DePristo 2013-06-20 11:19:13 -0400
  • 3db8908ae8 Remove debug print statement sathibault 2013-06-20 08:28:58 -0500
  • 701d70401f Merge pull request #296 from broadinstitute/md_pubprotfix Mark DePristo 2013-06-19 17:17:21 -0700
  • 0672ac5032 Fix public / protected dependency Mark DePristo 2013-06-19 19:42:09 -0400
  • 74415a6a2a Merge pull request #292 from broadinstitute/vrr_analyzeCovariates Eric Banks 2013-06-19 13:26:59 -0700
  • 1f8282633b Removed plots generation from the BaseRecalibration software Improved AnalyzeCovariates (AC) integration test. Renamed AC test files ending with .grp to .table Valentin Ruano-Rubio 2013-06-19 11:44:18 -0400
  • 08f92bb6f9 Added AnalyzeCovariates tool to generate BQSR assessment quality plots. Valentin Ruano-Rubio 2013-06-13 18:38:11 -0400
  • fb114e34fe Merge pull request #295 from broadinstitute/dr_remove_PrintReads_ds_argument Mark DePristo 2013-06-19 10:55:10 -0700
  • 573ecadecc Merge pull request #294 from broadinstitute/dr_handle_zero_length_cigar_elements droazen 2013-06-19 10:32:22 -0700
  • 51ec5404d4 SAMDataSource: always consolidate cigar strings into canonical form David Roazen 2013-06-18 16:04:29 -0400
  • 23ee192d5e PrintReads: remove -ds argument David Roazen 2013-06-19 13:22:44 -0400
  • 0be788f0f9 Fix typo in snpEff documentation David Roazen 2013-06-19 13:15:24 -0400
  • a3d6ad55f9 Merge pull request #271 from broadinstitute/chartl_extend_genotypeconcordance_documentation chartl 2013-06-19 09:03:05 -0700
  • af275fdf10 Extend the documentation of GenotypeConcordance to include notes about Monomorphic and Filtered VCF records. Chris Hartl 2013-06-12 13:54:30 -0400
  • 28a8d74290 Merge pull request #293 from broadinstitute/md_catvariants amilev 2013-06-19 08:36:58 -0700
  • 15171c07a8 CatVariants accepts reference files ending in any standard extension Mark DePristo 2013-06-19 11:10:36 -0400
  • 6a5502c94a Merge pull request #289 from broadinstitute/md_fix_bq MauricioCarneiro 2013-06-18 11:58:39 -0700
  • 1c400e8f8e Merge pull request #291 from broadinstitute/gda_new_hmm_in_ug delangel 2013-06-18 07:07:57 -0700
  • f176c854c6 Swapping in logless Pair HMM for default usage with UG: -- Changed default HMM model. -- Removed check. -- Changed md5's: PL's in the high 100s change by a point or two due to new implementation. -- Resulting performance improvement is about 30 to 50% less runtime when using -glm INDEL. Guillermo del Angel 2013-06-13 13:27:06 -0400
  • 4c482eb0f0 Merge pull request #290 from broadinstitute/rp_pruning_priority_queue Mark DePristo 2013-06-17 17:16:00 -0700
  • 8511c4385c Adding new pruning parameter to ReadThreadingAssembler Ryan Poplin 2013-06-17 14:02:54 -0400
  • a6a58cbc78 Merge pull request #288 from broadinstitute/gda_more_ancient_dna_fixes delangel 2013-06-17 13:04:21 -0700
  • cb5b1c3c34 Create README.md Mark DePristo 2013-06-17 16:03:45 -0300
  • 7b22467148 Bugfix: defaultBaseQualities actually works now Mark DePristo 2013-06-17 13:35:04 -0400
  • f6025d25ae Feature requested by Reich lab and Paavo lab in Leipzig for ancient DNA processing: -- When doing cross-species comparisons and studying population history and ancient DNA data, having SOME measure of confidence is needed at every single site that doesn't depend on the reference base, even in a naive per-site SNP mode. Old versions of GATK provided GQ and some wrong PL values at reference sites but these were wrong. This commit addresses this need by adding a new UG command line argument, -allSitePLs, that, if enabled will: a) Emit all 3 ALT snp alleles in the ALT column. b) Emit all corresponding 10 PL values. It's up to the user to process these PL values downstream to make sense of these. Note that, in order to follow VCF spec, the QUAL field in a reference call when there are non-null ALT alleles present will be zero, so QUAL will be useless and filtering will need to be done based on other fields. -- Tweaks and fixes to processing pipelines for Reich lab. Guillermo del Angel 2013-05-16 10:04:11 -0400
  • fce448cc9e Merge pull request #287 from broadinstitute/md_gzip_vcf_nt Mark DePristo 2013-06-17 09:39:37 -0700
  • b69d210255 Bugfix: allow gzip VCF output in multi-threaded GATK output Mark DePristo 2013-06-17 10:50:07 -0400
  • 485ceb1e12 Merge pull request #283 from broadinstitute/md_beagleoutput delangel 2013-06-17 09:31:03 -0700
  • 5b1a472d2c Merge pull request #286 from broadinstitute/eb_add_tiers_to_KBconsensus Mark DePristo 2013-06-17 08:38:57 -0700
  • ee78927bdb Merge pull request #279 from broadinstitute/eb_make_rms_mq_work_with_rr Mark DePristo 2013-06-16 09:48:19 -0700
  • e48f754478 Fixes to several of the annotations for reduced reads (and other issues). Eric Banks 2013-06-13 19:29:08 -0400
  • 9ec71bba26 Added 2 new fields to the MongoVariantContext: confidence and isComplex. Eric Banks 2013-06-11 14:20:10 -0400
  • 4151753718 Merge pull request #285 from broadinstitute/dr_james_warren_fasta_suffix_bugfix droazen 2013-06-14 16:57:10 -0700
  • f46f7d9b23 deducing dictionary path should not use global find and replace James Warren 2013-06-14 14:25:16 -0700
  • 52677429a0 Merge pull request #284 from broadinstitute/dr_fewer_stranded_temp_files Mark DePristo 2013-06-14 13:06:28 -0700
  • 1677a0a458 Simpler FILTER and info field encoding for BeagleOutputToVCF Mark DePristo 2013-06-14 15:56:13 -0400
  • d167292688 Reduce number of leftover temp files in GATK runs David Roazen 2013-06-14 15:30:17 -0400
  • b72880cc94 Merge pull request #282 from broadinstitute/md_gatklogs_gitversions Mark DePristo 2013-06-14 12:39:54 -0700
  • 20bb4902a3 Use git hash to lookup versions when necessary in analyzeRunReports.py Mark DePristo 2013-06-14 15:31:25 -0400
  • 50ea098c11 Merge pull request #281 from broadinstitute/md_gatklogs Mark DePristo 2013-06-14 10:00:16 -0700
  • c4e508a71f Merge pull request #275 from broadinstitute/md_fragment_with_pcr Ryan Poplin 2013-06-14 09:32:26 -0700
  • a057f37331 Update utilities to get GATKRunReports Mark DePristo 2013-06-13 17:08:34 -0400
  • ac346a93ba Merge pull request #278 from broadinstitute/md_gatk_version_in_vcf droazen 2013-06-13 13:22:20 -0700
  • 908183aba7 Merge pull request #277 from broadinstitute/dr_fix_com_sun_dependency Mark DePristo 2013-06-13 13:12:45 -0700
  • f9c986be74 Remove com.sun.javadoc.* dependencies from the GATK proper, and isolate them for doclet use only David Roazen 2013-06-13 15:30:10 -0400
  • 74f311c973 Emit the GATK version number in the VCF header Mark DePristo 2013-06-13 15:46:16 -0400
  • d93bed5d61 Merge pull request #276 from broadinstitute/md_gatkreport_cleanup Mark DePristo 2013-06-13 12:40:57 -0700
  • 6232db3157 Remove STANDARD option from GATKRunReport Mark DePristo 2013-06-13 15:18:28 -0400
  • dd5674b3b8 Add genotyping accuracy assessment to AssessNA12878 Mark DePristo 2013-06-12 12:43:19 -0400
  • 33720b83eb No longer merge overlapping fragments from HaplotypeCaller Mark DePristo 2013-06-10 14:52:41 -0400
  • fb5143a590 Merge pull request #274 from broadinstitute/md_s3_only droazen 2013-06-13 11:32:31 -0700
  • dd6e252373 GATKRunReport no longer tries to use the Broad filesystem destination, rather it goes unconditionally to S3 Mark DePristo 2013-06-12 17:47:27 -0400
  • c837d67b2f Merge pull request #273 from broadinstitute/rp_readIsPoorlyModelled Mark DePristo 2013-06-13 08:40:24 -0700
  • 2833325d31 Merge pull request #272 from broadinstitute/rp_hc_bam_writer_uninformative_reads Mark DePristo 2013-06-13 08:08:45 -0700
  • f44efc27ae Relaxing the constraints on the readIsPoorlyModelled function. Ryan Poplin 2013-06-13 10:05:53 -0400
  • d5f0848bd5 HC bam writer now sets the read to MQ0 if it isn't informative Ryan Poplin 2013-06-13 09:59:16 -0400
  • 336050ab71 Merge branch 'master' into st_fpga_hmm sathibault 2013-06-13 07:28:24 -0500
  • 17d3ccb03b Merge pull request #270 from broadinstitute/rp_reference_haplotype_mismatch_bug Eric Banks 2013-06-12 11:03:48 -0700
  • d1f397c711 Fixing bug with dangling tails in which the tail connects all the way back to the reference source node. Ryan Poplin 2013-06-12 12:22:36 -0400
  • b2dc7095ab Merge pull request #267 from broadinstitute/dr_reducereads_downsampling_fix Mark DePristo 2013-06-11 13:52:28 -0700
  • 95b5f99feb Exclude reduced reads from elimination during downsampling David Roazen 2013-06-05 15:55:43 -0400
  • e1fd3dff9a Merge pull request #268 from broadinstitute/eb_calling_accuracy_improvements_to_HC Ryan Poplin 2013-06-11 11:18:51 -0700
  • b63cbd8cc9 Merge pull request #266 from broadinstitute/gda_read_error_correction_new Eric Banks 2013-06-11 10:42:06 -0700
  • 2c3c680eb7 Misc changes and cleanup from all previous commits in this push. Eric Banks 2013-06-05 12:22:14 -0400
  • dadcfe296d Reworking of the dangling tails merging code. Eric Banks 2013-06-05 14:26:23 -0400
  • 55d5f2194c Read Error Corrector for haplotype assembly Principle is simple: when coverage is deep enough, any single-base read error will look like a rare k-mer but correct sequence will be supported by many reads to correct sequences will look like common k-mers. So, algorithm has 3 main steps: 1. K-mer graph buildup. For each read in an active region, a map from k-mers to the number of times they have been seen is built. 2. Building correction map. All "rare" k-mers that are sparse (by default, seen only once), get mapped to k-mers that are good (by default, seen at least 20 times but this is a CL argument), and that lie within a given Hamming distance (by default, =1). This map can be empty (i.e. k-mers can be uncorrectable). 3. Correction proposal For each constituent k-mer of each read, if this k-mer is rare and maps to a good k-mer, get differing base positions in k-mer and add these to a list of corrections for each base in each read. Then, correct read at positions where correction proposal is unanimous and non-empty. Guillermo del Angel 2013-06-04 14:25:26 -0400
  • c0030f3f2d We no longer subset down to the best N haplotypes for the GL calculation. Eric Banks 2013-06-04 09:26:50 -0400
  • c0e3874db0 Change the HC's phredScaledGlobalReadMismappingRate from 60 to 45, because Ryan and Mark told me to. Eric Banks 2013-06-03 14:34:29 -0400
  • 77868d034f Do not allow the use of Ns in reads for graph construction. Eric Banks 2013-05-30 14:00:43 -0400
  • e4e7d39e2c Fix FN problem stemming from sequence graphs that contain cycles. Eric Banks 2013-05-23 12:02:19 -0400
  • 210007cd09 Merge pull request #269 from broadinstitute/rp_minor_pruning_function_name Ryan Poplin 2013-06-11 08:24:56 -0700
  • 58e354176e Minor changes to docs in the graph pruning. Ryan Poplin 2013-06-11 10:33:22 -0400
  • c7836ec746 Merge pull request #264 from broadinstitute/md_dbsnp Mark DePristo 2013-06-10 13:38:20 -0700
  • 1c03ebc82d Implement ActiveRegionTraversal RefMetaDataTracker for map call; HaplotypeCaller now annotates ID from dbSNP Mark DePristo 2013-06-06 15:38:06 -0400
  • 0d593cff70 Refactor rsID and overlap detection in VariantOverlapAnnotator utility class Mark DePristo 2013-06-06 14:32:47 -0400
  • 1d67d07cf1 better docs for Qualify Missing Intervals Mauricio Carneiro 2013-06-10 15:17:40 -0400
  • c84f0deb1d Don't crash if cds file is not provided Mauricio Carneiro 2013-06-10 13:42:00 -0400
  • 3e979f30a9 Merge pull request #265 from broadinstitute/mc_move_qualify_intervals_to_protected Mark DePristo 2013-06-10 10:14:42 -0700
  • a95fbd48e5 Moving QualifyMissingIntervals to protected Mauricio Carneiro 2013-06-10 13:10:32 -0400
  • 2a935374f3 Merge pull request #242 from broadinstitute/vrr_N_cigar_error_and_override_option Eric Banks 2013-06-10 08:46:15 -0700
  • 96073c3058 This commit addresses JIRA issue GSA-948: Prevent users from doing the wrong thing with RNA-Seq data and the GATK. Valentin Ruano-Rubio 2013-05-23 20:39:32 -0400
  • cbb6c7ae92 Merge pull request #263 from broadinstitute/mccowan_reduce_reads_performance Eric Banks 2013-06-10 06:19:38 -0700
  • 00c06e9e52 Performance improvements: - Memoized MathUtil's cumulative binomial probability function. - Reduced the default size of the read name map in reduced reads and handle its resets more efficiently. Michael McCowan 2013-06-04 10:08:24 -0400
  • e7c69cb304 Merge pull request #261 from broadinstitute/md_ad_bugfix Eric Banks 2013-06-06 07:21:22 -0700
  • 209dd64268 HaplotypeCaller now emits per-sample DP Mark DePristo 2013-06-05 17:43:31 -0400
  • 34bdf20132 Bugfix for bad AD values in UG/HC Mark DePristo 2013-06-05 16:37:31 -0400
  • c8845a2b63 Merge pull request #260 from broadinstitute/md_nist_kb Mark DePristo 2013-06-05 13:09:41 -0700
  • df488dbd49 Merge pull request #259 from broadinstitute/md_discovar Mark DePristo 2013-06-05 13:09:18 -0700
  • 95376908e5 Converter script from discovar variants files to VCF Mark DePristo 2013-06-05 15:34:40 -0400
  • 5a54eac57d Bugfix for AssessNA12878 Mark DePristo 2013-06-05 15:34:23 -0400
  • 58a3d36076 Add NIST Genomes in a Bottle to NA12878 KB Mark DePristo 2013-06-04 15:11:07 -0400
  • eaebba5ba1 Merge pull request #257 from broadinstitute/md_unclip_reads_over_contig MauricioCarneiro 2013-06-04 08:01:30 -0700
  • e19c24f3ee Bugfix for HaplotypeCaller error: Only one of refStart or refStop must be < 0, not both Mark DePristo 2013-06-04 09:35:12 -0400
  • a0817b696b Merge pull request #256 from broadinstitute/md_hc_assembly_bug Mark DePristo 2013-06-03 13:58:41 -0700
  • c9f5b53efa Bugfix for HC can fail to assemble the correct reference sequence in some cases Mark DePristo 2013-06-03 14:36:54 -0400
  • a05c543728 Merge pull request #255 from broadinstitute/rp_gga_mode_kmer_function Mark DePristo 2013-06-03 11:30:34 -0700
  • ab40f4af43 Break out the GGA kmers and the read kmers into separate functions for the DeBruijn assembler. Ryan Poplin 2013-06-03 11:01:34 -0400
  • 21334e728d Merge pull request #252 from broadinstitute/md_bqsr_index_out_of_bounds Ryan Poplin 2013-06-03 07:13:00 -0700
  • de2a2a4cc7 Added command-line flag to disble FPGA Completed integration with FPGA driver sathibault 2013-06-03 07:30:32 -0500