Commit Graph

13564 Commits (6d7201a7f8a15ef3368d3e72f8d5d2d30c00f1ef)

Author SHA1 Message Date
jmthibault79 6d7201a7f8 Merge pull request #698 from broadinstitute/pd_printreads_subset
Improvements to read-group filtering in PrintReads
2014-08-12 14:13:07 -04:00
Valentin Ruano Rubio 1558f9c49a Merge pull request #628 from broadinstitute/vrr_per_read_allele_likelihoods
ReadLikelihoods container using arrays rather than Maps of Maps of Maps
2014-08-11 20:32:37 -04:00
Phillip Dexheimer 7e77875c81 Improvements to read-group filtering in PrintReads
- Read groups that are excluded by sample_name, platform, or read_group arguments no longer appear in the header
 - The performance penalty associated with filtering by read group has been essentially eliminated
 - Partial fulfillment of PT 73075482
2014-08-11 20:08:16 -04:00
Valentin Ruano-Rubio b39508cd15 ReadLikelihoods class introduction final changes before merging
Stories:

        https://www.pivotaltracker.com/story/show/70222086
        https://www.pivotaltracker.com/story/show/67961652

Changes:

  Done some changes that I missed in relation with making sure that all PairHMM implentations use the same interface; as a consequence we were running always the standard PairHMM.
  Fixed some additional bugs detected when running it on full wgs single sample and exom multi sample data set.
  Updated some integration test md5s.
2014-08-11 17:47:25 -04:00
Valentin Ruano-Rubio 9a9a68409e ReadLikelihoods class introduction final changes before merging
Stories:

        https://www.pivotaltracker.com/story/show/70222086
        https://www.pivotaltracker.com/story/show/67961652

Changes:

  Done some changes that I missed in relation with making sure that all PairHMM implentations use the same interface; as a consequence we were running always the standard PairHMM.
  Fixed some additional bugs detected when running it on full wgs single sample and exom multi sample data set.
  Updated some integration test md5s.

Fixing GraphBased bugs with new master code
Fixed ReadLikelihoods.changeReads difficult to spot bug.
Changed PairHMM interface to fix a bug
Fixed missing changes for various PairHMM implementations to get them to use the new structure.
Fixed various bugs only detectable when running with full sample(s).
Believe to have fixed the lack of annotations in UG runs
Fixed integrationt test MD5s
Updating some md5s
Fixed yet another md5 probably left out by mistake
2014-08-11 17:46:28 -04:00
Valentin Ruano-Rubio 0b472f6bff Added new test to verify the functionality of ReadLikelihoods.java and its use in HC. Updated existing integration test md5s.
Stories:

    https://www.pivotaltracker.com/story/show/70222086
    https://www.pivotaltracker.com/story/show/67961652
2014-08-11 17:46:28 -04:00
Valentin Ruano-Rubio 2914ecb585 Change the Map-of-maps-of-maps for an array based implementation ReadLikelihoods to hold read likelihoods.
The array structure should be faster to populate and query (no properly benchmarked) and reduce memory footprint considerably.
    Nevertheless removing PairHMM factor (using likelihoodEngine Random) it only achieves a speed up of 15% in some example WGS dataset
    i.e. there are other bigger bottle necks in the system. Bamboo tests also seem to run significantly faster with this change.

    Stories:

      https://www.pivotaltracker.com/story/show/70222086
      https://www.pivotaltracker.com/story/show/67961652

    Changes:

       - ReadLikelihoods added to substitute  Map<String,PerSampleReadLikelihoods>
       - Operation that involve changes in full sets of ReadLikelihoods have been moved into that class.
       - Simplified a bit the code that handles the downsampling of reads based on contamination

    Caveats:

       - Still we keep Map<String,PerReadAlleleLikelihoodsMap> around to pass to annotators..., didn't feel like change the interface of so many public classes in this pull-request.
2014-08-11 17:46:28 -04:00
Valentin Ruano-Rubio 09ac3779d6 Added ReadLikelihoods component to substitute Map<String,PerReadAlleleLikelihoodMap>.
It uses a more efficient java array[] based implementation and encapsulates operations perform with such a
read-likelihood collection such as marginalization, filtering by position, poor modeling or capping
worst likelihoods and so forth.

Stories:

          https://www.pivotaltracker.com/story/show/70222086
          https://www.pivotaltracker.com/story/show/67961652
2014-08-11 17:46:28 -04:00
Ryan Poplin c56e493f98 Merge pull request #622 from broadinstitute/ldg_SORanalysis
Add StrandOddsRatio to default annotations produced by GenotypeGVCFs
2014-08-11 09:45:27 -04:00
Eric Banks abcaba4bc3 Merge pull request #699 from broadinstitute/tf_hc_block_ranges
Changed the default GVCF Q Bands from 5,20,60 to be 1..60 by 1s, 60...90...
2014-08-08 16:10:23 -04:00
Tim Fennell 5695f22da8 Changed the default GVCF Q Bands from 5,20,60 to be 1..60 by 1s, 60...90 by 10s and 99 in order to give finer resolution
for homref PLs and ADs at lower confidences and somewhat higher resolution at higher confidences.
2014-08-08 14:31:35 -04:00
Laura Gauthier 35de598e4b Modify StrandOddsRatio calculation to take on lower values in cases where reference +/- reads are skewed but alt reads are not. Add SOR to default annotations produced by GenotypeGVCFs. Add jitter to minimum SOR values 2014-08-07 12:09:19 -04:00
ldgauthier 683baff375 Merge pull request #697 from broadinstitute/ldg_inbreedingCoeffForMultiallelics
Fix nullPointerException
2014-08-07 11:57:39 -04:00
Laura Gauthier f532f1f843 Fix nullPointerException 2014-08-07 10:13:17 -04:00
Eric Banks 0b72f7a16d Merge pull request #685 from broadinstitute/ldg_inbreedingCoeffForMultiallelics
Update inbreeding coefficient calculation to give a better estimate for ...
2014-08-07 09:00:09 -04:00
Laura Gauthier 74affcc077 Update inbreeding coefficient calculation to give a better estimate for multialleleic sites
Add unit test for compound het and for multiallelic hets
2014-08-07 08:12:47 -04:00
Eric Banks b9486f5b4d Merge pull request #693 from broadinstitute/ldg_SORfromHC
Allow SOR to be calculated from HC
2014-08-06 21:48:09 -04:00
Eric Banks 5f31e54d67 Merge pull request #696 from broadinstitute/pd_DoC_sorting
Fix sample sort order bug in DepthOfCoverage
2014-08-06 08:35:35 -04:00
Eric Banks 6fa4764fc1 Merge pull request #694 from broadinstitute/pd_check_missing_arg_values
Improved detection of missing argument values
2014-08-06 08:32:22 -04:00
Phillip Dexheimer b0c026e671 Fix sample sort order bug in DepthOfCoverage
Rare bug triggered by hash collision between sample names
 PT 66183936
2014-08-05 21:55:34 -04:00
Phillip Dexheimer 593663d9b6 Improved detection of missing argument values
In particular, it was possible to specify arguments for Files or Compound types without values
 Added a special "none" value for annotations, since a bare "-A" is no longer allowed
 Delivers PT 71792842 and 59360374
2014-08-05 20:31:31 -04:00
Eric Banks 03e7ee6e9c Merge pull request #695 from broadinstitute/pd_genotype_concordance_doc
Documentation fix (closed HTML tag)
2014-08-05 07:15:40 -04:00
Phillip Dexheimer 359fe150c9 Documentation fix (closed HTML tag) 2014-08-04 23:19:16 -04:00
Laura Gauthier 5533199402 Allow SOR to be calculated from HC
Refactor StrandBiasTest classes
2014-08-01 20:47:58 -04:00
Ryan Poplin e69b0d6316 Merge pull request #692 from broadinstitute/rp_typo_analyze_covariates
Fixing typos in AnalyzeCovariates
2014-07-31 10:39:31 -04:00
Ryan Poplin 63b3f7dfd3 Fixing typos in AnalyzeCovariates 2014-07-31 10:36:18 -04:00
Eric Banks 66ccc636b8 Merge pull request #691 from broadinstitute/ldg_update_htsjdk
Update GATK to work with latest htsjdk
2014-07-30 14:32:23 -04:00
Laura Gauthier 4373922ee6 Update GATK to work with latest htsjdk
ValidationStringency was moved from htsjdk.samtools.SAMFileReader to htsjdk.samtools
samtools find BAM index file method was also moved (and made public!)
2014-07-30 12:05:14 -04:00
Valentin Ruano Rubio d69af637bf Merge pull request #690 from broadinstitute/vrr_fix_non_diploid_unsupproted_message_in_haplotype_caller
Add diploid only support message to HaplotypeCaller
2014-07-29 22:50:00 -04:00
Valentin Ruano-Rubio 750eb4b5a6 Add diploid only support message to HaplotypeCaller
Story:

  https://www.pivotaltracker.com/story/show/73440292

Changes:

  - Just add the conditional in HaplotypeCaller#initialize

Testing:

  - Nothing added, checked locally, trivial change that would eventually be removed anyway.
2014-07-29 17:05:36 -04:00
Eric Banks 43bd6b4436 Merge pull request #689 from broadinstitute/eb_fix_samrecord_test_constructor
The copy constructor for a GATKSAMRecord (used for testing only) should ...
2014-07-23 23:03:30 -04:00
Eric Banks 84af1fc75f The copy constructor for a GATKSAMRecord (used for testing only) should use the actual read's contig index, not its mate's. 2014-07-23 15:31:03 -04:00
Valentin Ruano Rubio 881fc9cc35 Merge pull request #687 from broadinstitute/vrr_omniploidy_assesment_na12878
Further changes to AssessNA12878 for evaluating Omniploidy
2014-07-22 14:45:59 -04:00
Valentin Ruano-Rubio 8c14609477 Added -gtType argument to AssessNA12878 to stratify assessment based on true genotype call in NA12878 (or input call if the former is not in KB).
Story:

  https://www.pivotaltracker.com/story/show/75028590

Changes:

  Added the possibility of indicating the genotype type to consider (with argument [-gtType TYPE]*, where TYPE is HET, HOM_REF or HOM_VAR)
  Removed conditional Het evaluation based on input ploidy; now you need to use -gtType explictly.

Tests:

  Added integration tests to check on new argument (-gtType) behaviour in AssessNA12878KnowledgeBaseTest
2014-07-22 11:02:36 -04:00
David Roazen 0798a4b768 Update pom versions to mark the start of GATK 3.3 development 2014-07-17 12:09:33 -04:00
David Roazen 323f22f852 Update pom versions for the 3.2 release 2014-07-17 12:06:22 -04:00
Eric Banks 98d88eb07e Fixed IndexOutOfBounds error associated with tail merging.
Don't expand out source nodes for tail merging, since that's a head merging action only.
This shows up as a bug only because we now allow merging tails against non-reference paths.
2014-07-17 12:04:22 -04:00
David Roazen 799071b520 Merge remote-tracking branch 'unstable/master' 2014-07-14 14:07:16 -04:00
Geraldine Van der Auwera 104d94a1ac Merge pull request #686 from broadinstitute/gg_gatkdocs_updates
Various documentation improvements
2014-07-14 12:10:36 -04:00
Geraldine Van der Auwera a6f632874b Various documentation improvements
- Edited intervals merging docs for correctness & clarity
- Edited VQSR arg docs and made mode required (+added -mode SNP to VQSR tests)
- Moved PaperGenotyper to Toy Walkers to declutter the actually useful docs
- Moved GenotypeGVCFs to Variant Discovery category and clarified a few points
- Clarified that the -resource argument depends on using the -V:tag format
- Clarified how the pcr indel model works
- Added caveat for -U ALLOW_N_CIGAR_READS
- Added MathJax support for displaying equations in GATKDocs
- Updated HC example commands and caveats
2014-07-14 12:03:03 -04:00
droazen db53d096c9 Merge pull request #684 from broadinstitute/ks_add_cofoja_to_gatk_packages
Added cofoja to the gatk packages for tests to pass.
2014-07-14 11:15:49 -04:00
Eric Banks 2138aa135c Merge pull request #683 from broadinstitute/eb_disable_complex_variant_merging_PT74816146
Disable the complex variant merging for now, as requested by ATGU
2014-07-13 08:57:13 -04:00
Eric Banks ecefcb383d Disable the complex variant merging for now, as requested by ATGU 2014-07-11 17:27:40 -04:00
Khalid Shakir c7e357eb59 Added cofoja to the gatk packages for tests to pass. 2014-07-11 23:19:42 +08:00
droazen b8751ad598 Merge pull request #680 from broadinstitute/ldg_VQSRscript
Update VQSR Rnd BQSR  script generation code for compatibility with late...
2014-07-11 10:16:37 -04:00
kshakir 4f4ff5c327 Merge pull request #682 from broadinstitute/ks_revert_md5_db_per_test
Reverting md5 db per test
2014-07-11 03:42:18 +08:00
Ryan Poplin 193e389b41 Merge pull request #679 from broadinstitute/eb_better_tail_merging_PT74222522
Improved tail merging: now tails can be merged to branches that are not ...
2014-07-10 13:54:33 -04:00
Valentin Ruano Rubio 598b481733 Merge pull request #671 from broadinstitute/vrr_omniploidy_assesment_na12878
Now AssessNA12878 can handle different input ploidy for omniploidy asses...

Story:

  https://www.pivotaltracker.com/story/show/74469346
2014-07-10 13:19:32 -04:00
Khalid Shakir 18f6d56b4c Revert "Using the base directory for each test run when outputting MD5DB mismatches."
This reverts commit f192f032a153755a84b1d682f6e652a7c6787fb9.
2014-07-11 01:11:25 +08:00
Khalid Shakir cc09ef9190 Revert "Appending to md5db in the gatkdir, with additional logging."
This reverts commit 0aa2884f7b006f5d48c325bf942b92c183e45074.
2014-07-11 01:11:20 +08:00