Commit Graph

4381 Commits (7e77875c810ac1bcb1a2ad1d6526333490498166)

Author SHA1 Message Date
Phillip Dexheimer 7e77875c81 Improvements to read-group filtering in PrintReads
- Read groups that are excluded by sample_name, platform, or read_group arguments no longer appear in the header
 - The performance penalty associated with filtering by read group has been essentially eliminated
 - Partial fulfillment of PT 73075482
2014-08-11 20:08:16 -04:00
Eric Banks 5f31e54d67 Merge pull request #696 from broadinstitute/pd_DoC_sorting
Fix sample sort order bug in DepthOfCoverage
2014-08-06 08:35:35 -04:00
Phillip Dexheimer b0c026e671 Fix sample sort order bug in DepthOfCoverage
Rare bug triggered by hash collision between sample names
 PT 66183936
2014-08-05 21:55:34 -04:00
Phillip Dexheimer 593663d9b6 Improved detection of missing argument values
In particular, it was possible to specify arguments for Files or Compound types without values
 Added a special "none" value for annotations, since a bare "-A" is no longer allowed
 Delivers PT 71792842 and 59360374
2014-08-05 20:31:31 -04:00
Phillip Dexheimer 359fe150c9 Documentation fix (closed HTML tag) 2014-08-04 23:19:16 -04:00
Laura Gauthier 4373922ee6 Update GATK to work with latest htsjdk
ValidationStringency was moved from htsjdk.samtools.SAMFileReader to htsjdk.samtools
samtools find BAM index file method was also moved (and made public!)
2014-07-30 12:05:14 -04:00
Eric Banks 84af1fc75f The copy constructor for a GATKSAMRecord (used for testing only) should use the actual read's contig index, not its mate's. 2014-07-23 15:31:03 -04:00
David Roazen 0798a4b768 Update pom versions to mark the start of GATK 3.3 development 2014-07-17 12:09:33 -04:00
David Roazen 323f22f852 Update pom versions for the 3.2 release 2014-07-17 12:06:22 -04:00
Geraldine Van der Auwera a6f632874b Various documentation improvements
- Edited intervals merging docs for correctness & clarity
- Edited VQSR arg docs and made mode required (+added -mode SNP to VQSR tests)
- Moved PaperGenotyper to Toy Walkers to declutter the actually useful docs
- Moved GenotypeGVCFs to Variant Discovery category and clarified a few points
- Clarified that the -resource argument depends on using the -V:tag format
- Clarified how the pcr indel model works
- Added caveat for -U ALLOW_N_CIGAR_READS
- Added MathJax support for displaying equations in GATKDocs
- Updated HC example commands and caveats
2014-07-14 12:03:03 -04:00
Eric Banks ecefcb383d Disable the complex variant merging for now, as requested by ATGU 2014-07-11 17:27:40 -04:00
droazen b8751ad598 Merge pull request #680 from broadinstitute/ldg_VQSRscript
Update VQSR Rnd BQSR  script generation code for compatibility with late...
2014-07-11 10:16:37 -04:00
Khalid Shakir 18f6d56b4c Revert "Using the base directory for each test run when outputting MD5DB mismatches."
This reverts commit f192f032a153755a84b1d682f6e652a7c6787fb9.
2014-07-11 01:11:25 +08:00
Khalid Shakir cc09ef9190 Revert "Appending to md5db in the gatkdir, with additional logging."
This reverts commit 0aa2884f7b006f5d48c325bf942b92c183e45074.
2014-07-11 01:11:20 +08:00
kshakir aecd34d274 Merge pull request #677 from broadinstitute/ks_md5_db_per_test_type
Appending to md5db in the gatkdir, with additional logging.
2014-07-10 17:53:24 +08:00
Khalid Shakir a7d1904c63 Appending to md5db in the gatkdir, with additional logging. 2014-07-10 03:58:47 +08:00
Laura Gauthier 99026eb51b Update VQSR Rnd BQSR script generation code for compatibility with latest ggplot version. Update queueJobReport.R and public/gsalib/src/R/R/gsa.variantqc.utils.R also 2014-07-09 15:36:58 -04:00
David Roazen 719e685759 Remove junit imports in the test suite 2014-07-09 12:09:27 -04:00
Khalid Shakir 2129aa05d8 Bug fix for poms missing package test artifacts. 2014-07-08 06:34:26 +08:00
Khalid Shakir e5be9c7073 Using the base directory for each test run when outputting MD5DB mismatches. 2014-07-08 06:34:25 +08:00
Eric Banks bad7865078 When converting a haplotype to a set of variants we now check for cases that are overly complex.
In these cases, where the alignment contains multiple indels, we output a single complex
variant instead of the multiple partial indels.

We also re-enable dangling tail recovery by default.
2014-07-01 14:18:59 -04:00
Ryan Poplin 0127799cba Reads are now realigned to the most likely haplotype before being used by the annotations.
-- AD,DP will now correspond directly to the reads that were used to construct the PLs
-- RankSumTests, etc. will use the bases from the realigned reads instead of the original alignments
-- There is now no additional runtime cost to realign the reads when using bamout or GVCF mode
-- bamout mode no longer sets the mapping quality to zero for uninformative reads, instead the read will not be given an HC tag
2014-06-30 10:35:50 -04:00
Khalid Shakir 7b5f88a49c Refactored DoC custom Queue wrappers to a non-package object.
Now, "mvn verify && mvn verify" should work again.
2014-06-26 00:59:18 +08:00
droazen b935ed0df1 Merge pull request #665 from broadinstitute/ks_force_delete_bad_symlinks
Executing a version of the delete_maven_links.sh
2014-06-25 00:13:05 -04:00
Phillip Dexheimer 06d619e9aa Removed redundant SelectVariantsIntegrationTest, merged it's only test into protected version 2014-06-24 18:59:59 -04:00
Khalid Shakir 45d819a00e For now, executing the delete_maven_links.sh just ahead of creating the symbolic links during the process-test-resources phase.
Better than running it during the "clean" phase, since these users may not run "mvn clean" before attempting to build.
2014-06-25 02:32:15 +08:00
Phillip Dexheimer 65eeb4a7ab Recast the "Invalid JEXL expression detected" error in SelectVariants from a RuntimeException to a UserException
- PT 68931448
2014-06-20 00:05:23 -04:00
Phillip Dexheimer da5e567b73 Added functionality to CatVariants to process .list files with -V
- Pivotal 70305712
2014-06-19 21:46:13 -04:00
Ryan Poplin da1dab6c32 Merge pull request #661 from broadinstitute/jw_allele_balance_gvcf
Enable AB annotation in reference model pipeline. Incorporates patches f...
2014-06-19 13:10:41 -04:00
Eric Banks 1092dd6e25 From Carlos Barroto: switch outputRoot in SplitSamFile to an empty string instead of null. 2014-06-19 11:06:55 -04:00
Eric Banks 9212edba41 From Carlos Barroto: made 'level' in Picard's CalculateHsMetrics Scala Queue extension an argument. 2014-06-19 11:06:50 -04:00
Ryan Poplin 8b75428a90 Enable AB annotation in reference model pipeline. Incorporates patches from John Wallace to public github account 2014-06-19 09:35:04 -04:00
Nigel Delaney 7570666f2a Merge pull request #655 from broadinstitute/nfd_mathutil_opts
Optimization of function to calculate the logged sum of exponentiated values
2014-06-17 17:07:42 -04:00
Nigel Delaney 5e258bfeff Minor optimization to function to calculate the log of exponentials.
* Avoids calling Math.Pow whenever possible (skips -Inf and 0 values),
leads to better performance.
2014-06-17 15:26:10 -04:00
Chris Whelan ba1d23e535 Created a new tool, SiblingIBD, which finds Identical-By-Descent regions in two siblings.
-When parental genotypes are available, implements an HMM on genotype observations in the quartet.
   -Outputs IBD regions as well as per-site posterior probabilities of being in each IBD state.
   -Includes an experimental heuristic based mode for when parental genotypes are not available.
   -Made a method in MendelianViolation public static to reuse code.
   -Added the mockito library to private/gatk-tools-private/pom.xml
2014-06-13 09:41:37 -04:00
Menachem Fromer a1868e8b82 For XHMM and Depth-of-Coverage Qscripts, add ability for user to input sample renaming file at the GATK level using existing GATK flag (--sample_rename_mapping_file) and custom pre-processing code. For XHMM Qscript, add scatter-gather for Discovery and Genotype stages. 2014-06-09 23:49:54 -04:00
Phillip Dexheimer 4eb9858461 Ensure that output files are specified in a writeable location
-PT 69579780
2014-06-02 21:13:59 -04:00
Valentin Ruano Rubio db96891d4b Merge pull request #638 from broadinstitute/vrr_createTempFile_testfix
Changed File.createTempFile to BaseTest.createTempFile calls Test
2014-05-29 10:15:05 -04:00
Valentin Ruano-Rubio 938172d7f0 Removed redundant overrride createTempFileFromBase (same code as super class) and added some finals to DepthOfCoverageB36IntegrationTest 2014-05-28 19:02:04 -04:00
Valentin Ruano-Rubio e0c221470c Changed File.createTempFile to BaseTest.createTempFile 2014-05-28 18:59:48 -04:00
EvolvedMicrobe ef7531d4a5 Merge pull request #640 from broadinstitute/IntegerSWImplementation
Change SmithWaterman to use integers instead of doubles.
2014-05-28 15:10:05 -04:00
Nigel Delaney cc45e62e8e Change SmithWaterman to use integers instead of doubles. 2014-05-28 13:13:14 -04:00
Eric Banks ff43b1f298 Merge pull request #636 from broadinstitute/pd_log10_refactor
Replaced the static, fixed MathUtils.log10Cache array with a dynamic Log...
2014-05-28 08:46:49 -04:00
Phillip Dexheimer 6122b2805d Legibility improvements to ProgressMeter
- Fields in the header are delimited with the pipe character
 - Header is now split into two lines to improve spacing
 - Field width in header and progress lines auto-adjusts to length of "processing units" label (sites, active regions, etc)
 - Addresses PT 69725930
2014-05-27 23:52:42 -04:00
Phillip Dexheimer c15e6fcc0e Refactored the static lookup arrays in MathUtils (log10Cache, log10FactorialCache, jacobianLogTable)
-They are now only computed when necessary
 -Log10Cache is dynamically resizable, either by calling get() on an out-of-range value or by calling ensureCacheContains
 -Log10FactorialCache and JacobianLogTable are initialized to a fixed size on first access and are not resizable
 -Addresses PT 69124396
2014-05-27 22:27:57 -04:00
David Roazen 74b51c5c7a Improve test suite tmp file cleanup
-Make BaseTest.createTempFile() mark any possible corresponding index files for deletion on exit

-Make WalkerTest mark shadow BCF files and auxiliary for deletion on exit

-Make VariantRecalibrationWalkersIntegrationTest mark PDF files for deletion on exit
2014-05-27 13:41:44 -04:00
Valentin Ruano-Rubio 7c8a1ae892 Fix for SW to make double comparisons with a tolerance
Stories:

  - https://www.pivotaltracker.com/story/show/69577868

Changes:

  - Added a epsilon difference tolerance in weight comparisons.

Tests:

  - Added HaplotypeCallerIntegrationTest#testDifferentIndelLocationsDueToSWExactDoubleComparisonsFix
  - Updated md5 due to minor likelihood changes.
  - Disabled a test for PathUtils.calculateCigar since does not work and is unclear what is causing the error (needs original author input)
2014-05-23 01:48:48 -04:00
Khalid Shakir b7e98bdae9 Fixed GATK docs artifact, moved protected ExampleUG tests. 2014-05-22 21:03:55 -04:00
Karthik Gururaj 972a82d386 Changed 'sting' to 'gatk' in the VectorLoglessPairHMM classes and the
C++ code
2014-05-19 17:36:41 -04:00
Khalid Shakir 3939971d78 After renaming the packages, instead of updating the JNI library used for testing bwa, moving the classes to the archive.
NOTE: The migrated READEME.md has been added that will allow others to possibly ressurect this code as needed.
2014-05-19 17:36:41 -04:00