Commit Graph

1359 Commits (19bbe45cbc3960bcb70b6bfb74bdb4e73e762851)

Author SHA1 Message Date
Geraldine Van der Auwera 7a75f4ae79 Merge pull request #974 from broadinstitute/jw_Var2BinPEDSwap
Correct errant array element swap in FAM file output.
2015-05-12 08:49:16 -04:00
Eric Banks 53a34cea4a Merge pull request #938 from broadinstitute/eb_fix_spanning_deletions_in_genotyping
Added a fix for genotyping positions over spanning deletions.
2015-05-11 23:11:47 -04:00
Joseph White abb6bc6f57 Correct errant array element swap in FAM file output.
dad and mom are swapped; paternal first, then maternal

updated MD5 chksums for test files

remove commented lines
2015-05-11 20:45:50 -04:00
Eric Banks 530e0e5ea6 Added a fix for combining/genotyping positions over spanning deletions.
Previously, if a SNP occurred in sample A at a position that was in the middle of a deletion for sample B,
sample B would be genotyped as homozygous reference there (but it's NOT reference - there's a deletion).
Now, sample B is genotyped as having a symbolic DEL allele.

Minor cleanup added.  Note that I also removed Laura's previous fix for this problem.

Existing integration tests change because I've added a new header line to the VCF being output.
I also added several tests for the new functionality showing:
1. genotyping from separate and already combined gvcfs give the same output
2. genotyping over multiple spanning deletions works
3. combining works too

Existing unit tests also cover this case.
2015-05-11 15:11:16 -04:00
Joseph White 5be8bc5dfc Deprecate --mergeVariantsViaLD in HC
New unit test for deprecated mergeVariantsViaLD
Update HaplotypeCallerIntegrationTest.java
Delete duplicate testHaplotypeCallerMergeVariantsViaLDException test.
2015-05-08 17:50:25 -04:00
Geraldine Van der Auwera 5d8b9a7c20 Moved MQ0 out of HC exclusion and into StandardUGAnnotation 2015-05-03 01:04:49 +02:00
Geraldine Van der Auwera 071d82d1bf Un-exclude SD and TRA from HC annotators; resolves #966
Exclude MQ0BySample
Move SD and TRA to new StandardUGAnnotation interface
There is now annotation interface (StandardUGAnnotation) holding annots that are standard in UG but should't be used as they are now with HC. This allows us to not have to exclude these annotations explicitly in HC, but still be able to use them for development purposes.
2015-05-03 00:45:53 +02:00
Geraldine Van der Auwera e49f6dfd0f Merge pull request #970 from broadinstitute/gg_minor_docfixes
Fairly minor if plentiful fixes to various gatkdocs. Merging this without formal review since all tests pass, the gatkdocs build, and no one really wants to review corrections to grammar, typos and layout for 120+ documents. Review will be done by users in production ;-)
2015-05-03 00:36:12 +02:00
Geraldine Van der Auwera 919c3eaa2e Numerous doc fixes; mostly formatting and clarifications 2015-05-03 00:28:46 +02:00
Ron Levine 9ff827c83a More allele trimming for VariantAnnotator 2015-04-29 21:11:49 -04:00
Laura Gauthier 97caf94807 Fix implementation of allowNonUniqueKmersInRef so that it applies to all kmer sizes 2015-04-23 13:01:47 -04:00
Ron Levine d5f98e99f0 Bypass reads with a bad CIGAR length 2015-04-21 11:55:56 -04:00
Kristian Cibulskis 45610a142c initial refactoring of arguments into individual argument collections
fix blasted license blurbs

updates based on PR comments (abstractify HaplotypeCallerArgumentCollection into AssemblyBasedCallerArgumentCollection)

comments on comments from PR review
2015-04-07 16:55:32 -04:00
Geraldine Van der Auwera 2053afe52a Merge pull request #914 from broadinstitute/ldg_fixDitheringRandomness
Initialize annotations so that --disableDithering actually works
2015-04-06 15:40:30 -04:00
Yossi Farjoun d30a6258bc added the missing file to the error message 2015-04-06 08:21:55 -04:00
Laura Gauthier 9c842df3a3 Initialize annotations so that --disableDithering actually works 2015-04-02 17:34:46 -04:00
Geraldine Van der Auwera d7f7022dce Merge pull request #904 from broadinstitute/pd_orig_dp
Added keepOriginalDP argument to SelectVariants
2015-03-30 09:01:33 -04:00
Laura Gauthier 5a10758e2e Annotation changes for M2:
Build a ReferenceContext in ActiveRegionWalkers to pass in to annotation engine so we can call the TandemRepeatAnnotator from M2
Make TandemRepeatAnnotator default annotation for M2.
Setup (but don't use yet) HC-style contamination downsampling.
New HC integration test with TandemRepeatAnnotator
2015-03-27 18:25:23 -04:00
Ron Levine aef0a83c52 Automatically choose indexing strategy by file extension 2015-03-27 11:10:35 -04:00
Phillip Dexheimer c97c253ec8 Added keepOriginalDP argument to SelectVariants
Fixes #830
2015-03-25 22:45:31 -04:00
Phillip Dexheimer 9e63696315 Remove indel-length normalization of QD for GGVCFs
* Fixes #848
* length normalization is now only applied if the annotation is calculated in UG
2015-03-24 08:22:19 -04:00
Geraldine Van der Auwera 0a45b2d79d Merge pull request #883 from broadinstitute/rhl_hc_mq0
Exclude MappingQualityZero from default annotations
2015-03-23 12:59:08 -04:00
Ami Levy-Moonshine c5fc5c4f8c create 2 new tools:
- ASEReadCounter (public tool) replce Tuuli's script to produce the input to Manny's tool.
   It count the number of reads that support the ref allele and the alt allele, filtereing low qual reads and bases and keep only properPaired reads
- ASECaller (private tool) take both RNA and DNA, and produce ontingencyTables ** still under development **

minor changes in other tools:
- update RNA HC variant calling scala script
- expose FS method pValueForContingencyTable to be able to call it from ASEcaller

In ASEReadCounter:
- allow different option to deal with overlaping read from the same fragment
- add option to ignore or include indels in the pileups
- add option to disabled DuplicateRead

add ASEReadCounterIntegrationTest.java and files for the test
2015-03-21 16:56:00 -04:00
Ron Levine 46668d469a Exclude MappingQualityZero from default annotations 2015-03-17 21:46:18 -04:00
Kristian Cibulskis ab1053e83c It compiles, and produces results!
fixed NPE when normal contains no reads

first integration test (micro) and unit tests, also rename of MuTectHC -> M2

adding in standard GATK license terms

incorporated HOSTILE mode to PCR Error Correction

removed tumor and normal name parameters and cleaned up internal name handling

changes to allow for calling without a matched normal (technically, not true 'tumor-only' calling).  Used for panel-of-normals creation

additional regression tests, based on DREAM data.  Removed accidental addition of TandemRepeatAnnotator to default annotations

updated MD5 based on run from GSA4 to fix bamboo issue

reverted unneeded visibility changes
2015-03-13 18:28:01 -04:00
Geraldine Van der Auwera 39a972f348 Merge pull request #872 from broadinstitute/eb_create_rgq_format_field
Added the RGQ format annotation to monomorphic sites in the VCF output of GenotypeGVCFs. Fixes #870
2015-03-13 13:59:53 -04:00
Eric Banks 1ff9463285 Added the RGQ format annotation to monomorphic sites in the VCF output of GenotypeGVCFs.
Now, instead of stripping out the GQs for mono sites, we transfer them to the RGQ.
This is extremely useful for people who want to know how confident the hom ref genotype calls are.
Perhaps this is just what CRSP needs for pertinent negatives.

Note that I also changed the tool to no longer use the GenotypeSummaries annotation by default since
it was adding some seemingly unnecessary annotations (like mean GQ now that we keep the GQ around and
number of no-calls).  Let me know if this was a mistake (although Laura gave me a thumbs up).
2015-03-13 10:27:20 -04:00
Phillip Dexheimer 6ffa295963 Regression: The new 'includeUnmapped' PartitionBy annotation was incorrectly set for HC
Fixes #828
2015-03-13 00:24:57 -04:00
Eric Banks ea8a1edeb6 Adding option to CombineGVCFs to have it break blocks at every N sites.
Using --breakBandsAtMultiplesOf N will ensure that no reference blocks span across
genomic positions that are multiples of N.  This is especially important in the
case of scatter-gather where you don't want your scatter intervals to start in the
middle of blocks (because of a limitation in the way -L works in the GATK for VCF
records with the END tag).

For example, running with --breakBandsAtMultiplesOf 5 on this record:
1       69491   .       G       <NON_REF>       .       .       END=69523       GT:DP:GQ:MIN_DP:MIN_GQ:PL       ./.:94:99:82:99:0,120,1800

Will produce the following records:
1       69491   .       G       <NON_REF>       .       .       END=69494       GT:DP:GQ:MIN_DP:MIN_GQ:PL       ./.:94:99:82:99:0,120,1800
1       69495   .       C       <NON_REF>       .       .       END=69499       GT:DP:GQ:MIN_DP:MIN_GQ:PL       ./.:94:99:82:99:0,120,1800
1       69500   .       T       <NON_REF>       .       .       END=69504       GT:DP:GQ:MIN_DP:MIN_GQ:PL       ./.:94:99:82:99:0,120,1800
etc.

Added docs and a new test.
2015-03-12 14:42:10 -04:00
Valentin Ruano Rubio f8f2680142 Merge pull request #812 from broadinstitute/ldg_combineData_submit
New walker to combine WGS and WES data
2015-03-02 15:12:31 -05:00
Laura Gauthier aaf952469e Change UG @PartitionBy to fix Queue tests 2015-03-01 14:42:43 -05:00
Laura Gauthier 6ebcba5234 New walker to combine data for different formats of same sample that were called and VQSRed together; has functionality to combine only specified samples, omitting others (e.g. combine the uniquified NA12878s with -usn NA12878.variant51 -usn NA12878.variant102)
GenotypeGVCFs now has the ability to unique-ify samples so I can genotype together two different datasets containing the same sample
Modify InbreedingCoeff so that it works when genotyping uniquified samples
2015-03-01 12:44:32 -05:00
ldgauthier 8efaa97d84 Merge pull request #815 from broadinstitute/ldg_updateMulitallelicVAtestData
Update test data so it better reflects the multiallelic AC/AF annotation...
2015-03-01 12:10:25 -05:00
Ron Levine 44e5965a4b Change GC Content value type from Integer to Float 2015-02-25 13:56:42 -05:00
Laura Gauthier 4a493a7900 Update test data so it better reflects the multiallelic AC/AF annotation use case 2015-02-20 19:02:42 -05:00
Ron Levine 2cbaef2fb2 Throw exception for -dcov argument given to ActiveRegionWalkers 2015-02-19 08:24:39 -05:00
Ron Levine c3ff6df252 StrandAlleleCountsBySample can only be called from HaplotypeCaller 2015-02-12 13:43:48 -05:00
Phillip Dexheimer 92c7c103c1 GenotypeConcordance: monomorphic sites in truth are no longer called "Mismatching Alleles" when the comp genotype has an alternate allele
* PT 84700606
2015-02-07 15:54:38 -05:00
rpoplin b8b23b931e Merge pull request #807 from broadinstitute/rhl_handle_cigar
Process X and = CIGAR operators
2015-02-01 11:09:52 -05:00
Phillip Dexheimer 3354c07b1c Added optional element "includeUnmapped" to the PartitionBy annotation
* The value of this element (default true) determines whether Queue will explicitly run this walker over unmapped reads
 * This patch fixes a runtime error when FindCoveredIntervals was used with Queue
 * PT 81777160
2015-01-31 15:47:57 -05:00
Ron Levine 9d4b876ccd Process X and = CIGAR operators
Add simple BaseRecalibrator integration test for CIGAR = and X operators
2015-01-29 17:00:00 -05:00
Khalid Shakir 1808c90d2a Added introductory CRAM support.
Replaced usage of GATKSamRecordFactory with calls to wrapper GATKSAMRecord extending SAMRecord.
Minor other updates for test changes.
Added exampleCRAM.cram generated by GATK, with .bai and .crai indexes generated by CRAMTools.
CRAM-to-CRAM test disabled due to https://github.com/samtools/htsjdk/issues/148
Using exampleBAM.bam input, outputs of GATK's generated CRAM match CRAMTools generated CRAM, but not samtools/PrintReads SAM output, as things like insert sizes are different.
If required for other tools, CRAM indexes must be generated via CRAMTools until we can generate them via CRAMFileWriter.

Generation of exampleCRAM.cram:
* java -jar target/executable/GenomeAnalysisTK.jar -T PrintReads -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I public/gatk-utils/src/test/resources/exampleBAM.bam -o public/gatk-utils/src/test/resources/exampleCRAM.cram
* java -jar cramtools-2.1.jar index -I public/gatk-utils/src/test/resources/exampleCRAM.cram
* java -jar cramtools-2.1.jar index -I public/gatk-utils/src/test/resources/exampleCRAM.cram --bam-style-index

CRAM generation by existing tools:
* samtools view -C -T public/gatk-utils/src/test/resources/exampleFASTA.fasta -o testSamtools.cram public/gatk-utils/src/test/resources/exampleBAM.bam
* java -jar cramtools-2.1.jar cram --ignore-md5-mismatch --capture-all-tags -Q -n -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I public/gatk-utils/src/test/resources/exampleBAM.bam -O testCRAMTools.cram
* java -jar target/executable/GenomeAnalysisTK.jar -T PrintReads -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I public/gatk-utils/src/test/resources/exampleBAM.bam -o testGATK.cram

CRAMTools view of the above:
* java -jar cramtools-2.1.jar bam --skip-md5-check -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I public/gatk-utils/src/test/resources/exampleCRAM.cram | tail -n 1
* java -jar cramtools-2.1.jar bam --skip-md5-check -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I testSamtools.cram | tail -n 1
* java -jar cramtools-2.1.jar bam --skip-md5-check -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I testCRAMTools.cram | tail -n 1
* java -jar cramtools-2.1.jar bam --skip-md5-check -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I testGATK.cram | tail -n 1
2015-01-26 14:47:39 -03:00
Phillip Dexheimer 72f76add71 Added -trimAlternates argument to SelectVariants
* PT 84021222
 * -trimAlternates removes all unused alternate alleles from variants.  Note that this is pretty aggressive for monomorphic sites
2015-01-21 21:33:35 -05:00
Ron Levine 804b2a36b7 Fix SplitNCigar reads exception by making the list of RNAReadTransformer non-abstract, add test for -fixNDN
Includes documentation changes for -fixNDN argument and the read transformer documentation.

Documentation changes to CombineVariants
2015-01-14 22:22:05 -05:00
rpoplin 0292d49842 Merge pull request #801 from broadinstitute/pd_gatkvcfconstants
Collected VCF IDs and header lines into one place
2015-01-14 09:43:48 -05:00
Phillip Dexheimer 6190d660e0 Edits to work with the latest htsjdk release:
* TextCigarCodec.decode() is now static, and the getSingleton() method is gone
 * MergingSamRecordIterator now wants a Collection<SamReader> rather than Collection<SAMFileReader> in the constructor
 * SeekableBufferedStream now correctly reads the requested number of bytes, removed workaround in GATKBAMIndex
2015-01-13 21:32:10 -05:00
Phillip Dexheimer b73e9d506a Added GATKVCFConstants and GATKVCFHeaderLines to consolidate the GATK-specific VCF annotations
* Removed unused annotations (CCC and HWP)
 * Renamed one of the two GC annotations to "IGC" (for Interval GC)
 * Revved picard & htsjdk (GATK constants are now removed from htsjdk)
 * PT 82046038
2015-01-13 21:32:09 -05:00
Laura Gauthier 6b2bd5ed09 Address user-reported bug featuring "trio" family with two children, one parent
Add test to cover case with family of one parent, two children
2015-01-13 18:35:44 -05:00
Ryan Poplin 2e5f9db758 Raising per-sample limits on the number of reads in ART and HC.
-- Active Region Traversal was using per sample limits on the number of reads that were too low, especially now that we are running one sample at a time. This caused issues with high confidence variants being dropped in high coverage data.
-- HaplotypeCallerGVCFIntegrationTest PL/annotation changes due to using more reads in those tests
-- Removed a CountReadsInActiveRegionsIntegrationTest test for excessive coverage because the read coverage no longer goes over the limits in ART
2015-01-09 11:21:42 -05:00
rpoplin 03203e249e Merge pull request #792 from broadinstitute/rhl_pairhmm_log_stderr
Rhl pairhmm log stderr
2015-01-07 12:41:10 -05:00
Valentin Ruano-Rubio aae04b6122 Fixes explicit limitation of the maximum ploidy of the reference-confidence model
Story:
=====

 - https://www.pivotaltracker.com/story/show/83803796

Changes:
=======

  - From a fix maximum ploidy indel RCM likelihood cache to a
    dynamically resizable one.
  - Used the occassion to removed an unused and deprecated method from ReferenceConfidenceModel

Testing:
=======

  - Added integration test to check on ploidies larger than the previous limit of 20.
2015-01-07 10:43:22 -05:00
Ron Levine b4fda38922 Use logging system instead of stderr 2015-01-05 14:04:10 -05:00
Laura Gauthier 88b6f3aa50 Change []-type arrays to lists so argument parsing works in VCF header commandline output 2015-01-05 10:21:06 -05:00
rpoplin 3240b3538a Merge pull request #794 from broadinstitute/rhl_read_backed_phasing
Rhl read backed phasing
2015-01-05 09:47:25 -05:00
Ron Levine c6840124fe clean up, add final 2015-01-04 23:01:24 -05:00
Ron Levine 85dc703461 Add TestMergeIntoMNP() and TestReallyMergeIntoMNP() 2015-01-01 09:51:20 -05:00
Ron Levine bb94833750 Add more tests 2014-12-30 22:45:44 -05:00
Ron Levine 714d575e3b correct reference file name 2014-12-25 14:00:39 -05:00
Ron Levine a7fba5c209 restructure and add more tests 2014-12-25 13:57:54 -05:00
Ron Levine 64375f6341 Messages that were going to stdout now going to stderr
Make PairHMM outputs go to stderr instead of stdout

Change output from stdout to stderr in close()

Updated lib with output going to stderr
2014-12-23 11:03:29 -05:00
Ron Levine 069398ad46 Added more tests and documentation 2014-12-19 12:57:43 -05:00
Laura Gauthier a9694951d2 Add error handling for genotypes that are called but have no PLs 2014-12-18 15:03:20 -05:00
Geraldine Van der Auwera b0e615251b Updated VQSR tool docs 2014-12-18 12:59:37 -05:00
rpoplin 4a2ac38308 Merge pull request #790 from broadinstitute/rp_nsubtil_fix-snp-detection
BQSR bug fix from @nsubtil
2014-12-18 09:19:53 -05:00
Ron Levine 08790e1dab Fix mmultiallelic info field annotation for VariantAnnotator
Add multi-allele test for info field annotations

Fix to process all types of INFO annotations

roll back to previous version, removes INFO and FORMAT

Correct @return for VariantAnnotatorEngine.getNonReferenceAlleles()

Enhance comments and clean up multi-allelic logic, handle header info number = R

only parse counts of A & R

Add INFO for AC

update MD5

Performance enhancement, only parse multiallelic with a count A or R

Make argument final in getNonReferenceAlleles()

Code cleanup, add exceptions for bad expression/allele size mismatch and missing header info for an expression

Change exception to warning for expression value/number of alleles check

remove adevertised exceptions
2014-12-17 22:21:00 -05:00
Ron Levine ba949389c5 matchHaplotypeAlleles() no longer calls alleleSegregationIsKnown(), added a TODO to investigate 2014-12-17 14:02:24 -05:00
Ryan Poplin d84970ff75 BQSR bug fix from @nsubtil
-- Ignore SNP matches that lie outside the clipped read window
-- This fixes an issue where GATK would skip the entire read if a SNP is entirely
contained within a sequencing adapter.
2014-12-17 10:04:37 -05:00
Ron Levine 56f8e4f9cf Add comments, alleleSegregationIsKnown() check is added to matchHaplotypeAlleles() 2014-12-17 03:25:26 -05:00
Laura Gauthier 011843c569 Fixed huge bug from 9895005a (CombineGVCFs used to stop after the first contig) 2014-12-16 12:43:32 -05:00
rpoplin bcc6b73e9b Merge pull request #786 from broadinstitute/pd_variantstotable_sma
Fix VariantsToTable output of FORMAT record lists when -SMA is specified
2014-12-16 10:37:22 -05:00
Valentin Ruano-Rubio 736a857e82 Fixing CombineGVCFs that writes out the wrong REF allele
Story:
=====

  - https://www.pivotaltracker.com/story/show/83259038

Changes:
=======

  - Done minimal changes to make the fix after an arduous attempt to understand
    CombineGVCFs code.

Test:
====

  - Added a integration test to explicitly test for the bug.

  - Updated a md5 changes as the bug was actually affecting one of the existing
    integration tests.
2014-12-13 22:38:24 -05:00
Phillip Dexheimer 71bdfbe465 Fix VariantsToTable output of FORMAT record lists when -SMA is specified
* PT 84242218
 * Note that FORMAT fields behave the same as INFO fields - if the annotation has a count of A (one entry per Alt Allele), it is split across the multiple output lines.  Otherwise, the entire list is output with each field
2014-12-10 21:41:15 -05:00
rpoplin bf2911d62c Merge pull request #783 from broadinstitute/pd_splitsamfile
Fix NPE in SplitSamFile
2014-12-08 09:39:03 -05:00
Valentin Ruano-Rubio 385186e11b Makes GQ of Hom-Ref Blocks in GVCF output to be consistent with PLs
Story:
-----

  - https://www.pivotaltracker.com/story/show/83800586

Changes:
-------

  - In GVCFWriter GQ is now recalculated out of the fianl PL array for the block.

Testing:
-------

  - Updated affected integration test md5s
2014-12-07 16:45:32 -05:00
Phillip Dexheimer a5dee8a42e Fix NPE in SplitSamFile
* PT 82892316
  * Added integration test
  * Fixed similar error in debug output of HC
2014-12-07 10:37:30 -05:00
Ron Levine c9175eeee8 Renamed PhasingUtilitiesUnitTest to PhasingUtilsUnitTest 2014-12-02 18:20:12 -05:00
Ron Levine b8f0f3fdd2 Add argument for loading the vector HMM library once 2014-12-02 10:13:56 -05:00
Ron Levine 386aeda022 Add HaplotypeCaller argument so integration tests can specify the hardware dependent PairHMM sub-implementation 2014-11-25 21:53:53 -05:00
Ron Levine 34241a62f6 Use a publicly accessible sequence file 2014-11-24 11:18:21 -05:00
Ron Levine 6ff698c556 Added HP and non-HP tests for matchHaplotypeAlleles(), added a nominal test for mergeIntoMNPvalidationCheck() 2014-11-24 11:08:04 -05:00
Ron Levine 61e1a3ecd1 Added the framework for testing the PhasingUtilies methods matchHaplotypeAlleles() and reallyMergeIntoMNP() 2014-11-22 22:01:39 -05:00
Menachem Fromer 9b73c8a841 Fix MNP merging bugs 2014-11-21 06:42:51 -05:00
rpoplin 00027e1555 Merge pull request #774 from broadinstitute/ldg_makeSelectVariantsTrimAlleles
Add -trim argument to SelectVariants to trim alleles to minimal represen...
2014-11-13 13:58:13 -05:00
Ron Levine 67656bab23 Resolved conflict during rebasing
Add more logging to annotators, change loggers from info to warn

Add comments to testStrandBiasBySample()

Clarify comments in testStrandBiasBySample

remove logic for not prcossing an indel if strand bias (SB) was not computed

remove per variant warnings in annotate()

Log warnings if using the wrong annotator or missing a pedgree file

Log test failures once in annotate(), because HaplotypeCaller does not call initialize(). Avoid using exceptions

Fix so only log once in annotate(), Hardey-Weinberg does not require pedigree files, fix test MD5s so pass

Check if founderIds == null

Update MD5s from HaplotypeCaller integrations tests and clean up code

Change logic so SnpEff does not throw excpetions, change engine to utils in imports

Update test MD5s, return immediately if cannot annotate in SnpEff.initialization()

Post peer review, add more logging warnings

Update MD5 for testHaplotypeCallerMultiSampleComplex1, return null if PossibleDeNovo.annotate() is not called by VariantAnnotator
2014-11-12 02:45:49 -05:00
Laura Gauthier 783a4fd651 Change default behavior of SelectVariants to trim remaining alleles when samples are subset. -noTrim argument preserves original alleles. Add test for trimming. 2014-11-11 16:32:25 -05:00
Valentin Ruano-Rubio c5977e5c8f Correct wrong left-alignment of reads in HC bamout
Story:
-----

  https://www.pivotaltracker.com/story/show/80684230

Changes:
-------

  - Corrected the bug: AlignmentUtils#createReadAlignedToRef was
    not realigning against the reference but the best haplotype for
    the read.

Test:
----

  - Added integration test in HaplotypeCallerIntegrationTest to check
    that the bug has been fixed.
  - Fixed md5s modified by this change; these are cause due to small
    changes in the state of the random-number generator and read vs
    variant site overlapping.
2014-11-10 10:09:58 -05:00
Laura Gauthier c09667a20d Fix bug in CombineGVCFs so now sample 2 variants occuring within sample 1 deletions get merged properly.
CombineGVCFs now outputs ref conf for the duration of deletions so that SNPs occuring in other samples aligned with those deletions will be genotyped correctly
2014-11-05 09:11:47 -05:00
Khalid Shakir 0092a0b9eb Faster builds, with updates to documentation generation.
Reading the multiple GATKText files as a single stream, especially with new top level target executable jar files pointing to a lib folder.
Don't dirty the build with a new GATKText.properties if input files are unmodified.
Stop warning on undocumented abstract classes.
Fixed ClassNotFoundException/NoClassDefFoundError by fixing ResourceBundleExtractorDoclet artifact.
Excluding Exceptions from documentation.
Removed custom log4j dependency from ResourceBundleExtractorDoclet.
Stop generating the dependency reduced pom during shade.
Stop regenerating gsalib when the files are already up to date.
Disabled mvn site generation from external-example.
2014-11-05 00:32:23 +08:00
Khalid Shakir 1cb4b99548 Added faster built executable, non-packaged jars.
Moved top level target symlinks to package jar files to under target/package.
Executable jar files are placed under target/executable with the new target[/lib] directories.
Under top level target, symlinks to *either* the package *or* the executable jars replace what was a symlink to the package jar path.
Allow disabling of the shade package.
ant-bridge.sh by default only builds executable jars, and doesn't package by default, as did the old ant build.xml.
Added a new package_path.sh utility script for other scripts to use instead of anything in the target folder.
2014-11-05 00:30:46 +08:00
Phillip Dexheimer 10f99cbe04 Added StrandAlleleCountsBySample annotation
This annotation outputs the number of reads supporting each allele, stratified by sample and read strand.
Addresses PT 76958712
2014-11-03 21:35:58 -05:00
Khalid Shakir 8b81031bf8 Disabling tests for Lsf706 specific functionality. 2014-11-04 01:31:18 +08:00
Phillip Dexheimer bcfd9ce19a Moved platform flow information into NGSPlatform
* Explicitly added a type for rarely used platforms
 * PT 81767718
2014-10-31 22:27:34 -04:00
rpoplin c84805c402 Merge pull request #768 from broadinstitute/pd_bcf_failures
Fix BCF writing when FORMAT annotations contain arrays
2014-10-31 15:30:56 -04:00
rpoplin eecb56e0ae Merge pull request #766 from broadinstitute/ldg_StrandBiasForMultiallelics
Calculate StrandBiasBySample using all alternate alleles as ref vs. any ...
2014-10-31 15:26:07 -04:00
Phillip Dexheimer fc67e50faa Revved Picard/htsjdk
Removed inefficient array->List conversion in AlleleCountBySample
2014-10-30 21:16:25 -04:00
Laura Gauthier bc7202fff7 Calculate StrandBiasBySample using all alternate alleles as ref vs. any alt 2014-10-30 11:52:06 -04:00
Khalid Shakir 5c9fe1a06d Split all imports of tools|engine from utils, and all tools from engine.
Second of two commits, modifying actual files.
2014-10-24 20:59:46 +08:00
Khalid Shakir bb7151192a Split all imports of tools|engine from utils, and all tools from engine.
First of two commits, renaming files only.
2014-10-24 20:59:45 +08:00
Geraldine Van der Auwera b69b256003 Update pom versions to mark the start of GATK 3.4 development 2014-10-23 22:31:44 -04:00
Geraldine Van der Auwera eee94ec81f Update pom versions for the 3.3 release 2014-10-23 22:25:17 -04:00