Commit Graph

13483 Commits (c191103326d2f515a4ec08033eb4d0463affafdb)

Author SHA1 Message Date
Eric Banks 09d2415bea Merge pull request #541 from broadinstitute/eb_HC_sensitivity
Added code to retrieve dangling heads from the read threading graph (pre...
2014-03-03 23:56:06 -05:00
Eric Banks b99bf85ec8 Fixed bug where dangling tail merging occasionally created a cycle in the graph.
Added unit tests to cover this case.  Delivers PT#66690470.
2014-03-03 22:42:56 -05:00
Eric Banks 4d69af189e Minor change: make the --dontUseSoftClippedBases @Advanced instead of @Hidden 2014-03-03 15:59:32 -05:00
Eric Banks fa65716fe9 Added code to retrieve dangling heads from the read threading graph (previously we were rescuing just the tails).
The purpose of this is to be able to call SNPs that fall at the beginning of a capture region (or exon).
Before, the read threading code would only start threading from the first kmer that matched the reference.  But
that means that, in the case of a SNP at the beginning of an exome, it wouldn't start threading the read until
after the SNP position - so we'd lose the SNP.

For now, this is still very experimental.  It works well for RNAseq data, but does introduce FPs in normal exomes.
I know why this is and how to fix it, but it requires a much larger fix to the HC: the HC needs to pass all reads
and bases to the annotation engine (like UG does) instead of just the high quality ones.  So for now, the head
merging is disabled by default.

As per reviewer comments, I moved the head and tail merging code out into their own class.
2014-03-03 15:59:26 -05:00
amilev cecdd2f2c5 Merge pull request #539 from broadinstitute/eb_hard_clip_exon_overhangs_for_ami
Add the capability to the N-cigar splitter to also hard-clip off overhan...
2014-03-03 12:23:11 -05:00
Karthik Gururaj a893765ae2 Added license to Makefile 2014-03-03 09:11:02 -08:00
Karthik Gururaj 7cd23543a1 Added public license text to all C++ files 2014-03-03 09:04:00 -08:00
MauricioCarneiro f7d10b9781 Merge pull request #544 from broadinstitute/eb_archive_reduce_reads
Moving Reduce Reads to the archive.
2014-03-03 11:29:12 +09:00
Eric Banks 6c872308d8 Add the capability to the N-cigar splitter to also hard-clip off overhangs based on observed split positions.
We use a "manager" to keep track of observed splits and previous reads.  This can be extended/modified in the
future to try to salvage those overhangs instead of hard-clipping them and/or try other possible strategies.

Added unit tests and more integration tests.
2014-03-02 21:10:34 -05:00
Eric Banks 22ad18b919 Moving Reduce Reads to the archive.
The GATK now fails with a user error if you try to run with a reduced bam.
(I added a unit test for that; everything else here is just the removal of all traces of RR)
2014-03-02 02:03:14 -05:00
Eric Banks 293234a8dc Merge pull request #540 from broadinstitute/eb_add_ability_to_ignore_individual_filters
Add an option to AssessNA12878 to be able to ignore one or more specific...
2014-03-01 22:27:10 -05:00
Eric Banks db85dc6fc0 Add an option to AssessNA12878 to be able to ignore one or more specific filters (instead of either all or none).
Useful in conjunction with ROCCurveNA12878 in determining a good VQSR cut.
2014-03-01 22:25:46 -05:00
kshakir e16996d881 Merge pull request #543 from broadinstitute/ks_mvn_gc_config
Attempting to limit GC during Maven tests
2014-03-01 23:09:21 +07:00
Khalid Shakir 387188e5bb Attempting to limit gc during Maven tests, using defaults found in JavaCommandLineFunction 2014-03-01 15:24:45 +08:00
Karthik Gururaj 1b395a871a 1. Changed logger.info to logger.warn in PairHMMLikelihoodCalculationEngine.java
2. Committing the right set of files after rebase
2014-02-28 16:08:28 -08:00
Karthik Gururaj 37526dfad5 1. Added the catch UnsatisfiedLinkError exception in
PairHMMLikelihoodCalculationEngine.java to fall back to LOGLESS_CACHING
in case the native library could not be loaded. Made
VECTOR_LOGLESS_CACHING as the default implementation.
2. Updated the README with Mauricio's comments
3. baseline.cc is used within the library - if the machine supports
neither AVX nor SSE4.1, the native library falls back to un-vectorized
C++ in baseline.cc.
4. pairhmm-1-base.cc: This is not part of the library, but is being
heavily used for debugging/profiling. Can I request that we keep it
there for now? In the next release, we can delete it from the
repository.
5. I agree with Mauricio about the ifdefs. I am sure you already know,
but just to reassure you the debug code is not compiled into the library
(because of the ifdefs) and will not affect performance.
2014-02-28 08:59:55 -08:00
cwhelan 523eeecc15 Merge pull request #537 from broadinstitute/cw_duplicatevcfcheck_66084436
Added command line checks for duplicate files in ROD lists
2014-02-27 13:39:09 -05:00
Chris Whelan e61ba8b340 Added command line checks for duplicate files in ROD lists
-- Keep a list of processed files in ArgumentTypeDescriptor.getRodBindingsCollection
  -- Throw user exception if a file name duplicates one that was previously parsed
  -- Throw user exception if the ROD list is empty
  -- Added two unit tests to RodBindingCollectionUnitTest
2014-02-27 13:32:18 -05:00
Karthik Gururaj 2d0ce45bb0 Moved JNI_README 2014-02-27 10:12:23 -08:00
Eric Banks 4395d25726 Merge pull request #538 from broadinstitute/ks_integration_test_fix
Fixes test counts, and full paths of diff commands
2014-02-26 21:04:06 -05:00
Karthik Gururaj c645725fc3 1. Renamed directory structure from public/c++/VectorPairHMM to
public/VectorPairHMM/src/main/c++ as per Khalid's suggestion
2. Use java.home in public/VectorPairHMM/pom.xml to pass environment
variable JRE_HOME to the make process. This is needed because the
Makefile needs to compile JNI code with the flag -I<JRE_HOME>/../include (among
others). Assuming that the Maven build process uses a JDK (and not just
a JRE), the variable java.home points to the JRE inside maven.
3. Dropped all pretense at cross-platform compatibility. Removed Mac
profile from pom.xml for VectorPairHMM
2014-02-26 15:17:15 -08:00
Karthik Gururaj bd71ba35e5 Moved pom.xml to VectorPairHMM and updated artifactId 2014-02-26 14:01:46 -08:00
Khalid Shakir da587d48ed Using absolute paths in generated diff commands, to ease running them from any directory. 2014-02-27 04:43:39 +08:00
Khalid Shakir c163e6d0d2 Separate failsafe directories for each of the integration test types [#66515572] 2014-02-27 04:43:39 +08:00
Karthik Gururaj da23f2020a Merge branch 'intel_pairhmm' of /data/broad/gsa-unstable into intel_pairhmm 2014-02-26 11:48:27 -08:00
Karthik Gururaj b81e2c2948 Native library part of git repo 2014-02-26 11:47:42 -08:00
Karthik Gururaj 0fe843bfd9 Followed Khalid's suggestion for packing libVectorLoglessCaching into
the jar file with Maven
2014-02-26 11:47:42 -08:00
Karthik Gururaj 15fe244e4b Now has PAPI values 2014-02-26 11:47:42 -08:00
Intel Repocontact e32e9e6af6 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2014-02-26 11:47:01 -08:00
Karthik Gururaj 53d5bc93b2 Native library part of git repo 2014-02-26 11:44:28 -08:00
Karthik Gururaj 0d5627c2f1 Followed Khalid's suggestion for packing libVectorLoglessCaching into
the jar file with Maven
2014-02-26 10:53:51 -08:00
Eric Banks 84d8b0e9a1 Merge pull request #535 from broadinstitute/ks_pd_queuelogdir_gatherbam_patches
Ks pd queuelogdir gatherbam patches
2014-02-26 08:55:10 -05:00
Karthik Gururaj ac1cefce29 Merge branch 'intel_pairhmm' of /data/broad/gsa-unstable into intel_pairhmm
After rebase into local repo

Conflicts:
	protected/java/src/org/broadinstitute/sting/gatk/walkers/haplotypecaller/HaplotypeCaller.java
2014-02-25 21:51:57 -08:00
Karthik Gururaj a058e96c34 Now has PAPI values 2014-02-25 21:44:20 -08:00
Intel Repocontact ff2a972ab5 Merge branch 'master' of github.com:broadinstitute/gsa-unstable
Conflicts:
	.gitignore
2014-02-25 20:56:28 -08:00
Khalid Shakir f02ce6eca7 Added tests for cleaning up scattered .bai files, and using the log directory.
Re-added import java.io.File for BamGatherFunction.
Other cleanup to resolve scala syntax warnings from intellij.
Moved Example UG script to from protected to public.
2014-02-26 02:11:28 +08:00
pdexheimer 0405afeab2 Inherit BamGatherFunction from MergeSamFiles rather than PicardBamFunction
- This change means that BamGatherFunction will now have an @Output field for the BAM index, which will allow the bai to be deleted for intermediate functions

Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2014-02-26 02:11:28 +08:00
pdexheimer 504c125c26 Ensure .out files are saved into logDirectory
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2014-02-26 02:11:28 +08:00
pdexheimer 51dcd364a5 Added logDirectory argument
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2014-02-26 02:11:28 +08:00
kshakir e340b6237a Merge pull request #534 from broadinstitute/ks_queue_private_link_qscripts
Added missing private qscripts symbolic links to queue-private module.
2014-02-25 06:16:50 -05:00
Khalid Shakir a90745bbe5 Added missing private qscripts symbolic links to queue-private module. 2014-02-25 17:46:47 +08:00
Eric Banks b1885d449b Merge pull request #533 from broadinstitute/eb_normalize_FS_contingency_table
Stopgap procedure to rescue Fisher Strand for cases where there's lots of data.
2014-02-25 02:18:01 -05:00
Eric Banks 0f30df0356 Stopgap procedure to rescue Fisher Strand for cases where there's lots of data.
This commit consists of 2 main changes:
1. When the strand table gets too large, we normalize it down to values that are more reasonable.
2. We don't include a particular sample's contribution unless the total ref and alt counts are at least 2 each;
this is a heuristic method for dealing only with hets.

MD5s change as expected.
Hopefully we'll have a more robust implementation for GATK 3.1.
2014-02-25 01:04:27 -05:00
droazen e8ea9f58d3 Merge pull request #531 from broadinstitute/ks_build_patches
Build patches
2014-02-24 15:13:16 -05:00
Valentin Ruano Rubio 1c7eac50fc Merge pull request #532 from broadinstitute/vrr_graphbased_there_is_no_such_edge_fix
Fix for a bug a bug in (Assembly Graph) Routes.
2014-02-24 12:08:47 -05:00
Valentin Ruano-Rubio 0b3a70b8c1 Fix for a bug a bug in (Assembly Graph) Routes.
The slicePrefix method functionality was broken.

Story:

https://www.pivotaltracker.com/story/show/64595624

Changes:

1. Fixed the bug.
2. Added unit test to check on the method functionality.
3. Added a integration test to verify the bug has been fixed in a empirical data reprudible case.
2014-02-24 10:54:39 -05:00
Khalid Shakir 7e516b294f Replaced local drmaa and Jama artifacts with versions from maven central.
Removed unused caliper binary from local repo.
2014-02-22 01:21:35 +08:00
Khalid Shakir 9b7fc37b14 Moved private/scala/test to private/queue-private/src/test/scala
Added junction/symbolic links so that queue-private tests will run.
2014-02-22 01:21:35 +08:00
Khalid Shakir a75043b207 When git describe fails use "exported" instead of "unknown". 2014-02-22 01:21:35 +08:00
Khalid Shakir 4670c87313 Fixed mvn run for packagetests over external-example. 2014-02-22 01:21:34 +08:00