Guillermo del Angel
a2d90a3590
Bug fix: reverted logic so that default behavior skips over sample lookup
2011-07-20 10:23:10 -04:00
Guillermo del Angel
e8409c80fa
Further protection vs null pointers in PrintReadsWalker
2011-07-19 21:59:24 -04:00
Guillermo del Angel
fb2d475c22
Bug fix to prevent null pointer
2011-07-19 20:13:56 -04:00
Guillermo del Angel
6181d1e4cb
Fixed integration test for VariantsToTable: now the * in REF column is not output
2011-07-19 14:42:11 -04:00
Guillermo del Angel
e6d306458c
Merge bug fixes
2011-07-19 14:36:20 -04:00
Guillermo del Angel
989dd17f95
a) Add ability in PrintReads to specify a sample file to easily subset samples, useful for IGV visualization, b) VariantsToTable is more R-friendly with Indels when printing ref/alt columns, c) Changes to SelectVariants ability to speficy a mask to randomly sample from a given AF distribution
2011-07-19 14:29:07 -04:00
Matt Hanna
005adf377f
Derive MEDIAN_INSERT_SIZE plot from base plot with additional faceting.
2011-07-19 10:48:45 -04:00
Matt Hanna
9a1394d7e7
Clean up MEDIAN_INSERT_SIZE plot for consistency with other plots.
2011-07-19 10:34:50 -04:00
Matt Hanna
5d3112c665
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-19 09:32:01 -04:00
Matt Hanna
0cec2c6759
When sorting samples by date, only use filtered samples to avoid discontinuities
...
in the plot. Add brief documentation for running the R script.
2011-07-19 09:28:51 -04:00
Mauricio Carneiro
9ad5c7dfa4
Resolving simple conflicts in the data processing pipeline.
...
Conflicts:
public/scala/qscript/org/broadinstitute/sting/queue/qscripts/DataProcessingPipeline.scala
2011-07-19 08:05:11 -04:00
Mauricio Carneiro
7688bda1a6
better progress report for the DPP
2011-07-18 23:39:47 -04:00
Mauricio Carneiro
2b465ab43b
* added optional 'no validation' for the Data Processing pipeline.
...
* some simplifications on the picard classes
2011-07-18 23:30:31 -04:00
Mauricio Carneiro
4cf7a2af23
Removed broad specific default paths so people from outside the broad can use it.
2011-07-18 23:25:21 -04:00
Khalid Shakir
9b446020f9
Using picard implementations for accessing aggregation directories.
...
Added more utilities to PicardPrivate.
Revved picard.
2011-07-18 21:49:03 -04:00
Matt Hanna
0ef37979cc
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-18 21:30:51 -04:00
Matt Hanna
d5d107856c
Subselect based on bait set.
2011-07-18 18:42:21 -04:00
Mauricio Carneiro
1837da37f6
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-18 17:59:26 -04:00
Mauricio Carneiro
916c0c9489
some quick & dirty debug info for the replication validation walker.
2011-07-18 17:57:12 -04:00
Matt Hanna
044f5faa4d
Support for numeric columns.
2011-07-18 17:44:49 -04:00
Matt Hanna
9729d61e2d
Use geom_text() instead of geom_point() when outputting data for new project
...
only.
2011-07-18 17:29:00 -04:00
Mauricio Carneiro
f1e3c3356b
Merge branch 'rbam'
2011-07-18 17:26:07 -04:00
Mauricio Carneiro
c618a5b54c
commented out wrong MD5s
2011-07-18 17:25:45 -04:00
Mauricio Carneiro
a9f956c80c
Fixed several bugs in the pooled caller. Creating a good dataset to test its accuracy now.
2011-07-18 16:04:11 -04:00
Mark DePristo
4e78f0b064
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-18 15:45:23 -04:00
Mark DePristo
8f0badc52b
Updating md5s, as the diffobjects walker now emits the summary in reverse order.
2011-07-18 15:44:21 -04:00
Mark DePristo
c05451047c
Support for multiple records at the same site. The first record gets chr:start, and subsequent records get chr:start_2, chr:start_3, etc.
2011-07-18 15:43:52 -04:00
Mark DePristo
782a05e9b5
Support for sorting the diff output in reverse order.
2011-07-18 15:43:01 -04:00
Mark DePristo
45702d3084
Now supports a mode where the primary key isn't sorted. In this case the records are displayed in the order in which they are added to to the table.
2011-07-18 15:40:15 -04:00
Matt Hanna
15b44ac2c3
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-18 14:56:41 -04:00
Matt Hanna
e5e7523f8b
Modify to support either bam list format files or tsv formatted files. The
...
latter provide a major advantage when dealing with samples with spaces in the
names.
2011-07-18 14:56:00 -04:00
Matt Hanna
adce37774a
Add functionality for tsv output.
2011-07-18 14:12:01 -04:00
Eric Banks
6d5e87da10
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-18 13:59:10 -04:00
Eric Banks
83ba2c066a
Making it deterministic
2011-07-18 13:59:02 -04:00
Eric Banks
92fa410450
Check that it's a valid bam file before parsing or bad things can happen
2011-07-18 13:43:34 -04:00
Eric Banks
80b5c5261a
CombineVariants no longer combines records of different types. So now when combining SNP and indel callsets, overlapping calls get their own records. Useful for Khalid in the pipeline. For those interested, it turns out the previous behavior was doing the wrong thing occasionally (and this was even captured in the integration tests).
2011-07-18 13:42:45 -04:00
Menachem Fromer
4adead3099
Fixed import conflict
2011-07-18 13:23:20 -04:00
Menachem Fromer
d8ba4ab835
Only maintain an unbroken haplotype chain if the current is phased relative to previous (by RBP), or both previous and current are parentally phased
2011-07-18 13:14:39 -04:00
Eric Banks
bc8b5da698
Added docs while I was reading through the code to understand it
2011-07-18 12:25:54 -04:00
Mauricio Carneiro
5493a4dd99
Added annotations to filter out :
...
* unmapped reads
* failed vendor quality reads
* duplicate reads
* not primary alignment reads
2011-07-18 12:06:08 -04:00
Matt Hanna
d8517a000a
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-18 11:07:18 -04:00
Matt Hanna
f15357c2e1
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-18 10:52:31 -04:00
Matt Hanna
95c776bf59
Updated documentation.
2011-07-18 10:52:06 -04:00
Matt Hanna
cb9bef6847
Updated documentation.
2011-07-18 10:51:22 -04:00
Mark DePristo
51b0dd01c3
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-18 10:47:29 -04:00
Mark DePristo
449bf1b539
Testdata for diffObjects.
...
PipelineTest updated to point to MD5DB.java
2011-07-18 10:47:03 -04:00
Mark DePristo
d6e2e89f99
Walker test system refactoring. All MD5DB related functions are now in MD5DB.java.
...
System has the concept of a local and a global MD5 db. The local one is like it operated previously. The global one lives in /humgen/gsa-hpprojects/GATK/data/integrationtests. If the system can find this directory then MD5s will also be read / written to this location. This means that gsabamboo will print differences as appropriate. And all users will in effect have access to a complete history of MD5 file results.
A few minor code reshuffles changed VariantRecalibration and VCFHeader test files.
2011-07-18 10:46:01 -04:00
Mark DePristo
6f26c07b85
Removed the SpecificDifference class. Now Difference classes always have the option to remember specific master and test values. This means that all summarized differences carry with them specific examples of their differences. Consequently, now even summarized differences give at least one example of the specific difference, even when the count of the difference is > 1. Unit tests updated. Added DiffObjects integrationtest. VCFDiffableReader now specifically reads the first line of the VCF file to capture the version number.
2011-07-18 10:42:35 -04:00
Matt Hanna
1f538d2add
Place the preQC database in /humgen/gsa-scr1/GATK_Data.
...
Rework the way data outside the center 95% is trimmed out.
Cleanup some documentation.
2011-07-18 10:33:57 -04:00
Mark DePristo
837a91b85d
No more ls to stdout unless verbose is true [manageGATKS3Logs.py]
...
Fully qualified paths now work properly. Moved script into git [downloadGATKReportsFromS3.csh]
Correct path to files in runGATKReport.csh
2011-07-18 08:31:08 -04:00