droazen
3f974c62e6
Reorganized init() to check for RODs (reference / truth)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6050 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:48 +00:00
droazen
6f5a08ddc6
Simple walker to look at SNPs near indels. Didn't need to make this a walker and commit it, but used it as an opportunity to play with GIT in unstable.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6049 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:44 +00:00
droazen
29a0e08aa2
Testing bug fix process #3 (changes are irrelevant)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6048 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:40 +00:00
droazen
e148a75c32
Testing the 'bug fix' process #2 (changes are irrelevant)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6047 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:37 +00:00
droazen
2e3d6754cd
First implementation of the Error Model.
...
Added stratification by lane to ReadBackedPileup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6046 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:33 +00:00
droazen
27b1418b84
PSP2 output much better. Good masking of repetitive regions. Flagging of invalid amplicons rather than omission of them, reasons properly given. Kiran doesn't like the trailing comma, but the trailing comma also doesn't like Kiran.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6045 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:27 +00:00
droazen
f7fa373643
Incorporate lists of fingerprint data rather than summaries.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6044 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:23 +00:00
droazen
9a00d81d57
Is git commit -a different than git commit?
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6043 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:19 +00:00
droazen
84dd72e6cb
Adding in some read filters, updating MathUtils.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6042 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:15 +00:00
droazen
e0d203434f
Add a column summing the fingerprint LOD scores.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6041 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:09 +00:00
droazen
4f7a64a798
Fixing broken walker as per GS; adding integration test to cover it.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6040 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:04 +00:00
droazen
0e057276ae
Changing the default behavior of the IndelRealigner to run without Smith-Waterman. Changed around the integration tests accordingly.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6039 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:58 +00:00
droazen
751aa8bfa6
Partial rewrite of the summary metrics aggregator to accumulate all metrics
...
from sample-level summaries, rather than only specific metrics. Continues to
manually handle fingerprinting.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6038 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:53 +00:00
droazen
4288ca1c24
Fix doc bug.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6037 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:49 +00:00
droazen
cc1f94310d
A prototype script and library dependencies to extract a BAM list from a
...
reasonably well-formed PM's xls{x}-format spreadsheet or tsv file.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6036 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:45 +00:00
droazen
df71d5b965
bye bye
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6035 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:42 +00:00
droazen
9b90e9385d
Putting new association files, some qscripts, and the new pick sequenom probes file under local version control. I notice some dumb emacs backup files, I'll kill those off momentarily. Also minor changes to GenomeLoc (getStartLoc() and getEndLoc() convenience methods)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6034 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:37 +00:00
droazen
95614ce3d6
Updated the tribble jar to revision 345
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6033 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:25 +00:00
droazen
53c089949e
Added integration test for -n parameter
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6032 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:22 +00:00
droazen
32a991c4d3
Updated the tribble jar to revision 343
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6031 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:17 +00:00
ebanks
745935ffc2
No longer used - instead see the ConstrainedMateFixingManager class
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6030 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 19:38:17 +00:00
kshakir
69f5f16711
Added conditional checking for median_insert_size.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6029 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 17:59:54 +00:00
ebanks
b35df9a0f7
Removing unnecessary String.format calls
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6028 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 15:30:52 +00:00
delangel
b7a1beff3c
Bug fixes and rewrite of logic of several parts in SelectVariants:
...
a) If we were selecting SNPs or Indels and there was one of each at the same location, only whichever one was pulled first was processed.
b) Fixed logic error when selecting Mendelian Violations: if that option was used it wasn't possible to combine with other options.
c) Fixed logic error when using -disc option: you shouldn't parse genotypes to check whether a site is present or not because a vcf can be sites-only and this is slow.
d) Made -disc option work in the same way as other options: variants are now selected from "variant" track all the time, and variants which are not in disc track are kept. Inverse logic (keep disc variants not present in "variant") is confusing and prevents users from combining with different options.
With these changes it is now possible to ask for example "Give me all indels which are Mendelian Violations, not in dbsnp and present in these samples" which was not possible before.
Integration tests covering the above are forthcoming.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6027 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 14:05:00 +00:00
depristo
9b54239f37
Removed nDeletions and nInsertions as independent plots -- just not useful given nIndels and insertions/deletion ratio. Fixed jitter problems with rug plots by removing the rugs entirely. Recovered, and improved, comparative features lost by removing the rug plots by getting viewports to work (trivial) so now per sample metrics and distributions are displayed on the same page!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6026 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 13:59:46 +00:00
kshakir
a1f8aa90c0
Added an integration test showing how to use LSF C API to get LSF parameters.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6025 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 22:54:55 +00:00
ebanks
8e149cc52f
Fixing a silly bug of mine: when a realignment target begins at position 1 of a contig, it was possible to have some reads get emitted out of order (triggering an exception in the SAMFileWriter). This is fixed by moving around some parentheses. Tim, if you are reading this: feel free to take this fix in whenever it's convenient. I.e. it's not critical as the only user who has been hit by it has a reference with over 130K short contigs. Committing in SVN so that it gets incorporated immediately (and I can respond on GS now).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6024 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 21:42:38 +00:00
asivache
6dd41c8489
Nway writer takes another argument: whether to create index on the fly. Realigner in NWayOut mode currently will ALWAYS create index on the fly as there seems to be no clean way to extract the requested value from argument collection in the presence of a different @Output stream.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6023 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 17:26:04 +00:00
asivache
78461bac1e
Default logic (and name) has changed. Now somatic mode is default one. In order to run in single-sample (unpaired) mode, one has to use (hidden) --unpaired option.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6022 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 17:08:41 +00:00
chartl
c5de06a641
Fixing up the RefSeqCodec so a bad entry in RefSeq (some transcripts are odd and have a negative length which may signify something special (?) ) doesn't cause failure, but issues a warning instead. Integration tests pass.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6021 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 14:07:58 +00:00
asivache
7c322780d3
Nway out fixed: in this mode a special nwayout sam writer is instantiated and passed to constrained manager. All the dispatching of the reads into separate output sam streams is taken care by that writer, so no other special processing is needed at the realigner/manager level.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6020 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-20 19:35:26 +00:00
ebanks
600a6a43a6
Reverting previous commit, as promised.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6019 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-20 16:30:19 +00:00
ebanks
ee18c9b0c2
Temporary commit to please those in 320: re-support the -knownsOnly argument (@Hidden). This will be reverted in a sec.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6018 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-20 16:28:58 +00:00
depristo
a37e9bdbd4
Now produces an expanded table at the start, as well
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6017 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-20 13:45:08 +00:00
rpoplin
e8738f95c5
This warning message scares too many people.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6016 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-20 13:43:16 +00:00
depristo
4c6d0e6143
Added stratification by discrete allele count, just like AF, but requiring genotypes so it can be exact. Added docs on wiki, and integrationtest using Kiran's very nice fundamental VCF
...
VariantEvalWalker now passes a pointer to itself to the Stratefication setVariantEvalWalker (and assoc. get method) so that stratefications can look at VEWalker variables to obtain information necessary for their calculations, like the list of eval samples. This is a better interface, in my opinion, than the current approach of extending the base abstract Stratefication to include an initialize function that has all arguments necessarily for any Strat.
JEXL expressions now provide access to the VariantContext vc object itself, so you can write JEXL's that directly use VariantContext and associated functions from the command line.
ExomePostQC Queue script now creates a byAC eval using the new strat, and no longer produces a byAF file (as this was not exact, and lead to strange punctile behavior when actual AF quantization was out of sync with fix quantization of AF strat.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6015 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-19 03:11:00 +00:00
depristo
907018768c
Now uses AC directly from eval, not via AF, internally for AC vs. X plotting. Requires at least 1 SNP to include a site in TiTv plotting or snp/indel ratio. Uses .byAC not .byAF eval file now
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6014 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-19 03:05:17 +00:00
asivache
64196b6c7a
Writer implementation that can dispatch reads to maltiple underlying bam files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6013 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-17 20:44:17 +00:00
depristo
1afd24c831
SliceBams now handles properly the case where multiple read groups clash in the input BAM files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6012 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-17 20:19:19 +00:00
fromer
03a0185566
Control unscattered output file location
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6011 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-17 15:53:25 +00:00
depristo
285da580f3
Now with dbSNP rate
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6010 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-17 13:21:40 +00:00
ebanks
dd1d9cd76f
Forgot to deprecate the old args
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6009 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-16 17:54:44 +00:00
ebanks
4e85416af1
[Foiled yet again when trying to do this in git] Slight modifications in the argument structure for the IndelRealigner. Instead of boolean flags -knownsOnly and -doNotUseSW, we now have an enum --consensusDeterminationModel which lets you specify knowns only, also use indels in reads, or also use SW. Please note that the default behavior of IR has not changed at all (and won't for a few more days) - that'll be done in GIT (fingers crossed).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6008 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-16 17:35:37 +00:00
depristo
4304fc4862
Fixed up md5s
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6007 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-16 16:20:41 +00:00
depristo
27d4b317fc
Simple program that calls indels in CEU trio exomes and WGS can compared the results. Overall the indel calls really look good to me, given reasonably good input BAM files.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6006 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-16 12:56:04 +00:00
depristo
43fdd31e20
Significant performance optimization for reduced reads due to better algorithm for including reads in the variable regions. Fixed a critical bug that actually produced multiple copies of the same read in the variable regions with this optimization as well. Scala exploration script updated as well.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6005 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-16 12:54:59 +00:00
depristo
38d7733989
Now accepts any number of VCFs to evaluate. Runs the standard (now three) variant eval commands and invokes the exomeQC R script. Has some annoying assumptions about paths encoded right now. Example usage below:
...
setenv DATA ~/Desktop/broadLocal/localData/
java -Djava.io.tmpdir=tmp -jar ../dist/Queue.jar -S ../scala/qscript/oneoffs/depristo/ExomePostQCEval.scala --gatkjarfile ../dist/GenomeAnalysisTK.jar -R $DATA/human_g1k_v37.fasta $* -eval $DATA/ESPGO_Gabriel_NHLBI_eomi_june_2011_batch1.vcf -intervals ~/Desktop/broadLocal/localData/whole_exome_agilent_1.1_refseq_plus_3_boosters.Homo_sapiens_assembly19.targets.interval_list -dbSNP ~/Desktop/broadLocal/localData/dbsnp_132_b37.vcf -eval $DATA/ESPGO_Gabriel_NHLBI_eomi_june_2011_batch2.vcf
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6004 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-16 12:49:54 +00:00
depristo
9254faa27e
Added density plots by sample for each metric. New command line argument ordering. No longer requires the per-sample.tsv suppl. data -- will conditionally load if available
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6003 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-16 12:46:29 +00:00
fromer
b4c30bf124
Added option of minMappingQuality
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6002 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-16 00:02:26 +00:00
depristo
ce4e8d2093
A few comments / todos.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6001 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-15 13:09:09 +00:00