Commit Graph

6028 Commits (d92055d1f989b9df8b4f31b8450fd84b3271ee28)

Author SHA1 Message Date
droazen d92055d1f9 Checkpointing some bugfixes with zero-length version directories and missing
Picard metrics files before the push back into svn.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6069 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:56:01 +00:00
droazen 171e20a111 Updated the tribble jar to revision 351
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6068 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:57 +00:00
droazen 3d27e5eb98 Default operating parameters in addition to the parameterized Rscript version.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6067 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:53 +00:00
droazen 0d07c979e9 added comments on how to use this very useful script!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6066 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:50 +00:00
droazen ab1de3bfda Updated the tribble jar to revision 350
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6065 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:46 +00:00
droazen c8124496d0 now with the new 'consensus model' parameter to the cleaner.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6064 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:42 +00:00
droazen c956e154a0 Kill silly plots.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6063 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:39 +00:00
droazen 772291c38f Error model is now built by lane and each pool is called separately.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6062 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:36 +00:00
droazen 28d8b28bdf Density plots.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6061 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:33 +00:00
droazen d323ef0461 As promised, VariantFiltration can now mask out sites within a user-specified window around the provided mask rod. By default the window is 0, but you can now use the --maskExtension argument to increase that value. Added integration tests to cover this new functionality.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6060 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:29 +00:00
droazen ea47ccf032 Implemented HET case with binomial distribution. Separated events from normal events and for now skip all extended events.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6059 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:24 +00:00
droazen 26d837f59e Factorial and log Factorial utilities avoiding overflow using the gamma function. Lots of unit tests. Everything is working great.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6058 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:20 +00:00
droazen 8d5b4af8ca Binomial and Multinomial interfaces for probability and coefficients in log and real space. Passed all unit tests.
BinomialCumulativeProbability was reformatted to follow the now standard parameter order.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6057 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:15 +00:00
droazen 4abb7c424b implementation of the Gamma function and log10 Binomial / Multinomial coefficients. Unit tests for gamma and binomial passed with honors.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6056 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:09 +00:00
droazen 3392c67e0f Support for command-line arguments.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6055 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:05 +00:00
droazen d9973d3da7 Adding in a template for many other plots based on Mark's initial list of metrics.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6054 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:55:02 +00:00
droazen 237f73c1b1 Initial fingerprint boxplot for exome PreQC metrics.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6053 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:59 +00:00
droazen ff6386c29b binomial coefficient was in log2, changed to log10.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6052 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:55 +00:00
droazen 082abfd84f implementation of the truth allele, different cases for REF , HOMVAR, FILTERED and HET.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6051 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:51 +00:00
droazen 3f974c62e6 Reorganized init() to check for RODs (reference / truth)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6050 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:48 +00:00
droazen 6f5a08ddc6 Simple walker to look at SNPs near indels. Didn't need to make this a walker and commit it, but used it as an opportunity to play with GIT in unstable.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6049 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:44 +00:00
droazen 29a0e08aa2 Testing bug fix process #3 (changes are irrelevant)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6048 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:40 +00:00
droazen e148a75c32 Testing the 'bug fix' process #2 (changes are irrelevant)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6047 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:37 +00:00
droazen 2e3d6754cd First implementation of the Error Model.
Added stratification by lane to ReadBackedPileup.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6046 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:33 +00:00
droazen 27b1418b84 PSP2 output much better. Good masking of repetitive regions. Flagging of invalid amplicons rather than omission of them, reasons properly given. Kiran doesn't like the trailing comma, but the trailing comma also doesn't like Kiran.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6045 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:27 +00:00
droazen f7fa373643 Incorporate lists of fingerprint data rather than summaries.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6044 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:23 +00:00
droazen 9a00d81d57 Is git commit -a different than git commit?
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6043 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:19 +00:00
droazen 84dd72e6cb Adding in some read filters, updating MathUtils.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6042 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:15 +00:00
droazen e0d203434f Add a column summing the fingerprint LOD scores.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6041 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:09 +00:00
droazen 4f7a64a798 Fixing broken walker as per GS; adding integration test to cover it.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6040 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:54:04 +00:00
droazen 0e057276ae Changing the default behavior of the IndelRealigner to run without Smith-Waterman. Changed around the integration tests accordingly.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6039 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:58 +00:00
droazen 751aa8bfa6 Partial rewrite of the summary metrics aggregator to accumulate all metrics
from sample-level summaries, rather than only specific metrics.  Continues to
manually handle fingerprinting.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6038 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:53 +00:00
droazen 4288ca1c24 Fix doc bug.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6037 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:49 +00:00
droazen cc1f94310d A prototype script and library dependencies to extract a BAM list from a
reasonably well-formed PM's xls{x}-format spreadsheet or tsv file.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6036 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:45 +00:00
droazen df71d5b965 bye bye
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6035 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:42 +00:00
droazen 9b90e9385d Putting new association files, some qscripts, and the new pick sequenom probes file under local version control. I notice some dumb emacs backup files, I'll kill those off momentarily. Also minor changes to GenomeLoc (getStartLoc() and getEndLoc() convenience methods)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6034 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:37 +00:00
droazen 95614ce3d6 Updated the tribble jar to revision 345
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6033 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:25 +00:00
droazen 53c089949e Added integration test for -n parameter
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6032 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:22 +00:00
droazen 32a991c4d3 Updated the tribble jar to revision 343
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6031 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 22:53:17 +00:00
ebanks 745935ffc2 No longer used - instead see the ConstrainedMateFixingManager class
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6030 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 19:38:17 +00:00
kshakir 69f5f16711 Added conditional checking for median_insert_size.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6029 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 17:59:54 +00:00
ebanks b35df9a0f7 Removing unnecessary String.format calls
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6028 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 15:30:52 +00:00
delangel b7a1beff3c Bug fixes and rewrite of logic of several parts in SelectVariants:
a) If we were selecting SNPs or Indels and there was one of each at the same location, only whichever one was pulled first was processed.
b) Fixed logic error when selecting Mendelian Violations: if that option was used it wasn't possible to combine with other options.
c) Fixed logic error when using -disc option: you shouldn't parse genotypes to check whether a site is present or not because a vcf can be sites-only and this is slow.
d) Made -disc option work in the same way as other options: variants are now selected from "variant" track all the time, and variants which are not in disc track are kept. Inverse logic (keep disc variants not present in "variant") is confusing and prevents users from combining with different options.

With these changes it is now possible to ask for example "Give me all indels which are Mendelian Violations, not in dbsnp and present in these samples" which was not possible before.

Integration tests covering the above are forthcoming.

 


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6027 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 14:05:00 +00:00
depristo 9b54239f37 Removed nDeletions and nInsertions as independent plots -- just not useful given nIndels and insertions/deletion ratio. Fixed jitter problems with rug plots by removing the rugs entirely. Recovered, and improved, comparative features lost by removing the rug plots by getting viewports to work (trivial) so now per sample metrics and distributions are displayed on the same page!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6026 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-22 13:59:46 +00:00
kshakir a1f8aa90c0 Added an integration test showing how to use LSF C API to get LSF parameters.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6025 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 22:54:55 +00:00
ebanks 8e149cc52f Fixing a silly bug of mine: when a realignment target begins at position 1 of a contig, it was possible to have some reads get emitted out of order (triggering an exception in the SAMFileWriter). This is fixed by moving around some parentheses. Tim, if you are reading this: feel free to take this fix in whenever it's convenient. I.e. it's not critical as the only user who has been hit by it has a reference with over 130K short contigs. Committing in SVN so that it gets incorporated immediately (and I can respond on GS now).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6024 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 21:42:38 +00:00
asivache 6dd41c8489 Nway writer takes another argument: whether to create index on the fly. Realigner in NWayOut mode currently will ALWAYS create index on the fly as there seems to be no clean way to extract the requested value from argument collection in the presence of a different @Output stream.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6023 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 17:26:04 +00:00
asivache 78461bac1e Default logic (and name) has changed. Now somatic mode is default one. In order to run in single-sample (unpaired) mode, one has to use (hidden) --unpaired option.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6022 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 17:08:41 +00:00
chartl c5de06a641 Fixing up the RefSeqCodec so a bad entry in RefSeq (some transcripts are odd and have a negative length which may signify something special (?) ) doesn't cause failure, but issues a warning instead. Integration tests pass.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6021 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-21 14:07:58 +00:00
asivache 7c322780d3 Nway out fixed: in this mode a special nwayout sam writer is instantiated and passed to constrained manager. All the dispatching of the reads into separate output sam streams is taken care by that writer, so no other special processing is needed at the realigner/manager level.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6020 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-20 19:35:26 +00:00