Commit Graph

192 Commits (eae2d019cf6f9b5b359a3bbc0b0066f490464f44)

Author SHA1 Message Date
kshakir 9fcf71c031 Updated google reflections due to stale slf4j version conflicting with other projects also trying to use Queue as a component.
Added targets to build.xml to effectively 'mvn install' packaged GATK/Queue from ant.
TODO: Versions during 'mvn install' are hardcoded at 0.0.1 until a better versioning scheme that works with maven dependencies has been identified.
2012-10-16 02:22:30 -04:00
Mark DePristo 3362584014 Updating cofoja to the latest version 2012-08-09 16:36:18 -04:00
Roger Zurawicki 5b74763096 Removed Categories.
We will use DocumentedGATKFeatures to create categories in our documentation. Eric I guess will be in charge of this. We need to remove walkers and think how to categorize everything.

Tools can be hidden from GATKdocs with the @Hidden annotation

Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>
2012-07-25 13:46:24 -04:00
Roger Zurawicki f3c504769b Added the ability to update the Forum
GATKDocs looks for a key on gsa4, and updates the forum with new walker if it exists.
More changes were made to the GATKDocs. Works nicely with bootstrap on and offline.
Cleaned up the code as well

Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>
2012-07-23 17:17:33 -04:00
Mauricio Carneiro 116885a450 Removed the "Walker" suffix from all walkers that had it.
* Did not touch archived walkers... those can be named whatever.
   * Kept abstract classes that end in Walker untouched (e.g. LocusWalker, ReadWalker, ...)
   * Renamed a few inner classes due to conflict when stripping off Walker from their outer classes: ContigStats, FlagStats and FastaStats.
2012-07-20 17:27:11 -04:00
Mark DePristo 61f0c46423 Rev tribble to 110. Log is:
Optimization for PositionalBufferedStream with specialized read(byte, int, int) method

-- For binary codecs having an efficient reader of lots of bytes that doesn't fall back into read() itself vastly improves performance. The old version was 10x slower than InputStream, while the new version is +30%.
-- Generalize PositionalBufferedStream main() method for performance testing, now accepts cmdline arguments for the file to read, how many iterations, etc

Generalize AsciiLineReader main() method for performance testing
-- Now accepts cmdline arguments for the file to read, how many iterations, etc

AsciiLineReaderTest and PositionBufferedStreamTest were in src not test/src
2012-06-26 15:28:32 -04:00
Mark DePristo 373ae39e86 Testing of BCF codec
-- Rev.d tribble
-- Minor code cleanup
-- BCF2 encoder / decoder use Double not Float internally everywhere
-- Generalized VC testing framework
2012-05-24 10:57:01 -04:00
Mark DePristo a90482c772 Rev. tribble to v101 with another putative open file leak fix
Scalability bugfixes; can issues tens of thousands of queries to an reader
without opening too many files

-- Fixed missing close() statement in TribbleIndexedFeatureReader
-- Fixed NPE in TabixIteratorLineReader
-- Added scalability test that confirms .query() failure and subsequent fix

Note this actually fixes a tested and reproducible scability issue.  Might not be the only one but I believe it should do the trick.  Sorry everyone for the inconvenience.  Note that we now have a test in Tribble to ensure this doesn't happen again.
2012-05-04 15:40:41 -04:00
Mark DePristo fa84d50a2b Rev. tribble for putative bugfixes for not closing streams 2012-05-04 10:20:46 -04:00
Mark DePristo 0f4cc1884d Rev to tribble 99, optimized AsciiFeatureCodec
-- Removed tmp. GeneralizedFeatureCodec
-- BCF2 Reader update to use new style, but this entire class can be deleted now
-- Rev. tribble to r99
2012-05-03 07:31:48 -04:00
Mark DePristo 43d97c2e00 Rev Tribble to r97, adding binary feature support
From tribble logs:

Binary feature support in tribble

-- Massive refactoring and cleanup
-- Many bug fixes throughout
-- FeatureCodec is now general, with decode etc. taking a PositionBufferedStream
as an argument not a String
-- See ExampleBinaryCodec for an example binary codec
-- AbstractAsciiFeatureCodec provides to its subclass the same String decode,
readHeader functionality before.  Old ASCII codecs should inherit from this base
class, and will work without additional modifications
-- Split AsciiLineReader into a position tracking stream
(PositionalBufferedStream).  The new AsciiLineReader takes as an argument a
PositionalBufferedStream and provides the readLine() functionality of before.
Could potentially use optimizations (its a TODO in the code)
-- The Positional interface includes some more functionality that's now
necessary to support the more general decoding of binary features
-- FeatureReaders now work using the general FeatureCodec interface, so they can
index binary features
-- Bugfixes to LinearIndexCreator off by 1 error in setting the end block
position
-- Deleted VariantType, since this wasn't used anywhere and it's a particularly
clean why of thinking about the problem
-- Moved DiploidGenotype, which is specific to Gelitext, to the gelitext package
-- TabixReader requires an AsciiFeatureCodec as it's currently only implemented
to handle line oriented records
-- Renamed AsciiFeatureReader to TribbleIndexedFeatureReader now that it handles
Ascii and binary features
-- Removed unused functions here and there as encountered
-- Fixed build.xml to be truly headless
-- FeatureCodec readHeader returns a FeatureCodecHeader obtain that contains a
value and the position in the file where the header ends (not inclusive).
TribbleReaders now skip the header if the position is set, so its no longer
necessary, if one implements the general readHeader(PositionalBufferedStream)
version to see header lines in the decode functions.  Necessary for binary
codecs but a nice side benefit for ascii codecs as well
-- Cleaned up the IndexFactory interface so there's a truly general createIndex
function that takes the enumerated index type.  Added a writeIndex() function
that writes an index to disk.
-- Vastly expanded the index unit tests and reader tests to really test linear,
interval, and tabix indexed files.  Updated test.bed, and created a tabix
version of it as well.
-- Significant BinaryFeaturesTest suite.
-- Some test files have indent changes
2012-05-03 07:31:48 -04:00
Mark DePristo 58c470a6c5 Rev'ing Tribble from 53 to 94
-- Other tribble contributors did major refactoring / simplification of tribble, which required some changes to GATK code
-- Integrationtests pass without modification, though some very old index files (callable loci beds) were apparently corrupt and no longer tolerated by the newer tribble codebase
2012-05-03 07:31:47 -04:00
Khalid Shakir b8b7f28aa9 Revving Picard to pick up new SamFileHeaderMerger.
Updated ReadFilter abstract class to implement (via UnsupportedOperationException) the new SamRecordFilter.filterOut().
In IndelRealignerIntegrationTest updates for Picard fixes to SAMRecord.getInferredInsertSize() in svn r1115 & r1124.
- Ran FixMates to create new input BAM since running IR with variable maxReadsInMemory means all reads weren't realigned leading to different outputs.
- Updated md5s to match new expectations after looking at TLEN diff engine output.
2012-05-02 16:47:28 -04:00
Mark DePristo b0560f9440 Rev. tribble to fix BED codec bug in tribble 51 2012-01-17 16:40:26 -05:00
Mark DePristo f2b0575dee Detect unreasonably large allele strings (>2^16) and throw an error
-- samtools can emit alleles where the ref is 42M Ns and this caused the GATK (via tribble) to hang in several places.
-- Tribble was updated so we actually could read the line properly (rev. to 51 here).
-- Still the parsing algorithms in the GATK aren't happy with such a long allele.  Instead of optimizing the code around an improper use case I put in a limit of 2^16 bp for any allele, and throw a meaningful exception when encountered.
2012-01-17 16:40:26 -05:00
Matt Hanna e923a2e512 Revving Picard to incorporate final version of ReadWalker performance improvements. 2012-01-10 12:12:33 -05:00
David Roazen ea6e718cb8 SnpEff 2.0.5 support. Re-enabled SnpEff in the HybridSelectionPipeline.
For now, we recommend only running with the GRCh37.64 database.
2012-01-03 15:18:36 -05:00
Matt Hanna e6e80e8d3f Update Picard to fix a bug Mauricio found in Picard where Picard unnecessarily depends on Snappy during some usages of SortingCollection. 2011-12-29 14:35:02 -05:00
Khalid Shakir b4b7ae1bd9 Revved Picard to incorporate tfennell's AsyncSAMFileWriter.
Removed DbSnpFileGenerator and related files as they were removed from PPP r2063 by ktibbett.
2011-12-06 10:37:42 -05:00
Matt Hanna c9eae32f6e Revving Tribble to actually close file handles when close() is called. 2011-11-30 22:42:21 -05:00
Eric Banks d7d8b8e380 Tribble v42 changes the Codec.canDecode method to take in a String instead of a File; this is something that Jim was adamant about (because Tribble can handle streams other than files). I didn't want the next person who needed to rev Tribble to deal with this change additionally, so I took care of updating the GATK now. 2011-11-28 14:18:28 -05:00
David Roazen 68b2a0968c Updating the HybridSelectionPipeline for SnpEff 2.0.4 RC3
This will have to be done again when the 2.0.4 release becomes official,
but it's necessary to do now in order to re-enable the pipeline tests.
2011-11-17 14:46:12 -05:00
Eric Banks d64f8a89a9 Instead of the SelfScopingFeatureCodec interface, pushed this functionality into Tribble itself. Now we can e.g. determine that a file can be parsed by the BedCodec on the fly. 2011-11-09 15:24:29 -05:00
Eric Banks 6297561326 Adding the new jar 2011-11-07 15:08:19 -05:00
Eric Banks aa0c8c3600 Revving Tribble jar to v40. Our last jar was busted. 2011-11-07 11:30:08 -05:00
Matt Hanna 9afe6fc7ac Picard upgrade to 1.55. 2011-10-24 17:02:27 -04:00
Khalid Shakir 84bd355690 Merged bug fix from Stable into Unstable 2011-09-27 14:34:39 -04:00
Khalid Shakir b090751f62 Fixed Ant / PluginManager issue where reflections was picking up all class files under current working directory due to "." in jar manifest classpaths.
Updates to HybridSelectionPipeline:
- Added annotations back via snpEff
- Minor updates to VQSR paths and lowered memory
2011-09-27 14:33:57 -04:00
Mark DePristo 34f435565c Accidentally committed unclean tribble jar to repo 2011-09-21 10:16:17 -04:00
Mark DePristo 827c942c80 Rev tribble 2011-09-20 14:01:14 -04:00
Eric Banks da9c8ab386 Revving the Tribble jar where the DbsnpCodec class was renamed to OldDbsnpCodec. Updating GATK code accordingly. 2011-09-06 20:39:42 -04:00
Mark DePristo 0b794b5491 Reving Tribble to 23 2011-09-01 10:43:03 -04:00
Matt Hanna dd89755e74 Merged bug fix from Stable into Unstable 2011-08-31 17:28:44 -04:00
Matt Hanna 65a9159ac6 Point ivy to the maven repo instead of the default ibiblio repo. Drastically
simplify ivy config by completely cutting out module specifications.
2011-08-31 17:27:25 -04:00
Mark DePristo d604019362 Finished my broken tribble code. Updated to rev 22 2011-08-30 16:56:48 -04:00
Mark DePristo bdf04b8057 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-30 11:08:23 -04:00
Mark DePristo 173ca1e215 Reverting tribble temporarily while I fix my subtle problems 2011-08-30 11:08:13 -04:00
Khalid Shakir 2125ba1f23 Merged bug fix from Stable into Unstable
Conflicts:
	private/java/src/org/broadinstitute/sting/pipeline/ReferenceData.java
2011-08-29 19:36:43 -04:00
Khalid Shakir 20ac24464d Rev'ved picard to read new analysis_files.txt with a blank line after header and no reference sequence.
Updated error messages and unit tests.
2011-08-29 19:33:04 -04:00
Mark DePristo 427c643ce7 The missing tribble jar 2011-08-29 18:46:40 -04:00
Mark DePristo 5defaf5fac Continuing to improve Tribble
-- ProfileRodSystem now has a just load index mode, allowing us to optimize the profiler
-- assessFarmNodes R script for making nice plots of performance of jobs on the farm
-- Rev. tribble to use new, optimized index loading (performance win when loading many many indices)
2011-08-29 17:02:57 -04:00
Mark DePristo f39d0008bc Build.xml -- contracts not built by default. Slightly simpler CSS for dl. 2011-08-19 15:07:26 -04:00
Mark DePristo f2a05af356 Fixed layout problem with advanced arguments 2011-08-18 21:59:44 -04:00
Mark DePristo cca0930517 Better formatting for GATKDocs 2011-08-18 21:24:23 -04:00
Mark DePristo ce009bd4a4 Works without related data 2011-08-18 14:05:09 -04:00
Mauricio Carneiro a9df365364 GenotypeAndValidate walker updated
Updated the walker to comply with the new RodBinding system and the new GATKDocs. Will move it to public after writing integration tests.
2011-08-17 21:55:17 -04:00
David Roazen bd5cdb8a43 The tribble dependency is now handled through ivy. Revved tribble to r18 and removed obsolete build targets in build.xml 2011-08-11 16:38:29 -04:00
Mark DePristo b984b7aa9b Missing default values are now NA 2011-08-10 22:22:57 -04:00
Mauricio Carneiro 321afac4e8 Updates to the help layout.
*New style.css, new template for the walker auto-generated html. Short description is no longer repeated in the long description of the walker.

 *Updated DiffObjectsWalker and ContigStatsWalker as "reference" documented walkers.
2011-07-26 19:29:25 -04:00
Mark DePristo 2039ce6102 Default values now displayed in arguments
DiffEngine fixed so that newInstance() would work.  Pretty quickly encountered a situation where newInstance() failed.  Debug output now written when this occurs in the log.
Logger now used instead of standard out, with INFO the default level.
2011-07-24 22:56:55 -04:00