Commit Graph

165 Commits (ad5b057140597e47057331b7b581787aba5c7cae)

Author SHA1 Message Date
aaron 82aa0533b8 added some more documentation to the GLF writer and it's supporting classes, and some other fixes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@875 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 14:53:58 +00:00
aaron e712d69382 GLF writing support
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@872 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 21:30:18 +00:00
hanna fc7320133c Cleaned up error when fasta index is missing. Code still throws an exception, but the message is more direct (no more 'error while micromanaging') and tells the user to run 'samtools faidx' to fix the issue.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@867 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-01 15:34:38 +00:00
asivache d601548d53 added reallocate(int[] orig_array, int new_size) and int[] indexOfAll(String s, int ch); the former is self-explanatory, while the latter returns array of indices of all occurences of ch in the specified string
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@856 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 20:15:00 +00:00
asivache fe3b843b65 intercept NullPointerException and rethrow it with (marginally) comprehensible error message when an attempt to get class source code location fails
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@854 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 15:56:55 +00:00
aaron b43deda6c9 iterative changes to GLF files; also a test of checking-in over sshfs.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@850 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 20:24:30 +00:00
hanna 5e8c08ee63 Update to latest version of picard. Change imports in all classes dependent on picard public from import edu.mit.broad.picard... to import net.sf.picard...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@849 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 20:13:01 +00:00
hanna aa17c4a468 Farewell, functionalj. You promised much, but you could not deliver.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@847 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 01:35:49 +00:00
aaron d275c18e58 adding some objects we need for the GLF format.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@846 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-27 22:32:25 +00:00
aaron 6fab1a64fa Started work on GLF input / output basics. Do not use.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@827 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-26 22:49:59 +00:00
hanna a488d2dbb2 Lazy creation of output streams. Only create output streams when absolutely necessary.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@824 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-26 21:56:57 +00:00
asivache 9ef1a21112 minor changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@817 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-26 21:03:06 +00:00
aaron d994544c47 Added back end code support for Sharding based on genomic location for reads. Changed the sharding
code to take GenomeLocSortedSet instead of a list<GenomeLoc>, and added a bunch of much simplier 
and cleaner test cases.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@816 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-26 20:57:46 +00:00
aaron d056f9f3e8 Changed the name to reflect the sorted nature of the set, added some fixes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@810 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 22:34:24 +00:00
aaron 831d430025 Added a collection for storing GenomeLocs, that also has functions for removing by genomic region (that may span multiple GenomeLoc's in the collection), and adding regions, which are then merged with any overlapping regions.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@809 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 21:52:40 +00:00
kiran 454a6d1df7 Fixed an egregious error in simpleReverseComplement wherein the RC'd string would be composed entirely of the last base.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@804 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 19:32:20 +00:00
asivache 02fc4f145f refactoring: a couple of general purpose (hopefully useful?) methods/classes extracted into a standalone utils class
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@802 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 18:54:40 +00:00
depristo 7a979859a9 Intermediate checking for evaluation -- now supports transition / transversion evaluation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@793 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 17:05:06 +00:00
depristo dc17a5661d Better accessors for dealing with second base prob pileups
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@785 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 22:25:16 +00:00
depristo d261459c48 Useful function to create a string with N copies of a same char
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@784 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 22:23:52 +00:00
kiran 83e1454a11 Added a method to determine the fraction of a sequence that's taken up by the most frequent base.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@781 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 20:35:31 +00:00
kiran 1a9d5cea29 Added a method to reverse-complement a String object, preserving 'N' and '.' bases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@776 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 19:39:39 +00:00
kiran a687c6bc03 Added a method to refresh an NFS mount point (necessary to prevent NFS flakiness when running on the LSF farm.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@774 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 19:31:54 +00:00
aaron 8515247575 Adding some functions I keep reinventing, especially for testing purposes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@772 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 19:30:44 +00:00
andrewk 0219d33e10 QualityUtils: added reverse function to reverse an array of bytes (and not complement it), BaseUtils: split qualToProb into itself and qualToErrProb, CovariateCounterWalker and LogisticRecalibrationWalker: several changes including a properly acocunting (only partly complete) for reversing AND complementing bases that are negative strand, PrintReadsWalker: created option to output reads to a BAM file rather than just to the sceern (useful for creating a downsampled BAM file)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@770 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 18:30:45 +00:00
hanna dc748d9c9c Integrate more feedback on command-line argument system. Focus on help
formatter: separate required from optional but otherwise keep ordering
the same, reorder GATK arguments by usage.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@764 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-20 19:01:25 +00:00
hanna 01a3cb27c7 @Required / @Allows flags for main arguments.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@751 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-19 23:26:17 +00:00
kiran 40dbc21df7 Moved ParseException to it's own file and made it public.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@750 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-19 14:42:44 +00:00
hanna e6ce80c8e3 Fix for GSA-44...don't throw exception when user specifies -h.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@742 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-18 00:42:00 +00:00
hanna d35e20ce21 Better error checking for missing .dict file.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@741 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-17 21:57:12 +00:00
hanna 7161b8f927 Disable support for short name values directly abutting their arguments.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@740 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-17 16:09:32 +00:00
hanna d152c2b911 New GATKArgumentCollection caused a subtle bug with argument grouping and the help system. Fixed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@738 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-17 14:54:25 +00:00
depristo 8e9e2f4502 Revised ROD system. Split the system in Basic type and interface. Enabled more control over rod accessing, including an initialize() function to fetch headers and other options from the file. Added general tabular rod, which has a named columns and supports a map<String,String> interface. Comes with shiny new Junit system for RODs. Also, added simple python script for accessing picard data.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@716 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 21:06:28 +00:00
hanna 67293168e7 Support periods in sequence names.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@715 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 20:17:57 +00:00
kiran 68c9455c0f Moved the base complement method to BaseUtils.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@711 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 18:57:48 +00:00
kiran 64c65c7751 New methods to generated compressed SQ quality elements in line with the SAM spec.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@699 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 16:50:31 +00:00
hanna 12ae3a22b6 Break locus context data access providers into modular components in preparation for traverse by loci.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@689 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-13 18:51:16 +00:00
jmaguire 11723fbcc2 added method indelPileup. Generates a pileup of indel alleles given reads and ofsets (as from a locus walker).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@663 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 15:08:24 +00:00
hanna 32696b13f5 Fixed method override issue with old-style traversals.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@660 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 01:22:18 +00:00
hanna 23e9e29964 Changed reads traversals from providing a LocusContext from which the reference sequence
could be extracted to a char[] containing the reference bases.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@657 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 22:45:11 +00:00
ebanks 009e71fcd9 We need to sort cleaned reads ourselves (instead of letting SAMFileWriter
do it) because the SAM headers are often screwed up and claim to be
"unsorted".  While here, I broke off the module from the SortSamIterator
in case someone else wants to use it.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@654 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 15:43:42 +00:00
aaron 4ce3feba4d my move ended up being a copy, so this is to delete dupplicate files.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@651 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 02:10:26 +00:00
aaron 898f65547e Added code to split GenomeAnalysisTK.java into an object concerned with loading command line args, and one that runs the engines. This will allow us to run the GATK from other tools (like Matlab). Also some cleanup to seperate out the legacy traversals and the new style traversals. This is not live yet, and any modifications you need should be made to GenomeAnalysisTK.java for now.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@650 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 02:07:20 +00:00
aaron ee02b61068 added support for the argument collections code
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@648 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-09 07:07:33 +00:00
aaron 742840017b added the argument collection annotation for situations where fields in a command line args have embedded fields that should be checked for command line args
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@647 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-09 06:59:17 +00:00
aaron bae4256574 Started the process to make the GATK engine into a runnable object so we can call it from other processes. Step 1: make a configuration object that can serialize to and from an XML file. This way we can store the information everyone uses shell scripts for. Also we can now pull the list of params out of the GenomeAnalysisTK.java. More to come...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@636 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 01:25:26 +00:00
hanna 7f8850a8a2 Argument validation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@631 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 20:28:56 +00:00
hanna a3d8febbf2 Error message cleanup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@630 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 19:31:32 +00:00
hanna c241d386a7 Beefed up command-line usage string.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@629 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 19:08:19 +00:00
depristo 5a6892900e fixing oddities in duplicates
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@628 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:55:45 +00:00