Put this puppy through its paces. Eliminated the sorting and header-handling stuff; that isn't the purvey of this script and should be handled downstream or by a script wrapper.

I also secretly handled another pesky overlow exception. Occasionally Syzygy could report lods of like -1000; e.g. posterior probabilities of one in one (((googol) googol) googol) googol which of course makes python blow up. Now we safely output an accurate posterior.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1971 348d0f76-0448-11de-a6fe-93d51630548a
This commit is contained in:
chartl 2009-11-04 06:05:45 +00:00
parent 3d9195f8b6
commit fc17e75759
1 changed files with 4 additions and 15 deletions

View File

@ -23,7 +23,10 @@ def dictFromCombinedErrorCoverageFile(f):
return dict(dictList)
def qualFromLod(L):
X = math.exp(-L)
try:
X = math.exp(-L)
except OverflowError:
return 0
try:
return math.floor(-10*math.log10(X/1+X))
except OverflowError:
@ -175,17 +178,3 @@ pooledCallsFile.close()
for i in range(len(poolNames)):
pooledOutputFiles[i].close()
## sort the files -- system commands ##
for pool in poolNames:
cmd1 = "cat "+directory+"/"+pool+"_calls.vcf | sort -n -k2,2 | perl "+sortByRefPath+" - /seq/references/Homo_sapiens_assembly18/v0/Homo_sapiens_assembly18.fasta.fai > "+directory+"/tmp.vcf"
cmd2 = "cp "+directory+"/tmp.vcf "+directory+"/"+pool+"_calls.vcf"
os.system(cmd1)
os.system(cmd2)
cmd = "cat "+directory+"/"+proj+"_combined_calls.vcf | sort -n -k2,2 | perl "+sortByRefPath+" - /seq/references/Homo_sapiens_assembly18/v0/Homo_sapiens_assembly18.fasta.fai > "+directory+"/tmp.vcf"
os.system(cmd)
cmd = "cp "+directory+"/tmp.vcf "+directory+"/"+proj+"_combined_calls.vcf"
os.system(cmd)
cmd = "rm "+directory+"/tmp.vcf"
os.system(cmd)