-- The previous creation algorithm used the following algorithm: for each kmer1 -> kmer2 in each read add kmers 1 and 2 to the graph add edge kmer1 -> kmer2 in the graph, if it's not present (does check) update edge count by 1 if kmer1 -> kmer2 already existed in the graph -- This algorithm had O(reads * kmers / read * (getEdge cost + addEdge cost)). This is actually pretty expensive because get and add edges is expensive in jgrapht. -- The new approach uses the following algorithm: for each kmer1 -> kmer2 in each read add kmers 1 and 2 to a kmer counter, that counts kmer1+kmer2 in a fast hashmap for each kmer pair 1 and 2 in the hash counter add edge kmer1 -> kmer2 in the graph, if it's not present (does check) with multiplicity count from map update edge count by count from map if kmer1 -> kmer2 already existed in the graph -- This algorithm ensures that we add very much fewer edges -- Additionally, created a fast kmer class that lets us create kmers from larger byte[]s of bases without cutting up the byte[] itself. -- Overall runtimes are greatly reduced using this algorith |
||
|---|---|---|
| licensing | ||
| protected | ||
| public | ||
| settings | ||
| .gitignore | ||
| build.xml | ||
| intellij_example.tar.bz2 | ||
| ivy.xml | ||