top of page

Search parity

Mankind would be nowhere without respect for the fundamentals. And search might be the most fundamental process of all. Faster ways to find information might sound boring, but we literally use those advances every second. Remember the phone book? (Shudder) We remember the phone book. For you Millennials, it was this actual paperback book with all the phone numbers for your city. What’s that, you don’t know what paper is? Oh, never mind . . .

Hashing, SQL, relational databases, query structure, graph theory, probabilistic inference, AI, parallelization . . . your eyes glazing over. Who let the coder geeks in?! Well, you need to buy them lunch. They made it possible for you to google, BLAST, Siri, etc. They put the world at your fingertips. Well, most of the world anyway. Anything that could be transformed into a string of characters, that is. But what happens when there are billions or trillions of strings to search? Not even current computing power can keep up. Google solved this problem with graphs that found shorter paths through to quality search hits by evaluating the connections between the hits. BLAST did it similarly by groupings and filterings of sequences. But no one has really solved this for chemical structures.

Until now. Our friendly hackers at Molsoft figured out a way to do a chemical structure similarity search of billions of chemicals in just minutes. This was literally a limiting belief just last year, since the process involves the computer comparing a string or signature representing the chemical structure one billion times against each other chemical structure (called a systematic search). That would take hours or days on even the most powerful computer. Instead, Molsoft’s engineers adopted a two-tier filtering strategy in which a single character or bit of the structure is initially used to quickly rule out an enormous fraction of the database and the systematic search then proceeds on only the likely small remaining subset.

Molsoft’s been at it before with these creative-focused search strategies, having made their bones with the clever Biased Probability Monte Carlo search of peptide conformations, also an NP hard problem for which theirs essentially remains the leading solution. It’s creativity like this applied to practical fundamentals that really moves mankind forward with little fanfare, and we at GeneCentrix are doing our part.

Single Post: Blog_Single_Post_Widget
bottom of page