Artificial intelligence (AI) conjures up images of Terminators and human enslavement to machines, but, in fact, AI is already a huge and welcome part of our iPhone driven lives, guiding us in everything from getting to the nearest burrito to keeping our mailbox free of spam. Still, Google and IBM's demonstration that an AI program could learn to play a complex board or show game better than any human in a short time can’t help but give the average human the willies. If it can beat me at Jeopardy, what if it decides it needs to beat me to the nearest electrical grid?
However, AI still has its limitations, and one of them is in biochemistry and drug discovery. Yes, Nature is still too much for the machines. AI has never discovered an FDA approved drug. It hasn’t even designed a decent new molecule.
Nevertheless, we keep trying. One of the holy grail challenges of molecular science is to harness the seemingly infinitely complex landscape of protein sequences and how they dictate three-dimensional protein molecular structure. Now a team led by celebrity Harvard scientist George Church reports the application of AI to protein design.
AI, which is a kind of machine learning, essentially uses large bodies of data to find models that unify the data. There are two basic approaches, structured and unstructured. Less data is needed for the structured approach, where correlating features of whatever is being sought are cleaned up and provided for the machine and it is told what to look for. The most exciting type of AI essentially provides little direction and the program actually learns how to learn the model on the fly. The main requirement for the unstructured approach is enormous amounts of diverse data, which the program can then learn on by trial and error. The most common program is modeled on the way the human brain learns and is called a neural network. This approach has been enriched in layers to where it is now called deep learning.
Deep learning is what the Church group used to try to learn the elusive rules of how protein sequences dictate protein structure. The publication outlines a series of validation test that, on the surface, seem to indicate that the program learned key aspects of amino acids and protein structure, but the outputs were somewhat of the straw man variety. Nevertheless, the speed at which the rules were learned was impressive. These and other investigations suggest that improvements to reduce drug attrition, the mission of GeneCentrix may not be far behind.