Tikalon Header Blog Logo

Discovering New Chemicals

May 10, 2013

Trial-and-error, or "shotgun," methods can be useful when you have very little understanding of a material or a process. One trial-and-error technique is the industrial experiment, formalized in a methodology called design of experiments (DOE, not to be confused with the DOE). I wrote about the design of experiments methodology in a previous article (Tea Party Technologists, November 18, 2011).

I learned two important things when I attended a DOE course many years ago. The first was that I had been using the approach in my own work without knowing that it was a formal technique. The second was that the technique was being over-sold to the students and their employers.

A fundamental tenet of DOE is the idea that it's theory-free, which is to say that you need no experience in, say superalloys, to invent a new superalloy. If you're an expert in a field, you're instructed to ignore all prior beliefs and perform all the experiments demanded in the DOE analysis, even those that seem absurd.

At the end of the course I attended, each student was asked to design experiments to solve a problem of his choosing. To make a point, the object of my DOE was winning the New Jersey State Lottery. The only variables in the lottery process under my control were the day of the week in which I bought my ticket and where I bought my ticket. So much for a DOE's ability to solve any problem.

Figure caption

The odds are so far against against you, it doesn't make sense to play the lottery.

Although there's no good reason to buy a ticket, one engineer with whom I worked had a good argument for buying a second ticket. It's the least expensive way that you can double your odds at winning.

(Girl at the Lottery, Mädchen vor dem Lotteriegewölbe, an 1829 oil on canvas painting by Peter Fendi, 1796–1842, now at the Austrian Gallery Belvedere, via Wikimedia Commons.)

Progress in computing, robotics and automated analytical techniques has enabled the application of the trial and error technique to chemistry, a method called combinatorial chemistry. In this process, chemical compounds in huge numbers are synthesized from a combinations of precursor elements or molecules. These are then tested automatically for a selected property.

In my own field of materials science, combinatorial chemistry has been used in the discovery of new catalysts and high performance luminescent materials. Although combinatorial chemistry has been used for quite a while in drug development, very few drug discoveries have come from it. Theory is still an important part of science.

One reason for the inability of combinatorial chemistry to deliver new drugs might be that the chemists are using a limited subset of chemicals as their precursors. Unless they think outside their conventional compound libraries, nothing new will be forthcoming. Recent research at Duke University and the University of Pittsburgh has followed-up on that idea by examining the large region of compound "space" that's still unexplored.[1-3] An article on their analysis has been published in a recent issue of the Journal of the American Chemical Society.[1]

The research team studied what they called the small molecule universe, molecules with mass less than or equal to 500 Daltons. Molecules of this size can bind to cells and cross cell walls.[3] This limits the number of candidate molecules, but the number is still huge, estimated to be more than 1060.[1-2] To date, only about a hundred million of these have been synthesized by chemists.[2]

As can be expected, the study was done using a computer algorithm to assess the feasibility of existence of randomly-generated small molecules. The research team has named this algorithm ACSESS (Algorithm for Chemical Space Exploration with Stochastic Search).[1] The algorithm uses a variational technique to transform known compounds into potentially useful compounds. Explains study coauthor, Aaron Virshup, of Duke University,
"The idea was to start with a simple molecule and make random changes, so you add a carbon, change a double bond to a single bond, add a nitrogen. By doing that over and over again, you can get to any molecule you can think of."[2]

The algorithm was improved by having synthetic chemists review the potential molecules to determine which were feasible, and which were not. Their feedback was distilled into rules that were incorporated into an improved algorithm.[2] After ten iterations, the computer program gave a set of nine million molecules, and a map was created to show unexplored regions of the molecular space (see figure).[2]

Figure caption

A map representing the areas of the small molecule universe still unexplored by synthetic chemists.

Duke University image by Virshup et. al. Used with permission.)

(Click for larger image.)[2]

One advantage of the map is that it highlights potential drugs that haven't been studied. Says Virshup, "If you're in the blank spaces on our small molecule map, you're guaranteed to make something that isn't patented yet."[2] This research was supported by the National Institutes of Health, and the source code for the program is available online.[2]


  1. Aaron M Virshup, Julia Contreras-Garcia, Peter Wipf, Weitao Yang and David N. Beratan, "Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds," J. Am. Chem. Soc., Epub ahead of print, April 2, 2013, DOI: 10.1021/ja401184g.
  2. Ashley Yeager, "Scientists Map All Possible Drug-like Chemical Compounds," Duke University Press Release, April 22, 2013.
  3. Rebecca Boyle, "A Library Of Every Drug That Could Ever Exist - Drug researchers go along on a stochastic voyage," Popular Science, April 24, 2013.
  4. Philip J. Hajduk, Warren R. J. D. Galloway and David R. Spring, "Drug discovery: A question of library design," Nature, vol. 470, no. 7332 (February 3, 2011), pp. 42-43.

Permanent Link to this article

Linked Keywords: Trial-and-error; shotgun; material; chemical process; industrial experiment; methodology; design of experiments; hype cycle; over-sold; student; employer; theory; superalloy; invention; invent; expert; New Jersey State Lottery; odds; engineer; Peter Fendi; Österreichische Galerie Belvedere; Austrian Gallery Belvedere; Wikimedia Commons; computing; robotics; automation; automated; chemistry; combinatorial chemistry; chemical compound; precursor; chemical element; molecule; materials science; catalysis; catalyst; luminescence; luminescent; chemist; chemical library; compound library; Duke University; University of Pittsburgh; parameter space; Journal of the American Chemical Society; mass; Dalton; cell; cell wall; chemical synthesis; synthesized; algorithm; computer algorithm; randomn; randomly-generated; variational; Aaron Virshup; carbon; double bond; single bond; nitrogen; iteration; patent; National Institutes of Health; source code.