Reference: Karp, P. D. Hypothesis Formation and Qualitative Reasoning in Molecular Biology. Knowledge Systems Laboratory, June, 1989.
Abstract: This dissertation investigates scientific reasoning from a computational perspective. The investigation focuses on a program of research in molecular biology that culminated in the discovery of a new mechanism of gene regulation in bacteria, called attenuation. The dissertation concentrates on a particular type of reasoning called hypothesis formation. Hypothesis-formation problems occur when the outcome of an experiment predicted by a scientific theory does not match that observed by a scientist. I present methods for solving hypothesis-formation problems that have been implemented in a computer program called HYPGENE. This work is also concerned with how to represent theories of molecular biology in a computer, and with how to use such theories to predict experimental outcomes; I present a framework for performing these tasks that is implemented in a program called GENSIM. I tested both HYPGENE and GENSIM on sample problems that biologists solved during their research on attenuation. The dissertation includes a historical study of the attenuation research. This study is novel because it examines a large, complex, and modern program of scientific research. The thesis treats hypothesis formation as a design problem, and uses design methods to solve hypothesis-formation problems. The HYPGENE program reasons backward from an error in a GENSIM prediction of an experimental outcome. It uses hypothesis-design operators to design modifications to the initial conditions of the experiment, and to the biological theory, such that the new predicted outcome of the experiment computed by GENSIM will match the observation. This approach is largely domain-independent because most design operators include no domain concepts. The design operators are complete in that they can synthesize any hypothesis that can be represented within the GENSIM framework. HYPGENE uses a planner to satisfy its design goals. This method of hypothesis formation is efficient because it is goal directed (in contrast to previous generate-and-test approaches), and because reference experiments can be used both to guide the generation of hypotheses, and to filter hypotheses. A reference experiment has initial conditions that are similar to those of the anomalous experiment, but has an outcome that is correctly predicted by the theory.