researchGenome analysis views DNA as a linear string of the letters A, C, G, and T but proteins recognize DNA as a three-dimensional object (Figure). Our main interest is to understand better how transcription factors (TFs) recognize nuances in intrinsic DNA structure and to identify TF families for which the readout of local DNA shape contributes to binding specificity and explains distinct functions of closely related TFs. Until recently, our research mainly focused on the analysis of TF binding sites (TFBSs) for which structural information was available. However, sequence information for whole genomes has become available in recent years due to advances in high-throughput sequencing technologies, whereas structural information on that scale is not available. It is still unknown why certain TFs bind to similar DNA sequences but execute different in vivo functions, or in turn bind to diverse sequences. Our scientific contributions suggest that direct chemical contacts with base pairs cannot sufficiently explain binding specificity and that DNA shape is a crucial specificity determinant:

1. We discovered that Drosophila Hox proteins achieve binding specificity through readout of minor groove geometry and that base-specific hydrogen bonds in the major groove are not sufficient for in vivo function (Joshi et al., Cell 2007).

2. Based on the analysis of all available crystal structures of protein-DNA complexes, we generalized our finding that many TF families use arginine residues to recognize minor groove shape and electrostatic potential and that similar shape-dependent interactions with histones contribute to the stabilization of nucleosomes (Rohs et al., Nature 2009).

3. For binding sites of the tumor suppressor p53, we found a different mechanism for altering minor groove shape due to a flip of several base pairs from Watson-Crick to Hoogsteen geometry (Kitayner et al., Nat. Struct. Mol. Biol. 2010).

4. The discovery of minor groove shape recognition has led to a new classification of protein-DNA readout modes in base readout and shape readout (Rohs et al., Annu. Rev. Biochem. 2010), which has already been adopted in a textbook.

5. We developed a high-throughput method for minor groove geometry prediction. In a proof-of-principle study, we published the first application of this approach based on the shape analysis of several hundreds of thousands of TFBSs (Slattery et al., Cell 2011). We predicted the minor groove width of Drosophila Hox binding sites derived from SELEX-seq experiments and discovered that Hox TFs, although they bind to target sites that are similar in sequence, prefer distinct minor groove topographies. Hox TFs responsible for the development of anterior regions of the fly select one shape class while Hox proteins involved in posterior development prefer a different shape class.

6. We are currently expanding our high-throughput method to predict all essential structural features of TFBSs at single nucleotide resolution. This approach is based on the data mining of thousands of Monte Carlo trajectories, which we validated based on all available crystal structures. Using a sliding pentamer window, we derive average conformations at the center of each unique pentamer to predict structural features. Our high-throughput method is very fast in comparison to molecular simulations and predicts shape features of, for instance, the entire yeast genome in about one minute on a single processor. This advance makes DNA shape analysis for the first time expedient on a genome-wide scale.

Building on this high-throughput method for DNA shape prediction, our lab is now working on expanding this approach and will apply DNA shape analysis to a variety of biological questions, which we believe will benefit from integrating studies of DNA sequence and shape. Our immediate research plans are focused on analyzing the role of various intrinsic DNA shape features on a genome-wide basis in achieving DNA binding specificity of closely related TFs. Based on our preliminary results, we expect that genome-wide DNA shape analysis will become an important aspect in interpreting high-throughput sequencing data and provide a better understanding of the genome and its diverse functions.



January 30, 2017
Our newest NAR paper with the Tullius lab addresses the role of intrinsic versus protein-induced DNA shape.

January 11, 2018
We published a new study in Genome Research revealing a protein family specific relationship between TF binding and histone modifications. Congrats, Beibei!

November 20, 2017
We expanded our high-throughput prediction method to 13 DNA shape features with a new publication in NAR. Congrats, Jinsen!

November 20, 2017
Our recent Yang et al. Mol. Syst. Biol. paper won RECOMB/ISCB Top-10 Paper Award in regulatory and systems genomics in 2016/17.

October 11, 2017
We published a new method to derive minor-groove electrostatic potential on a genomic scale in NAR. Congrats, Tsu-Pei!

August 16, 2017
Remo started the Quantitative Biology (QBIO) major at the interface of biology and computer science.

April 26, 2017
Remo accepted reappointment as Vice Chair of USC's Department of Biological Sciences through August 2019. Fight on!

April 20, 2017
We published our interactive tool for structural analysis of protein-DNA complexes in NAR. Congrats, Jared!

March 20, 2017
Tsu-Pei was awarded the prestigious Manning Endowed Fellowship. Congrats, Tsu-Pei!

March 20, 2017
Beibei was awarded a competitive Research Enhancement Fellowship. Congrats, Beibei!

February 6, 2017
Our new Mol. Syst. Biol. paper provides systematic analysis of DNA shape readout for many protein families. Congrats, Lin!

Recent news

September 28, 2017
Faculty of Biological Sciences Seminar, Pontificia Universidad Católica de Chile, Santiago, Chile

September 23-26, 2017
Molecular Biosystems Conference on Eukaryotic Gene Regulation & Functional Genomics, Puerto Varas, Chile

August 20-24, 2017
Symposium on Molecular Recognition, 254th American Chemical Society Meeting, Washington, DC

May 24, 2017
Workshop “Mathematical Oncology: Modeling Clinical Data for Maximum Patient Benefit”, University of Southern California, Los Angeles, CA

April 28, 2017
Department of Bioinformatics and Genomics, University of North Carolina, NC

April 13, 2017
Department of Chemistry, University of Utah, Salt Lake City, UT

March 22, 2017
Biochemistry, Molecular Biology and Biophysics, College of Biological Sciences, University of Minnesota Twin Cities, Minneapolis, MN

March 9, 2017
Program in Quantitative and Computational Biology, Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ

Recent presentations

Program Director, Bachelor of Science (B.S.) in Quantitative Biology (QBIO) 

QBIO 105
Remo coteaches with Professor Michael Waterman
Introduction to Quantitative Biology