Computational Analysis Tools



DNAshapeR is a software package implemented in the statistical programming language R that predicts DNA shape features in an ultra-fast, high-throughput manner from genomic sequencing data. The package takes either nucleotide sequence or genomic coordinates as input, and generates various graphical representations for visualization and further analysis. DNAshapeR further encodes DNA sequence and shape features as user-defined combinations of k-mer and DNA shape features. The resulting feature matrices can be readily used as input of various machine learning software packages for further modeling studies. 

Chiu et al. DNAshapeR: an R/Bioconductor package for DNA shape prediction and feature encoding.
Bioinformatics 32, 1211-1213 (2016)
Link to DNAshapeR software package link


GBshape provides DNA shape annotations of entire genomes. The database currently contains annotations for minor groove width, roll, propeller twist, helix twist and hydroxyl radical cleavage for 98 different organisms. Additional genomes can easily be added in the provided framework. GBshape contains two major tools, a genome browser and a table browser. The genome browser provides a graphical representation of DNA shape annotations along standard genome browser annotations. 

Chiu et al. GBshape: a genome browser database for DNA shape annotations.
Nucleic Acids Res. 43, D103-109 (2015)
Link to GBshape database


Our new TFBSshape database disentangles the complex relationships between DNA sequence, its 3D structure, and protein-DNA binding specificity. The TFBSshape database augments nucleotide sequence motifs with heat maps and quantitative predictions of DNA shape features for 739 TF datasets from 23 different species.

Yang et al. TFBSshape: a motif database for DNA shape features of transcription factor binding sites.
Nucleic Acids Res. 42, D148-155 (2014)
Link to TFBSshape database


We developed a new method for predicting DNA shape in a high-throughput manner on a genome-wide scale. This approach predicts structural features (several helical parameters and minor groove width) for the entire yeast genome in less than one minute on a regular laptop. The prediction can be visualized as genome browser tracks and compared to other properties of the genome such as sequence conservation.

Zhou et al. DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale.
Nucleic Acids Res. 41, W56-62 (2013)
Link to DNAshape web server


          Additional Data



Download page

L. Yang et al. Transcription factor family-specific DNA shape readout revealed by quantitative specificity models.
Mol. Syst. Biol. in press (2017)
Supplementary Information


Download page

T. Zhou et al. Quantitative modeling of transcription factor binding specificities using DNA shape.
Proc. Natl. Acad. Sci. USA 112, 4654-4659 (2015)
Supplementary Information


Download page

N. Abe et al. Deconvolving the recognition of DNA sequence from shape.
Cell 161, 307-318 (2015)
Supplementary Information

April 26, 2017
Remo accepted reappointment as Vice Chair of USC's Department of Biological Sciences through August 2019. Fight on!

April 20, 2017
We published our interactive tool for structural analysis of protein-DNA complexes in NAR. Congrats, Jared!

March 20, 2017
Tsu-Pei was awarded the prestigious Manning Endowed Fellowship. Congrats, Tsu-Pei!

March 20, 2017
Beibei was awarded a competitive Research Enhancement Fellowship. Congrats, Beibei!

February 6, 2017
Our new Mol. Syst. Biol. paper provides systematic analysis of DNA shape readout for many protein families. Congrats, Lin!

Recent news

September 28, 2017
Faculty of Biological Sciences Seminar, Pontificia Universidad Católica de Chile, Santiago, Chile

September 23-26, 2017
Molecular Biosystems Conference on Eukaryotic Gene Regulation & Functional Genomics, Puerto Varas, Chile

August 20-24, 2017
Symposium on Molecular Recognition, 254th American Chemical Society Meeting, Washington, DC

May 24, 2017
Workshop “Mathematical Oncology: Modeling Clinical Data for Maximum Patient Benefit”, University of Southern California, Los Angeles, CA

April 28, 2017
Department of Bioinformatics and Genomics, University of North Carolina, NC

April 13, 2017
Department of Chemistry, University of Utah, Salt Lake City, UT

March 22, 2017
Biochemistry, Molecular Biology and Biophysics, College of Biological Sciences, University of Minnesota Twin Cities, Minneapolis, MN

March 9, 2017
Program in Quantitative and Computational Biology, Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ

Recent presentations