L. Yang et al.
Transcription factor family-specific DNA shape readout revealed by quantitative specificity models.
Mol. Syst. Biol. in press (2017)
Supplementary Information
DESCRIPTION
This page provides download links for all the trained models related to the paper, as well as the preprocessed M-word scores and encoded features for reproducing the results. Each MLR model is the average of 10 models learned from 10-fold cross validation. PSSMs were information content calculated based on top 200 M-words for each TF.
DOWNLOAD
Models
Model | Link |
1mer | Download |
1mer+shape | Download |
1mer+shape+3merE2 | Download |
1mer+2mer+3mer | Download |
1mer+2merNoE2+3merNoE2 | Download |
3mer | Download |
1mer+shapei | Download |
shape (first&second order) | Download |
shape-shapei | Download |
shape (first order) | Download |
PSSM (sequence) | Download |
PSSM (shape) | Download |
M-word scores: Download
Each file contains three columns. The first column are the M-words. The second column are the relative affinity scores for the correspoinding M-words. And the third column are M-word counts.
Encoded features: Download
In each of the feature files, the first column is log2(M-word score). And the second column is constant 1. The rest of the columns are the encoded features. Naming of the features is as the following:
1mer features: *.10000000000
First-order MGW features: *.00010000000
First-order Roll features: *.00001000000
First-order ProT features: *.00000100000
First-order HelT features: *.00000010000
Second-order MGW features: *.00000001000
Second-order Roll features: *.00000000100
Second-order ProT features: *.00000000010
Second-order HelT features: *.00000000001
1mer+shape features: *.10011111111
Standard deviations used for normalizing the shape features: *.scale