We recently described a computational model that inputs only the sequence preferences of DNA and RNA binding proteins, and accurately predicts known gene structures, as well as expression from randomly-generated sequences (de Boer et al., Genome Research 2014)
Read MoreWe are systematically decoding protein-DNA and proten-RNA sequence preferences across the eukaryotes (Weirauch, Yang et al., Cell 2014; Ray, Kazan, Cook, Weirauch, Najafabadi et al., Nature 2013) using a variety of approaches, including new experimental and computational techniques that we develop
Read MoreMany DNA and RNA binding proteins display deep evolutionary conservation, but most lineages also contain families in which divergence is common. Striking cases include the ~700 human C2H2 zinc finger proteins, whose expansion appears to be driven by retroelements, which are highly enriched among genomic sequences bound in vivo (Najafabadi, Mnaimneh, Schmitges et al., Nature Biotechnology 2015)
Read More