Sunday, 18 September 2011

Customising Ensembl Displays







Ensembl's display customisation h
as become both far more detailed, and far easier toaccess. I'm goingto talk you through how to customise and what my favourite view is.

The default of Ensembl display is something like this. Nice, pretty, but very transcript/gene centric, and with a strong emphasis on splitting up the strands. I am now a more regulatory person, and genes really form the context of what I want to see, but most of the time I don't want to see the details of the transcripts.

So - on each track title to the left you can click this to g

et an immediate configuration of this track. I click on this to set to the genes to "Collapsed with Labels". Then also the order of the tracks can be changed by dragging the left hand yellow/green bars (the colour indicates the strand). I push the contig to the top, then I have the forward (green) genes and the reverse (yellow genes). I also put the ncRNA track into "Collapsed, no label" mode (as the gene names for ncRNA is often pretty uninformative). At the end of all of this, I end up with a display like:








But now we need to add things to the display. The "Configure" button is your friend on the left hand side under the menus (it's important to realise that the "Configure" button is page specific, so as you around the pages in Ensembl it will configure things specific for that).








Click on this and you will get a rather bewildering set of options for tracks (over 300 or so). Notice you can search across these tracks on the top left, and there are different sections.



I
like to put in Comparative genomics - the 35 way GERP conservation scores, and the conserved elements. Then I like to switch on regulatory information. Choose the regulatory information section from the Configuration panel. This gives a matrix of cell lines (on the top)and factors on the left. On the final display you will see these organised first by cell line and then by factor as colors inside that track. Be careful of the multi c
ell track as this is the union over all cell lines - if you switch that you can run out of alot of screen space! I like GM12878, K562 and H1esc on, and go for the "peaks and signal" style. Then I usually have DNaseI, CTCF, cFos, and P300 on (cFos just to get a sense of a "relatively normal" factor). There are some "gotchas" in this matrix. For example, if you use the "Select all" on the left hand side for a factor, this will trigger each cell line this is present to switch to being displayed - the logic being that it is better to show the user something that they can then get rid of - but I often wanted it only on in the already selected cell lines. Once you get used to this you can either go back to the top and switch things off, or switch things on and off in the individual cells.


Notice that you can also set up histone modifications and polymerase tracks.


Once all this done, I get something like:




Before you start browsing around even more, do make sure you set these tracks as "favourites". On the configure page, click on the "Active Tracks" (things which are on now) and then hit the greyed out stars. This moves these tracks into your "Favourites" which means when you went to turn things on and off, you just need to go to favourites rather than researching these.

Go play on:



If you are not human or mouse obessed, the Ensembl Genomes project has the same sortable and configurable tracks for Worm, fly and others, but we have not (yet) put regulatory information at the same level into these species. We hope to of course.



(Ugh. I've just discovered the mismatch between the blogger "preview" and the real preview of this and images. For the last screen shot I've now added in a higher resolution capture of this - if anyone knows the better way to get images to work in blogger in the place you think you'd like them to go, let me know :)).