Team:Heidelberg/Software/RISE

RISE
Registry Intelligent Search Engine

Overview

The iGEM Registry is an important tool for every synthetic biologist trying to conduct experiments. It contains all BioBricks available and is growing constantly. For many teams the iGEM Registry has provided a solid foundation for their research project. - Just like it provided a basis for our work.

Nevertheless we believe that the iGEM Registry did not grow in its functionality like the number of parts and demands of the iGEM teams did. To name a few:

  1. searching the iGEM registry is restricted to the part’s name, while it would be valuable to search for certain keywords in the description of a part for instance
  2. some parts are lacking a comprehensive description or sequence, so filtering for the uses of a part and a refined documentation by another team can be beneficial
  3. the overview of the information of a part is very crowded and changes throughout the registry, therefore we need a standardized display of all information connected with one BioBrick

When talking to other iGEM teams for our collaboration and education projects and several Online Meetups, the registry always was a controversially discussed topic. But coming to terms with the status quo or demonizing everything was not our approach. We wished to make more use of this valuable resource. This is why we present RISE (iGEM Registry Intelligent Search Engine), an easy-to-use application to search all available BioBricks.

Fork us on GitHub! Fork us on GitHub!

Functions of RISE

We use the point-in-time database dump provided by the iGEM Foundation (I). The file csv_creator.py parses this .xml file to create a .csv file. To make the decision if you want to use RISE easier for you, we provided the first 10.000 rows in this repository for you to play with.

RISE.ipynb reads this .csv file and creates a searchable table. Once you found the entry/entries you need, you can export the corresponding rows as as new .csv file. Additionally, you can export the part name and sequence as a .fasta file for use in several applications, such as 3DOC, a service to concatenate the protein domains of BioBricks and generate a 3D-structure. And like that's not enough, we also added a Genbank export functionality.

Instructions

First, you have to run csv_creator.py script to parse the aforementioned .xml file or you can use the registry.csv file in our repository (only the first 10.000 rows) for testing purposes. After that, you can head to RISE.ipynb:
The fist thing you should do is running all cells. After you have done that you will be greeted by the clear and clean layout of RISE. As the registry is quite big, this can take a bit, so don't be impatient.
Figure 1: Run all
After opening the RISE.ipynb file you should click on Cell > Run All to run all cells in the notebook.
Figure 2: Overview of RISE
This is how RISE should look after running all cells.
Now you can start investigating the iGEM registry. You can use the little funnel symbols on the column names to filter the corresponding column.
Figure 3: Using Filters
Filtering is very intuitive. Simply click on the funnel symbol on the column you want to filter and star searching.
After you selected all filters you want, you should see a table significantly reduced in size. The next step is to select all rows you want. The controls work similarly to selecting files on your operating system. After you selected all rows you want...
Figure 4: Choose the rows you need
After filtering you can narrow down your search furthermore by selecting the rows you want.
You can add them to the export dataframe by clicking on the button. Now you can apply different filers if you want to and add more rows to your export dataframe by clicking the button.
Figure 5: Add the selected rows to the export dataframe
After you made your selection add them to the export dataframe by clicking on the corresponding button.
Changed your mind? Click on the reset button to clear your dataframe.
Satisfied with your selection? Great! Choose a name for your files that will be created soon. Now you have different export options. You can export your selection as a .csv file for use in other applications like Excel. You can also export the dataframe as a .fasta file. The .fasta file contains the part name and the sequence. The third option is exporting the file as a .gb file (genbank). This format contains additional annotations. The genbank format is especially useful for further usage in a software like Geneious Prime or SnapGene.
Figure 6: Export
After you chose a name you can export your selection choosing between different formats.

RISE is a tool for all iGEM teams.

There has been a Part query service of the iGEM Registry set up by the iGEM Team of the TU Delft before. Unfortunately the service is not available anymore and the documentation cannot be accessed. This is why we decided to implement a Python program, so other teams don’t need to rely on an Online-service and to publish the .csv file on the iGEM Github repository and on this website, to keep it for future generations. With RISE we hope to make a small contribution to the success of the projects of future iGEM teams.