Poster: Imperial_College

SOAP Lab: Accessible Automation of BASIC, MoClo and BioBrick Assembly for Biologists

Presented by Team Imperial_College 2020

Olivia Gallupová¹, Emma Albertini¹, Benedict Carling¹, Gabrielle Johnston¹, Raghav Khanna¹, Raymond Miles¹, Gabriel Swallow¹, Maria Torra I Benach¹, James Bayne², Hia Ming², Scott Stacey², Georg Wachter², Professor Geoff Baldwin^3,4

¹ iGEM Student team member
² iGEM Advisor
³ iGEM Primary PI
⁴ Department of Life Sciences and IC-Centre for Synthetic Biology, Imperial College London, London, UK

Abstract
DNA assembly is a vital first step in most synthetic biology projects. As genetic design spaces become larger with more complex genetic circuits and greater diversity of parts, the ability to construct sizable genetic libraries with high accuracy in a cost and time-efficient manner is imperative. Automating this process using affordable, open-source liquid handlers presents an attractive solution for small-scale labs, but requires programming robotic workflows, a technical challenge for many wet-lab scientists. To make automated workflows a more practical reality, we developed SOAP Lab, a web UI that infers genetic circuit designs from SBOL files and customizes an assembly plan based on the user’s specifications. SOAP Lab then generates ready-to-run scripts for the liquid handlers, along with instructions for set-up and meta-information for traceability and debugging. The use of the SBOL standards makes SOAP Lab integrable into software pipelines that use the same standard, making automation of DNA assembly accessible. Proof of concept of the software pipeline was finally achieved through the expression of fluorescent reporters in E.coli proving the validity of the software.

Introduction

SOAP Lab is a software tool designed to make your life in the lab easier. Our eponymous acronym SOAP stands for: Synthetic Biology Open Language (SBOL) to Opentrons Automated Pipeline. Through the use of the SBOL standard, we aim to deliver the power of Opentrons laboratory automation to new users and improve the experience of existing users. Our product is a comprehensive software pipeline that seamlessly takes users from designing their constructs in silico to assembling them in the lab, all through our intuitive website.

We offer a suite of features that easily empowers your experiments, such as the automated planning of high-throughput combinatorial libraries and generation of Opentrons 2 Python Scripts to your custom needs, along with extensive metainformation about your experimental set-up.

Inspiration

When our team first assembled, we were interested in the optimization of L-Tryptophan (Trp) production in yeast. Trp has an annual demand of over 50,000 tons and was a precursor to many pharmaceutical products, a topic we were originally investigating [1, 2]. Optimization of Trp production in E. Coli was achieved as early as 2000, but little progress has been made in optimising Trp production in other organisms, including yeast [2, 3]. Our interest in performing this optimization in a yeast chassis specifically stems from our interest in using CRISPR as a regulatory tool. Based on interviews with experts in metabolic engineering, we were advised that CRISPR works well in yeast, and that many of the secondary metabolites of pharmacological interest requiring Trp as a precursor are difficult to produce in E. coli at high yields.

We realised through our research that with the number of test variables and genetic pathways we wanted to modify, the massive size of the combinatorial library we would have to assemble would make our wet lab work a nightmare perform by hand. We sought a way to not only reduce hands-on time in DNA assembly for complex, combinatorial designs, but also to keep track of the whole experimental process - the inspiration for our software tool. Building a software tool was also especially attractive given the current challenges, with COVID-19 restricting our lab access, and making planning, simplification of workflows, and the reduction of hands-on time all the more important.

References:
[1] Liu, L., Bilal, M., Luo, H., Zhao, Y., and Iqbal, H., 2019. Metabolic Engineering and Fermentation Process Strategies for L-Tryptophan Production by Escherichia coli. Processes, 7(4).
[2] Lim, Y. H., Foo, H. L., Loh, T. C., Mohamad, R., and Abdul Rahim, R. (2020). Rapid Evaluation and Optimization of Medium Components Governing Tryptophan Production by Pediococcus acidilactici TP-6 Isolated from Malaysian Food via Statistical Approaches. Molecules, 25(4).
[3] Marín‐Sanguino, A. and Torres, N.V. (2000), Optimization of Tryptophan Production in Bacteria. Design of a Strategy for Genetic Manipulation of the Tryptophan Operon for Tryptophan Flux Maximization. Biotechnol Progress, 16.

Problem

DNA assembly is a key bottleneck step in any synthetic biology experiment. To compound problems, there is a limit to human accuracy and human mistakes are often difficult to trace. The challenge of DNA assembly is further magnified by the growing need to assemble large combinatorial libraries of constructs for complex synthetic biolgy experiments such as in metabolic engineering. This translates to the preponderant need to reduce hands-on-time, and increase accuracy and scalability for DNA assembly. We began developing a software tool that can ameliorate these issues by implementing combinatorial design and automated assembly, and realised that many academic labs have yet to adopt software and automation into their experimental workflows.

To encourage the acceptance of our tool, we would have to address the current problems with automation in synthetic biology:

Liquid handlers, used for DNA assembly, have high initial capital costs [1].
Many automated protocols lack the flexibility that manually specifying protocols can provide [1].
There is little integration throughout the experimental process - both with machines and with software, and many open source tools available are standalone and incompatible with each other [2, 3].
Researchers lack confidence in the reliability of laboratory automation, with one study indicating that almost 90% of researchers “cannot trust the quality of robot’s work”, despite not knowing its actual success rate [4].
The learning curve and lack of accessibility of lab automation to wet-lab researchers poses a significant barrier [1]. Many researchers do not have the programming experience that synthetic biology automation tools require from users.
There is a lack of widespread standardisation in the sharing and encoding of parts and protocols, which hinders the scalability of the design build test learn (DBTL) cycle [1, 2, 5, 6].

References:
[1] Jessop-Fabre, M. & Sonnenschein, N. (2019) Improving Reproducibility in Synthetic Biology. Frontiers in Bioengineering and Biotechnology. 7 18.
[2] Appleton, E., Madsen, C., Roehner, N. & Densmore, D. (2017) Design Automation in Synthetic Biology. Cold Spring Harbor Perspectives in Biology. 9 (4).
[3] Chao, R., Mishra, S., Si, T. & Zhao, H. (2017) Engineering biological systems using automated biofoundries. Metabolic Engineering. 42 98-108.
[4] Nagata, M. (2017) User-centered automation process in synthetic biology research. Massachusetts Institute of Technology.
[5] Decoene, T., De Paepe, B., Maertens, J., Coussement, P., Peters, G., De Maeseneire, S. L. & De Mey, M. (2018) Standardization in synthetic biology: an engineering discipline coming of age. Critical Reviews in Biotechnology. 38 (5), 647-656.
[6] Carbonell, P., Radivojevic, T. & García Martín, H. (2019) Opportunities at the Intersection of Synthetic Biology, Machine Learning, and Automation. ACS Synthetic Biology. 8 (7), 1474-1477.

Solution

We developed SOAP Lab, a comprehensive software application which streamlines the design and build aspects of the Design-Build-Test-Learn cycle using computer-aided design tools and automation. SOAP Lab is an open source software that infers genetic circuit designs from SBOL data files, and produces custom DNA assembly automated protocols based on a user’s specifications. It is also accessible through a user-friendly interface that is hosted on the web.

This gives meaning to our name:
SBOL: Synthetic Biology Open Language, an open-source data standard for describing in silico biological designs
Opentrons: An open-source liquid handler chosen chose for its affordability and modularity.
Automated: We aim to bring synthetic biology automation to new users, and improve automation for more experienced users.
Pipeline: Our product is more than just a single tool, it’s a comprehensive workflow that seamlessly takes users from designing their constructs to producing them in the lab, all through an easy-to-use website.

Software Pipeline

Overview

Figure 1: Overview of software pipeline

SOAP Lab is an open-source software application that parses Synthetic Biology Open Language (SBOL) files for the purposes of automating the generation of Opentrons 2 (OT2) Python scripts for automated DNA assembly. The software comes with an interactive user interface hosted on web servers (soaplab.io) that guides and facilitates the design of multi-part genetic assembly constructs in SBOL Designer as well as the customization of automated assembly runs on the Opentrons liquid handler. The SBOL file created and user-specified customizations are sent to a backend which runs the SBOL Parser and Script Generation packages and subsequently produces downloadable OT2 Python scripts along with more detailed information about the experimental set-up as .CSV and .TXT files.

Software Architecture

Figure 2: Software architecture

The main interface with the SOAP Lab application is through the web frontend (hosted at soaplab.io) that has been coded in the React web framework. SBOL Designer, a Java applet created by the Myers Research Group, is integrated into the frontend using Webswing. The backend was built using the Python django framework and comprises of the following components: SBOL Parser API and Script Generation API (including packages for generating automated assemblies for BASIC, MoClo, and BioBrick assemblies). The frontend is interfaced with the backend using GraphQL, implemented using the Graphene Python library in the django backend.

Frontend

The process of generating OT2 Python scripts on the user end is divided into 3 main stages: creation of SBOL files, upload of SBOL files, and customization of experimental specifications.

Creation of SBOL Files

Figure 3: Creating SBOL design files in SOAP Lab

The user will be prompted to specify the DNA assembly method to be executed. The currently available options are: BASIC assembly, MoClo (Golden Gate) assembly, and BioBrick assembly.
An additional downloadable file "basic_linkers_standard.xml" is provided. This file contains the standard set of linkers for BASIC assembly in SBOL format for the user to build their designs easily.
The user is able to design their multi-part assembly construct in SBOL Designer. If there are multiple variants of a part (e.g. multiple promoters), this is easily supported by specifying a Combinatorial Derivation. An example of a valid design is shown in Figure 4.
After creating the SBOL design, the user will be prompted to save the SBOL file.

Figure 4: Example of a valid SBOL design for multi-part BASIC assembly

Uploading and Validating SBOL Files

Figure 5: Uploading and validating SBOL design files

The user will be prompted to upload their SBOL design file and is given the option to validate the file using an SBOL Validator implemented using the SBOL Validator API.

The purpose of validation is to check for the integrity of the SBOL file against established SBOL data standards to prevent issues in downstream parsing.

Example SBOL files for different assembly types are also provided for reference.

Customizing Experimental Specifications

Figure 6: Customizing experimental specifications

The user is able to fill in the appropriate fields to customize the DNA assembly experiment, and subsequently submit the file for script generation.
Shortly after submitting the file, the user will be prompted to fill in the specifications (concentration, plate and well positions) for the DNA parts that will be used in the assembly.
Thereafter, OT2 Python Scripts will be returned available for download.

SBOL Parser API

The SBOL Parser API is a Python package which implements the pySBOL2 and Plateo libraries for parsing SBOL files and preparing the appropriate .CSV inputs for downstream script generation.

Figure 7: Overview of CSV generation process using SBOL Parser API

An overview of the CSV generation process is as follows:

Get list of assembly constructs from the SBOL Document (enumerating Combinatorial Derivations if necessary)
(Optional) Filtering to remove assembly constructs with repeated parts
Take a random sample of assembly constructs if the size of the list of constructs is greater than the desired number of constructs to be assembled
Distribute assembly constructs and DNA parts into respective Plateo Plateo objects
Create CSVs from Plateo Plate objects

Script Generation

Figure 8: Inputs of the script generation function

The automated script generation component of the backend comprises 3 separate packages for each of the different assembly methods. The entire script generation procedure of each assembly method is run by calling a single script generation function from the relevant script generation package. The script generation function takes in constructs as a file of comma separated values (a CSV), and one additional CSVs of parts, or parts and linkers in the case of BASIC assembly. Additional input parameters include the folder to save the scripts in, and the labware, pipettes, and modules the user intends to use. Outputs of the script generation functions include the OT2 Python scripts along with the associated metainformation on the experimental run.

Proof of concept: Design & Results

We aimed to show that our automated DNA assembly pipeline could go from in silico computer-aided design to actual physical execution by assembling GFP and RFP transcriptional units in E. coli, using BioBricks, BASIC and MoClo. Our goal was to generate single and multiple gene fluorescent assemblies.

Construct Design

BASIC

Name	Linker1	Part 1	Linker 2	Part 2	Linker 3	Part 3
A1	LMS	BASIC_SEVA_36_CmR_p15A1	LMP	BASIC_L3S2P21_J23101_RiboJ1	L1RBS3	BASIC_mCherry_ORF1
B1	LMS	BASIC_SEVA_36_CmR_p15A1	LMP	BASIC_L3S2P21_J23101_RiboJ1	L1RBS3	BASIC_sfGFP_ORF1
C1	LMS	BASIC_SEVA_36_CmR_p15A1	LMP	BASIC_L3S2P21_J23108_RiboJ1	L1RBS3	BASIC_sfGFP_ORF1
D1	LMS	BASIC_SEVA_36_CmR_p15A1	LMP	BASIC_L3S2P21_J23108_RiboJ1	L1RBS3	BASIC_mCherry_ORF1

Figure 1: Illustration of the two BASIC constructs assembled

MoClo (GoldenGate)

Name	Backbone	Promoter	CDS	Terminator
construct1	pICH47742	Bba_J23119	pC0_009	pC0_069
construct2	pICH47732	Bba_J23102	pC0_009	pC0_062
construct3	pICH47732	Bba_J23100	pC0_009	pC0_062
construct4	pICH47732	Bba_J23114	pC0_009	pC0_062
construct5	pICH47732	Bba_J23106	pC0_009	pC0_062
construct6	pICH47742	Bba_J23100	pC0_009	pC0_069
construct7	pICH47742	Bba_J23106	pC0_009	pC0_069
construct8	pICH47742	Bba_J23114	pC0_009	pC0_069
construct9	pICH47732	Bba_J23119	pC0_009	pC0_062
construct10	pICH47742	Bba_J23102	pC0_009	pC0_069

Figure 2: Illustration of the MoClo constructs assembled

We chose to try out different Anderson promoters in each construct, equivalent to (BBa_J23119, BBa_J23100, BBa_J23102, BBa_J23106 and BBa_J23114) in BioBricks. Each promoter contains an embedded ribosomal binding site (RBS).

BioBrick

Table of Level 1 Assembly Constructs

Name	Upstream	Downstream	Plasmid
pro_rbs_1_Var_BBa_J23119	BBa_J23119	BBa_B0030	ColE1_AmpR
pro_rbs_1_Var_BBa_J23108	BBa_J23108	BBa_B0030	ColE1_AmpR
cds_ter_1_Var_BBa_I746916	BBa_I746916	BBa_B0015	ColE1_AmpR
cds_ter_1_Var_BBa_E1010	BBa_E1010	BBa_B0015	ColE1_AmpR

Table of Level 2 Assembly Constructs

Name	Upstream	Downstream	Plasmid
assembly_1_Var_BBa_J23108_Var_BBa_I746916	pro_rbs_1_Var_BBa_J23108	cds_ter_1_Var_BBa_I746916	ColE1_CamR
assembly_1_Var_BBa_J23119_Var_BBa_I746916	pro_rbs_1_Var_BBa_J23119	cds_ter_1_Var_BBa_I746916	ColE1_CamR
assembly_1_Var_BBa_J23108_Var_BBa_E1010	pro_rbs_1_Var_BBa_J23108	cds_ter_1_Var_BBa_E1010	ColE1_CamR
assembly_1_Var_BBa_J23119_Var_BBa_E1010	pro_rbs_1_Var_BBa_J23119	cds_ter_1_Var_BBa_E1010	ColE1_CamR

Figure 3: Illustration of the BioBrick constructs built using 3A Assembly.

The receiving backbone needs a different antibiotic resistance than the assembled parts.

Results

Figure 4: Successful assembly and transformation of a BASIC assembly construct B1

Following automated BASIC assembly, a single colony expressed green fluorescence in the plate containing the BASIC_SEVA_36_J23101_GFP construct suggesting a correct assembly. MoClo and BioBricks automated assemblies did not report any fluorescence. Unfortunately, due to insufficient quantities of the YFP part, it was not possible to perform a repeat of the MoClo. The lack of fluorescence for the BioBricks is likely due to unsuccessful round 2 assembly transformations. These issues also appeared during manual construct assembly.

Software Application: Tryptophan Optimisation and BASIC SEVA Library

Once we had demonstrated that SOAP Lab worked as intended in our proof-of-concept work, we next wanted to demonstrate its utility for real synthetic biology projects. We focused on two obvious applications of combinatorial design – metabolic engineering and part library construction.

Tryptophan Optimisation

We decided to optimise tryptophan biosynthesis in yeast due to it being an important precursor for many high value products. Using parsimonious flux balance analysis we selected a number of targets for overexpression or down regulation, that were predicted to increase tryptophan yield. Using these targets a genetic design was developed combining overexpressions and CRISPR mediated knockdowns at varying strengths. This created a large design space of 270 possible designs that could only be tackled efficiently using automation for both Design and Build - perfect demonstration of the utility of SOAP Lab. Unfortunately due to Covid based limitations we have been unable to carry out the build at the current time. In addition to demonstrating the utility of SOAP Lab, we also developed a range of different BASIC yeast parts, expanding BASIC to work in yeast, which we hope will be used in many future projects.

Figure 1: Design of Trp optimisation plasmid

BASIC SEVA Backbone Library

Figure 2: Structure of BASIC SEVA plasmid

As a further demonstration of the utility of SOAP Lab we aimed to foundationally develop the BASIC standard by constructing a large library of Standardised European Vector Architecture (SEVA) backbones that were compatible with BASIC. Here 6 different antibiotic resistance cassettes, and 8 different origins of replication came together to make a design space of 48. Whilst this is a manageable design space, carrying this out in an automated fashion using SOAP Lab could save many man hours in the lab.

Human practices

Human practices is an essential element in any project. For us, human practices is the infrastructure upon which our project develops. It played an integral role from the offset, informing the project design at every stage. It acts as a means to ensure our project remains relevant, valuable, and trustworthy, taking into consideration the interests of all, not just our own.

In order to achieve this, we developed a set of core values to abide to:

Communication

We sought to set a precedent for a strong line of communication between our team and end-users throughout the design process, and for all aspects of our project.

Accessibility

We sought to ensure the accessibility of our project, both so our end-users can utilise our product agnostic of their experience, and future teams can build on our progress.

Validation

We sought to provide proof for the validity and therefore trustworthiness of all aspects of our project, seeking input and feedback from external sources to remove bias.

Collaborations

We beta tested our software tool with three teams, including our partner team Hamburg, at various points in its development. Given part of our target market is iGEM teams, their feedback as potential end-users was invaluable in the tools development.
We also mentored five teams on their mathematical model, including Hamburg. We provided a general introduction in our video calls, as well as bespoke advice on how to model their systems and, perhaps more importantly, how their model can have an impact on their project.

Introduction to Mathematical Modelling Package:

Mathematical modelling is challenging, and on top of that there are limited educational resources. It can be so powerful and impactful, yet it is highly inaccessible to many iGEM teams. To tackle this, we sought to develop a thorough introduction to mathematical modelling in synthetic biology educational package, consisting of a handbook, video tutorials and coding challenges.

The content was determined organically based on our meetings with our collaborators. They then trialed all aspects, providing feedback on its approachability. Their input was invaluable in the development of the package.

We have included the original LaTeX files in our wiki so future iGEM teams can expand upon our work. We hope it will continue to develop until it is a truly comprehensive guide.

Webinars and Science Slam

We also wanted to involve the wider scientific community and the general public in the conversation. The ethical implications of automation is an important topic of discussion. We therefore wanted to encourage discussion around the topic by presenting our talk titled "The Ethics of Automation" at Heidelberg's Science Slam. We also sought to involve experts in the field, so we hosted a webinar under the same title. We held a second webinar, with a focus on how tech companies and academics are communicating with their end-users. This essential element of many projects is often lacking, and we wanted to understand why, as well as how it can be done effectively.

Partnership

Hamburg

Hamburg tested our SBOL design features and gave us their opinions on the use of SBOL as a standard, and helped us make the tool more intuitive
The team was able to test our software on their Opentrons, providing us invaluable feedback and helping us identify sources of contamination

Not only did this present immense value to our software tool itself, but proving that another iGEM team could implement our pipeline in an afternoon made an incredible case for the trustworthiness, usability, and user-friendliness of our software. After our own validations, this was the first true step to bringing our software into the real world and open-source synthetic biology community.

Korea_HS

Korea_HS provided us with invaluable continuous input to our Introduction to Modelling package, which helped us develop it from a brief introduction to a fully-fledged package. They on the other hand benefitted from our extension of the modelling tutorials to applied, bespoke simulations that helped them untangle crucial elements of their design. We therefore feel this collaboration satisfies the partnership requirements as well.

Conclusion

Our frustrations about the lack of software tools to aid the synthetic biology workflow galvanised our resolve to make things better for our future selves. Through our integration of feedback from end users, creators of our underlying technology, and from our own needs for scalable synthetic biology experiments, we developed our software to be able to automate DNA assembly in academic labs. We addressed the problem of scaling up synthetic biology by building our pipeline around an easily comprehendible and shareable synthetic biology data standard, so that changes to the design in silico could translate to immediate changes in the workflow. To bridge the disconnect between wet lab and software, we hosted workshops and talks on developing pipelines for synthetic biology and automation, as well as guiding and educating novice bioengineers in maximising the utilisation of software tools to their benefit. With easy access to our software through an intuitive web interface, we open our software up new users in synthetic biology while offering powerful, open-source and well-documented tools for experts to seamlessly scale up their designs and builds.

Achievements

Our project melded the dry lab and wet lab, bringing the capabilities of automation, combinatorial design, and standardisation to an accessible platform. Our outreach and continuous adaptation of feedback from other teams, scientists, and the public ensures that our work can have a lasting, beneficial impact on the synthetic biology community.

Software Achievements

Created a comprehensive software pipeline, including part design, SBOL parsing, automated assembly, and automated transformation
Provided and accessible tool with combinatorial design
Validated our product through user testing
Identified issues with and improved supporting software, such as pysbol2

Wet-lab Achievements

Performed dry runs which were critical in debugging and optimising our script generation
Assembled a BASIC construct successfully using our automated scripts
Tryptophan optimisation in yeast design and modelling
Added 23 new parts to the iGEM registry

Collaboration and Partnership

Partnered with Hamburg, assisting Hamburg in designing their model, while receiving crucial feedback on our software and how it translated to the wet lab
Partnered with Korea_HS, mentoring them on mathematical modelling and receiving feedback on our Introduction to Modelling package
Collaborated with five other teams, mentoring three in mathematical modelling while exchanging feedback with two others

Human Practices

Held two webinars to further investigate the ethics of automation and wet-lab and software collaboration
Participated in the Heidelberg Science Slam, educating the public on the ethics of automation and synthetic biology
Integrated feedback throughout our project, both from other teams and experts in the field

Education

Created a comprehensive mathematical modelling package, including a handbook, videos, and coding tutorials

Future work

In future, we hope to integrate our project into a larger pipeline such as Galaxy SynBioCAD, which incorporates the entire design-build-test-learn cycle, and are in discussions with the developers at the moment (https://galaxyproject.org/use/synbiocad/). This would allow our software to reach many more users and greatly increase its usefulness. With this possibility in mind, we aim to continue to improve and expand our tool based on user feedback. We would like to increase our users’ confidence by adding real-time error tracking and prediction to our Opentrons protocols. Furthermore, we are interested in integrating Plateo, a tool by the Edinburgh Genome Foundry that allows tracking of plates and wells (https://github.com/Edinburgh-Genome-Foundry/Plateo).