Software
Abstract
We developed a software tool in collaboration with the “TU Darmstadt" iGEM Team, which enables the user to run a molecular dynamics simulation of biofilm growth. The program is implemented in Python and documented in detail on our Wiki and GitHub repository. The software tool not only provides the numerical model but various functions to visualize the results and access the data. The software is optimized to make use of the full computational power of the local machine. All of the used parameters and formulas are well documented. In our example notebook, we explain in detail how to adapt the model for its purposes. We implemented a specific class, to enable the user to switch easily between simulation constants and even different compositions of biofilms. Furthermore, we followed a comprehensive programming approach, to encourage future iGEM teams to contribute to the project.
Why did we do this?
The iGEM project of our team is the development of an “InToSens” (Inflammatory Toxin Sensor) made of biological components, which helps to diagnose the adherence of toxic biofilms to implant surfaces. By detecting the attachment of a biofilm at an early stage, we aim to maximize the chance for treatment success. Therefore, our teams are greatly interested in the mechanisms and parameters that cause and influence the formation of biofilms.
Because of the current pandemic, we are not able to research the biofilm in a wet lab. Hence, we decided to build a computational model to research the mechanisms of biofilm formation on implant surfaces.
The developed model takes into account a diverse set of biological and physical parameters and enables the user, to quantitatively investigate the influence of different parameters on early biofilm formation
We teamed up with the iGEM Team TU-Darmstadt to discuss our model throughout the development. The project of the Darmstadt Team aims to use biofilms to detox wastewater. Because of the lack of laboratory time, the Team is also interested in computational modelling of biofilm growth over time.
As the both teams are interested in different biofilm compositions and influencing parameters, we made great effort to make the model as flexible as possible.
In the following, we focus on a short overview and a demonstration on how to use the software. If you are interested in details on the implementation and methods, check out our Model wiki page or our GitHub Repository .
What do we provide?
The software tool includes the implementation of three class and utility functions to store, access, and visualize the data generated in the simulation.
The main part of the code is the implementation of three classes. The Bacteria Class is our object-orientated representation of the bacteria in the biofilm. Instances of the bacteria class have parameters regarding their movement and interaction with other bacteria and the surrounding media. Additionally, the parameters are completed with a set of biologically inspired parameters. Throughout the simulation, the parameters of each bacteria are exported in a Dictionary and save in the convenient JSON format.
In the Biofilm class, the simulation takes place. An initial configuration of bacteria can be spawned and are stored in a list as a parameter of the Biofilm instance. A Biofilm class method iterates over this list updates the parameters of each bacteria and calculates interaction forces using a biophysical potential and drag force. Because the computation gets complex quickly, we use multiprocessing to use all of the CPU cores of the users' local machine.
The Constants class represents the interface, with which the user can specify the simulation constants like duration, step size, and output paths. Furthermore, constants like the number of initial bacteria and the bacterial strain, of which the Biofilm consist, can be set with methods of this class. Selecting a bacterial strain results in different properties of the Bacteria in the simulated biofilm. We provide the regarding constants of two bacterial strains Escherichia coli and Bacillus subtilis, between which the user can choose. Each strain comes with a Dictionary of constants regarding the biological properties of this strain. All constants are well researched and documented on our wiki as well as in the code itself. The implementation of additional strains are intuitive, which makes the software tool easily extendable.
The user can run a simulation and set the parameters on their one local machine by writing few lines of python code or even less if he wants to use the default values.
The python code is mainly built on the NumPy (Numerical Python), SciPy (Scientific Python), and pandas package. These packages are heavily spread and used in various applications reaching from Machine Learning to numerical models and data analysis.
We applied our own tool to study the growth of Biofilms consisting of Escherichia Coli, while the iGEM Team TU Darmstadt used the software tool to study the growth of Biofilms consisting of Bacillus subtilis. While running the software tool, both teams critically evaluated the features and accessibility of the software tool. We validated the results of the simulation by extensive literature research and with the biological expertise of the TU Darmstadt iGEM Team. This (see Model and Collaboration)hence resulted in many improvements. The history of software development can by tracked in our GitHub Repository .
Application
Prerequisites
Because the software tool is implemented in the python language, there are a few requisites before we can dive into the code. First of all the user will need an installation of python on its machine. We recommend the installation of the open-source distribution “anaconda”. With this, the user can make use of all the features included in our tool. The installation of anaconda includes python itself, the development environment (IDE) Spyder, and Jupyter, a web-based tool for interactive programming. Once you installed anaconda, our software tool has to be copied to the local machine.
Open up a console and navigate to the folder, in which you want to clone the repository. Then typegit clone https://github.com/igemsoftware2020/Hannover
As mentioned above, our model uses a number of open-source python packages, which you will need to install. Open up an anaconda console and navigate in the env folder of our repository. By typing
conda env create -f iGEM-biofilm-model.yml
you will get all of the needed packages. This file stores the settings of our so called “python environment”, which includes all the tools needed for running our software.
Activate the above environment by typing
conda activate iGEM-biofilm-model
and install the BiofilmSimulation software tool via pip:
pip install BiofilmSimulation
Now you got everything you need, to start our BiofilmSimulation tool on your computer. We note that the terminal commands can slightly differ depending on the operating system installed on your local machine. If you are having trouble installing the module, open an issue at our GitHub Repository .
How to run it
To directly start with the simulation typejupyter-notebook
This will open up a webbrowser running with Jupyter. Navigate to the folder, in which you just cloned
the repository and jump into the example by clicking on the "example.ipynb" file.
If the browser does not show up automatically, open it manually and connect to
http://localhost:8888/
In the jupyter notebook, we explain how to use our software tool in the intended way. We included
explanation of the functions and briefly analyze the results.For a more detailed explanation of the
outputs, check out our Model.
How to modify the source code?
Our code is easily extendable. Here we present some approaches to modify the source code. The BiofilmSimulation module provides constants of two bacterial strains. If you want to add more strains, you can do so by researching the needed constants and storing them in a python dictionary. The dictionary has to be in the following format:ecoli_dic = {
"LENGTH": np.abs(np.random.normal(loc=1, scale=0.14)), # [um]
"WIDTH": 0.5 , # [um]
"MASS": 10 ** (-12), # [kg]
"MORTALITY_RATE": 0.01, # [% / 100]
"CRITICAL_LENGTH": 3.5, # [um]
"FREE_MEAN_SPEED": 50, # [um / s]
"DOUBLING_TIME": 1200, # [s]
"GROWTH_RATE": 1 / 1200, # [um / s]
"MOTION_ACTIVATION_PROBABILITY": 0.005, # [% / 100]
"MOTION_DEACTIVATION_PROBABILITY": 0.01 # [% / 100]
}
Open up the source code in constants.py in a editior of your choice. If you installed anaconda you
can use the Spyder IDE, which comes with the anaconda installation.
By copying the
get_ecoli_constants(key: str = None)
function and replacing the dictionary and choosing a unique function name you already did half of the work.
Now, you will have to add your function to the
set_bacteria_constants
method. Add an elfi
statement at the end of the function in the following way.
def set_bacteria_constants(self, default=True):
""" set constants according to selected bacteria strain """
if self.bac_type == "B.Sub." and default:
self.bac_constants = Constants.get_bsub_constants()
elif self.bac_type == "E.Coli." and default:
self.bac_constants = Constants.get_ecoli_constants()
elif self.bac_type == "Your_bacteria_strain_name" and default:
self.bac_constants = Constants.your_unique_function_name()
and you are done.
Now you can run the simulation with your Bacterial strain by specifying
constants = Constants(bac_type="Your_bacteria_strain_name" )
in your custom script or in the "example_usage.py" script.
A neat feature of the code is, that the Bacteria can easily be extended by additional parameters.
When adding a new parameter like self.some_protein_concentration
to the __init__
function and
updating the return value of the get_bacteria_dict
function accordingly, the new parameter will
be automatically stored in the output data and can be accessed with our data handling function. This
allows to easily extend the bacteria properties in the simulation.
Further extensions, e.g. adding new forces are also possible but will require a little bit more effort. For this we suggest that you look at the implementation of the update_acting_forces
method
of the Bacteria class and the formulas in the "formulas.py" script.
Another possible contribution can be the implementation of new visualization methods. Almost all of
our visualization functions make use of the get_data_to_parameter
function in the "data_handling.py"
script. This function enable you to pass the parameter you are interested in, e.g. the ‘velocity’
and get a pandas array with the respective data for each bacteria for every time step.
We are excited, which contributions you will add to the open-source project!