JUNE
Week 1 (15th of June- 21st of June)
Tuesday
- June 16, 2020
- Discussed commonly asked questions about iGEM and organized future meetings.
Wednesday
- June 17, 2020
- Introduced Dr.Perli, Assigned groups/roles, Reviewed deadlines, Medal criterias, Finalized project idea, Reviewed Basic CRISPR knowledge
Week 2 (22nd of June- 28th of June)
Wednesday
- June 24, 2020
- Reviewed project goals and purpose of the project with Dr.Perli and assigned to review the original algorithm’s paper “Compact and highly active next- generation libraries for CRISPR-mediated gene repression and activation”
Week 3 (29th of June- 5th of July)
Tuesday
- June 30, 2020
- Summarized and reviewed previous CRISPR research paper that we will be building upon. Reviewed what variables the original algorithm takes in.
Wednesday
- July 1, 2020
- Assigned the biology focused team to research and analyzing genes.
JULY
Week 4 (6th of July- 12th of July)
Wednesday
- Wednesday
- Reviewed the research on genes and their secondary structures and discussed logo designs
Week 5 (13th of July- 19th of July)
Tuesday
- July 14, 2020
- Discussed updates and progress on research of specified genes, assigned team members specific tasks such as logo designing, compsci and bio specific tasks.
Week 6 (20th of July- 26th of July)
Tuesday
- July 21, 2020
- Reviewed possible logo designs, discussed iGEM questions with our iGEM mentor Craig M. Trester, introduced benchling and discussed the code.
Wednesday
- July 22, 2020
- Explored Benchling and discussed ViennaRNA
Week 7 (27th of July- 2nd of August)
Wednesday
- July 29, 2020
- Discussed and reviewed how our project involves PCR with Dr.Perli
AUGUST
Week 8 (3rd of August - 9th of August)
Wednesday
- August 5, 2020
- Discussed wiki building, fundraising and Dr.Perli presented slides on qtPCR and its importance in this project and how we can quantify the success of CRISPRi transcription regression
Week 9 (10th of August - 16th of August)
Wednesday
- August 12, 2020
- Mentor Craig Trester and Dr. Perli answered questions about judging rubric/form, sponsorship, global meetups, and discussed the tasks the project involves in greater detail.
- Separate meeting with compsci group looking over programs and data files which needs to be installed for the program to work, reviewed concepts and ideas for machine learning.
Saturday
- August 15, 2020
- Worked on compiling the program and got compatible files, Dr.Perli explained how to move forward with the code, and had difficulties with the github since it didn't have everything we needed.
Week 10 (17th of August - 23rd of August)
- No update
Week 11 (24th of August - 30th of August)
- No update
SEPTEMBER
Week 12 (31st of August - 6th of September)
Monday
- The project is started with sgRNA machine learning Python scripts used for Weissman lab’s next-generation CRISPRi and CRISPRa library designs as base. These pre-developed scripts were hosted on github. (Ref: https://github.com/mhorlbeck/CRISPRiaDesign)
- Spent time reviewing and debugging Weissman code
Tuesday
- Continued reviewing the code base and flow
- Started setting up the environment in AWS
Wednesday
- Continued understanding the Weissman code base
- Set up github
Thursday
- Started setting up Ubuntu environment on AWS
- Started setting up multiple versions of Python environment - v3.6 and v3.8
Friday
- Completed setting up Ubuntu and Python
- Setup secured version of Jupyter notebook server and exposed the whole setup over DNS
- Ran Weissman code on Python v3.6 which did not work
Saturday
- Environment setup is complete and is hosted successfully on AWS. This contains code and all the data files required
- Started porting Weissman code base from Python v3.6 to v3.8
Sunday
- Continued porting the code to Python v3.8
- Started fixing the issues to make the code functional on v3.8
Week 13 (7th of September - 13th of September)
Monday
- Completed porting the code v3.8
- Continued debugging issues to make the code functional
Tuesday - Sunday
- Fixed screen processing experiment module to generate scores from the experiment data
- Started working on algorithm to generate bowtie index files for the human genome based on hg19
- Time was completely spent on debugging Weissman code to make it functional on Python v3.8
- Discussed progress and improvements on the Weissman code with Siva. Planned future implements of the code, outlined the bio teams tasks and reviewed how the code interacts with the TSS database of the hg38. Rearranged schedule and meetings.
Week 14 (14th of September - 20th of September)
Monday
- Finally was able to make Weissman code functional on v3.8. All the issues were resolved and code was successfully run
- Also, genome data from the lab was provided by Dr. Perli
Tuesday - Friday
- Transformed qRT_pcr_data scores obtained by Dr. Perli from excel and normalized the data
- Started Experimenting Normalizing the data and different features.
Saturday - Sunday
- Started using the normalized experimental data and apply machine learning to tune the model
Week 15 (21st of September - 27th of September)
Monday - Tuesday
- The original model that was previously created and trained using phenotype based analysis and scores did not give good results when the guides suggested by the library are used in the lab.
- Discussed wiki goals, explored the Weissman code and brainstormed names for our project.
- Reviewed the iGEM CRISPR AWS Installation Instructions to make it easier for the compsci group to do the installation.
Friday- Sunday
- We continued working training and fine tuning the original model using good quality data, generated and measured by Dr. Perli in his lab
- Tested guide Used scores to predict guides for genesscores with Dr. Perli
Week 16 (28th of September - 16th of October)
Monday - Thursday
- Loaded scripts and empirical data to train the model and predict sgRNA activity
- Box and Scatter Plots are drawn to analyze the correlation between the features and scores evaluated in the lab
- Features are tuned and, square of distance of guide from primary/secondary tss is added as an additional feature and generated TSS annotations using FANTOM dataset
- Calculated the parameters for empirical sgRNAs
Friday- Sunday
- Continued working on training and tuning machine learning models to predict sgRNA activity
- Other features that were used in training the model are: length of guide, distance from tss, square of distance from tss, homo polymers, Chromatin accessibility (DNase, FAIRE, MNase), Nucleotide Dimers at each position, Secondary Structures, strand
- Trained and created different models with different combinations of the above features
- Started evaluating the results of the above models and continue with tuning these models
OCTOBER
Week 17 (5th of October- 11th of October)
Monday - Wednesday
- Continued working on training the models with new changes and is evaluated again
- Distances are linearized using SVR and other parameters are used as bin values
- ElasticNetCV is used as regressor
- Trained and tested multiple random sets with same set of features and observed AUC-ROC & R2 values to determine the best model
- Constructed the sgRNA libraries/models and picked the top sgRNAs for a library, given predicted activity scores and off-target filtering
- Finally a best model with better statistics is selected as final model
Thursday - Sunday
- Work to draw multiple graphs and tune the model
- Work also in progress to compare scores predicted by our model vs Weismann library vs lab
- Changes done offline
- Compsci group installed the AWS installation and worked with the jupyter guide and downloaded PUTTY.exe
Week 18 (12th of October - 18th of October)
Monday - Wednesday
- All compsci students worked on installing private AWS
- Worked with the code and PUTTY
Thursday - Sunday
- Work to draw multiple graphs and tune the model
- Work also in progress to compare scores predicted by our model vs Weismann library vs lab
- Git repo is created:
- Put all activity
- Updated README.md file
- Replaced the python version in README.md file and updated the README.md file and .gitignore, created a in depth version of tasks that need to be finished and assigned
- Created the iGEM_CRISPRi_Library_Design.ipynb, Merged branch ‘dev’ on github, updated sgRNA_learning.py, Normalized lab scores, Made gene exclusion as configurable, Added boxplots for normalized TSS data and validated the model for a different gene.:
- Updated model to linearize the relationship between the distance and the target score, Edited GeneFoldList
Week 19 (19th of October - 25th of October)
Wednesday - Friday
- Enhanced model with additional parameters, Selected the best training data set for the model, Added scatterplots, Created BowTie Output Directory, Worked on Printing features identified by the regressor along with coefficients
- Changed the Gene excluded to EIF4G1, Displayed comparison data in terms of ranks, Changed from log(1/scores) to 1-scores, Compared Weissman and iGEM scores using mean squared error, poisson and gamma
- Updated Off-target Stringency by making changes related to support off-target stringency
- Added support to find guides for given genes by adding a cell to take a gene name as a variable and find the guides and score them and cleaned up the code.
- Fixed off-target stringency by debugging for off-target stringency and removed poisson and the gamma as the target scores can be negative
- Created a gene selector that finds guides for a given gene and scores them using the model and calculates off-target stringency, created a score comparison study of Weissman Lab Scores and iGEM algorithm score with Dr. Perli's measured scores. Updated iGEM_CRISPRi_Library_Design.ipynb, Deleted unnecessary files
Week 20 (25th of October - 26th of October)
Monday
- We worked on wiki design with guidance from Lauren Turetsky
- Discussed and worked on the final touches/updates