Wiki Study
- We carefully investigated what makes a good iGEM website
- We collaborated with Team Heidelberg to decide on variables of interest and Heidelberg iGEM team collected data through web scraping
- We made predictions about the influence each variable and tested these hypotheses using a binomial logistic regression.
- Number of internal links, number of external links and (to a lesser extent) the number of titles showed a clear correlation with iGEM Wiki success, indicating that larger, more tightly integrated projects produce more successful Wiki pages
- Surprisingly, the number of pictures and the complexity of the language (mean characters in a sentence and mean words per sentence) do not show the expected correlation
Why did we do a Wiki study?
Websites have become increasingly important for companies and entities since the creation of the World Wide Web. They are one of the main platforms to present a product, communicate ideas and advertise companies and their products. Competition for the attention of Internet users is fierce, and consequently it is essential to build websites in the most effective and memorable way possible.
The team Wikis are the most important and long-lasting part of every iGEM project. They are the place where the entire project is presented and where judges should be able to access all of the information in order to award medals and prizes. Consequently, we began our Wiki making process by researching what makes a good website; during this research we became interested in effective web design and began to question what specifically makes a good iGEM Wiki.
After researching general website etiquette and characteristics of a successful website we realised that the Wiki for iGEM is a very specific type of website with its own criteria, so we decided to investigate iGEM Wiki structure to see if we could find any recurring themes underlying succesful communication.
We then realised that it would be very interesting to compare past years Wikis and see if any trends arise and how these trends match up with some of the “best practices” we found when researching website design.
Is there a set of components that when combined make a “winner” Wiki?
What is the wiki study?
After some consideration we selected several variables of interest which we felt could be influential for a Wiki’s success; these were then put forward to the Heidelberg iGEM 2020, who collaborated with us by applying their programming expertise to data collection through web scraping. We received feedback from them about which variables could be plausibly investigated with this method. We then narrowed it down to the variables below. For each team together with Team Heidelberg we determined:
- Number of Titles;
- Number of Subtitles;
- Number of Sub-subtitles;
- Number of Pictures;
- Number of PDFs;
- Number of Videos;
- Mean Characters per Sentence;
- Mean Words per Sentence;
- Number of internal links;
- Number of external links;
- Team Size.
We also differentiated the teams that won the Wiki prize from those that didn't, including runner-ups as “winners” in order to have a slightly more balanced number of winners and losers. Success with the judges is thus our main indicator of what makes a “winner” Wiki.
Mean word length was not found through web scraping; instead it is a combined metric for mean character per sentence and mean words per sentence. We divided the mean characters by mean words in a sentence to give us a very vague indication of the average length of words used. This seemed to be an interesting way to possibly gain some insight into the complexity of the language used.
Due to the two-phase design of our project, we are carrying out this study in two parts, this year we have collaborated with Team Heidelberg to do preliminary web scraping and analysed the results to identify some trends. Next year's team can use our analysis from this year and test the model on next year's Wiki data and expand on our conclusions or adjust the analysis. Furthermore, we can recruit team members with more programming expertise who can expand on the web scraping and maybe use more advanced methods of data collection.
Hypotheses
- Number of Titles: We predict that title number may be positively correlated with being a winner Wiki as it may indicate more components to the project.
- Number of Subtitles: We predict that although there may be some positive correlation with success,it likely will not be as notable as that of the Number of Titles.
- Number of Sub-subtitles: Similarly to Number of subtitles, but if there is any correlation it will be even smaller
- Number of Pictures:We were unable to differentiate between images and figures so we think there may be a tenuous link between number of pictures and Wiki success.
- Number of PDFs: PDFs would usually be used to hold detailed information that is summarized on the wiki pages. This was a potentially interesting variable as we were unsure if Wiki judges prefer for everything to be on the Wiki or if summarized outlines accompanied by more detailed PDFs were preferred.
- Number of Videos: We predicted that more videos would potentially be correlated to winner Wikis because they are a more complicated way to present content and may be well received by judges. Furthermore, this is an interesting variable for Phase 2 next year to investigate as the iGEM deliverables have changed this year to include two videos.
- Mean Characters per Sentence: Higher mean numbersmay indicate more complex writing; we predicted that there might be a correlation that levels off, as overly complicated writing becomes unclear and would likely be detrimental to the success of a Wiki.
- Mean Words per Sentence: Higher mean numbers may indicate more complex writing; we predicted that there might be a correlation that levels off, as overly complicated writing becomes unclear and would likely be detrimental to the success of a Wiki.
- Number of internal links: We predicted that this variable would have the biggest correlation with winning the Wiki prize, as iGEM encourages interlinking project components and showing how one section may inform other parts of the project.
- Number of external links: Although this may have some correlation to winning we did not expect it would be of much consequence.
- Team Size: We predicted that a larger team may result in a larger collective skillset, as well as having more time, which might result in a generally better Wiki and thus a correlation to winning the Wiki prize.