Alrighty, super-citizen-scientists and data nerds, we want to harness the power of your analytical and creative brains to help us make sense of the Belly Button Biodiversity data.
You may recall that our earlier attempts to predict the species that take up residence in an individual’s belly button have fallen flat.
So we want YOU to take a look – Find patterns we have failed to see. Approach the data in new and creative ways. It’s free and available for you to PLAY.
Our only requirement is that you share what you learn via the comment box below or by dropping us an email (email@example.com). Our goal is create a gallery of data analyses – we want to show off and discuss your results. Look for some of those features and innovative tools coming soon.
This data set includes the data from our PLOS ONE paper as well as information from an additional 93 people (with promises of more to come soon as lab work wraps up).
The MS Excel Workbook “Belly Button Biodiversity Data” contains two sheets of data:
Sheet #1 – Belly Button Data Matrix
Belly button sample numbers (unique to each participant, n=153) are on the top row and lowest taxonomic level of bacteria and archaea identified (‘phylotype’ or OTU, operational taxonomic unit) are in the second column. (OTU ID # is listed in the first column for ease of reference). Taxonomies reflect the lowest taxonomic level to which samples could be given a name. If samples were so different from existing samples as to not readily be classified they are simply labeled “bacteria” reflecting that they were different from everything we know. Taxonomies presented follow the descending taxonomic rank you learned about in Intro Biology: Kingdom (or Domain in the case of Archaea); Phylum; Subphylum; Order; Family; Genus. See Hulcr et al. 2012 for specific methods. Numbers in the table indicate sequencing read numbers.
Sheet #2 – Belly Button Meta Data Matrix
Again, belly button sample numbers (unique to each participant, n=153) are on the top row. The first column contains meta-data collected for each participant. Row 2 represents sample collection event. Rows 3-14 are information self-reported by participants. Rows 15-22 are landscape-level environmental factors determined from publically available data layers including the National Land Cover Database, WorldClim, and MODIS. Values correspond to the pixel located at the center of each zip code reported.
Ready, set, ANALYZE! And of course, please do report back your findings!Belly Button Biodiversity Data