Data mining, inference, and predictive analytics for the built environment with images, text, and WiFi data
Abstract: What can campus WiFi data tell us about life at MIT? What can thousands of images tell us about the way people see and occupy buildings in real-time? What can we learn about the buildings that millions of people snap pictures of and text about over time? Crowdsourcing has triggered a dramatic shift in the traditional forms of producing content. The increasing number of people contributing to the Internet has created big data that has the potential to 1) enhance the traditional forms of spatial information that the design and engineering fields are typically accustomed to; 2) yield further insights about a place or building from discovering relationships between the datasets. In this research, I explore how the Architecture, Engineering, and Construction (AEC) industry can exploit crowdsourced and non-traditional datasets. I describe its possible roles for the following constituents: historian, designer/city administrator, and facilities manager - roles that engage with a building's information in the past, present, and future with different goals. As part of this research, I have developed a complete software pipeline for data mining, analyzing, and visualizing large volumes of crowdsourced unstructured content about MIT and other locations from images, campus WiFi access points, and text in batch/real-time using computer vision, machine learning, and statistical modeling techniques. The software pipeline is used for exploring meaningful statistical patterns from the processed data.
Scraped internet images of MIT Stata Center since 2004 and their original camera positions (in rotation, translation, and position in space) taken by the public.
Internet scraped public images of MIT Stata Center formed into a digital 3d model.
400,000 WiFi Access Points and the population density of mobile connected devices to WiFi AP's by time of day