Data, Big And Small: How The Datasets That Power AI Are Made
AI as a field, and as a buzzword, has exploded in the past few years. But for many, AI seems like something conjured out of the ether: a computerized intelligence created with nothing but code. This is not the case. The explosion in AI has related directly to an explosion in the available data for training: these algorithms are fed massive datasets and learn to imitate the human behaviors that went into creating them. In this workshop, we'll explore a quick history of the datasets behind AI and the human labor that made them, discussing the advent of web scraping, ImageNet, Mechanical Turk, and more. After that, we'll discuss the ethical considerations of dataset creation, then go into a hands-on portion where we create and label our own image and text datasets for use, to give participants firsthand experience of the messy, human process of making them.