Extracting data from the Internet with Scrapy

While exposing data to developers through API is getting more typical, most of the data found on the internet is only available through raw HTML, often mixed in seemingly chaotic tags. This talk aims to be a quick introduction for the data scientist to politely extract data from a website and store it in a structured database with the help of the Python library Scrapy, and how one might extend it to fits their specific needs.

Israël Hallé

Flare Systems Inc.

Israël earned a B.Eng. from ETS in 2016. He has since developed sought-after expertise in computer security and software engineering. Israël has been involved in the computer security ecosystem as a conference speaker, workshop host, bug bounty hunter and an open-source developer. He has been building the current technology powering Flare Systems from the beginning and is now acting as the Chief Architect to overcome industry problems with the help of innovation.

