Data Infrastructures

September 14, 2017

How THRON leverages cloud architectures to excel

Dario De Agostini, CTO @ THRON

This talk will briefly explain how THRON uses Amazon AWS and will provide a few examples on how leveraging cloud architectures brought successful results, especially regarding managing data analytics to provide real-time insights. Lambda architecture, data pipeline and various AWS services will be described before sharing lessons learnt. Dario will also share his vision regarding the future of developers.

Slides

Crawling and processing the Italian corporate web

Alessio Guerrieri, Data Scientist @ SpazioDati

SpazioDati collects public information about all Italian companies from many different sources, the most challenging being the World Wide Web. Our Internet Data Gathering project crawls and processes data from the entire Italian web, using distributed frameworks such as Hadoop, Nutch, Elasticsearch and Spark. This talk will give an overview of the extraction pipeline and present some of the issues we tackled during and after development.

Slides