
It was a slightly different business case than our core area of application development, because our resource was staffed at Sunlife to help and build data warehouse automation software that automates ETL for many data warehouse platforms.
His role in the development process was more of Product Engineer, also wearing a sales hat for Thinking Minds. He managed the roadmap, and decide the content of each release.
We also provided three developers for the project – one specialized in databases and SQL, one specialized in our code framework, and one specialized in UI. All developers were cross-skilled, but they were each the go-to guy for their specialisation, and own the decisions in that space.
We used a Kimball methodology, so all of us spent a LOT of time going over and over Kimball resources to make sure we’re doing the right things. Our database guy then spent at least 50% of his work on performance related issues, finding the fastest techniques for loading and transformation on each DBMS.
Our programming language was Python 3.4, and we used Nuitka to compile our code for distribution. We’d like to be using Python 3.5 or later, but there are too many deployment issues on platforms that already bundle 3.4. The application was WSGI based, and currently supports stand-alone, CGI and FastCGI deployment methods.
The application generated DDL in SQL, and ELT scripts that use Python as a glue language around SQL statements specific to the DBMS.
Our databases were SQLite3 for metadata storage internal to our product, PostgreSQL as our primary development database; and support PostgreSQL, Redshift, Snowflake, Azure SQL Data Warehouse and Greenplum as our targets. The guys were working on two new data warehouse platforms at the moment, that were rolled out in summer.
We planned our work using Trello. Each developer used Dropbox to control local storage between their own machines, and as a team we use Mercurial for source control. We were all in separate locations, so our must-have tools were email, Skype and JIRA.
I hope that gives you a little insight into a different aspect of ETL / ELT 🙂