Data Developer

Join our Development team in Toronto

Application deadline: 31st Oct 2021

About Egress

Egress Software Technologies provides industry and government-certified encryption services that give you choice and control over how you securely share sensitive information. Whether sending information via email or large file transfer, or collaborating on confidential documents, the Egress portfolio of products offers flexibility over deployment and user federation, choice over how users send and receive information, and control over how recipients handle the data shared with them.

We work with large email and file sharing datasets to provide risk-assessments to users in real time by leveraging Machine Learning approaches. Data Engineering and Data Science are an increasingly important part of what we do, and we are now seeking to build out our dedicated Data Operations team to support these efforts and further develop our Egress Data Platform based on Azure Databricks.

We’re happy to offer remote working with 2 days a month in the office once reopened.

About the role

We are seeking a developer with a strong interest in data and machine learning, and associated technologies such as Apache Spark and Azure SQL.  This is a cross functional and collaborative role, working closely with our Development and Science & Innovation teams, as well as our Data Architect and Data Protection Officer.  You will help to develop, maintain and deliver production-quality data pipelines and SQL databases supporting our Egress Analytics reporting product, as well as exciting new AI/ML functionality to help customers share information safely. You will bring strong software development skills, and experience and expertise in contemporary cloud-based data engineering approaches to Egress.  You will be pivotal in developing the Egress Data Platform which will provide solid foundations for future ML functionality in Egress’s products.

About you

  • Software development experience associated with data manipulation & processing (Python, Scala).
  • Solid SQL knowledge with experience interrogating data from large relational databases/data lakes, designing SQL schemata and optimising performance.
  • Solid understanding of the Microsoft database and BI stack (Azure Synapse, Azure SQL, MS SQL)
  • Experience with Apache Spark and related technologies (Hive, Spark, Azure Databricks)
  • Experience in cloud based deployment practices eg Azure DevOps.
  • Experience with streaming platforms such as Azure Event Hubs and Kafka.
  • Familiarity with pipeline orchestration engines such as Airflow and Azure Data Factory.
  • Extensive experience in data architectures and concepts.
  • Microsoft Azure data integration experience with Azure SQL Data Warehouse, Azure Data Lake, Azure Data Factory and Polybase, or AWS equivalents.
  • Solid understanding of software development lifecyles.
  • Experience working with data scientists and software engineers.