CAD 146 - Big Data Solutions Architect
Toronto, Montreal or Remote Canada
NorthBay Solutions is an AWS advanced consulting partner certified in big data, public sector, mobile, machine learning, DevOps and education competencies. We design big data, mobile & web solutions for premier brands and working with some of the most progressive companies in the world and creating profoundly impactful solutions.
NorthBay is seeking technically savvy hands-on Big Data Solutions Architects to implement solutions for our customers working with our offshore engineering team. In this role, you will collaborate with NorthBay customers, some working onsite, understand requirements and needs, translate into specifications to develop solutions, drive work with offshore engineering teams, and deliver solutions and results to the customer. This includes assessing customer needs, re-engineering business intelligence processes, designing and developing data models, and sharing your expertise throughout the deployment process.
- Interface with client project sponsors to gather, assess and interpret client needs and requirements
- Develop a data model and Data Lake design around stated use cases to capture client’s KPIs and data transformations
- Identify the relevant AWS services -- especially on Amazon EMR, Redshift, Athena, Glue, Lambda, etc and an architecture that can support client workloads/use-cases; evaluate pros/cons among the identified options before arriving at a recommended solution optimal for the client’s needs.
- Be a hands-on SA… architect, code and facilitate full-scale development.
- Be able to explain to the client the tradeoffs among the various AWS options, and why the recommended solution(s) and architecture was chosen as an optimal one for the client’s needs.
- Work closely with the client and broader NorthBay Delivery team to implement in Agile fashion the architecture and chosen AWS services using AWS Best Practices and principles from the AWS Well-Architected Framework
- Assess, document and translate goals, objectives, problem statements, etc. to our offshore team and onshore management
- Advising on database performance, altering the ETL process, providing SQL transformations, discussing API integration, and deriving business and technical KPIsHelp transition the implemented solution into the hands of the client, including providing documentation the client can use to operate and maintain the solution.
- Help NorthBay Solutions with its Continuous Improvement processes to learn from each customer project, including doing project retrospectives and writing up “Lessons Learned”.
- Able to travel up to 40%
Qualifications (Must Haves:)
- Strong Design / Development Experience on Amazon EMR or AWS GLUE, preferably with Spark (PySpark or Scala)
- Strong troubleshooting / admin experience with EMR or AWS GLUE – specific to infrastructure (CloudFormation) code, deployment via AWS CLI, and bootstrap actions.
- Ability to implement transient infrastructure (e.g. transient EMR clusters) that leverages decoupled storage (S3) and compute. Implement these using reproducible automated mechanisms like AWS CLI scripts, CloudFormation templates, and custom code leveraging AWS SDKs.
- Strong experience in data lake design patterns factoring in key features like ACID transactions,updates and deletes,Data Versioning,Schema Evolution,Schema Enforcement, Audit History, Metadata handling.
- Strong experience on one or more MPP Data Warehouse/Database Platforms preferably AWS RedShift, PostgreSQL, Teradata,Oracle, Aurora postgres/mysql or similar.
- ossess in-depth working knowledge and hands-on development experience in building Distributed Big Data Solutions including ingestion, caching, processing, consumption, logging & monitoring
- Strong Development Experience on at least one or more event-driven streaming platforms preferably Kinesis, Firehose, Kafka, Spark Streaming, or Apache Flink
- Strong Data Orchestration experience using one or more of these tools: AWS Step Functions, Lambda, AWS Data Pipeline, AWS Glue orchestration, Apache Airflow, Luigi or related
- Strong understanding and experience with Cloud Storage infrastructure, and operationalizing AWS-based storage services & solutions preferably S3 or related
- Strong technical communication skills and ability to engage a variety of business and technical audiences explaining features, metrics of Big Data technologies based on experience with previous solutions
- Strong Understanding of at least one or more Cluster Managers (YARN, Hive, Kubernetes, Pig, etc)
Qualifications (Nice to Haves:)
- Strong Data Cataloging experience preferably using AWS Glue or Other
- Strong Development Experience on at least one NoSQL OR Document databases
- Experience on at least one or More Ingestion Integration tools Like Apache NIFI or Streamset or related
- Strong Development Experience on at least one Caching Tool like Amazon Elasticache (with Redis or Memcached) or Lucene
- Strong Understanding and experience in Big Data Audit Logging and Monitoring solutions like AWS CloudTrail and CloudWatch.
- 5+ years of AWS Solutions implementation, professional services experience, prefer Data Analytics space.A passion for exploring data and extracting valuable insights.
- Proven analytical, problem solving, and troubleshooting expertise.
- Proficiency in SQL, preferably across a number of dialects (we commonly write MySQL, PostgreSQL, Redshift, SQL Server, and Oracle).
- Exposure to developer tools/workflow (e.g., git/github, *nix, SSH)Experience optimizing database/query performance.
- Experience with AWS ecosystem (EC2, S3, RDS, Redshift).
- Experience with business intelligence tools with a physical model (e.g., MicroStrategy, Business Objects, Cognos).
- Experience with data warehousing.
- Exposure to NoSQL-based, SQL-like technologies (e.g., Hive, Pig, Spark SQL/Shark, Impala, BigQuery,Mongo)
- Excellent verbal and written communication skills
Education and Experience:
- Bachelor’s Degree in Computer Science or Equivalent
- Minimum five years of Big Data Engineering on AWS experience
- AWS Solution Architect
- AWS Big data Specialty
- Or any Data Centric Certifications