System Analyst Data Platform

Ottawa, Ontario
Lightspeed System Development - Landing Stations and User Terminals – LEO Systems Operations Software /
Full Time Hire - (FTE with Benefits) /
Hybrid
Telesat (NASDAQ and TSX: TSAT) is a leading global satellite operator, providing reliable and secure satellite-delivered communications solutions worldwide to broadcast, telecommunications, corporate and government customers for over 50 years.  Backed by a legacy of engineering excellence, reliability and industry-leading customer service, Telesat has grown to be one of the largest and most successful global satellite operators.
 
Telesat Lightspeed, our revolutionary Low Earth Orbit (LEO) satellite network, scheduled to begin service in 2027, will revolutionize global broadband connectivity for enterprise users by delivering a combination of high capacity, security, resiliency and affordability with ultra-low latency and fiber-like speeds. Telesat is headquartered in Ottawa, Canada, and has offices and facilities around the world.
 
The company’s state-of-the-art fleet consists of 14 GEO satellites, the Canadian payload on ViaSat-1 and one LEO 3 demonstration satellite.  For more information, follow Telesat on X and LinkedIn or visit www.telesat.com
 


We are seeking a highly skilled Kafka Expert with deep expertise in Apache Kafka, Linux systems (Red Hat and Debian), and Kubernetes  to join our data platform team. This is a hands-on engineering role focused on designing, deploying, and optimizing Kafka-based data streaming solutions that are scalable, secure, and production-ready. You will work closely with both the infrastructure and development teams to align Kafka and stream processing architectures with platform standards and to build robust use cases for real-time data streaming, observability, and microservices. Your contributions will be critical to ensuring high availability, performance, and operational excellence across distributed systems.



Key Responsibilities

    • Design, deploy, and manage Apache Kafka clusters in development/testing/production environments.
    • Proven experience deploying and managing Apache Spark and Apache Flink in production environments. 
    • Optimize Kafka performance, reliability, and scalability for high-throughput data pipelines.
    • Ensure seamless integration of Kafka with other systems and services.
    • Manage and troubleshoot Linux-based systems (Ubuntu) supporting Kafka infrastructure.
    • Manage, fine-tune, deploy and operate Kafka on Kubernetes clusters, using Helm, Operators, or custom manifests Kafka
    • Collaborate with cross-functional teams to identify and implement Kafka use cases.
    • Contribute to automation and Infrastructure as Code (IaC) practices  through CI/CD pipeline with gitlab
    • Monitor system health, implement alerting, and ensure high availability.
    • Participate in incident response and root cause analysis for Kafka and related systems.
    • Evaluate and recommend Kafka ecosystem tools like Kafka Connect, Schema Registry, MirrorMaker, and Kafka Streams.
    • Build automation and observability tools for Kafka using Prometheus, Grafana, Fluent Bit, etc.
    • Deep understanding of streaming and batch processing architectures.
    • Familiarity with Spark Structured Streaming and Flink DataStream API.
    • Work with teams to build end-to-end Kafka-based pipelines for various applications (data integration, event-driven microservices, logging, monitoring).
    • Experience running Spark and Flink on Kubernetes, YARN, or standalone clusters.
    • Proficiency in configuring resource allocation, job scheduling, and cluster scaling.
    • Knowledge of checkpointing, state management, and fault tolerance mechanisms.
    • Ability to tune Spark and Flink jobs for low latency, high throughput, and resource efficiency.
    • Experience with memory management, shuffle tuning, and parallelism settings.
    • Familiarity with Spark UI, Flink Dashboard, and integration with Prometheus/Grafana.
    • Ability to implement metrics collection, log aggregation, and alerting for job health and performance.
    • Understanding of TLS encryption, Kerberos, and RBAC in distributed environments.
    • Experience integrating with OAuth, or other identity providers.
    • Familiarity with time-series databases

Required Qualifications

    • 5+ years of experience administering and supporting Apache Kafka in production environments. 
    • Strong expertise in Linux system administration (Red Hat and Debian).
    • Solid experience with Kubernetes (CNCF distributions, OpenShift, Rancher, or upstream K8s ).
    • Proficiency in scripting (Bash, Python) and automation tools (Ansible, Terraform).
    • Experience with Kafka security, monitoring (Prometheus, Grafana, Istio), and schema management.
    • Familiarity with CI/CD pipelines and DevOps practices.
    • Proficient in scripting and automation (Bash, Python, or Ansible).
    • Comfortable with Helm, YAML, Kustomize, and GitOps, GitLab principles.
    • 4+ years of experience in Apache Spark development, including building scalable data pipelines and optimizing distributed processing.
At Telesat, we take pride in being an equal opportunity employer that values equality in the workplace.   We are committed to providing the best candidate experience possible including any required accommodations at every stage of our interview process.   All qualified applicants that have been selected for an interview that require accommodations, are advised to inform the Telesat Talent team accordingly.  We will work with you to meet your needs.   All accommodation information provided will be treated as confidential.