REMOTE-CDM Data Engineer
Tampa, FL /
The position is REMOTE
Currently, MelkoTech is seeking a motivated, career and team-oriented Data Engineer with SQL SSIS Development skills to provide unparalleled support to multiple federal agencies through the Department of Homeland Security (DHS) Continuous Diagnostic & Mitigation (CDM) Program. The CDM Program is a high-profile, high-visibility, cybersecurity modernization and risk management program where you can contribute innovative solutions to enhance federal agencies Information Assurance (IA) continuous monitoring capabilities.
This CDM role requires the ability to obtain a suitability clearance and involves working with relational and nonrelational data. The candidate will work with multiple data formats (CSV,JSON, XML), data in the Axonius cybersecurity asset management tool, creating and updating data transformation pipelines and storage (ETL/ELT) in Azure Cloud Data Services (Azure Blob
Storage, Azure Data Factory, Azure SQL Database), data in an on-premise SQL databases, and SQL query language and logic to insert, delete, review, stage, and verify data quality.
Note from the hiring manager "We need a candidate with some programming experiences, be able to connect to API's, transform the data and store into SQL, write store procedures, and query to look for data quality issues/trends, etc. "
Responsibilities include, but are not limited to:
• Connecting to APIs or ingesting data from CSV flat files using Azure Services ingesting the data into Azure Blob storage for staging
• Create data mapping and transform pipelines in Azure Data Factory putting the staged JSON data into an Azure SQL Database. Support either relational or nonrelational/NoSQL structures.
• Create and continuously evaluate the effectiveness and efficiency of the data pipeline and SQL to enhance the overall efficacy of the data pipeline. This may require scripting to supplement SQL and tool processes
• Design stable, reliable, and secure SQL databases that perform optimally at scale. This includes developing complex SQL to perform analysis, aggregations, complex joins, store procedures, views, etc. Includes troubleshooting errors with data outputs and/or identifying that data issues are coming from a downstream sensor tool itself
• Support releases with updates to SQL schema, API data mappings, and ETL pipelines
• Document all work including version control of all source code in the pipeline. This includes data requirements, relationships, data flows, and lineage. This also includes working with the test team on documenting test cases
• Collaborate with other project stakeholders and provide Subject Matter Expertise with all questions around data in the pipeline and the ETL processes.
• Actively participate in Agile ceremonies
• Perform other duties as assigned
• BS in Computer Science or related field
• Minimum 5+ years of experience in data with:
o Experience in building and optimizing data architectures and pipelines with importance on Azure Data Factory or SQL Server Integration Services (SSIS)
o Strong experience with data ETL/ELT tools, processes, and data validation
o Complex SQL development in creating and maintaining relational database schemas
o Data analysis looking for trend, issues, and quality across multiple data sets by querying and joining data
o Experience with finding trends and issues across multiple data sets
o Experience with adding metadata such as with tags and enriching data from other sources
o Experience with relational database design, security, performance tuning, and SQL Experience with SQL Database or other relational database design, security, performance tuning, SQL queries, stored procedures, and views
o Experience with dashboard and report generation, data analytics logic including defining queries or filters and building visualizations
o Experience with parsing and storing nonrelational file formats (e.g., JSON, XML, and CSV)
o Experience in data lifecycle management
o Awareness of APIs types and call and responses to get data
o Experience with database authentication and access and securing data in transit and at rest
o Release management and version control system tools (e.g., Git, SVN)
• Strong communication, interpersonal, and collaborations skills working in a team-oriented environment
• Ability to research, test, and implement as an individual
• Strong analytical, logical, and problem-solving skills
• Ability to adhere to defined processes and procedures, and suggest improvements
• Ability to effectively prioritize and handle multiple tasks simultaneously
• Knowledge of connecting to and implementing on Cloud resources
• Strong technical documentation skills
• Experience in adhering to schedules for software version delivery, and release management in an Agile/DevOps environment (stories, bugs, and issue management)
• Experience with JIRA and Agile development and release practices
• Experience with databases, data store, authorization and access, and security in the Cloud
• Experience architecting and implementing data ETL/ELT pipeline with using Azure Data Factory, SQL Server Integration Services (SSIS), AWS Glue, Google Fusion, Informatica, DataStage, etc.
• Experience with nonrelational JSON based databases
• Overall strong background in data engineering and architecting an enterprise integration solution using multi data sources
• Understanding of technical, operational, and management issues related to data design, development, and deployment within multiple distributed systems
• Scripting or programming skills (PowerShell, Python, Java, data science frameworks)
• Building visualizations with Microsoft Power BI
• Certification(s) in SQL a plus
Security Clearance Requirements:
• Candidate is required to be a US citizen (non-dual citizenship) with the ability to obtain DHS Suitability
• Office work, typically sedentary with some movement around the office.