Data Engineer - TS/SCI or TS
Delivery – Data Science
U.Group is an Advanced Technology and Creative Design company launched in 2019 after combining the data and domain expertise of ByteCubed with the world-class creative of CHIEF. Together under a new brand and market position, U.Group’s team of 280+ changemakers are using data science, engineering, and ingenuity to develop deep understanding of our clients most pressing challenges, and powerful solutions for confronting them. Working collaboratively out of offices in Arlington, VA, Washington, DC, Portland, OR, and in the heart of the action at client sites—U.Group is committed to using customer-centric innovation to create new opportunity in the public and private sectors. To learn more, visit www.U.Group
U.Group is hiring a Cleared, TS/SCI or TS, Data Engineer to join our Data team. You will work at our Corporate Headquarters in Arlington, VA and also at our client site.
About this role?
This role is primarily in service of “data science”. People use the words “data science” to mean a lot of different things. At U.Group, data science mostly focuses on data wrangling, natural language processing, machine learning, and visualization. We work with a lot of different kinds of data: unstructured text from the wild web, relational datasets from commercial providers, semi-structured data from public APIs. On a practical level, whether we’re working with social media data or corporate entity data, we’re generally looking for ways to make it easier for our client to do their work -- look up data, draw connections, anticipate shifts and changes, make hard business decisions -- but make them faster, easier, smarter, and more scalable. We leverage libraries from Python and R, tools like Spark and Hadoop, and a range of different SQL and noSQL databases to do that, though we’re fairly open-minded in terms of stack (well, we do have a strong preference for open source).
Who we’re looking for?
We are looking for a savvy Data Engineer to join our growing team of data professionals. You will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. You will also be responsible for building ingestion pipelines for new, and possibly challenging, data sources, and integrating them.
You are an experienced data wrangler and data pipeline builder who enjoys optimizing data systems and building them from the ground up. You support our data scientists, data visualization experts, and database architects, on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.
You must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
You should be excited by the prospect of building new products, optimizing existing ones or even re-designing current data science products to support our next generation of data products, services, and initiatives.
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Create and maintain optimal data pipeline architecture.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Data, Product, Program, and Executive teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
- Create data tools for analysts and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Ensure good data governance practices are being employed to ensure data awareness, high data quality, and data freshness.
- Work with subject matter experts to strive for greater functionality in our data systems.
Qualifications (Not all skills needed but at least a majority of the ones below)
- You must have an Active TS/SCI or TS.
- You have extensive experience working with data, and you’re comfortable with structured and unstructured data (HTML, XML, JSON, CSV, etc).
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases. You don’t have a dog in the SQL/noSQL fight -- you believe in picking the best tool for the job, and you’ve used enough of both to have good instincts about what will work best to solve the problem at hand.
- Experience building and optimizing data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- We are looking for a candidate with 3+ years of experience in a Data Engineer role
- You should also have experience using the following software/tools:
- Experience with Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases, including Postgres, MongoDB, and Cassandra.
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience with Continuous Integration / Continuous Delivery tools
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
- Experience with object-oriented, functional, declarative, and scripting languages: Python, Java, C++, Scala, R, etc.
- You have experience with agile software development, know about sprints, scrum, and all that stuff, and you aren’t squeamish about code reviews or rubber ducking.
- You understand why version control is important and are willing to using git branching conventions in a routine, methodical, communicative, and participatory way.
Nice to haves
- A grasp of or desire to learn machine learning and natural language processing
- AWS experience and if not experience, a desire to learn AWS which we will gladly teach
- You are game to help us find smart ways to implement/integrate data science features (e.g. machine learning models) within a Java- and Angular-based software application.
- You have a working knowledge of data visualization technologies.
Life at U.Group
We encourage our colleagues to lead balanced lives. Our comprehensive benefits include medical, dental, vision, disability, wellness programs, flexible spending, 20 days paid time off, and paid holidays. We also offer:
Flex Time and Remote Work Options
Education reimbursement program
Transit/parking subsidy program
Parental Leave Policy
Professional Development Program
401(k) with Company Match
EQUAL EMPLOYMENT OPPORTUNITY – Our policy is to provide equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, we will provide reasonable accommodations for qualified individuals with disabilities.
U.Group shall abide by the requirements of 41 CFR §§ 60-300.5(a) and 60-741.5(a). These regulations prohibit discrimination against qualified individuals on the basis of protected veteran status or disability, and require affirmative action by covered prime contractors and subcontractors to employ and advance in employment qualified protected veterans and individuals with disabilities.