Collaborate & contribute to the architecture, design, development, and maintenance of large-scale data & analytics platforms, system integrations, data pipelines, data models & API integrations.
Create transformation path for data to migrate from on-prem pipelines and sources to AWS.
Provide input and insights, in conjunction with Data Architects, to the client.
Coordinate with data engineers to provide feedback and organization to team.
Ensure that data are optimally standardized and analysis-ready.
Prototype emerging business use cases to validate technology approaches and propose potential solutions.
Collaborate and ensure data integrity after large scale migrations.
Deliver high quality data assets to be used by the business to transform business processes and to enable leaders to complete data-driven analyses.
Continuously improve data solutions to increase quality, speed of delivery and trust of data engineering team's deliverables to enable business outcomes.
Reduce total cost of ownership of solutions by developing shared components and implementing best practices and coding standards.
Collaborate with team to re-platform and reengineer data pipelines from on-prem to AWS cloud.
Work together with team members to ensure data quality and integrity during migrations.
Lead by example and pitch in to enable successful and seamless client delivery.
Location: Preferred in DC area, but role is remote.
Requirements
8+ years experience related to Data Engineering.
AWS Cloud Certificate
Minimum of 3 years of experience in the following:
Leading engineering teams, including task management and personnel management.
Working in or managing data centric teams in the government or other highly regulated environments.
Strong understanding of data lake, data lakehouse, and data warehousing architectures in a cloud-based environment.
Proficiency in Python for data manipulation, scripting, and automation.
In-depth knowledge of AWS services relevant to data engineering (e.g., S3, EC2, DMS,
Understanding of data integration patterns and technologies.
Proficiency designing and building flexible and scalable ETL processes and datapipelines using Python and/or PySpark and SQL.
Proficiency in data pipeline automation and workflow management tools like Apache Airflow or AWS Step Functions.
Knowledge of data quality management and data governance principles.
Strong problem-solving and troubleshooting skills related to data management challenges.
Experience managing code in GitHub or other similar tools.
Minimum of 2 years of experience in the following:
Hands-on experience with Databricks including data ingestion, transformation, analysis and optimization.
Experience designing, deploying, securing, sustaining and maintaining applications and services in a cloud environment (e.g., AWS, Azure) using infrastructure as code (e.g., Terraform, CloudFormation, Boto3).
Experience with database administration, optimization, and data extraction.
Experience using containerization technology such as Kubernetes or Mesos.
Minimum of 1 year of experience in the following:
Hands-on experience migrating from an on-premise data platform(s) to a modern cloud environment (e.g. AWS, Azure, GCP).
Linux/RHEL server & bash/shell scripting experience in on-prem or cloud environment.
Preferred Experience:
Bachelor's Degree in related field.
Previous experience with large-scale data migrations and cloud-based data platform implementations.
Prior experience with Databricks Unity Metastore/Catalog.
Familiarity with advanced SQL techniques for performance optimization and data analysis.
Knowledge of data streaming and real-time data processing frameworks such as Spark Structured Streaming.
Experience with data lakes and big data technologies (e.g., Apache Spark, Citus).
Familiarity with serverless computing and event-driven architectures in AWS.
Certifications in AWS, Databricks, or related technologies.
Experience working in Agile or DevSecOps environments and using related tools for collaboration and version control.
Extensive knowledge of software and data engineering best practices.
Strong communication and collaboration skills with internal and external stakeholders.
Experience establishing, implementing and documenting best practices, standard operating procedures, etc.
Clearance requirements:
Must be able to obtain and maintain a Public Trust clearance.