Sr Data Engineer (NO C2C CANDIDATES) at Sharp Decisions in Marysville, Ohio

Posted in Other 2 days ago.

Type: full-time





Job Description:

Title : Sr Data Engineer (NO C2C CANDIDATES)

Location : Marysville, OH (Locals only)

Contract duration : 8 - 9 months of Contract (with possible extension)

Hybrid role

The candidate will be working :

Generative AI solutioning, specifically in AWS and/or Azure.

The Data and Platform Engineer plays a crucial role in designing, developing, and maintaining scalable and reliable data platforms to support the organization's data needs. With a propensity of action over analysis, they are responsible for ensuring efficient data ingestion, storage, processing, and retrieval, as well as providing data integration solutions to enable effective data analysis and reporting. This role requires a strong understanding of data engineering concepts, data management, and programming languages to deliver innovative data solutions while respecting governance principles like responsible AI and data privacy.

Daily Tasks Performed

Data Integration and Transformation:
  • Develop and implement data integration solutions to enable seamless data movement across various systems and platforms.
  • Implement efficient data workflows, data pipelines, and ETL processes to accommodate structured and unstructured data from various sources to ensure the timely delivery of high-quality data.
  • Define data models and build data hierarchy structures to support AI/ML model integrations that are reliable and scalable.
  • Transform and cleanse data to ensure accuracy, consistency, and integrity.
  • Collaborate with data analysts and data scientists to understand data requirements and deliver tailored solutions.
  • Troubleshoot and resolve data integration issues in a timely manner.

Data Platform Development and Maintenance:
  • Design, develop, and maintain scalable data platforms that support data ingestion, storage, processing, and retrieval.
  • Collaborate with cross-functional teams to ensure data platforms meet the organization's evolving data requirements.
  • Regularly monitor the data platform's performance, identifying and resolving any issues or bottlenecks.

Data Quality, Governance and Security:
  • Implement and enforce data quality and governance assurance policies, ensuring compliance with relevant data protection regulations and industry best practices.
  • Develop and maintain data security measures, including access controls, encryption, and data anonymization techniques.
  • Monitor data usage and access patterns, proactively identifying and mitigating potential security risks.
  • Collaborate with the IT and cybersecurity teams to address data-related vulnerabilities and incidents.
  • Perform data profiling, data validation, and data cleansing activities to ensure data accuracy and completeness.
  • Collaborate with stakeholders to identify and resolve data quality issues.
  • Define and monitor data quality metrics to measure and improve data quality over time.
  • Conduct regular audits and reviews to ensure adherence to data quality standards.
  • Ensure data governance and compliance standards, including responsible AI principles and data privacy, are adhered to during data integration and transformation processes.

Performance Optimization:
  • Identify and implement performance optimization strategies for data platforms and processes.
  • Optimize database design, data structures, and query performance to enhance data retrieval speed.
  • Monitor and analyze data processing and query performance metrics, taking proactive actions to optimize their performance.
  • Collaborate with infrastructure and network teams to ensure optimal data platform performance.
  • Conduct regular performance testing and tuning activities and optimize data platforms for performance, reliability, and security.

Documentation and Knowledge Sharing:
  • Document data platform architecture, data models, data flows, and technical specifications.
  • Create and maintain comprehensive documentation of data engineering processes and workflows.
  • Share knowledge and best practices with team members and stakeholders.
  • Provide training and support to users on data engineering tools and technologies.
  • Contribute to the development and enhancement of data engineering standards and guidelines.
  • Continuously research, evaluate and implement emerging technologies and best practices in data engineering to drive innovation.

Position Success Criteria (Desired) - 'WANTS'
  • BS in Technical discipline such as Computer Science, Information Systems, Computer Engineering or a related field. Proven experience as a Data Engineer, Database Developer, or relevant experience and certifications are welcome in lieu of a degree.
  • 3-5 years working in cloud-based environments.
  • Strong understanding of data engineering principles, data management, and data modeling concepts.
  • Proficient in programming languages such as Python, Java, or Scala, with experience in database query languages (e.g., SQL).
  • Experience with cloud-based data platforms (e.g., AWS, Azure, GCP) and associated services (e.g., S3, Redshift, BigQuery).
  • Familiarity with data integration techniques, ETL frameworks (e.g., Apache Spark), and workflow management tools (e.g., Airflow).
  • Experience with data streaming and real-time data processing frameworks (e.g., Kafka, Apache Flink, AWS Kinesis, etc).
  • Familiarity with machine learning and AI techniques for data analysis and prediction.
  • Understanding of data security, encryption, privacy, and compliance requirements.
  • Excellent problem-solving and analytical skills, with the ability to optimize data processing pipelines for performance and efficiency.
  • Strong communication skills, with the ability to effectively collaborate with cross-functional teams and explain complex technical concepts to non-technical stakeholders.

Other job specific skills :
  • Experience with data engineering tools and frameworks such as Apache Airflow, Apache NiFi, Talend, etc.
  • Experience with Data science tools such as Open Data Hub (Seldon, Prometheus, Dataiku, IBM Watson Studio, etc)
  • Deep learning - machine learning that is a neural network with three or more layers, which helps to learn from large amounts of data
  • Cloud/big data tools (ex. blob storage, Redshift, Kafka, Hadoop, Spark, Hive etc.)
  • Experience with containerization technologies such as Docker or Kubernetes.

More jobs in Marysville, Ohio

Other
about 4 hours ago

Scotts Miracle-Gro
Other
about 5 hours ago

Smart IT Frame LLC
Other
about 11 hours ago

Vertiv Corporation
More jobs in Other

Other
6 minutes ago

Health Imperatives
Other
6 minutes ago

The town of Snowmass Village is hiring!
Other
6 minutes ago

Gadsden Water Works