The ideal candidate is a rare hybrid, an engineer with the programming abilities to scrape, combine, and manage data from a variety of sources and a statistician who knows how to derive insights from the information within. He or she will combine the skills to create new prototypes with the creativity and thoroughness to ask and answer the deepest questions about the data. Qualified candidates will have a strong academic background in mathematics, statistics, or data engineering and a passion for data science and machine learning.
The Data Engineer is an integral part of the Data & Analytics Keying & Linking team that works closely in all phases of Search, Match, and Entity Resolution prototype development.
Equifax has a hybrid work schedule that allows for 2 days of remote work (Monday and Friday), with 3 days onsite (Tuesday, Wednesday, Thursday) every week.
This role will work the required onsite days at our Equifax office in Alpharetta, GA or Reston, VA.
Visa sponsorship/support is not available for this position currently or in the future.
This is a direct-hire role - not open to C2C or vendors.
What you'll do
With moderate supervision, manage project's progress, metadata collection, development and management.
Perform investigations on internal / external stakeholder queries with high level direction from the Team Leader.
Analyze problems, identify root cause, formulate findings and observations of results, suggest resolutions and communicate to internal / external stakeholders with moderate guidance from the Team Leader.
Maintain current knowledge of industry regulatory requirements such as reporting mandates, concepts and procedures, compliance requirements, and regulatory framework and structure.
Able to support internal/external queries on data standards.
Enter/maintain information in documentation repository.
Follow established security protocols, identify and report potential vulnerabilities.
Perform intermediate level data quality checks, following established procedures.
What experience you need
BS degree in a STEM major or equivalent discipline; Master's Degree strongly preferred.
2-5 years of experience as a data engineer or related role.
Intermediate skills using programming languages such as Python, SQL, NoSQL (or related technologies) or other scripting languages.
Basic understanding and experience with Google Cloud Platforms and an overall understanding of cloud computing concepts.
Experience building and maintaining simple data pipelines, following guidelines, transforming and entering data into a data pipeline in order for the content to be digested and usable for future projects.
Experience supporting the design and implementation of basic data models.
Demonstrates proficient Git usage and contributes to team repositories.
What could set you apart
Some exposure to Entity Resolution either academically or professionally.
Agile development including Scrum and other lean techniques.
A Cloud certification is strongly preferred.
Experience working in a cross-functional, matrix organization, at times under ambiguous circumstances.
Experience in performing analysis using Google BigQuery, and other Google Cloud Platform technologies, Google DataFlow, Scala + Spark or PySpark.
Excellent problem solving skills with the ability to design algorithms, which may include data cleaning, data mining, data clustering and pattern recognition methodologies.