The data science team at AF Group is seeking an intern who's excited to get hands-on experience using statistics, machine learning and natural language processing to solve complex business problems and help deliver valuable insights to our claims and pricing business partners.
Check out the Internships at Emergent Holdings video to learn more.
JOB DESCRIPTION:
Perform exploratory data analysis and feature engineering.
Build predictive models using supervised and unsupervised learning approaches on structured and unstructured data.
Train natural language processing models to extract information from notes and scanned documents.
Conduct hyper-parameter tuning and feature selection.
Evaluate model performance and create exhibits such as lift charts.
Research new machine learning and statistics algorithms/concepts/frameworks.
Communicate results via presentations, reports, and dashboards to both technical and nontechnical stakeholders.
Assist with peer review of code.
Learn about risk, pricing, claims handling and gain an in-depth understanding of property and casualty insurance.
Attend sprint planning and daily standups and present work completed during biweekly sprint reviews.
EMPLOYMENT QUALIFICATIONS:
Have status as either senior undergraduate or graduate student (MS/MA) as of the end of the spring term.
Hold a cumulative grade point average of 3.0 or better as of the most recent grading period.
Be able to work full-time during normal business hours for this summer.
Be available to begin employment between mid-May and mid-June.
EDUCATION OR EQUIVALENT EXPERIENCE: Should hold or be pursuing a bachelor's or master's degree in Data Science, Statistics, Computer Science, Mathematics, Operations Research, Engineering, Physics, Actuarial Science, or similar analytical STEM field.
EXPERIENCE: With proper education and projects (either personal or scholastic) no prior work experience necessary.
SKILLS/KNOWLEDGE/ABILITIES (SKA) REQUIRED:
Strong knowledge of statistical modeling and machine learning algorithms, including:
Gradient boosted trees e.g. XGBoost, LightGBM, etc.
Artificial neural networks, including deep architectures.
Generalized linear models (GLMs) e.g. linear and logistic regression
Dimensionality reduction, e.g. Principal component analysis (PCA)
K-means clustering
Lasso and Ridge regularization
Intermediate Python programming, including packages such as pandas, matplotlib, NumPy, scikit-learn, etc.
Working knowledge of SQL and relational databases.
Knowledge of natural language processing concepts, including at least three of the following: TF-IDF, topic modeling, sentiment analysis, word embeddings (e.g. Word2Vec), BERT, GPT.
Have some experience with Git and cloud resources like Azure or AWS, and good coding habits
Ability to present information and ideas, clearly and concisely in both written and oral manner.
Ability to establish workflows, manage multiple projects and meet necessary deadlines while maintaining composure during stressful workloads and/or deadlines.