Lead Python Data Engineer
Kloudhunt LLC
7 months 3 weeks ago
Job Details
Job Description
- We are currently seeking an experienced Python Data Engineer to join the Big Data and Advanced Analytics department.
- As part of the Data Engineering team, the Lead Python Data Engineer will work closely with Business domain experts and Data Scientists to solve real-world oil and gas midstream problems using advanced analytics, machine learning, and artificial intelligence.
- This individual will provide analytical and technical leadership to the team to advance the data engineering practice within the organization.
Responsibilities include:
- Work directly with Business domain experts and Data Scientists to develop high quality, reliable, scalable, machine learning systems.
- Design and implement frameworks and tools to streamline the machine learning process.
- Automate manual data collection and processing tasks to improve efficiency.
- Leverage software architecture and design patterns to develop fault tolerant microservices.
- Convert research-based machine learning models into production-ready software.
- Implement processes to ensure coding standards, code quality, documentation, and test coverage.
Qualifications
- 7+ years of programming experience in Python
- Expertise in developing and maintaining data pipelines.
- Experience in testing, packaging, and deploying machine learning models.
- Experience in software engineering practices such as Design Principles and Patterns, Unit Testing, Refactoring, CI/CD, and version control.
- Expertise in Object-Oriented Design Principals and Functional Programming Principals
- Experience with common Python Data Engineering packages including Pandas, Numpy, Pyarrow, Pytest, Scikit-Learn, and Boto3
- Experience in storage technologies including SQL relational databases and Object Storage such as AWS S3
- Experience in implementing distributed computing systems.
- Experience in designing modular, reusable software components.
- Experience in developing API endpoints and microservices.
- Knowledgeable of MLOps Principles
Knowledgeable of ML platform technologies including Apache Airflow, Kubernetes, Dask, Ray, and MLFlow