Skip to main content

Data Engineer - ETL, Python, Spark, Hadoop

Data Engineer - ETL, Python, Spark, Hadoop
InfoGravity LLC.
1 month ago

Job Details

Data Engineer - ETL with Python, Spark & Hadoop

Location: Pittsburgh only (Hybrid 2-3 days a week).

Citizens, or L2S


Responsibilities:

  • Organize business needs into ETL/ELT logical models and ensure data structures are designed for flexibility to support the scalability of business solutions.
  • Craft and implement data pipelines utilizing Spark and Python
  • Define and deliver reusable components for the ETL/ELT framework.
  • Define optimal data flow for system integration and data migration.
  • Integrate new data management technologies and software engineering tools into existing structures.
  • Design, build, and maintain CI/CD pipelines in multiple integration and test environments.
  • Install, configure, and manage automated testing tools in the environment.

Qualifications:

  • Experienced in the Design, Development, and Implementation of large-scale projects in financial industries using Data Warehousing ETL tools(Spark)
  • Experience in creating ETL transformations and jobs using PySpark and automating workflows using Orchestration tools like Airflow, Control-M
  • Strong knowledge and experience in SQL, Python, and Spark
  • Experience with Big Data/distributed frameworks such as Spark, Kubernetes, Hadoop, and Hive
  • Ability to design ETL/ELT solutions based on user reporting and archival requirements
  • Strong sense of customer service to consistently and effectively address client needs
  • Self-motivated; comfortable working independently under general direction
  • Hands-on experience in building and managing CI/CD pipelines
  • Basic knowledge of Azure Cloud Components

Expertise level

Key skills