Lead Data Engineer – Python / SQL Expert - Web3/Infrastructure
As the Lead Data Engineer for this infrastructure project, you will play a pivotal role in developing and maintain distributed systems, data pipelines, and APIs for Rated. You will primarily focus on building indexing, database, and processing systems that will enable the project to expand to multiple blockchain networks. This role is critical in shaping the projects technology infrastructure. The ideal candidate will have experience in crypto data indexing, be mission-driven and intellectually curious, thrive in early-stage environments, and be an avid open-source contributor.
Explorer Hub in the Blockchain Ecosystem
This project contains an explorer hub which is a valuable resource for a community in the blockchain ecosystem, with a specific focus on networks using the Proof of Stake (PoS) algorithm. Participants such as validator operators, relay operators, developers, researchers, and wallet providers rely on the explorer's insights to navigate current challenges and anticipate future possibilities. The documentation section provides in-depth background information on its various components, along with clear definitions of the variables and methodologies used to drive them.
Responsibilities of a Lead Data Engineer
- Design and Develop Distributed Systems: Lead the design and development of scalable and fault-tolerant distributed systems tailored for blockchain data indexing and processing. Ensure these systems are capable of handling large volumes of data efficiently.
- Data Pipeline Development: Architect, build, and maintain robust data pipelines to collect, process, and store blockchain data from multiple networks. Implement mechanisms for data quality assurance and monitoring to ensure accuracy and reliability.
- API Development: Design and implement APIs to provide access to the indexed blockchain data, catering to the needs of various stakeholders within the blockchain ecosystem. Ensure APIs are well-documented, efficient, and secure.
Requirements for a Lead Data Engineer
- Expertise in SQL and Python: Demonstrate mastery in SQL for efficient data querying and manipulation, as well as in Python for developing robust data processing solutions and automation scripts.
- Proficiency in Real-Time Data Processing: Possess strong skills in building real-time data pipelines and indexers using technologies such as Flink, Bytewax, Apache Beam, Spark Streaming, or similar frameworks. Experience in handling streaming data efficiently is essential for this role.
- In-Depth Knowledge of Database Technologies: Have a deep understanding of columnar and/or time series databases such as TimescaleDB, ClickHouse, StarRocks, or similar solutions. Proficiency in designing, optimizing, and querying these databases is necessary to ensure the efficient storage and retrieval of blockchain data.
- Comprehensive Understanding of Software Development Lifecycles: Exhibit a thorough understanding of software development lifecycles, from conducting code reviews to implementing continuous integration and continuous deployment (CI/CD) pipelines. Experience in ensuring code quality, scalability, and maintainability throughout the development process is crucial.
- Familiarity with Infrastructure as Code (IaC) Systems: Possess some experience with Infrastructure as Code (IaC) systems such as Ansible, Terraform, or Pulumi. Understanding how to automate infrastructure provisioning and management using code-based approaches is beneficial for maintaining and scaling the project's infrastructure efficiently.