Skip to main content

Python Data Acquisition Engineer

Python Data Acquisition Engineer
Allen Institute for AI (AI2)
8 months ago

Python Data Acquisition Engineer

This is a contract role for ~3-9 months.

Who You Are:

  • You are a software engineer that’s comfortable writing data crawlers and processing pipelines in Python.

Who We Are:

  • We’re a cross-functional team of researchers and engineers working together to build & release the next generation of our 3 trillion token dolma dataset: https://huggingface.co/datasets/allenai/dolma

Your Next Challenge:

  • The essential functions include, but are not limited to the following:
  • Develop efficient and scalable data crawlers
  • Implement extraction & processing services that clean and transform data after acquisition
  • Write unit & integration tests as appropriate

What You’ll Need:

  • Experience writing Python code
  • Knowledge of libraries such as Scrapy or BeautifulSoup
  • Experience deploying services and infrastructure in AWS

Physical Demands and Work Environment

  • The physical demands described here are representative of those that must be met by a team member to successfully perform the essential functions of this position.
  • Reasonable accommodations may be made to enable individuals with disabilities to perform the functions.
  • Must be able to remain in a stationary position for long periods of time.
  • The ability to communicate information and ideas so others will understand.
  • Must be able to exchange accurate information in these situations.
  • The ability to observe details at close range.
  • Can work under deadlines.

A Little More About AI2

  • The Allen Institute for Artificial Intelligence is a non-profit research institute in Seattle founded by Paul Allen.
  • The core mission of AI2 is to contribute to humanity through high-impact research in artificial intelligence.

In addition to AI2’s core mission, we also aim to:

  • Contribute to humanity through our treatment of each member of the AI2 Team.

Highlights include:

  • We are a learning organization.
  • We value diversity.
  • We value inclusion.
  • We emphasize a healthy work/life balance.
  • We are collaborative and transparent.
  • We are in Seattle.

We are proud to be an Equal Opportunity employer.

  • We do not discriminate based upon various characteristics.

This employer participates in E-Verify and will provide the federal government with your Form I-9 information to confirm that you are authorized to work in the U.S. If E-Verify cannot confirm that you are authorized to work, this employer is required to give you written instructions and an opportunity to contact the Department of Homeland Security (DHS) or Social Security Administration (SSA) so you can begin to resolve the issue before the employer can take any action against you, including terminating your employment. Employers can only use E-Verify once you have accepted a job offer and completed the Form I-9.

We are committed to providing reasonable accommodations to employees and applicants with disabilities to the full extent required by the Americans with Disabilities Act (ADA). If you feel you need a reasonable accommodation pursuant to the ADA, you are encouraged to contact us at [email protected].

Expertise level

Work arrangement

Key skills

AWS

Similar Jobs in United States