You are viewing a preview of this job. Log in or register to view more details about this job.

Founding Data Scientist

This is a unique opportunity to be the first full-time hire for a well-funded Y Combinator backed startup. The candidate will be working side-by-side with the CTO to design and build the early major releases of the company’s products and help set the technical direction and culture of the company.

About the company

HiGeorge helps companies, no matter how big or small, better leverage the world’s data to create business value. We do this by providing a no-code service where businesses can access the world’s public data and visualize it. Think Tableau with all the world’s public data already attached.

Today, HiGeorge enables media companies like the Chicago Tribune to easily create best in class data visualizations for their readers at a fraction of the cost and time of an in-house team. We can do this by leveraging our proprietary data pipeline engine and front-end libraries that allows us to configure new auto-updating data feeds without writing new code.

HiGeorge launched 9 months ago and has grown 4x since December. We are backed by a mix of Silicon Valley and media institutions such as YCombinator, Bertelsmann Digital Media Investments (BDMI) and Garage Technology Ventures. Our ambition is to make data accessible to everyone and build the next multi-billion dollar tech company along the way.

Job Description

We are seeking an experienced, talented Data Scientist to join the product and engineering team at HiGeorge in Los Angeles or remotely anywhere around the world. You’ll be bringing your skills and expertise to design schemas, build ML models and develop and deploy ETL pipelines that will make external data accessible to many businesses.

Responsibilities

Independently design, build and launch new ETL pipelines in production
Collaborate on improving the company's data pipeline engine
Design schema for master records from multiple external sources
Design and build data integrity and quality controls and processes
Design and build ML models to classify incoming data
Mentor other team members

Requirements

Previous experience working with large datasets
Advanced knowledge of SQL and query optimizations
Experience with dimensional data modeling & schema design
Experience building ETL design, implementation and maintenance
Highly experienced with Python, Pandas