Founding Data Scientist
This is a unique opportunity to be the first full-time hire for a well-funded Y Combinator backed startup. The candidate will be working side-by-side with the CTO to design and build the early major releases of the company’s products and help set the technical direction and culture of the company.
About the company
HiGeorge helps companies, no matter how big or small, better leverage the world’s data to create business value. We do this by providing a no-code service where businesses can access the world’s public data and visualize it. Think Tableau with all the world’s public data already attached.
Today, HiGeorge enables media companies like the Chicago Tribune to easily create best in class data visualizations for their readers at a fraction of the cost and time of an in-house team. We can do this by leveraging our proprietary data pipeline engine and front-end libraries that allows us to configure new auto-updating data feeds without writing new code.
HiGeorge launched 9 months ago and has grown 4x since December. We are backed by a mix of Silicon Valley and media institutions such as YCombinator, Bertelsmann Digital Media Investments (BDMI) and Garage Technology Ventures. Our ambition is to make data accessible to everyone and build the next multi-billion dollar tech company along the way.
We are seeking an experienced, talented Data Scientist to join the product and engineering team at HiGeorge in Los Angeles or remotely anywhere around the world. You’ll be bringing your skills and expertise to design schemas, build ML models and develop and deploy ETL pipelines that will make external data accessible to many businesses.
- Independently design, build and launch new ETL pipelines in production
- Collaborate on improving the company's data pipeline engine
- Design schema for master records from multiple external sources
- Design and build data integrity and quality controls and processes
- Design and build ML models to classify incoming data
- Mentor other team members
- Previous experience working with large datasets
- Advanced knowledge of SQL and query optimizations
- Experience with dimensional data modeling & schema design
- Experience building ETL design, implementation and maintenance
- Highly experienced with Python, Pandas