Data Scientist
DESCRIPTION
At PDF Solutions, we’re transforming the semiconductor and electronics manufacturing industry with our AI platform that improves yield and lowers manufacturing costs at some of the largest chip makers in the world. Not just machine learning, but AI. We’re seeking an experienced Data Scientist to join our team, who is responsible for developing model pipelines to enable and drive production of the world’s most advanced chips. We look for people who are self-motivated and passionate about the transformative potential of AI. You’ll be able to hone your skills while working side by side with industry experts who have decades of experience. The candidate must be an organized and highly motivated team player with strong initiative and communication skills, and possesses the drive to deliver quality results on time in a complex, intensive, and highly productive environment.
RESPONSIBILITIES
●     Help design, implement, and validate the ML Pipelines while collaborating with other data scientists.
●     Coordinate and collaborate with other Software Development group so that ML Pipeline fits well with the rest of PDF Solutions’ software applications.
●     Balance adding new features with the need for stability and performance.
●     Grow development capabilities to align with the pace of business needs.
QUALIFICATIONS AND SKILLS
●     Master's degree or higher in Computer Science, Computer Engineering, Electrical Engineering or similar discipline with industrial experience in software development
●     3+ years of experience with Python coding
●     3+ years of recent experience working as a Data Scientist in industry
●     Experience with developing production-grade code, preferably in Python
●     Experience with data science and machine learning, including Python libraries such as NumPy, SciPy, Pandas and Scikit-learn
●     Strong professional written and verbal communication skills
●     Ability to pass a Data Science skills-based test
●     Experience with relational or NoSQL databases such as Oracle/Cassandra/Redis or similar
●     Ability to create model-ready data from raw data, at scale
●     Ability to translate business problems into data science pipelines
●     Comfort with ML theory to recommend solutions beyond the standard libraries
●     Must be able to work independently and as part of a diverse interdisciplinary and international team
●     Communicates clearly to technical and non-technical audiences
●     Empathy with customer business challenges
●     Ability to map business problems to software and data science techniques
●     Understanding of fundamental data science and machine learning pipeline including data cleansing, feature engineering, imputation, model tuning, and model prediction
●     Basic understanding of the pros and cons of different machine learning algorithms, and basic understanding for different types of open source ML frameworks 
●     Understanding of hypervisors/containers, especially Docker
DESIRED SKILLS
●     Proficiency in both agile and waterfall development methodologies
●     Familiarity with Spark and TIBCO Spotfire
●     Knowledge/experience with backend, and ideally frontend as well, development/frameworks/libraries
●     Experience with Cassandra DB or other NoSQL DB a Plus