You are viewing a preview of this job. Log in or register to view more details about this job.

Bioinformatics Data Scientist



Role: Bioinformatics data scientist
Location: Remote
Duration: Full Time

Position Summary

We are actively seeking a Bioinformatics Data Scientist to join our team. The successful candidate will be responsible for clinical and genomic data pre-processing and more importantly whole genomic data sequencing workflow. The candidate will may work on secondary genomics analysis tasks including read alignment and variant detection; and will work on tertiary genomics analysis tasks including annotation, filtering, prioritization, variant classification, and case interpretation followed by variant confirmation, segregation analysis, and reporting. An ideal candidate has knowledge and experience in polygenic/genomic risk score analysis, especially in cancer patients risk prediction and stratification.

The candidate will may work on different types of machine learning based predictions such as survival risk predictions, and therapy matching. S/he will be responsible to design and implement all required machine learning algorithms, implementing the algorithms in Python or in R, and contributing to API developers.

Qualifications

• PhD degree or Master degree in preferably bioinformatics, or computer science or other similar disciplines in engineering, and statistics experienced in clinical and genomic research related to oncology, particularly in polygenic/genome risk analysis for cancer patients

Duties and Responsibilities

• Discuss assigned use cases by other teammates and may need to do some researches accordingly to figure out the factors and requirements influencing the case
• Understand, pre-process clinical/genomic data, apply genomic data quality control such as standard QC of GWAS
• Select proper ML models and evaluate their performance on training, validation, and test datasets, and tune the models accordingly
• Collaborate with various internal teammates including data scientists, software engineers, API developers, and project leaders
• Effectively communicate in written and verbal format to the project team members.

Knowledge

• Deep knowledge in polygenic/genomic risk analysis and stratification
• Deep knowledge in theories of machine learning and deep learning
• Knowledgeable in microbiology and genomics, genome sequencing, and oncology
• Familiar with different database, repository, or catalogue of genome sequencing such as GWAS
• Deep knowledge in polygenic/genomic risk analysis

Experience

• At least 4 years of team project experience for applicants with master's degree, or at least 2 years for PhD applicants
• Practical experience of applying machine learning methodology across multiple domains and with a variety of business objectives
• Practical data analysis and implementing ML experience in clinical and genomic studies, especially polygenic/genomic risk analysis
• Strong record of self-management, and team-working experience provided through previous employers

Software/Skillset

• Skilful in programming by PLINK, R libraries such as PRSice, LDPred or Python libraries such as hail used for polygenic risk score analysis
• Proficiency in Microsoft office (Ms Word, Power Point, Excel)
• Familiarity with Azure, and Google cloud services
• Fluent in spoken and written English
• Organized, self-motivated, independent, flexible and able to work with minimum supervision