Get more qualified candidates and reduce your costs with our guaranteed recruiting solutions

Talk to an expert today

1687 - Data Scientist (Scientific Data Analyst)

US Citizenship


  • Perform data management, advanced data quality control and analysis of public health related and electronic medical record (EMR) data (e.g., large relational databases) using R or Python
  • Develop advanced data analysis, data visualization tools and applications in R or Python
  • Prepare technical reports and routine data analysis reports using large scale data sets using R or Python
  • Work on a Linux-based high-performance computing (HPC) environment
  • Identify data gaps and inconsistencies (i.e. data validation and missingness patterns)
  • Assist interpreting data analysis findings and offer solutions for issues identified
  • Communicate and collaborate with other scientists on aspects of study analysis and interpretation
  • May perform statistical and machine learning methods


  • Master’s Degree or 3 years of work experience in Statistics, Biostatistics, Engineering, Computer Science, Mathematics, or similar field
  • 3 years of experience with data management, statistical analysis and programming in R and/or Python
  • 3 years of experience with quantitative analysis and data interpretation
  • Experience working on a Linux-based high-performance computing (HPC) or cloud computing environment
  • Experience working with real world large databases and identifying data gaps and inconsistencies (i.e.- data validation and missingness patterns)
  • Experience with cleaning data for analysis (i.e., data munging)
  • Excellent organizational skills, commitment to generating accurate data, ability to meet short deadlines, and demonstrated experience in multi-tasking
  • Adheres to reproducible research
  • Strong interpersonal and teamwork skills
  • Strong oral and written communication skills; Experience with both written and oral presentation of data and the ability to interact with senior research staff and external stakeholders


  • 1 year(s) experience utilizing Python for developing machine learning models and algorithms, including the use of the following Python machine learning libraries: Numpy, Scipy, Scikit-learn, Theano, TensorFlow, Keras, PyTorch, Pandas, and Matplotlib.
  • 1 years of advanced programming in R and experience using machine learning packages (e.g., caret, randomForest, nnet, neuralnet, e1071, hiernet, tree,  xgboost, SMOTE, etc.)
  • Experience developing user-friendly interactive tools (e.g. R Shiny applications)