Post Job Free
Sign in

Data Scientist

Washington, DC
May 26, 2024

Contact this candidate


Igor Volkov, PhD 240-***-**** 1238 D St. NE, Washington, DC, 20002 US Citizen


Data Scientist with a comprehensive background in mathematics, physics, and programming, and over 20 years of experience solving complex, data-driven problems. My academic career has been highlighted by the publication of high impact research in leading scientific journals, including work ranked in the top 1% of the most-cited scientific papers. After successfully initiating my transition from academia to the industry, I am keen to further apply my analytical skills and problem-solving abilities in a results-driven environment.


Python; C; SQL; AWS; Javascript; HTML; Databricks; CUDA; data science; machine learning; natural language processing; data visualization; data mining; high-performance computing; stochastic and Monte-Carlo simulations; Markov chains; mathematical modeling; statistical analysis


Data Scientist

2023 - 2024 Whitespace LLC, Washington, DC

Led a project in support of a U.S. Citizenship and Immigration Services (USCIS) initiative to enhance user experience, executing technical lead responsibilities, creating and refining predictive models.

•Identified essential data and developed Python scripts and SQL queries to collect data from Databricks, reducing querying time from about 5 hours to few minutes. Cleaned and selected high quality data to use for training and validation of data science models.

•Developed and implemented novel algorithms to assign existing 1.5 million of records to 50,000 customers using Python's data science packages.

•Applied machine learning techniques to develop predictive models for ongoing data, significantly boosting accuracy and achieving a milestone of over 99% accuracy.

•Executed the role of technical lead, guiding and delegating tasks to the team (developers, data analysts, business analysts), reviewing their outputs, and confirming completion. Set strategic short and long-term goals to align team efforts with project objectives.

•Equipped the team with data science tools and organized workflows for effective verification of model predictions. Implemented an automated process for task comparisons.

•Developed a web application using Python, Flask, HTML, and Javascript, leveraging AI/ML models, that significantly outperformed existing client services. In addition, the application automated previously manual client tasks, enhancing efficiency and user experience.

•Managed the deployment of the machine learning model and web application into the client's Java-based systems, overseeing technical integration and operational deployment.

•Engaged with stakeholders to collect feedback and showcase product features, which resulted in extremely positive responses and markedly increased client satisfaction.

•Identified data patterns that required clarification/change of existing policies.

•Presented and demonstrated findings and insights at team level as well as regular meetings involving USCIS leadership.

Research Scientist

2013 - 2022 The George Washington University, Washington, DC

Led multiple projects on analysis and visualization of astronomical data with a focus on machine learning applications (classification and clustering) to imperfect data, theoretical and computational modeling, statistical inference, and time series analysis and forecasting.

•Performed synthesis and trend analysis of the astrophysical data from using space telescope data.

•Performed a search and characterization of fast transients in stellar clusters.

•Developed a framework for machine-learning classification of cosmic sources and wrote (using Python, Bokeh, HTML, and JavaScript) a web-based (running client-side) interactive GUI to visualize the training data and the classification results (

•Analyzed hundreds of Chandra X-ray Observatory images using advanced clustering methods and created an interactive online application to visualize the results.

•Developed analytical methods and software enabling fast searches of periodicity in long and sparse time series.

•Supervised and mentored three graduate and three undergraduate students.

•Communicated results to diverse audiences as a part of GWU public outreach initiative.

•Authored and co-authored 12 refereed articles in The Astrophysical Journal and multiple other publications (e.g., in Research Notes of the American Astronomical Society).

Postdoctoral Researcher

2012 - 2017 The University of Maryland, College Park, MD

2005 - 2010 The Pennsylvania State University, University Park, PA

•Modeled ecological systems using methods of statistical physics and thermodynamics.

•Designed and executed large-scale Monte Carlo and molecular dynamics computer simulations, alongside Bayesian inference to assess model accuracy.

•Used the data from the forests across the globe to develop an analytic and statistical theory of metabolic scaling ; performed spatiotemporal analysis of the data.

•Worked on interdisciplinary projects in physics, ecology, and biology developing computational models for explanation and prediction of data patterns in collaboration with the National Institutes of Health and the Smithsonian Institution.

•Developed statistical models for evolution of the influenza virus and its interaction using genomic and surveillance data.

•Analyzed nucleotide sequences of flu virus and microarray data obtained from cell culture.

•Developed a pipeline to simulate an epidemiological model of viral and immune co-evolution.

•Modeled biodiversity in communities of coral reefs and soil bacteria using multiple datasets from various sources.

•Published extensively, including a review article in Reviews of Modern Physics, articles in Nature, Proceedings of the National Academy of Sciences and other peer-reviewed papers.


PhD, Physics

2005 The Pennsylvania State University, University Park, PA

Thesis: “Statistical physics and Ecology”

BSc, Theoretical Physics

1996 Belarusian State University, Minsk, Belarus


Co-authored over 50 articles, accumulating more than 2500 citations, including lead author in a number of high-impact research studies, published in leading scientific journals such as Nature, Science, Proceedings of the National Academy of Sciences, and Reviews of Modern Physics.

•Y Lin, et al., Multiwavelength catalog of 10,000 4XMM-DR13 sources with known classifications. Research Notes of the AAS (2024, in press).

•H Yang, et al., Classifying unidentified X-ray sources in the Chandra source catalog using a multiwavelength machine learning approach. The Astrophysical Journal 941, 104 (2022).

•I Volkov, et al., NuSTAR observation of LS 5039. The Astrophysical Journal 915, 61 (2021).

•O Kargaltsev, I Volkov, Automated search for extended sources in archival Chandra X-ray Observatory data. AAS/High Energy Astrophysics Division 17, 112 (2019).

•S Azaele, et al., Statistical mechanics of ecological systems: Neutral theory and beyond. Reviews of Modern Physics 88, 035003 (2016).

•I Volkov, et al., Synthesizing within-host and population-level selective pressures on viral populations: the impact of adaptive immunity on viral immune escape. Journal of The Royal Society Interface 7, 1311 (2010).

•I Volkov, et al., Inferring species interactions in tropical forests. Proceedings of the National Academy of Sciences 106, 13854 (2009).

•I Volkov, et al., Patterns of relative species abundance in rainforests and coral reefs. Nature 450, 45 (2007).

•I Volkov, et al., A novel ensemble in statistical physics. Journal of statistical physics 123, 167 (2006).

•M Nelson, et al., Stochastic processes are key determinants of short-term evolution in influenza A virus. PLoS pathogens 2, 125 (2006).

•I Volkov, et al., Comment on “Computational improvements reveal great bacterial diversity and high metal toxicity in soil”. Science 313, 918 (2006).

•I Volkov, et al., Density dependence explains tree species abundance and diversity in tropical forests. Nature 438, 658 (2005).

•I Volkov, et al., The stability of forest biodiversity. Nature 427, 696 (2004).

•I Volkov, et al., Neutral theory and relative species abundance in ecology. Nature 424, 1035 (2003).

•I Volkov, et al., Molecular dynamics simulations of crystallization of hard spheres. Physical Review E 66, 061401 (2002).


•CIDview – Interactive visualization of multidimensional data.

•ChaSES – Search for extended sources in Chandra X-ray Observatory ACIS images.

•MUWCLASS – Multiwavelength Machine Learning Classification pipeline and the classification results of the Chandra Source Catalog v2.

•XDBS – Catalog of X-ray Detected Stars.

Contact this candidate