Breaking News
March 21, 2018 - Trovagene Announces First Patient Successfully Completes Cycle 1 of Treatment with PCM-075 in Combination with Low Dose Cytarabine (LDAC) in AML Trial
March 21, 2018 - Congenital Cardiac Cath Tx Often Strays from Guidelines
March 21, 2018 - Marked increase in cardiovascular risk factors in women after preeclampsia
March 21, 2018 - New app may help predict, track manic and depressive episodes in bipolar patients
March 21, 2018 - Discovery of genes could lead to development of novel therapies for EBV-related cancers
March 21, 2018 - High-fat, high-cholesterol diet depletes ranks of artery-protecting immune cells
March 21, 2018 - Research misconduct allegations shadow likely CDC appointee
March 21, 2018 - Most Breast Ca Patients Fail to Get Genetic Counseling
March 21, 2018 - Lopsided ear function can lead to lopsided brain development
March 21, 2018 - Acupuncture helps manage menopausal symptoms, review finds
March 21, 2018 - Motor skill training may contribute to reading skills in obese children
March 21, 2018 - Poor dental health may be related to increased diabetes risk
March 21, 2018 - Chronic opioid users at increased risk of complications after spinal fusion surgery
March 21, 2018 - Scientists develop brain “stethoscope” that can detect silent seizures
March 21, 2018 - New method predicts effects of global warming on disease
March 21, 2018 - Insurance Company Hurdles Burden Doctors, May Harm Patients
March 21, 2018 - Renal Transplant from HCV-Positive Donors Feasible
March 21, 2018 - Myelodysplastic syndrome: MedlinePlus Medical Encyclopedia
March 21, 2018 - Research reveals brain mechanism involved in language learning
March 21, 2018 - Many parents still hesitate to try early peanut introduction, survey finds
March 21, 2018 - Audiologist urges tinnitus sufferers facing ‘revolving door healthcare’ to seek support
March 21, 2018 - Study reveals impact of prostate cancer on wives and partners of sufferers
March 21, 2018 - ‘Almost a Miracle Drug’: What We Heard This Week
March 21, 2018 - Study shows NIH spent >$100 billion on basic science for new medicines
March 21, 2018 - Columbia researchers identify nerve cells that drive fruit fly’s escape behavior
March 21, 2018 - Sartorius Stedim Biotech selected by ABL Europe to supply single-use process technologies
March 21, 2018 - Increase in coffee consumption may help battle against colon cancer
March 21, 2018 - Hydrogel may accelerate healing of diabetic ulcers
March 21, 2018 - Dermira’s Two Phase 3 Trials Evaluating Olumacostat Glasaretil in Patients with Acne Vulgaris Did Not Meet Co-Primary Endpoints
March 21, 2018 - DePuy Synthes introduces ACTIS Total Hip System for improving initial implant stability
March 21, 2018 - ‘Oh, It Was Nothing’
March 21, 2018 - Herbal drug kratom linked to salmonella illnesses, CDC says
March 21, 2018 - New optical point-of-care device could enhance screening for thyroid nodules
March 21, 2018 - FDA Expands Approval of Adcetris (brentuximab vedotin) for First-Line Treatment of Stage III or IV Classical Hodgkin Lymphoma in Combination with Chemotherapy
March 21, 2018 - Eosinophilic Esophagitis: Late Manifestation of Allergic March
March 21, 2018 - Signaling pathway involving the Golgi apparatus identified in cells with Huntington’s disease
March 21, 2018 - Quintupling inhaled steroid doses may not benefit children with asthma
March 21, 2018 - Study shows clear connection between cardiovascular fitness in middle age and dementia risk
March 21, 2018 - Premature babies have higher risks of health complications in Bangladesh
March 21, 2018 - Child’s temperament and parenting influence weight gain in babies
March 21, 2018 - Researchers find the heart to be capable of arrhythmia termination after local gene therapy
March 21, 2018 - Inhealthcare to provide digital infrastructure for NHS to help protect people from falls
March 21, 2018 - Flu Season Finally Slowing Down
March 21, 2018 - Mixed Results for Shorter DAPT in ACS Patients
March 21, 2018 - Scientists discover fish scale-derived collagen effective for healing wounds
March 21, 2018 - Genomics England announces new partnership to improve efficiency of next-generation sequencing analysis
March 21, 2018 - Adjuvant AC chemotherapy found to be effective in treating HRD-positive breast cancer patients
March 21, 2018 - Researchers identify new treatment targets for lung diseases using big data
March 21, 2018 - Kids see more women in science than five decades ago
March 21, 2018 - Research shows link between chronic fatigue syndrome and lower thyroid hormone levels
March 21, 2018 - Alzheimer’s disease on the rise
March 21, 2018 - Two Agents Equal as Pretreatment for Adrenal Tumor Surgery
March 21, 2018 - ‘Icebreaker’ protein opens genome for T cell development, researchers find
March 21, 2018 - Women in medicine shout #Metoo about sexual harassment at work
March 21, 2018 - Mother’s pre-pregnancy waist size may be linked to child’s autism risk
March 21, 2018 - Second hand marijuana smoke can cause serious damage
March 21, 2018 - International study shows benefits of using MRI at the start of prostate cancer diagnosis
March 20, 2018 - Santhera Reports Outcome of Exploratory Trial with Idebenone in PPMS Conducted at the NIH
March 20, 2018 - ECG Patch Ups At-Home Afib Diagnosis in mSToPS Trial
March 20, 2018 - ROS-scavenging nanozymes for anti-inflammation therapeutics
March 20, 2018 - Genomics England announces appointment of global genomics pioneer as first CEO
March 20, 2018 - Test flight at German Aerospace Center in Cologne demonstrates functionality of deficopter
March 20, 2018 - Music therapy helps treat combat-related psychological injuries in military personnel
March 20, 2018 - Innovative psychotherapeutic treatment protocol for obsessive-compulsive disorders
March 20, 2018 - Weight loss after lap-band surgery alleviates arthritic knee pain
March 20, 2018 - New diabetes drug may help obese people shed body weight
March 20, 2018 - Novel Peanut OIT a Winner in Phase III Trial
March 20, 2018 - Can gene therapy be harnessed to fight the AIDS virus?
March 20, 2018 - Education and academic achievement can lessen effects of child abuse, neglect
March 20, 2018 - Researchers develop new algorithm to make CPR more effective
March 20, 2018 - Diabetes medication reduces chance of late miscarriage, premature birth among women with PCOS
March 20, 2018 - SSRIs may be more effective option for treating anxious youth, UC research shows
March 20, 2018 - Antibiotics could benefit women suffering from chronic bladder pain
March 20, 2018 - Health Highlights: March 16, 2018
March 20, 2018 - Interventional Radiology Has a Problem of ‘Unseen’ Value
March 20, 2018 - Antibodies show effectiveness for HIV prevention and promise for treatment and cure
March 20, 2018 - New 3-D-printed technology will improve radiology training
March 20, 2018 - New study identifies key role for particular gene in 16p11.2 deletion syndrome
March 20, 2018 - Red and processed meat increase the risk of liver disease
March 20, 2018 - 50% of Australians do not brush teeth twice a day
Promoting precision medicine using data science of large datasets

Promoting precision medicine using data science of large datasets

image_pdfDownload PDFimage_print

An interview with Dr. Rajat Mukherjee conducted by Alina Shrourou, BSc

Please give an overview of what exactly data science is, and why it’s important to promote precision medicine.

I feel that data science is a marriage between statistical science and informatics, using statistical principals of math and logic on huge volumes of data.

© Jirsak/

You have to rely on informatics to store, read, and then apply these complex statistical algorithms to make sense of huge volumes of information.

A good use of data science can lead to major breakthroughs in medical research in areas like diagnostics, precision medicine, and real-world evidence. By taking large datasets, we can study if different approaches can have a markedly different effect for different patient populations.

What are the benefits of big data analysis compared to traditional collection and analysis?

That’s a good question. In traditional collection and analysis, or randomized clinical trials, the data are often collected in very controlled environments. Things like environmental factors may be very well controlled. On the other hand, factors like genomics may not be accounted for.

© Mopic/

Nowadays, real-world data studies or even randomized clinical trials are being designed differently to accommodate the variability and heterogeneity that environmental factors and genetic factors can bring. Data science provides a platform for systematically studying the interaction with environmental factors and genetic factors and looking at the therapeutic effects.

Please outline how you and your research team use data to inform diagnostics? What other biomedical applications are there of data science?

We study biomedical signals and images to develop statistical classifiers that can be used for diagnostics. We also work in the area of precision medicine, researching either genetics or related areas and biomarkers that can help enrich populations for targeted therapeutics. This is becoming more and more popular and important with applications in oncology, rare diseases and difficult to study diseases like Alzheimer’s and Parkinson’s disease.

Another area where data science is useful is in monitoring the patient population for both minor or harmful side effects of therapeutics that are in the market, as many side effects may only become known in the long-term, which can be hard to capture in short term clinical trials.

How can biomedical signals be used as a source of data for diagnosis? Please describe how signal data is processed and transformed to help with data analytics.

Biomedical signals are high-dimensional data, and they need a lot of what we call pre-processing. Essentially, it is the process of filtering out the noise and extracting the valuable information from these signals.

The next step would be feature extraction. We use these methods because you cannot use each and every component of the high dimensional data. You must extract features from these signals and images that are informative of disease status.

Next follows feature selection, i.e. select extracted features or their combinations that have the highest association with disease status. A diagnostic classifier can then be developed and validated using an independent test or validation set. The validation set must be independent of the data set used to develop the classifier.  In general, the diagnostic development and validation using signals and images are done in two separate clinical trials. However, our team works on different seamless options that may lead to much more efficient but still statistically valid designs.

We have been involved in a few diagnostics projects where we have taken biomedical signals and images and transformed them into classifiers. In one of the biggest projects where we have a classifier now, the pivotal validation studies are ongoing. The design of the pivotal validation trial is an operationally seamless, threshold optimization, group-sequential adaptive design which has been accepted by the CDRH, FDA.

How can data science be used to inform biomarker identification and selection? What work is Cytel doing in this area?

Biomarkers are another interesting example, as they play a key role in providing precision medicine strategies. Biomarkers can be diagnostic, prognostic or predictive. Predictive biomarkers help in enrichment strategies for therapies that may only work for a particular sub-population that may be classified as biomarker-positive sub-population.

Biomarker development relies heavily on data science techniques such as filtering and reduction of data and using machine learning techniques to classify patients into biomarker positive/negative.

Please can you also describe data mining work that you are conducting, and how it can inform decision making?

We have yet to use data mining on big data, but we will in the future. However, we have used data mining to look at go/ no-go type decisions, for example, if there have been multiple early phase studies on a particular area or therapeutic, we can pull all of this early phase data together and we do some data mining to come up with these go no-go decisions either to further the pipeline or to call it to an end.

Another new area of interest for us where we have used data mining techniques is the area of pharmacovigilance where large amounts of post-marketing data are used to generate signals for adverse events.

How important are Bayesian models within data science?

As a statistician, when you talk about real world data, I automatically think about Bayesian methods which are ideally suited to apply on accumulating data and updating the information of interest.

Bayesian methods can also be used for automatic feature extraction and selection. These areas suffer from the absence of uniform methodology and Bayesian methods can fill up that gap.

Do you think data science will change the way we manage large volumes of data? What does the future hold for data science and the healthcare and drug discovery industry?

Managing large volumes of data is part of data science, so yes, data science has an integral role to play in the way we manage large volumes of data. Data science will mean having specialized people to take care of big data in the right way, so that it can be applied in real time. Data science is going to change the nature of how big volumes of data are to be stored, accessed and applied.

I think incorporating data science in a drug-development team opens doors to having effective multidisciplinary teams attacking a common problem from different directions.  

Where can readers find more information?

About Rajat Mukherjee

Rajat Mukherjee has 15 years of professional experience as an industry and academic statistician, and brings a range of expert knowledge to Cytel’s customers. This includes work in pattern recognition problems for devices and biomarker discovery, Bayesian clinical trials, adaptive designs, and design and analysis of complex epidemiological studies.

His experience and expertise also includes statistical computing, survival analysis, longitudinal analysis, nonparametric and semiparametric inference, as well as statistical classification and high-dimensional data. Rajat has a strong background and interest in development and implementation of statistical methodology to real life medical problems.

Tagged with:

About author

Related Articles