Special Issue: VOL.-11, Issue-1, January 2025
1. A Data-Driven Analysis for Predicting Polycystic Ovary Syndrome (PCOS) Using Clinical and Hormonal Indicators
Authors: Gottimukkula Vinayasri
Keywords: Polycystic Ovary Syndrome (PCOS) Prediction Clinical and Hormonal Feature Analysis Machine Learning in Healthcare Antral Follicle Count and Androgen Levels Predictive Modeling for Early Diagnosis
Page No: 01-05
Abstract
Polycystic Ovary Syndrome (PCOS) is a prevalent endocrine disorder among women of reproductive age, characterized by hormonal imbalance and irregular menstrual cycles. Early diagnosis is crucial to prevent long-term complications such as infertility, diabetes, and cardiovascular issues. This study analyzes a clinical dataset of 1,000 women to develop predictive models for PCOS based on features such as BMI, testosterone levels, menstrual irregularity, and antral follicle count. Machine learning classifiers are implemented to identify the most influential predictors. Results indicate that antral follicle count and testosterone levels are the most critical features, while the models achieve over 90% classification accuracy, supporting the viability of automated diagnostic tools in clinical practice.
Keywords: Polycystic Ovary Syndrome (PCOS) Prediction Clinical and Hormonal Feature Analysis Machine Learning in Healthcare Antral Follicle Count and Androgen Levels Predictive Modeling for Early Diagnosis
References
References not available
2. An Exploratory Analysis of Sleep Health and Lifestyle Factors
Authors: Kalavakunta Jeevan Kumar
Keywords: Sleep Health Analysis Lifestyle and Stress Factors Regression Modeling Unsupervised Clustering Public Health Data Analytics
Page No: 06-08
Abstract
Sleep is a crucial determinant of overall health and well-being. This paper investigates the associations between sleep disorders and lifestyle indicators such as stress levels, physical activity, BMI, and occupation using a dataset of 374 individuals. We employ descriptive statistics, regression modeling, and unsupervised clustering to uncover patterns affecting sleep health. Our analysis reveals significant correlations between stress levels and sleep quality, while physical activity and occupation also appear to influence sleep outcomes. The study underscores the multifactorial nature of sleep health and highlights the importance of holistic lifestyle interventions.
Keywords: Sleep Health Analysis Lifestyle and Stress Factors Regression Modeling Unsupervised Clustering Public Health Data Analytics
References
References not available
3. Analyzing Global Sugar Consumption Patterns and Health Implications Using Machine Learning Techniques
Authors: Sivagiri Jeeva
Keywords: Sugar Consumption Trends Public Health Analytics Machine Learning Applications Obesity and Diabetes Correlation Data-Driven Policy Analysis
Page No: 09-12
Abstract
The global increase in sugar consumption has profound health and economic implications, including rising rates of diabetes and obesity. This study explores a comprehensive dataset containing sugar consumption indicators across countries from 1960 to 2020, examining correlations with population health metrics and economic indicators. We leverage Python-based analytics to identify trends, regional disparities, and the influence of government interventions like taxation and education. The findings reveal critical associations between processed food consumption, average sugar intake, and public health risks, supporting data-driven policy recommendations for mitigating sugar-related health challenges globally.
Keywords: Sugar Consumption Trends Public Health Analytics Machine Learning Applications Obesity and Diabetes Correlation Data-Driven Policy Analysis
References
References not available
4. Data-Driven Analysis of Thyroid Cancer Risk Using Clinical and Demographic Indicators
Authors: Pagadala Lavanya
Keywords: Thyroid Cancer Risk Prediction Clinical and Demographic Indicators Thyroid Hormone Analysis (TSH, T3, T4) Data-Driven Risk Stratification Healthcare Data Analytics
Page No: 13-15
Abstract
Thyroid cancer is among the fastest-growing endocrine malignancies globally, with multiple risk factors including genetics, environmental exposure, and hormonal imbalances. This study analyzes a dataset of over 212,000 patients containing demographic, clinical, and biochemical features to evaluate patterns associated with thyroid cancer risk. Using Python, we apply descriptive statistics and visualization to explore correlations between thyroid hormone levels (TSH, T3, T4), lifestyle factors, and cancer diagnosis. Our findings suggest that elevated TSH levels and large nodule size are critical markers, and that risk stratification can be improved with data-centric methodologies.
Keywords: Thyroid Cancer Risk Prediction Clinical and Demographic Indicators Thyroid Hormone Analysis (TSH, T3, T4) Data-Driven Risk Stratification Healthcare Data Analytics
References
References not available
5. Data-Driven Insights into Popular Anime Trends Using Exploratory Analysis
Authors: Areti Hemanth Kumar
Keywords: Anime Trend Analysis Exploratory Data Analysis (EDA) Popularity and Rating Patterns Genre Distribution Analysis Recommendation System Insights
Page No: 16-19
Abstract
The anime industry has grown exponentially in recent years, with a global audience that spans millions. This study analyzes a comprehensive dataset of the top 15,000 anime titles to extract insights on trends, scoring patterns, and audience preferences. Through Python-based data processing and visualization, we investigate relationships between score, popularity, genre distribution, and other features. The study contributes to understanding the driving factors behind highlyrated anime and provides a foundation for recommendation systems and content strategies.
Keywords: Anime Trend Analysis Exploratory Data Analysis (EDA) Popularity and Rating Patterns Genre Distribution Analysis Recommendation System Insights
References
References not available
6. Data-Driven Prioritization of Medical Symptoms Based on Severity Scores
Authors: Battula Bhargavi
Keywords: Symptom Severity Scoring Clinical Decision Support Systems Medical Triage Optimization Healthcare Data Analysis Severity-Based Symptom Prioritization
Page No: 20-22
Abstract
In clinical decision-making, the accurate prioritization of symptoms based on severity is essential for diagnosis, triage, and treatment planning. This study analyzes a structured dataset consisting of 133 medical symptoms, each annotated with a severity weight ranging from 1 to 5. We explore statistical distributions and symptom severity trends using Python. The findings underscore the variability of symptom intensities and suggest utility in symptom severity indexing for use in medical diagnosis systems and health triage applications.
Keywords: Symptom Severity Scoring Clinical Decision Support Systems Medical Triage Optimization Healthcare Data Analysis Severity-Based Symptom Prioritization
References
References not available
7. ECG Time Series Analysis and Prediction: A Data-Driven Approach
Authors: C S Chandra Sekhar Azad
Keywords: Electrocardiogram (ECG) Analysis Time Series Modeling Cardiovascular Signal Processing Anomaly Detection in Biomedical Signals Predictive Analytics in Cardiology
Page No: 23-25
Abstract
Electrocardiography (ECG) serves as a vital diagnostic tool in cardiology, capturing the electrical activity of the heart over time. In this study, we analyze a univariate ECG time-series dataset with over 17,000 data points. By exploring patterns in the data using descriptive analytics and visualizations, we aim to identify trends, anomalies, and insights for potential predictive modeling. This paper elaborates on the methodology, data analysis, results, and implications for cardiovascular monitoring systems.
Keywords: Electrocardiogram (ECG) Analysis Time Series Modeling Cardiovascular Signal Processing Anomaly Detection in Biomedical Signals Predictive Analytics in Cardiology
References
References not available
8. Evaluating the Effectiveness of Two Experimental Medications in Preventing Viral Infections in Mice
Authors: Kuppam Madhusudhan Reddy
Keywords: Antiviral Drug Evaluation, Logistic Regression Analysis, Pre-Clinical Experimental Study, Dose–Response Modeling, Viral Infection Prediction
Page No: 26-28
Abstract
The emergence of novel viral infections has renewed focus on experimental antiviral treatments. This study investigates the effect of two medications on preventing viral infections in mice. Using a dataset of 400 observations measuring dosages of two medications (Med_1 and Med_2) and the presence of virus, we apply exploratory data analysis, logistic regression, and visualization to assess correlations and predictive capabilities. The analysis reveals that higher doses of Med_1 significantly reduce the likelihood of infection, while Med_2 shows a more complex, nonlinear effect. These insights can support further pre-clinical trials and drug formulation studies.
Keywords: Antiviral Drug Evaluation, Logistic Regression Analysis, Pre-Clinical Experimental Study, Dose–Response Modeling, Viral Infection Prediction
References
References not available
9. Evaluating Treatment Effects in a Synthetic Clinical Trial Dataset
Authors: Bathula Krishnaveni
Keywords: Clinical Trial Data Analysis, Treatment Effect Evaluation, Synthetic Healthcare Dataset, Comparative Statistical Analysis, Outcome-Based Health Assessment
Page No: 29-31
Abstract
Clinical trials are foundational to modern medicine, offering empirical evaluations of treatment safety and efficacy. This study analyzes a synthetic dataset of 1,000 participants across different treatment groups—Drug A, Drug B, and Placebo—to evaluate health outcomes including blood pressure, cholesterol levels, and adverse events. Using Python, we perform statistical and visual analyses to identify trends, group differences, and potential treatment benefits. The approach highlights how simulated clinical data can be used for educational, analytical, and methodological testing purposes.
Keywords: Clinical Trial Data Analysis, Treatment Effect Evaluation, Synthetic Healthcare Dataset, Comparative Statistical Analysis, Outcome-Based Health Assessment
References
References not available
10. Exploratory Data Analysis and Insights on Enzyme Inhibitors Dataset
Authors: C Guna Sekhar
Keywords: Enzyme Inhibitor Analysis, Exploratory Data Analysis (EDA), Structure–Activity Relationship (SAR), Drug Discovery Analytics, Molecular Property Correlation
Page No: 32-36
Abstract
This research paper presents an exploratory data analysis (EDA) of a dataset containing information on enzyme inhibitors. The dataset is composed of various chemical and biological properties of inhibitors, aimed at providing insights into the structure-activity relationships (SAR) of these compounds. Using Python and libraries such as Pandas, Matplotlib, and Seaborn, we analyze the distribution of inhibitor types, their molecular properties, and potential correlations between these properties and their efficacy. The paper highlights key trends in inhibitor characteristics, contributing to the understanding of their potential applications in drug discovery and enzyme regulation. Statistical analysis and visualizations reveal significant findings about the inhibitors' behavior and structure. Our study lays the foundation for further research into the pharmaceutical and biotechnological domains.
Keywords: Enzyme Inhibitor Analysis, Exploratory Data Analysis (EDA), Structure–Activity Relationship (SAR), Drug Discovery Analytics, Molecular Property Correlation
References
References not available
11. Global Blood Type Distribution Analysis: A Comparative Study
Authors: Nagaraju Latha Devi
Keywords: Global Blood Type Distribution, ABO and Rh Factor Analysis, Comparative Demographic Study, Exploratory Data Analysis (EDA), Population Health Patterns
Page No: 37-39
Abstract
This study presents a comprehensive analysis of global blood type distribution across 126 countries, with an emphasis on identifying patterns and anomalies in blood group prevalence. Utilizing a dataset consisting of ABO and Rh factor percentages by country, we conduct exploratory data analysis using Python to uncover demographic insights and potential medical implications. The findings provide valuable references for healthcare planning, blood bank logistics, and genetic studies on population diversity.
Keywords: Global Blood Type Distribution, ABO and Rh Factor Analysis, Comparative Demographic Study, Exploratory Data Analysis (EDA), Population Health Patterns
References
References not available
12. Global HIV/AIDS Prevalence: An Analytical Review of 2024-2025 Estimates
Authors: Shaik Yasin
Keywords: HIV/AIDS Prevalence Analysis, Global Health Statistics (2024–2025), Adult Prevalence Rates, Epidemiological Data Review, Public Health Policy Insights
Page No: 40-43
Abstract
The global burden of HIV/AIDS continues to pose significant challenges to healthcare systems. This study analyzes recent estimates of HIV/AIDS prevalence across 193 countries, focusing on adult prevalence rates, number of affected individuals, and associated annual deaths. Using descriptive statistics and data visualizations, the study identifies regional patterns and high-burden areas, offering insights for public health policy and future interventions.
Keywords: HIV/AIDS Prevalence Analysis, Global Health Statistics (2024–2025), Adult Prevalence Rates, Epidemiological Data Review, Public Health Policy Insights
References
References not available
13. Machine Learning for Hepatitis C Diagnosis: Predictive Modeling and Analysis of Clinical Biomarkers
Authors: R Kavya
Keywords: Hepatitis C Diagnosis, Clinical Biomarker Analysis, Supervised Machine Learning, Random Forest Classification, Predictive Healthcare Modeling
Page No: 44-46
Abstract
Hepatitis C, a liver disease caused by the Hepatitis C virus (HCV), affects millions globally, often progressing asymptomatically until severe liver damage occurs. This study utilizes machine learning algorithms to classify patients based on various clinical and biochemical parameters available in the Hepatitis C dataset. We apply Decision Trees, Logistic Regression, and Random Forest classifiers to predict the liver condition category of patients. Our Random Forest model achieves an accuracy of 93%, revealing the strong potential of supervised learning in early and efficient hepatitis diagnosis.
Keywords: Hepatitis C Diagnosis, Clinical Biomarker Analysis, Supervised Machine Learning, Random Forest Classification, Predictive Healthcare Modeling
References
References not available
14. Mortality Prediction in Heart Failure Patients Using Machine Learning Techniques
Authors: Potluru Abhinaya
Keywords: Heart Failure Mortality Prediction, Clinical Records Dataset Analysis, Supervised Machine Learning, Random Forest Classifier, Predictive Clinical Decision Support
Page No: 47-50
Abstract
Heart failure is a critical cardiovascular condition affecting millions globally. Early prediction of mortality risk among heart failure patients can help guide treatment strategies and potentially save lives. In this study, we use the Heart Failure Clinical Records Dataset to develop predictive models using machine learning algorithms. We applied Logistic Regression, Random Forest, and Support Vector Machine (SVM) classifiers to predict the likelihood of patient death. Our results indicate that the Random Forest model achieved the highest accuracy (91%), suggesting its effectiveness in handling medical datasets with mixed feature types. The study supports the use of predictive analytics to aid clinical decision-making.
Keywords: Heart Failure Mortality Prediction, Clinical Records Dataset Analysis, Supervised Machine Learning, Random Forest Classifier, Predictive Clinical Decision Support
References
References not available
15. Obesity Level Prediction using Machine Learning Techniques on Lifestyle and Health Indicators
Authors: Penabadi Prasanna Kumar
Keywords: Obesity Level Classification, Lifestyle and Health Indicators, Supervised Machine Learning, Random Forest Model, Predictive Health Analytics
Page No: 51-54
Abstract
Obesity is a growing global health concern, associated with numerous comorbidities such as diabetes, cardiovascular diseases, and cancer. In this study, we analyze lifestyle and demographic data from the "ObesityDataSet_raw_and_data_sinthetic.csv" to build predictive models that classify individuals into various obesity categories. We apply machine learning algorithms such as Logistic Regression, Random Forest, and Support Vector Machine (SVM) to classify obesity levels. The results indicate that Random Forest outperforms other models with an accuracy of 95.3%. This study showcases the potential of data-driven approaches for early identification and prevention of obesity.
Keywords: Obesity Level Classification, Lifestyle and Health Indicators, Supervised Machine Learning, Random Forest Model, Predictive Health Analytics
References
References not available
16. Predicting the Onset of Diabetes using Clinical and Demographic Features
Authors: Sandagiri Gayathri
Keywords: Diabetes Onset Prediction, Clinical and Demographic Features, Logistic Regression Model, Pima Indian Dataset Analysis, Early Disease Detection Analytics
Page No: 55-57
Abstract
The rising global prevalence of diabetes necessitates the development of effective diagnostic tools. This study explores the prediction of diabetes onset using clinical and demographic variables including glucose level, BMI, age, and family history. Utilizing a well-established dataset of 768 women from the Pima Indian population, we perform exploratory analysis and build a logistic regression model to assess the probability of diabetes presence. The model shows promising accuracy, with glucose level and BMI emerging as strong predictors. These findings emphasize the potential of machine learning in enhancing early diabetes detection and prevention strategies.
Keywords: Diabetes Onset Prediction, Clinical and Demographic Features, Logistic Regression Model, Pima Indian Dataset Analysis, Early Disease Detection Analytics
References
References not available
17. Predictive Analysis of Cancer Presence Using Gene Expression Profiling
Authors: K Gopika
Keywords: Gene Expression Profiling, Cancer Presence Prediction, Biomarker Identification, Supervised Learning Classification, Genomic Data Analytic
Page No: 58-61
Abstract
Gene expression profiling has emerged as a powerful approach in identifying biomarkers for cancer diagnosis and prognosis. This study explores the relationship between the expression levels of two specific genes and the presence of cancer, utilizing a dataset containing 3000 samples. Through data visualization and supervised learning, we aim to develop a classification model that can predict cancer presence with high accuracy. Our findings suggest a strong correlation between gene expression levels and cancer diagnosis, underscoring the potential of genetic data in early detection efforts.
Keywords: Gene Expression Profiling, Cancer Presence Prediction, Biomarker Identification, Supervised Learning Classification, Genomic Data Analytic
References
References not available
18. Predictive Analysis of Oral Cancer Using Lifestyle and Clinical Indicators
Authors: Mallepogu Sinduri
Keywords: Oral Cancer Prediction, Lifestyle and Clinical Risk Factors, Machine Learning Classification, Early Screening Analytics, Population Health Data Analysis
Page No: 62-65
Abstract
Oral cancer remains a significant global health burden, especially in developing countries. Early detection through predictive analytics can dramatically improve outcomes. This study utilizes a comprehensive dataset of 84,922 records, incorporating demographic, lifestyle, and clinical data to predict oral cancer diagnoses. Using machine learning techniques, we analyze the relationship between various risk factors and oral cancer. The results highlight the importance of lifestyle and early screening in disease prediction and prevention.
Keywords: Oral Cancer Prediction, Lifestyle and Clinical Risk Factors, Machine Learning Classification, Early Screening Analytics, Population Health Data Analysis
References
References not available
19. Predictive Analysis of Thyroid Cancer Recurrence using Clinical and Pathological Factors
Authors: Cherivi Rajitha
Keywords: Thyroid Cancer Recurrence Prediction, Clinical and Pathological Risk Factors, Logistic Regression Modeling, Random Forest Classification, Oncology Predictive Analytics
Page No: 66-69
Abstract
Thyroid cancer is one of the most common endocrine malignancies worldwide, and its recurrence remains a significant clinical challenge. This study aims to develop predictive models to identify patients at risk of thyroid cancer recurrence based on demographic, clinical, and pathological variables. Utilizing a dataset of 383 thyroid cancer patients, we applied data preprocessing, exploratory analysis, and machine learning models, including logistic regression and random forest classification. Our findings demonstrate that pathology type, tumor staging, focality, and response to initial treatment are among the most predictive features. These insights can support early intervention strategies and improved patient outcomes.
Keywords: Thyroid Cancer Recurrence Prediction, Clinical and Pathological Risk Factors, Logistic Regression Modeling, Random Forest Classification, Oncology Predictive Analytics
References
References not available
20. Predictive Modeling of Esophageal Cancer Risk using the Sobar-72 Dataset
Authors: Kate Pogu Kumar
Keywords: Esophageal Cancer Risk Prediction, Sobar-72 Dataset Analysis, Lifestyle and Clinical Risk Factors, Random Forest Classification, Clinical Risk Stratification Analytics
Page No: 70-73
Abstract
Esophageal cancer is a deadly disease with late-stage diagnosis and poor prognosis. Early identification of highrisk individuals is critical for prevention and intervention. This study utilizes the Sobar-72 dataset, which contains clinical and lifestyle features, to develop machine learning models that can predict the risk of esophageal cancer. By applying data preprocessing, exploratory data analysis, and classification algorithms—including logistic regression and random forest—we identify the most influential factors and evaluate model performance. Results show that Random Forest achieved the highest accuracy (91.6%) and identified features such as age, alcohol use, and tobacco use as significant predictors. This work emphasizes the potential of predictive analytics in clinical risk stratification.
Keywords: Esophageal Cancer Risk Prediction, Sobar-72 Dataset Analysis, Lifestyle and Clinical Risk Factors, Random Forest Classification, Clinical Risk Stratification Analytics
References
References not available
21. Predictive Modeling of Heart Attack Risk in China using Lifestyle, Clinical, and Socioeconomic Indicators
Authors: Edara Annegrace
Keywords: Heart Attack Risk Prediction, Cardiovascular Disease Modeling, Supervised Machine Learning, Population Health Analytics, Public Health Risk Assessment
Page No: 74-76
Abstract
Cardiovascular disease remains the leading cause of death globally, with heart attacks representing a significant proportion. This study explores the predictive modeling of heart attack risks in the Chinese population using a large-scale dataset of over 239,000 individuals. Key variables span lifestyle, clinical, environmental, and socioeconomic dimensions. By applying supervised machine learning techniques, the study aims to identify critical factors contributing to heart attack incidence and demonstrate the potential for data-driven public health intervention.
Keywords: Heart Attack Risk Prediction, Cardiovascular Disease Modeling, Supervised Machine Learning, Population Health Analytics, Public Health Risk Assessment
References
References not available
22. Predictive Modeling of Liver Disease using Machine Learning on Indian Patient Data
Authors: Venkatagirichandu Sai
Keywords: Liver Disease Prediction, Indian Liver Patient Dataset Analysis, Random Forest Classification, Clinical Decision Support Systems, Supervised Machine Learning in Healthcare
Page No: 77-82
Abstract
Liver diseases are a growing concern in developing countries like India, where diagnosis is often delayed due to limited access to specialized healthcare. This study utilizes the Indian Liver Patient dataset to build predictive models for early detection of liver disease. We employ logistic regression, decision trees, and random forest classifiers to predict whether a patient has a liver condition. Our results show that the random forest classifier achieves the highest accuracy of 79.4%, highlighting the potential of machine learning in clinical decision support systems.
Keywords: Liver Disease Prediction, Indian Liver Patient Dataset Analysis, Random Forest Classification, Clinical Decision Support Systems, Supervised Machine Learning in Healthcare
References
References not available
23. Smartwatch Health Analytics: A Data-Driven Study of Physiological and Behavioral Metrics
Authors: Vanpuri Jagadeesh
Keywords: Wearable Health Data Analytics, Smartwatch Physiological Monitoring, Exploratory Data Analysis (EDA), Machine Learning for Health Trends, Population-Level Health Insights
Page No: 83-86
Abstract
With the proliferation of wearable devices, massive volumes of health-related data are generated daily. This paper presents an analytical exploration of physiological metrics obtained from smartwatches, including heart rate, oxygen saturation, sleep duration, and stress levels. Using a cleaned dataset of anonymized users, we evaluate patterns, identify correlations, and compare behaviors across activity levels. Through exploratory data analysis and machine learning techniques, we demonstrate the potential of smartwatches in monitoring individual and population-level health trends.
Keywords: Wearable Health Data Analytics, Smartwatch Physiological Monitoring, Exploratory Data Analysis (EDA), Machine Learning for Health Trends, Population-Level Health Insights
References
References not available
24. Survey-Based Analysis of Mastocytosis Patient Experiences: Symptoms, Triggers, and Quality of Life Impact
Authors: S Kartheek
Keywords:
Page No: 87-89
Abstract
Mastocytosis is a rare disease involving the accumulation of mast cells in various organs. This study presents a comprehensive analysis of patient-reported experiences based on a survey of 50 individuals worldwide. Using statistical and exploratory data analysis techniques, we investigate symptom prevalence, diagnosis status, treatment patterns, and the perceived impact on quality of life. Results indicate daily symptom frequency among the majority, strong associations between certain symptoms and triggers, and a need for increased specialist access and support systems. This work provides insights into the real-world burden of mastocytosis and underlines the importance of patient-centered care.
Keywords:
References
References not available
📚 Browse More Issues
Explore our complete archive of published research articles and studies.
View All Issues📝 Submit Your Research
Contribute to our journal by submitting your original research for publication.
Submit Article