Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
102 articles
Proceedings Article
Peer-Review Statements
Kannimuthu Subramaniyam
All of the articles in this proceedings volume have been presented at the [IWADIC] during [12.26-12.28, 2025] in [Kuala Lumpur, Malaysia]. These articles have been peer reviewed by the members of the [Scientific Committee] and approved by the Editor-in-Chief, who affirms that this document is a truthful...
Proceedings Article
Industrial Robot Control Based on Deep Learning
Yuebai Wang
The concept of Industry 4.0 originated in Germany, with its core being a highly digitalized factory interior that enables an efficient manufacturing system and even autonomous control of production. In the era of Industry 4.0, with the application of neural network learning, industrial robots can complete...
Proceedings Article
The Application of Vision-Based Navigation Techniques on Drones
Xiaohong Gui
With the development of unmanned aerial vehicles (UAVs), more and more of them are introduced into true working conditions, and they need a powerful navigation system to operate smoothly. However, the commonly used navigation systems, such as the global positioning system (GPS), are not always reliable...
Proceedings Article
YOLOv8-CAACA: A Context-Aware Adaptive Confidence Adjustment and Target Fusion Algorithm for Pavement Crack Recognition in Complex Scenarios
Muhan Bai
To address the issues in current crack detection from road images—such as inaccurate identification of long cracks, misdetection and missed detection of cracks in complex environments, and splitting of a single crack into multiple segments—this paper proposes an improved YOLOv8 algorithm (YOLOv8s-CAACA)...
Proceedings Article
Application of Robotics Technology in Extreme Environments
Kexin Li, Tinghe Na, Kaiming Zhang
Extreme environment robots can replace humans working in dangerous scenarios such as the deep sea, space and nuclear industry, but their applications face many challenges. Technically, it needs to have extreme environmental adaptability, stable communication and autonomous decision-making ability. Key...
Proceedings Article
Technology and Application of Computer Vision in Smart Driving
Yiheng Zhou
Intelligent driving is a systematic project that involves the realization of multiple links in a collaborative manner. Environmental perception is an important link in an intelligent driving system. Machine vision can acquire rich and accurate data through cameras, and through the processing of this...
Proceedings Article
Prediction of Pre-Diabetes Based on Random Forest
Qihan Li
Diabetes, as one of the most severe chronic diseases globally, has seen an increase in prevalence in recent years rather than a decrease. Early screening for diabetes relies on traditional biochemical indicators, resulting in a high rate of missed diagnoses and insufficient resources at the grassroots...
Proceedings Article
Short-term Subway Passenger Flow Prediction on Holiday Based on LightGBM and LSTM
Yuhan Xia
Traditional predictive models struggle to accurately forecast short-term holiday passenger flow in urban rail transit systems. Intense fluctuations and nonlinear patterns often cause infrastructure strain, overcrowding, delays, and safety risks. Addressing this gap is vital for effective transit management,...
Proceedings Article
Short-term Traffic Flow Prediction for Expressway based on ARIMA: Compared with LSTM
Jinghan Zou
Predicting short-term traffic flow holds significant value for the operation and management of highway traffic. As a result, it’s crucial to develop a feasible short-term traffic flow forecast model and use traffic flow data for prediction in an efficient manner. In this study, for predicting traffic...
Proceedings Article
Battery Health Prediction of New Energy Vehicles Based on LightGBM
Haolong Li
In the era of rapid development of new energy vehicles, the health status of lithium-ion batteries (SOH) will have a direct impact on the endurance and safety of new energy vehicles, so it is very important to predict the health status of lithium-ion batteries. This study uses lightgbm as a data prediction...
Proceedings Article
Application of Biometric Technology in Academic Examinations
Qiman Huang
Educational examinations are critical for national talent selection. In recent years, the number of examinees has surged due to growing demands for academic advancement, professional title evaluations, and occupational qualifications. This expansion brings new management challenges, as conventional cheating—such...
Proceedings Article
A CNN-RF-SVM Hybrid Method for Traffic Congestion Monitoring
Jiaxu Zhu
With the development of the automotive industry and increasing demand for personal mobility, the growing number of vehicles on roads has led to traffic congestion and environmental pollution. Accurate traffic congestion monitoring has become a prominent research focus. The integration of surveillance...
Proceedings Article
Corporate Bankruptcy Prediction in the U.S. Using Random Forest and XGBoost Algorithms
Zihao Lan
Corporate bankruptcy prediction is crucial to investors, financial institutions and regulators, because it supports early risk warning, enhances capital allocation efficiency, and contributes to both financial market resilience and enterprise sustainability. Traditional statistical methods have limited...
Proceedings Article
Bird Species Identification Using YOLO Neural Network
Yawei Li
Ecological conservation efforts increasingly rely on biodiversity indicators, with bird populations serving as critical sentinels of ecosystem health. Bird conservation is fundamental to maintaining biodiversity and health. Thus, bird species monitoring constitutes a crucial part of conservation efforts....
Proceedings Article
Short-Term Passenger Flow Prediction of Suzhou Metro Using Random Forest and LSTM
Kai Zhang, Lixun Zhuang
In the operation of urban public transportation, the volatility of subway passenger flow stands as a key issue affecting operational efficiency and passenger experience. Congestion during peak passenger flow periods, wasted transport capacity during off-peak hours, and sudden changes in passenger flow...
Proceedings Article
Multi-Person Pose Estimation: Method Classification and Cross-Dataset Performance Analysis
Zikun Li
Finding the important features of each human body in the picture and accurately allocating those features to each individual is the main challenge of multi-person posture estimation. Multi person pose estimation tasks can provide support for multiple downstream tasks and overcome the limitation of single...
Proceedings Article
Path Planning Algorithm for Intelligent Robot
Jiayi Xu
The task of intelligent robot path planning is to automatically generate the optimal motion path from the starting point to the target point on the basis of achieving safety, efficiency, and energy balance. The development of related technologies has improved the adaptability and efficiency of path planning,...
Proceedings Article
Autonomous Driving Vehicle Detection Methods under Different Low-Light Scenes
Jiacheng Fan
Vehicle detection is a core task in environmental perception for autonomous driving, and its performance directly affects driving safety. However, low-light scenes, due to issues such as low light levels, atmospheric scattering, and sensor occlusion, result in image distortion, difficulty in edge detection,...
Proceedings Article
Intelligent Detection of Crop Pests and Diseases Based on Deep Learning: Rice Pests and Tomato Leaf Diseases
Zexi Li
At the critical stage of the development of smart agriculture, integrating mechanical vision and deep learning technologies to achieve precise monitoring of tomato leaf diseases and rapid identification of rice pests is of great significance for enhancing agricultural production efficiency and ensuring...
Proceedings Article
A Survey on Deep Learning-Based Integrated Perception-Cognition-Control Systems for Autonomous Mobile Robots
Wenqi Han
The mobile robots currently widely adopted in industrial and logistics sectors integrate various technologies, with deep learning emerging as a new driving force and breakthrough for enhancing robotic efficiency and precision. Integrating its technology into each component module of the robotic system’s...
Proceedings Article
Analysis of Visual Navigation Systems in Autonomous Driving
Yizhan Zhang
As a key technology for environmental perception and precise positioning in autonomous driving, the performance of visual navigation systems directly impacts vehicle safety and efficiency in complex environments. This system primarily relies on visual cameras to capture rich road scene information, utilizing...
Proceedings Article
Breakthroughs and Technological Innovations in Homo Sapiens-Based Artificial Intelligence: Development Overview of the Machine Homo Sapiens Field
Peiliang Yan
This paper examines the impact of breakthroughs and technological innovations in Homo sapiens-based artificial intelligence on the development of the machine Homo sapiens field, aiming to lower the dependency on traditional AI models and enhance autonomous capabilities. The article first introduces key...
Proceedings Article
Comprehensive Analysis of Intelligent Welding Technology Based on Machine Vision
Zhuangzhuang Liu
With the increasing demands for welding efficiency and accuracy, intelligent welding technology has developed rapidly, and machine vision plays a significant role in it. In terms of image preprocessing, a system of grayscale processing, filtering and denoising, and image enhancement is formed to reduce...
Proceedings Article
Analysis of Autonomous Driving Control Strategies Based on Deep Learning
Yumin Cai
As a core component of intelligent transportation systems, the development of autonomous driving technology is of milestone significance for enhancing traffic safety, optimizing traffic efficiency, and improving travel experience. Control strategies, as a key execution link in the “perception-decision-control”...
Proceedings Article
Applications of GAN-Based Extension Methods in Image Processing
Yang Ding
In the era of rapid AI advancement, Generative Adversarial Networks (GANs) stand as one of the most disruptive innovations in deep learning, deeply integrated into every facet of human society. This paper focuses on the optimization pathways within the GAN technology system, systematically reviewing...
Proceedings Article
Cross-Domain Applications of Multi-Dimensional Data Alignment Technology
Fengshuo Kou
Against the backdrop of the rapid development of intelligent technology, cross-modal data fusion is driving innovations across various fields. As a key supporting technology, 2D and 3D data alignment technology is becoming increasingly prominent in value. 2D data (e.g., camera images) is rich in semantic...
Proceedings Article
PGeoCLIP: Acceleration on Image geo-localization Using Precomputed Features
Chengwuzhou Wu
Image-based Geo-localization refers to predicting the geographic location from an image. A noble image-to-GPS retrieval approach GeoCLIP, demonstrated outstanding performance below distance threshold metrics, while the substantial training time and computational overhead present considerable challenges....
Proceedings Article
Object Detection Based on the DETR Method
Yiru Wang
One of the computer vision research hotspots is object detection. Its aim is to accurately and quickly identify objects in images and locate their positions, converting visual information into understandable and actionable intelligence. With the success of the Transformer architecture in the field of...
Proceedings Article
Character Image Editing via Segment-Anything Model and In-Context Edit Integration
Zhuoran Jia
In recent years, the development of computer vision tasks for image segmentation has become relatively mature. Image editing tasks based on instructions can achieve powerful image modification by natural language prompts. However, existing instruction-based image editing driven by natural language or...
Proceedings Article
Construction and Deduction of a Multimodal Traffic Prediction System: From Baseline Models to Fusion Innovation
Zijun Chen, Zhengyuan Zhou
With the rapid development of intelligent transportation systems, urban traffic flow prediction has become a core issue in urban management and traffic optimization. Traditional traffic prediction methods often rely on single-modal data, such as historical speed or flow, which cannot fully utilize the...
Proceedings Article
Advances in Intelligent Control and Collaborative Technologies for Industrial Robots
Yixin Luo
This article provides a systematic review of the current research status and development trends of intelligent control and collaboration technologies for industrial robots. As the demand for flexibility and personalization in manufacturing continues to rise, artificial intelligence is deeply integrating...
Proceedings Article
Tooth Detection Technology for Oral Disease Diagnosis
Jie Li
With the constant development of digital medical technology, oral disease is gradually shifting from traditional manual examination to automated detection based on image analysis. Tooth detection technology, as an important part of intelligent dental imaging, plays a crucial role in improving early disease...
Proceedings Article
Bionic Mechanical Structures of Rescue Robots in Complex Disaster Environments
Wenxuan Yu
As the crucial equipment for responding to complex disaster environments, the bionic mechanical structure design of rescue robots will determine their operational effectiveness directly. With the frequent occurrence of extreme natural disasters worldwide, the adaptability of rescue equipment to complex...
Proceedings Article
The Development of Face Recognition Technology Based on Deep Learning
Xiaoxun Huang
Nowadays with the constantly moving forward of the society, the development of facial recognition technology and the improvement of people’s safety awareness, human beings have begun to conduct more in-depth research on facial recognition technology. Continuous research on deep learning technology can...
Proceedings Article
Review on Autonomous Robot Mobility Based on Visual Deep Learning
Xun He, Tianyi Xu, Lo San Yuan
The topic related to dynamic autonomous navigation in complex environments has become more popular in the robotics research area. It is very crucial for mobile robotics to have automatic obstacle avoidance and path planning. Traditional methods, such as simultaneous localization and mapping (SLAM) technology,...
Proceedings Article
Intelligent Question Answering System Based on Multimodal Fusion
Mengcong Zhang
This article explores the application of multimodal learning in intelligent question answering, emphasizing the importance of integrating data from multiple modalities to fully understand complex scenarios. The article reviews the development of intelligent question answering systems, from the early...
Proceedings Article
Comprehensive to the Textual Hallucination in Generative AI
Yiyang Li
Generative AI has been particularly strong in many places in recent years, especially large language models that have done very well in writing articles, answering questions, and helping to learn these things. However, these models sometimes make mistakes, such as making up factual content, or giving...
Proceedings Article
Multimodal Question Answering: Method Evolution, Challenges and Prospects
Haopeng Li
With the breakthroughs in cross-modal technology of artificial intelligence, multi-modal question answering (MMQA), as a key research direction connecting image, text and voice information, has increasingly significant application value in fields such as barrier-free services and education. This paper...
Proceedings Article
The Principles and Functions of Various Models for Text Style Transformation
Haolun He
Text style transfer is a significant research task in natural language processing, with its core objective being to transform the style of a text while maintaining the original semantic content. This paper reviews and summarizes existing research from the perspectives of supervised learning, semi-supervised...
Proceedings Article
Analysis, Comparison and Application Scenarios of Nighttime Infrared Image Technology
Zijun Jin
As a breakthrough technology that transcends human visual limitations, nighttime infrared imaging demonstrates immense potential and application value across military, civilian, and industrial sectors. This paper systematically reviews the research progress of nighttime infrared imaging technology, focusing...
Proceedings Article
Progress in Welding Defect Detection Based on Visual Technology
Xucheng Feng
Welding defect detection is a crucial step in ensuring the safety of industrial manufacturing. The traditional manual visual detection and radiographic detection methods have low efficiency, high costs, and poor adaptability, which make it challenging to meet the needs of large-scale manufacturing. With...
Proceedings Article
Research on Industrial Robot Grasping Based on Visual Technology
Changye Du
Under the backdrop of the rapid development of industrial automation, industrial robots have become the core force for the transformation and upgrading of manufacturing. Integrating visual technology into the grasping of industrial robots changes the traditional operation mode, endowing robots with the...
Proceedings Article
Unmanned Aerial Vehicle Target Tracking Systems in Complex Environments Based on Visual Enhancement Technology
Tianlin Guo
Unmanned aerial vehicle target tracking is one of the key areas of study in the field of computer vision and intelligent control, widely used in aerospace, intelligent driving, security surveillance, smart agriculture, and search and rescue, which can also carry out aerial multi-view multi-platform long-distance...
Proceedings Article
Machine Learning-Based Motion Planning for Robots
Yumin Shi
Motion planning for robots is a crucial technology that enables autonomous navigation and the performance of complex tasks. Traditional methods have problems like low efficiency and poor adaptability in high-dimensional spaces and dynamic environments. In recent years, machine learning methods, including...
Proceedings Article
From Credit Scoring to Artificial Intelligence-Driven Loan Default Prediction
Zaiyu Zhang
Loan default prediction is important to both the credit allocation and portfolio risk management. The conventional scorecard is based on predefined handcrafted features and linear assumptions, thus ignoring nonlinear and temporal characteristics of borrowers behaviors. Article introduce a novel sandwiched...
Proceedings Article
Food Safety Monitoring Based on Machine Vision
Pengyu Xie
Food safety has become a global challenge due to the inefficiency and subjectivity of traditional inspection methods. These approaches often rely on destructive sampling and manual visual analysis, which lead to high latency, low scalability, and limited accuracy. Machine vision provides a non-destructive,...
Proceedings Article
Analysis of Intelligent Welding Strategies Based on Machine Vision
Junying Tong
With the fast development of industrial automation, welding process is one of the core production processes that affects product quality and production efficiency of enterprises with its welding quality and efficiency. Non-contact inspection, real-time information acquisition, and high-precision recognition...
Proceedings Article
Retrieval-Augmented Generation: Advances, Applications, and Future Directions in Knowledge-Grounded Language Modeling
Hancheng Yu
Retrieval-Augmented Generation (RAG) has emerged as a pivotal innovation in natural language processing (NLP), integrating information retrieval with generative modeling to overcome the limitations of static, parameter-based large language models. By dynamically retrieving relevant external knowledge...
Proceedings Article
The Current Development of Large Language Models in Medical Questions and Answers--Center on the Models of Med-PaLM and Med-PaLM2
Yueqi Zhu
The large language models (LLMs) have brought a significant technological changes in the field of medical questions and answers. It can help medical personnel improve diagnostic efficiency and also help the public more easily understand their own physical health status. This article focuses on the Google...
Proceedings Article
Comparative Research on the Unbalanced Processing of Traffic Accident Data and Multi-model Performance
Tao Hu
Traffic Accident prediction is of great significance in the construction of smart cities, however, there is an unbalanced challenge in the data of traffic accidents. In response to this problem, three public data sets, Addis Ababa Sub-city Accident, Us Accident and Barcelona Accident, were selected....
Proceedings Article
Research Analysis of Optimization of Temperature Prediction Model Based on Random Forest
Tianhua Jiang
Temperature prediction plays a crucial role in meteorological analysis, scientific planning of agricultural production, precise allocation of energy management and so on. But traditional temperature prediction methods often fall short in terms of prediction accuracy and fail to meet expectations. To...
Proceedings Article
Threshold-Aware Machine Learning for Heart Disease Prediction
Yanzhou Qian
The decision threshold in clinical heart disease risk assessment is critical for real-world utility but frequently overlooked. This study investigates threshold-aware prediction using a clinical dataset of 919 records (14 features), with heart disease presence (num > 0) as the target. Preprocessing...
Proceedings Article
An Improved CNN–LSTM Model for Daily Gold Price Prediction
Yi Lin
Gold prices play a pivotal role in the global financial system, but they often experience short-term volatility, long-term trends, and frequent regime switching. To address this, this paper proposes a new architecture, building on the CNN architecture and combining ordinary convolution with dilated convolution...
Proceedings Article
Prediction of Cardiovascular Diseases Based on Mainstream Machine Learning Algorithms
Peiyuan Liu
Cardiovascular diseases (CVDs) are one of the hottest issues in present medical research due to their status as the leading cause of mortality worldwide. Studies have achieved certain achievements in early detection tool development. However, there is still a research gap in accurate, efficient and widely...
Proceedings Article
UAV Autonomous Flight Obstacle Avoidance Technology
Shengyao Duan
With the maturity of drone technology and its increasing application in agriculture, logistics, and even medical fields, the issue of its flight safety in low-altitude complex environments has become increasingly critical. Autonomous flight obstacle avoidance is the core technology to ensure the reliability...
Proceedings Article
Research and Analysis on Chain of Thought (CoT) Reasoning and Interpretability in Large Language Models
Pengyu Liao
As an important form of Large Language Models (LLM), Chain-of-Thought (CoT) has made breakthroughs in logical consistency, interpretability, and task accuracy by guiding the model to generate answers by stepwise reasoning. This paper systematically reviews the research progress of CoT reasoning in the...
Proceedings Article
Computational Approaches to Stock Price Prediction
Jiachen Liu
Stock price prediction is one of the most significant and difficult problems of computational finance. As the volume of data and computing capabilities have rapidly increased, forecasting models have shifted toward a data-driven forecasting method based on machine learning and deep learning. In this...
Proceedings Article
Progress in Diagnosis and Prediction of Common Cancers: Multi-Cancer Characteristics, Technical Applications, and AI Model Practices
Bowen Zhang
Cancer remains a major global health burden, and its early diagnosis and accurate prediction are crucial for improving patient prognosis. This paper reviews three common types of cancers—gastric cancer, skin cancer, and brain tumors—focusing on their pathological mechanisms, pathogenic factors, clinical...
Proceedings Article
Multi Asset Price Time Series Prediction Based on LSTM, GRU, and MLP
Yaqi Liu
The financial asset price has nonlinear, stochastic and multidimensional characteristics, making it difficult to use linear forecasting models. Therefore, in this paper, three neural networks are used—Multi Layered Perceptron (MLP), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU)—to achieve...
Proceedings Article
Pneumonia X-ray Image Classification Methods Based on Deep Learning
Xinyu Zhou
Pneumonia is a respiratory disease that is widely present in daily life. For pneumonia, timely and accurate diagnosis is very crucial for clinical treatment. The traditional chest X-ray diagnostic method mainly relies on all the professional knowledge of radiologists to carry out the diagnostic work....
Proceedings Article
Machine Learning Applications in Stock Index Prediction
Xuanmeng Huang
Stock index forecasting has been an essential indicator of market research since the emergence of financial analysis. Its concept encompasses many dimensions, such as investment decision-making, risk control, and policy evaluation. In recent years, with the rapid development of artificial intelligence...
Proceedings Article
Enhancing the Efficiency of Cell Classification in ScRNA-seq Data by Weakly Supervised Learning
Xinyi Zhou
Single-cell RNA sequencing (scRNA-seq) is a genomics technology that enables research of cellular diversity and operation through assessing gene expression. However, no widely available automated classification technology currently exists, and distinguishing cells within scRNA-seq datasets still relies...
Proceedings Article
Evaluation of Galaxy Morphology Classification with Machine Learning and Deep Learning
Yuhan You
Through the development of technologies, astronomical imaging surveys have increased significantly, so that more galaxy images are taken than ever, which makes the traditional manual galaxy classification infeasible. For this reason, adopting automated machine learning techniques is important in replacing...
Proceedings Article
Fine-Grained Deep Learning for Gleason Grading in Prostate Histopathology
Kaixin Chen
One of the most frequent cancers in male patients is prostate cancer, diagnosis and prognosis of which include histopathological image analysis, especially Gleason grading. It is a delicate activity that involves identifying subtle architectural variations in tissue patterns which are however disadvantaged...
Proceedings Article
GANs in Image Generation and Denoising: From Infrastructure to Real-World Applications
Haoyang Sun
Generative Adversarial Networks (GANs) are currently the mainstream generative models, widely used in image, audio, and other data processing. Over the course of ten years, many GAN models have emerged that are adapted to different use cases. With the constant improvement of GAN model structure, they...
Proceedings Article
Skin Cancer Image Generation Using WGAN-GP Based on HAM10000 Dataset
Hongye Hao
Skin cancer is a highly prevalent disease, and its early diagnosis relies on the recognition of skin lesion images. However, the acquisition cost of labeled medical image samples is extremely high, which makes it easy for model training to suffer from data imbalance. To address this issue, this paper...
Proceedings Article
The Summary of Price Prediction Methods of Gold-related Financial Products
Hengxin Hua
After the dismantling of the Bretton Woods System, the financial attributes of gold are gradually emerging, more and more gold-related financial products appear in the market in recent years, such as gold futures and so on. The price of gold-related financial products has reflected and influenced economic...
Proceedings Article
Intelligent Classification and Identification of Similar Respiratory System Diseases
Zhuoyang Liu
This study focuses on symptom classification models for the four most common respiratory diseases (COVID, FLU, COLD, and ALLERGY). The aim is to address the challenge of distinguishing between similar and troublesome respiratory illnesses while maintaining accuracy and minimizing unnecessary time wastage....
Proceedings Article
Heart Disease Prediction Using Machine Learning Models
Daoyi Cheng
This study is aimed at the binary classification task of heart diseases, using 918 subjects and 11 clinical and diagnostic features from public datasets. Before training, the data underwent missing and anomaly checks, numerical features were standardized, category-specific features were encoded, and...
Proceedings Article
Applications of Partial Differential Equations
Ziran Zhang
Partial differential equations (PDEs) are descriptions of continuous changes. Using methods/algorithms, PDE-based models are applied in physical calculations, image processing, machine learning, etc. This paper discusses the applications of PDEs, especially in image denoising and inpainting. When denoising,...
Proceedings Article
Using Machine Learning Methods to Predict Mobile Phone Prices
Wenbo Lu
This study analyzes the price range and configurations of mobile phones on the market to help emerging mobile phone companies accurately position their product prices and better compete in the mobile phone market. The study uses machine learning techniques to predict mobile phone prices, employing three...
Proceedings Article
Machine Learning, Ensembles, and Knowledge Graphs for Diabetes Prediction
Jiayi Jiang
This article reviews the progress of machine learning in early prediction and risk identification of diabetes, focusing on three methods: traditional models (such as Logical Regression, SVM, RF, etc.), ensemble learning (such as Bagging, Boosting, Stacking and weighted voting) and Reasoning based on...
Proceedings Article
Research and Analysis of Large Language Model Synthetic Data Generation and Bias and Illusion Problems
Bolin Zhang
This dissertation examines the relevance of large language models (LLMs) in enhancing time-series data applications, e.g., climate forecasting, traffic control, and finance, in which quality data is critically important in the prediction and making of decisions. The importance of overcoming the limitations...
Proceedings Article
Research and Analysis of Traffic Accident Severity Prediction Based on Data Augmentation and Feature Interpretation
Yuhan Chen
Predicting the severity of traffic accidents is crucial to traffic safety management and accident prevention. However, in the actual data, the number of minor accidents far exceeds that of serious accidents, resulting in deviations in most types of predictions. In order to alleviate the category imbalance...
Proceedings Article
Paradigm Evolution of Industrial Surface Defect Detection: The Underlying Logic and Fundamental Challenges from Supervised Classification to Unsupervised Anomaly Localization
Zihan Zhang
Industrial appearance defect detection is a key aspect of quality control in smart manufacturing. The core challenge lies in how to effectively apply models that perform well in laboratory environments to complex, dynamic, and unpredictable real-world industrial scenarios. This challenge is specifically...
Proceedings Article
Diagnosis, Optimization, and Verification of Localized Failures in Traffic Flow Prediction Models
Yunjian Tang
In the real traffic control scenario, the multi-intersection prediction model often appears “local failure” due to the distribution difference and concept drift between intersections, which leads to the decline of the overall prediction performance and affects the reliability of scheduling. Improving...
Proceedings Article
Application Status of Deep Reinforcement Learning in Optimal Power Flow (OPF) Problem in Renewable Energy Power System
Shuai Yuan
This paper makes an in-depth analysis of the challenges faced by the optimal power flow (OPF) of the power system due to the high proportion of renewable energy access, which makes the uncertainty increase and the real-time requirements improve. Since the traditional iterative optimization method only...
Proceedings Article
Overview of Traffic Flow Prediction Research Based on Graph Neural Networks
Yi Shao
The acceleration of global urbanization and the continuous increase of the amount of vehicles have made traffic jams a severe challenge faced by cities around the world. How to accurately and real-time predict traffic flow has become a core element in building intelligent transportation systems. To better...
Proceedings Article
Pure Vision and Multi-modal Perception in Autonomous Driving: Performance, Challenges and Architectural Insights
Yaode Han, Kaiyang Li
Environmental perception is a core technical aspect of autonomous driving systems, and its architecture design directly determines the vehicle’s understanding ability of the surrounding environment and driving safety. With the development of artificial intelligence and sensor technology, pure visual...
Proceedings Article
A Review of Recent Methods for Traffic Prediction with Few Data
Zhaowei Huang
Predicting traffic accurately is very important for building smart cities. But the deep learning models that achieve state-of-the-art performance today—especially graph neural networks (GNNs)—require massive amounts of past data. This is a problem in new urban areas or on roads that have recently been...
Proceedings Article
Research and Analysis of Stock Prediction Based on Deep Learning
Hanyu Zhang
In recent years, with all the relentless strides in artificial intelligence, deep learning could now be applied to almost all fields, including but not limited to health care, scientific research, and financial analytics. Among these applications, predicting and assessing stock market trends using deep...
Proceedings Article
The Influence of the Shape Symbol Paired Grouping Gradient Mean Iterative Method (SSPGG-IM) on the Medical Image Classification Problem in the Context of Small Samples
Zhengyang Li, Jiayi Zhang
This study focuses on the analysis of small sample medical images. It first reviews the research value and application status of small sample learning in this field, as well as the limitations of traditional methods. The research is based on three types of core small-sample learning methods as the technical...
Proceedings Article
Evaluating the Impact of Self-Attention in Pix2Pix for Image-to-Image Translation
Zheng Liao
In this work, a self-attention module is incorporated into the generator of the Pix2Pix model and its effects on image-to-image translation with Facades dataset is assessed. The proposed architecture adds self-attention at the bottleneck of the U-Net generator to capture global context while retaining...
Proceedings Article
Multi-Scale Patch Discriminator for Cycle-Consistent Unpaired Image Translation
Yichen Liu
Unpaired image-to-image translation often relies on a single-scale PatchGAN discriminator that emphasizes local textures while providing limited guidance on global structure, which can lead to shape distortion and unstable optimization at higher resolutions. This paper investigates a drop-in Multi-Scale...
Proceedings Article
Data Cleaning and Visualization Analysis Based on Pan-das and Matplotlib a Case Study of the Titanic Dataset
Kaiwen Zuo
In the big data time, data cleaning and visualization become essential for valuable information from raw materials. The very heady challenge of such work is to know what works for whom and how it does. It needs systematic exercises for data preprocessing and exploratory visualization. The classic Titanic...
Proceedings Article
Generative Adversarial Networks in Medical Imaging: Applications, Challenges, and Emerging Trends (2020–2025)
Taoyu Chen
Generative Adversarial Networks (GANs) have developed into an important class of generative models in medical imaging, providing distinct properties for problems that are difficult because of limited data and costly annotation. In 2020–2025, GANs have also shifted from basic augmentation to more advanced...
Proceedings Article
Research and Analysis on Image Style Transfer Technologies
Runxin Yang
Image style transfer technologies seek to imbue original images with a desired artistic style while preserving their content structure at the same time. Early image style transfer algorithms mostly employed non-parametric methods to achieve style transfer. Recently, there are a series of breakthroughs...
Proceedings Article
Research and Analysis of Core Data in the Closed-loop of Autonomous Driving Data
Tielin Wang
Since 2020, the global autonomous driving (AD) market has moved into the mass-market of L2 + Advanced Driver Assistance System (ADAS) penetration, it is believed that the adoption rate of the systems in China will reach more than 65 percent by 2025. A vehicle with AD capabilities produces 4–20 terabytes...
Proceedings Article
Framework Design and Performance Comparative Analysis of Large Language Models
Mingxuan Deng
With the emergence of Transformer architecture, large language models (LLMs) have made breakthroughs in the fields of language understanding, reasoning, code generation and multimodal interaction. This research systematically sorts out the technical evolution path of mainstream LLM in the past five years,...
Proceedings Article
Research and Analysis of Multi object Tracking Combined with Transformer Method
Junhao Yu
Multi-object tracking (MOT) is widely used in intelligent transportation, public safety, and autonomous driving. Traditional MOT algorithms, relying on local feature modeling and heuristic association rules, have reached their performance limits. This leads to significant performance degradation in complex...
Proceedings Article
Precision Fine-Tuning: Leveraging LoRA for Text-Only Adaptation in Multi-Modal Medical Models
Wenru Lu
Large Multimodal Model (LMM), which has the ability to process visual and textual information, has great potential in medical and other professional fields. However, adapting these complex models to specific sub domains or tasks faces many challenges. Due to the high demand for computer resources and...
Proceedings Article
Adapter-Fusion: A Practical, Parameter-Efficient Framework for Composable Control in Text-to-Image Diffusion
Yunzhong Zheng
The surge of text-to-image diffusion models is an innovative step in the development of generative artificial intelligence. However, when the model is applied in production, the lack of precise control is a critical constraint. There are existing methods that introduced singular control modalities. The...
Proceedings Article
Evaluating LoRA, QLoRA, and Full Fine-Tuning on Compact Language Models Under Limited GPU Resources
Congbo Ni
Language models that are fine-tuned can also be surprisingly high-demand, although the model themselves can be itty-bitty. In the course of this project, the paper learned how to apply three approaches of adapting compact models to a simple classification problem: updating all model parameters, adding...
Proceedings Article
Research and Analysis of DCGAN in Different Application Fields
Chongyue Liu
Deep convolutional Generative Adversarial Network (DCGAN) is an important type of generative adversarial network. It integrates convolutional neural network (CNN) into the adversarial framework and performs well in image generation and data augmentation. This paper reviews and summarizes the basic technical...
Proceedings Article
Research and Analysis of Generative Adversarial Networks in the Field of Computer Vision
Longxuan Li
The integration of Generative Adversarial Networks (GANs) with various domains in computer vision has become one of the key topics in current research. Researchers have found that GANs have outperformed existing models in vertical applications, significantly enhancing and expanding model training performance...
Proceedings Article
Research and Analysis of VAE in Image and Data Analysis
Xuan Sheng
Deep generative models find popular applications in image and data analysis to learn more intricate patterns as well as to generate novel samples. Variational autoencoders are appreciated for ensuring a clear format and stable training and are used in recovery, learning features, and data augmentation....
Proceedings Article
A Novel Image-to-Image Model: MSF-CycleGAN
Yujing Wang
Unpaired image-to-image transformations are often affected by structural distortions and semantic inconsistencies because they rely on pixel-level cyclic consistency constraints. To address these limitations, the paper proposes a multi-scale feature consistency cyclic generative adversarial network,...
Proceedings Article
Evaluating High-Resolution Vessel Mask-to-Fundus Translation under Non-Monotonic GAN Dynamics
Jiaqiang Yang
Clinically viable retinal fundus synthesis from vessel masks requires photorealistic appearance while maintaining anatomical agreement with the input structure. The task is treated as a structure-sensitive, high-resolution (512 × 512) paired translation problem, with a Pix2Pix-style model as a baseline....
Proceedings Article
Evaluating Sparse and Transformer-based Representations for Chinese Weibo Sentiment Analysis Across Data Scales and Noise Conditions
Yuying Zhao
This study presents a systematic comparison between traditional sparse representations and contextualised Transformer models for Chinese Weibo sentiment classification. Using a publicly available dataset of 10,500 annotated microblog posts, the analysis examines four key dimensions: text representation,...
Proceedings Article
Attention-Enhanced CycleGAN for Unpaired Image-to-Image Translation
Wenqi Zheng
Cycle-consistent GANs are widely used for unpaired image-to-image translation, but they often over-translate textures in regions that should remain largely unchanged (e.g., background grass or sky in horse ↔ zebra). This failure mode is encouraged by discriminators that score the full image uniformly,...