Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)

+ Advanced search
102 articles
Proceedings Article

Peer-Review Statements

Kannimuthu Subramaniyam
All of the articles in this proceedings volume have been presented at the [IWADIC] during [12.26-12.28, 2025] in [Kuala Lumpur, Malaysia]. These articles have been peer reviewed by the members of the [Scientific Committee] and approved by the Editor-in-Chief, who affirms that this document is a truthful...
Proceedings Article

Industrial Robot Control Based on Deep Learning

Yuebai Wang
The concept of Industry 4.0 originated in Germany, with its core being a highly digitalized factory interior that enables an efficient manufacturing system and even autonomous control of production. In the era of Industry 4.0, with the application of neural network learning, industrial robots can complete...
Proceedings Article

The Application of Vision-Based Navigation Techniques on Drones

Xiaohong Gui
With the development of unmanned aerial vehicles (UAVs), more and more of them are introduced into true working conditions, and they need a powerful navigation system to operate smoothly. However, the commonly used navigation systems, such as the global positioning system (GPS), are not always reliable...
Proceedings Article

YOLOv8-CAACA: A Context-Aware Adaptive Confidence Adjustment and Target Fusion Algorithm for Pavement Crack Recognition in Complex Scenarios

Muhan Bai
To address the issues in current crack detection from road images—such as inaccurate identification of long cracks, misdetection and missed detection of cracks in complex environments, and splitting of a single crack into multiple segments—this paper proposes an improved YOLOv8 algorithm (YOLOv8s-CAACA)...
Proceedings Article

Application of Robotics Technology in Extreme Environments

Kexin Li, Tinghe Na, Kaiming Zhang
Extreme environment robots can replace humans working in dangerous scenarios such as the deep sea, space and nuclear industry, but their applications face many challenges. Technically, it needs to have extreme environmental adaptability, stable communication and autonomous decision-making ability. Key...
Proceedings Article

Technology and Application of Computer Vision in Smart Driving

Yiheng Zhou
Intelligent driving is a systematic project that involves the realization of multiple links in a collaborative manner. Environmental perception is an important link in an intelligent driving system. Machine vision can acquire rich and accurate data through cameras, and through the processing of this...
Proceedings Article

Prediction of Pre-Diabetes Based on Random Forest

Qihan Li
Diabetes, as one of the most severe chronic diseases globally, has seen an increase in prevalence in recent years rather than a decrease. Early screening for diabetes relies on traditional biochemical indicators, resulting in a high rate of missed diagnoses and insufficient resources at the grassroots...
Proceedings Article

Short-term Subway Passenger Flow Prediction on Holiday Based on LightGBM and LSTM

Yuhan Xia
Traditional predictive models struggle to accurately forecast short-term holiday passenger flow in urban rail transit systems. Intense fluctuations and nonlinear patterns often cause infrastructure strain, overcrowding, delays, and safety risks. Addressing this gap is vital for effective transit management,...
Proceedings Article

Short-term Traffic Flow Prediction for Expressway based on ARIMA: Compared with LSTM

Jinghan Zou
Predicting short-term traffic flow holds significant value for the operation and management of highway traffic. As a result, it’s crucial to develop a feasible short-term traffic flow forecast model and use traffic flow data for prediction in an efficient manner. In this study, for predicting traffic...
Proceedings Article

Battery Health Prediction of New Energy Vehicles Based on LightGBM

Haolong Li
In the era of rapid development of new energy vehicles, the health status of lithium-ion batteries (SOH) will have a direct impact on the endurance and safety of new energy vehicles, so it is very important to predict the health status of lithium-ion batteries. This study uses lightgbm as a data prediction...
Proceedings Article

Application of Biometric Technology in Academic Examinations

Qiman Huang
Educational examinations are critical for national talent selection. In recent years, the number of examinees has surged due to growing demands for academic advancement, professional title evaluations, and occupational qualifications. This expansion brings new management challenges, as conventional cheating—such...
Proceedings Article

A CNN-RF-SVM Hybrid Method for Traffic Congestion Monitoring

Jiaxu Zhu
With the development of the automotive industry and increasing demand for personal mobility, the growing number of vehicles on roads has led to traffic congestion and environmental pollution. Accurate traffic congestion monitoring has become a prominent research focus. The integration of surveillance...
Proceedings Article

Corporate Bankruptcy Prediction in the U.S. Using Random Forest and XGBoost Algorithms

Zihao Lan
Corporate bankruptcy prediction is crucial to investors, financial institutions and regulators, because it supports early risk warning, enhances capital allocation efficiency, and contributes to both financial market resilience and enterprise sustainability. Traditional statistical methods have limited...
Proceedings Article

Bird Species Identification Using YOLO Neural Network

Yawei Li
Ecological conservation efforts increasingly rely on biodiversity indicators, with bird populations serving as critical sentinels of ecosystem health. Bird conservation is fundamental to maintaining biodiversity and health. Thus, bird species monitoring constitutes a crucial part of conservation efforts....
Proceedings Article

Short-Term Passenger Flow Prediction of Suzhou Metro Using Random Forest and LSTM

Kai Zhang, Lixun Zhuang
In the operation of urban public transportation, the volatility of subway passenger flow stands as a key issue affecting operational efficiency and passenger experience. Congestion during peak passenger flow periods, wasted transport capacity during off-peak hours, and sudden changes in passenger flow...
Proceedings Article

Multi-Person Pose Estimation: Method Classification and Cross-Dataset Performance Analysis

Zikun Li
Finding the important features of each human body in the picture and accurately allocating those features to each individual is the main challenge of multi-person posture estimation. Multi person pose estimation tasks can provide support for multiple downstream tasks and overcome the limitation of single...
Proceedings Article

Path Planning Algorithm for Intelligent Robot

Jiayi Xu
The task of intelligent robot path planning is to automatically generate the optimal motion path from the starting point to the target point on the basis of achieving safety, efficiency, and energy balance. The development of related technologies has improved the adaptability and efficiency of path planning,...
Proceedings Article

Autonomous Driving Vehicle Detection Methods under Different Low-Light Scenes

Jiacheng Fan
Vehicle detection is a core task in environmental perception for autonomous driving, and its performance directly affects driving safety. However, low-light scenes, due to issues such as low light levels, atmospheric scattering, and sensor occlusion, result in image distortion, difficulty in edge detection,...
Proceedings Article

Intelligent Detection of Crop Pests and Diseases Based on Deep Learning: Rice Pests and Tomato Leaf Diseases

Zexi Li
At the critical stage of the development of smart agriculture, integrating mechanical vision and deep learning technologies to achieve precise monitoring of tomato leaf diseases and rapid identification of rice pests is of great significance for enhancing agricultural production efficiency and ensuring...
Proceedings Article

A Survey on Deep Learning-Based Integrated Perception-Cognition-Control Systems for Autonomous Mobile Robots

Wenqi Han
The mobile robots currently widely adopted in industrial and logistics sectors integrate various technologies, with deep learning emerging as a new driving force and breakthrough for enhancing robotic efficiency and precision. Integrating its technology into each component module of the robotic system’s...
Proceedings Article

Analysis of Visual Navigation Systems in Autonomous Driving

Yizhan Zhang
As a key technology for environmental perception and precise positioning in autonomous driving, the performance of visual navigation systems directly impacts vehicle safety and efficiency in complex environments. This system primarily relies on visual cameras to capture rich road scene information, utilizing...
Proceedings Article

Breakthroughs and Technological Innovations in Homo Sapiens-Based Artificial Intelligence: Development Overview of the Machine Homo Sapiens Field

Peiliang Yan
This paper examines the impact of breakthroughs and technological innovations in Homo sapiens-based artificial intelligence on the development of the machine Homo sapiens field, aiming to lower the dependency on traditional AI models and enhance autonomous capabilities. The article first introduces key...
Proceedings Article

Comprehensive Analysis of Intelligent Welding Technology Based on Machine Vision

Zhuangzhuang Liu
With the increasing demands for welding efficiency and accuracy, intelligent welding technology has developed rapidly, and machine vision plays a significant role in it. In terms of image preprocessing, a system of grayscale processing, filtering and denoising, and image enhancement is formed to reduce...
Proceedings Article

Analysis of Autonomous Driving Control Strategies Based on Deep Learning

Yumin Cai
As a core component of intelligent transportation systems, the development of autonomous driving technology is of milestone significance for enhancing traffic safety, optimizing traffic efficiency, and improving travel experience. Control strategies, as a key execution link in the “perception-decision-control”...
Proceedings Article

Applications of GAN-Based Extension Methods in Image Processing

Yang Ding
In the era of rapid AI advancement, Generative Adversarial Networks (GANs) stand as one of the most disruptive innovations in deep learning, deeply integrated into every facet of human society. This paper focuses on the optimization pathways within the GAN technology system, systematically reviewing...
Proceedings Article

Cross-Domain Applications of Multi-Dimensional Data Alignment Technology

Fengshuo Kou
Against the backdrop of the rapid development of intelligent technology, cross-modal data fusion is driving innovations across various fields. As a key supporting technology, 2D and 3D data alignment technology is becoming increasingly prominent in value. 2D data (e.g., camera images) is rich in semantic...
Proceedings Article

PGeoCLIP: Acceleration on Image geo-localization Using Precomputed Features

Chengwuzhou Wu
Image-based Geo-localization refers to predicting the geographic location from an image. A noble image-to-GPS retrieval approach GeoCLIP, demonstrated outstanding performance below distance threshold metrics, while the substantial training time and computational overhead present considerable challenges....
Proceedings Article

Object Detection Based on the DETR Method

Yiru Wang
One of the computer vision research hotspots is object detection. Its aim is to accurately and quickly identify objects in images and locate their positions, converting visual information into understandable and actionable intelligence. With the success of the Transformer architecture in the field of...
Proceedings Article

Character Image Editing via Segment-Anything Model and In-Context Edit Integration

Zhuoran Jia
In recent years, the development of computer vision tasks for image segmentation has become relatively mature. Image editing tasks based on instructions can achieve powerful image modification by natural language prompts. However, existing instruction-based image editing driven by natural language or...
Proceedings Article

Construction and Deduction of a Multimodal Traffic Prediction System: From Baseline Models to Fusion Innovation

Zijun Chen, Zhengyuan Zhou
With the rapid development of intelligent transportation systems, urban traffic flow prediction has become a core issue in urban management and traffic optimization. Traditional traffic prediction methods often rely on single-modal data, such as historical speed or flow, which cannot fully utilize the...
Proceedings Article

Advances in Intelligent Control and Collaborative Technologies for Industrial Robots

Yixin Luo
This article provides a systematic review of the current research status and development trends of intelligent control and collaboration technologies for industrial robots. As the demand for flexibility and personalization in manufacturing continues to rise, artificial intelligence is deeply integrating...
Proceedings Article

Tooth Detection Technology for Oral Disease Diagnosis

Jie Li
With the constant development of digital medical technology, oral disease is gradually shifting from traditional manual examination to automated detection based on image analysis. Tooth detection technology, as an important part of intelligent dental imaging, plays a crucial role in improving early disease...
Proceedings Article

Bionic Mechanical Structures of Rescue Robots in Complex Disaster Environments

Wenxuan Yu
As the crucial equipment for responding to complex disaster environments, the bionic mechanical structure design of rescue robots will determine their operational effectiveness directly. With the frequent occurrence of extreme natural disasters worldwide, the adaptability of rescue equipment to complex...
Proceedings Article

The Development of Face Recognition Technology Based on Deep Learning

Xiaoxun Huang
Nowadays with the constantly moving forward of the society, the development of facial recognition technology and the improvement of people’s safety awareness, human beings have begun to conduct more in-depth research on facial recognition technology. Continuous research on deep learning technology can...
Proceedings Article

Review on Autonomous Robot Mobility Based on Visual Deep Learning

Xun He, Tianyi Xu, Lo San Yuan
The topic related to dynamic autonomous navigation in complex environments has become more popular in the robotics research area. It is very crucial for mobile robotics to have automatic obstacle avoidance and path planning. Traditional methods, such as simultaneous localization and mapping (SLAM) technology,...
Proceedings Article

Intelligent Question Answering System Based on Multimodal Fusion

Mengcong Zhang
This article explores the application of multimodal learning in intelligent question answering, emphasizing the importance of integrating data from multiple modalities to fully understand complex scenarios. The article reviews the development of intelligent question answering systems, from the early...
Proceedings Article

Comprehensive to the Textual Hallucination in Generative AI

Yiyang Li
Generative AI has been particularly strong in many places in recent years, especially large language models that have done very well in writing articles, answering questions, and helping to learn these things. However, these models sometimes make mistakes, such as making up factual content, or giving...
Proceedings Article

Multimodal Question Answering: Method Evolution, Challenges and Prospects

Haopeng Li
With the breakthroughs in cross-modal technology of artificial intelligence, multi-modal question answering (MMQA), as a key research direction connecting image, text and voice information, has increasingly significant application value in fields such as barrier-free services and education. This paper...
Proceedings Article

The Principles and Functions of Various Models for Text Style Transformation

Haolun He
Text style transfer is a significant research task in natural language processing, with its core objective being to transform the style of a text while maintaining the original semantic content. This paper reviews and summarizes existing research from the perspectives of supervised learning, semi-supervised...
Proceedings Article

Analysis, Comparison and Application Scenarios of Nighttime Infrared Image Technology

Zijun Jin
As a breakthrough technology that transcends human visual limitations, nighttime infrared imaging demonstrates immense potential and application value across military, civilian, and industrial sectors. This paper systematically reviews the research progress of nighttime infrared imaging technology, focusing...
Proceedings Article

Progress in Welding Defect Detection Based on Visual Technology

Xucheng Feng
Welding defect detection is a crucial step in ensuring the safety of industrial manufacturing. The traditional manual visual detection and radiographic detection methods have low efficiency, high costs, and poor adaptability, which make it challenging to meet the needs of large-scale manufacturing. With...
Proceedings Article

Research on Industrial Robot Grasping Based on Visual Technology

Changye Du
Under the backdrop of the rapid development of industrial automation, industrial robots have become the core force for the transformation and upgrading of manufacturing. Integrating visual technology into the grasping of industrial robots changes the traditional operation mode, endowing robots with the...
Proceedings Article

Unmanned Aerial Vehicle Target Tracking Systems in Complex Environments Based on Visual Enhancement Technology

Tianlin Guo
Unmanned aerial vehicle target tracking is one of the key areas of study in the field of computer vision and intelligent control, widely used in aerospace, intelligent driving, security surveillance, smart agriculture, and search and rescue, which can also carry out aerial multi-view multi-platform long-distance...
Proceedings Article

Machine Learning-Based Motion Planning for Robots

Yumin Shi
Motion planning for robots is a crucial technology that enables autonomous navigation and the performance of complex tasks. Traditional methods have problems like low efficiency and poor adaptability in high-dimensional spaces and dynamic environments. In recent years, machine learning methods, including...
Proceedings Article

From Credit Scoring to Artificial Intelligence-Driven Loan Default Prediction

Zaiyu Zhang
Loan default prediction is important to both the credit allocation and portfolio risk management. The conventional scorecard is based on predefined handcrafted features and linear assumptions, thus ignoring nonlinear and temporal characteristics of borrowers behaviors. Article introduce a novel sandwiched...
Proceedings Article

Food Safety Monitoring Based on Machine Vision

Pengyu Xie
Food safety has become a global challenge due to the inefficiency and subjectivity of traditional inspection methods. These approaches often rely on destructive sampling and manual visual analysis, which lead to high latency, low scalability, and limited accuracy. Machine vision provides a non-destructive,...
Proceedings Article

Analysis of Intelligent Welding Strategies Based on Machine Vision

Junying Tong
With the fast development of industrial automation, welding process is one of the core production processes that affects product quality and production efficiency of enterprises with its welding quality and efficiency. Non-contact inspection, real-time information acquisition, and high-precision recognition...
Proceedings Article

Retrieval-Augmented Generation: Advances, Applications, and Future Directions in Knowledge-Grounded Language Modeling

Hancheng Yu
Retrieval-Augmented Generation (RAG) has emerged as a pivotal innovation in natural language processing (NLP), integrating information retrieval with generative modeling to overcome the limitations of static, parameter-based large language models. By dynamically retrieving relevant external knowledge...
Proceedings Article

The Current Development of Large Language Models in Medical Questions and Answers--Center on the Models of Med-PaLM and Med-PaLM2

Yueqi Zhu
The large language models (LLMs) have brought a significant technological changes in the field of medical questions and answers. It can help medical personnel improve diagnostic efficiency and also help the public more easily understand their own physical health status. This article focuses on the Google...
Proceedings Article

Comparative Research on the Unbalanced Processing of Traffic Accident Data and Multi-model Performance

Tao Hu
Traffic Accident prediction is of great significance in the construction of smart cities, however, there is an unbalanced challenge in the data of traffic accidents. In response to this problem, three public data sets, Addis Ababa Sub-city Accident, Us Accident and Barcelona Accident, were selected....
Proceedings Article

Research Analysis of Optimization of Temperature Prediction Model Based on Random Forest

Tianhua Jiang
Temperature prediction plays a crucial role in meteorological analysis, scientific planning of agricultural production, precise allocation of energy management and so on. But traditional temperature prediction methods often fall short in terms of prediction accuracy and fail to meet expectations. To...
Proceedings Article

Threshold-Aware Machine Learning for Heart Disease Prediction

Yanzhou Qian
The decision threshold in clinical heart disease risk assessment is critical for real-world utility but frequently overlooked. This study investigates threshold-aware prediction using a clinical dataset of 919 records (14 features), with heart disease presence (num > 0) as the target. Preprocessing...
Proceedings Article

An Improved CNN–LSTM Model for Daily Gold Price Prediction

Yi Lin
Gold prices play a pivotal role in the global financial system, but they often experience short-term volatility, long-term trends, and frequent regime switching. To address this, this paper proposes a new architecture, building on the CNN architecture and combining ordinary convolution with dilated convolution...
Proceedings Article

Prediction of Cardiovascular Diseases Based on Mainstream Machine Learning Algorithms

Peiyuan Liu
Cardiovascular diseases (CVDs) are one of the hottest issues in present medical research due to their status as the leading cause of mortality worldwide. Studies have achieved certain achievements in early detection tool development. However, there is still a research gap in accurate, efficient and widely...
Proceedings Article

UAV Autonomous Flight Obstacle Avoidance Technology

Shengyao Duan
With the maturity of drone technology and its increasing application in agriculture, logistics, and even medical fields, the issue of its flight safety in low-altitude complex environments has become increasingly critical. Autonomous flight obstacle avoidance is the core technology to ensure the reliability...
Proceedings Article

Research and Analysis on Chain of Thought (CoT) Reasoning and Interpretability in Large Language Models

Pengyu Liao
As an important form of Large Language Models (LLM), Chain-of-Thought (CoT) has made breakthroughs in logical consistency, interpretability, and task accuracy by guiding the model to generate answers by stepwise reasoning. This paper systematically reviews the research progress of CoT reasoning in the...
Proceedings Article

Computational Approaches to Stock Price Prediction

Jiachen Liu
Stock price prediction is one of the most significant and difficult problems of computational finance. As the volume of data and computing capabilities have rapidly increased, forecasting models have shifted toward a data-driven forecasting method based on machine learning and deep learning. In this...
Proceedings Article

Progress in Diagnosis and Prediction of Common Cancers: Multi-Cancer Characteristics, Technical Applications, and AI Model Practices

Bowen Zhang
Cancer remains a major global health burden, and its early diagnosis and accurate prediction are crucial for improving patient prognosis. This paper reviews three common types of cancers—gastric cancer, skin cancer, and brain tumors—focusing on their pathological mechanisms, pathogenic factors, clinical...
Proceedings Article

Multi Asset Price Time Series Prediction Based on LSTM, GRU, and MLP

Yaqi Liu
The financial asset price has nonlinear, stochastic and multidimensional characteristics, making it difficult to use linear forecasting models. Therefore, in this paper, three neural networks are used—Multi Layered Perceptron (MLP), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU)—to achieve...
Proceedings Article

Pneumonia X-ray Image Classification Methods Based on Deep Learning

Xinyu Zhou
Pneumonia is a respiratory disease that is widely present in daily life. For pneumonia, timely and accurate diagnosis is very crucial for clinical treatment. The traditional chest X-ray diagnostic method mainly relies on all the professional knowledge of radiologists to carry out the diagnostic work....
Proceedings Article

Machine Learning Applications in Stock Index Prediction

Xuanmeng Huang
Stock index forecasting has been an essential indicator of market research since the emergence of financial analysis. Its concept encompasses many dimensions, such as investment decision-making, risk control, and policy evaluation. In recent years, with the rapid development of artificial intelligence...
Proceedings Article

Enhancing the Efficiency of Cell Classification in ScRNA-seq Data by Weakly Supervised Learning

Xinyi Zhou
Single-cell RNA sequencing (scRNA-seq) is a genomics technology that enables research of cellular diversity and operation through assessing gene expression. However, no widely available automated classification technology currently exists, and distinguishing cells within scRNA-seq datasets still relies...
Proceedings Article

Evaluation of Galaxy Morphology Classification with Machine Learning and Deep Learning

Yuhan You
Through the development of technologies, astronomical imaging surveys have increased significantly, so that more galaxy images are taken than ever, which makes the traditional manual galaxy classification infeasible. For this reason, adopting automated machine learning techniques is important in replacing...
Proceedings Article

Fine-Grained Deep Learning for Gleason Grading in Prostate Histopathology

Kaixin Chen
One of the most frequent cancers in male patients is prostate cancer, diagnosis and prognosis of which include histopathological image analysis, especially Gleason grading. It is a delicate activity that involves identifying subtle architectural variations in tissue patterns which are however disadvantaged...
Proceedings Article

GANs in Image Generation and Denoising: From Infrastructure to Real-World Applications

Haoyang Sun
Generative Adversarial Networks (GANs) are currently the mainstream generative models, widely used in image, audio, and other data processing. Over the course of ten years, many GAN models have emerged that are adapted to different use cases. With the constant improvement of GAN model structure, they...
Proceedings Article

Skin Cancer Image Generation Using WGAN-GP Based on HAM10000 Dataset

Hongye Hao
Skin cancer is a highly prevalent disease, and its early diagnosis relies on the recognition of skin lesion images. However, the acquisition cost of labeled medical image samples is extremely high, which makes it easy for model training to suffer from data imbalance. To address this issue, this paper...
Proceedings Article

The Summary of Price Prediction Methods of Gold-related Financial Products

Hengxin Hua
After the dismantling of the Bretton Woods System, the financial attributes of gold are gradually emerging, more and more gold-related financial products appear in the market in recent years, such as gold futures and so on. The price of gold-related financial products has reflected and influenced economic...
Proceedings Article

Intelligent Classification and Identification of Similar Respiratory System Diseases

Zhuoyang Liu
This study focuses on symptom classification models for the four most common respiratory diseases (COVID, FLU, COLD, and ALLERGY). The aim is to address the challenge of distinguishing between similar and troublesome respiratory illnesses while maintaining accuracy and minimizing unnecessary time wastage....
Proceedings Article

Heart Disease Prediction Using Machine Learning Models

Daoyi Cheng
This study is aimed at the binary classification task of heart diseases, using 918 subjects and 11 clinical and diagnostic features from public datasets. Before training, the data underwent missing and anomaly checks, numerical features were standardized, category-specific features were encoded, and...
Proceedings Article

Applications of Partial Differential Equations

Ziran Zhang
Partial differential equations (PDEs) are descriptions of continuous changes. Using methods/algorithms, PDE-based models are applied in physical calculations, image processing, machine learning, etc. This paper discusses the applications of PDEs, especially in image denoising and inpainting. When denoising,...
Proceedings Article

Using Machine Learning Methods to Predict Mobile Phone Prices

Wenbo Lu
This study analyzes the price range and configurations of mobile phones on the market to help emerging mobile phone companies accurately position their product prices and better compete in the mobile phone market. The study uses machine learning techniques to predict mobile phone prices, employing three...
Proceedings Article

Machine Learning, Ensembles, and Knowledge Graphs for Diabetes Prediction

Jiayi Jiang
This article reviews the progress of machine learning in early prediction and risk identification of diabetes, focusing on three methods: traditional models (such as Logical Regression, SVM, RF, etc.), ensemble learning (such as Bagging, Boosting, Stacking and weighted voting) and Reasoning based on...
Proceedings Article

Research and Analysis of Large Language Model Synthetic Data Generation and Bias and Illusion Problems

Bolin Zhang
This dissertation examines the relevance of large language models (LLMs) in enhancing time-series data applications, e.g., climate forecasting, traffic control, and finance, in which quality data is critically important in the prediction and making of decisions. The importance of overcoming the limitations...
Proceedings Article

Research and Analysis of Traffic Accident Severity Prediction Based on Data Augmentation and Feature Interpretation

Yuhan Chen
Predicting the severity of traffic accidents is crucial to traffic safety management and accident prevention. However, in the actual data, the number of minor accidents far exceeds that of serious accidents, resulting in deviations in most types of predictions. In order to alleviate the category imbalance...
Proceedings Article

Paradigm Evolution of Industrial Surface Defect Detection: The Underlying Logic and Fundamental Challenges from Supervised Classification to Unsupervised Anomaly Localization

Zihan Zhang
Industrial appearance defect detection is a key aspect of quality control in smart manufacturing. The core challenge lies in how to effectively apply models that perform well in laboratory environments to complex, dynamic, and unpredictable real-world industrial scenarios. This challenge is specifically...
Proceedings Article

Diagnosis, Optimization, and Verification of Localized Failures in Traffic Flow Prediction Models

Yunjian Tang
In the real traffic control scenario, the multi-intersection prediction model often appears “local failure” due to the distribution difference and concept drift between intersections, which leads to the decline of the overall prediction performance and affects the reliability of scheduling. Improving...
Proceedings Article

Application Status of Deep Reinforcement Learning in Optimal Power Flow (OPF) Problem in Renewable Energy Power System

Shuai Yuan
This paper makes an in-depth analysis of the challenges faced by the optimal power flow (OPF) of the power system due to the high proportion of renewable energy access, which makes the uncertainty increase and the real-time requirements improve. Since the traditional iterative optimization method only...
Proceedings Article

Overview of Traffic Flow Prediction Research Based on Graph Neural Networks

Yi Shao
The acceleration of global urbanization and the continuous increase of the amount of vehicles have made traffic jams a severe challenge faced by cities around the world. How to accurately and real-time predict traffic flow has become a core element in building intelligent transportation systems. To better...
Proceedings Article

Pure Vision and Multi-modal Perception in Autonomous Driving: Performance, Challenges and Architectural Insights

Yaode Han, Kaiyang Li
Environmental perception is a core technical aspect of autonomous driving systems, and its architecture design directly determines the vehicle’s understanding ability of the surrounding environment and driving safety. With the development of artificial intelligence and sensor technology, pure visual...
Proceedings Article

A Review of Recent Methods for Traffic Prediction with Few Data

Zhaowei Huang
Predicting traffic accurately is very important for building smart cities. But the deep learning models that achieve state-of-the-art performance today—especially graph neural networks (GNNs)—require massive amounts of past data. This is a problem in new urban areas or on roads that have recently been...
Proceedings Article

Research and Analysis of Stock Prediction Based on Deep Learning

Hanyu Zhang
In recent years, with all the relentless strides in artificial intelligence, deep learning could now be applied to almost all fields, including but not limited to health care, scientific research, and financial analytics. Among these applications, predicting and assessing stock market trends using deep...
Proceedings Article

The Influence of the Shape Symbol Paired Grouping Gradient Mean Iterative Method (SSPGG-IM) on the Medical Image Classification Problem in the Context of Small Samples

Zhengyang Li, Jiayi Zhang
This study focuses on the analysis of small sample medical images. It first reviews the research value and application status of small sample learning in this field, as well as the limitations of traditional methods. The research is based on three types of core small-sample learning methods as the technical...
Proceedings Article

Evaluating the Impact of Self-Attention in Pix2Pix for Image-to-Image Translation

Zheng Liao
In this work, a self-attention module is incorporated into the generator of the Pix2Pix model and its effects on image-to-image translation with Facades dataset is assessed. The proposed architecture adds self-attention at the bottleneck of the U-Net generator to capture global context while retaining...
Proceedings Article

Multi-Scale Patch Discriminator for Cycle-Consistent Unpaired Image Translation

Yichen Liu
Unpaired image-to-image translation often relies on a single-scale PatchGAN discriminator that emphasizes local textures while providing limited guidance on global structure, which can lead to shape distortion and unstable optimization at higher resolutions. This paper investigates a drop-in Multi-Scale...
Proceedings Article

Data Cleaning and Visualization Analysis Based on Pan-das and Matplotlib a Case Study of the Titanic Dataset

Kaiwen Zuo
In the big data time, data cleaning and visualization become essential for valuable information from raw materials. The very heady challenge of such work is to know what works for whom and how it does. It needs systematic exercises for data preprocessing and exploratory visualization. The classic Titanic...
Proceedings Article

Generative Adversarial Networks in Medical Imaging: Applications, Challenges, and Emerging Trends (2020–2025)

Taoyu Chen
Generative Adversarial Networks (GANs) have developed into an important class of generative models in medical imaging, providing distinct properties for problems that are difficult because of limited data and costly annotation. In 2020–2025, GANs have also shifted from basic augmentation to more advanced...
Proceedings Article

Research and Analysis on Image Style Transfer Technologies

Runxin Yang
Image style transfer technologies seek to imbue original images with a desired artistic style while preserving their content structure at the same time. Early image style transfer algorithms mostly employed non-parametric methods to achieve style transfer. Recently, there are a series of breakthroughs...
Proceedings Article

Research and Analysis of Core Data in the Closed-loop of Autonomous Driving Data

Tielin Wang
Since 2020, the global autonomous driving (AD) market has moved into the mass-market of L2 + Advanced Driver Assistance System (ADAS) penetration, it is believed that the adoption rate of the systems in China will reach more than 65 percent by 2025. A vehicle with AD capabilities produces 4–20 terabytes...
Proceedings Article

Framework Design and Performance Comparative Analysis of Large Language Models

Mingxuan Deng
With the emergence of Transformer architecture, large language models (LLMs) have made breakthroughs in the fields of language understanding, reasoning, code generation and multimodal interaction. This research systematically sorts out the technical evolution path of mainstream LLM in the past five years,...
Proceedings Article

Research and Analysis of Multi object Tracking Combined with Transformer Method

Junhao Yu
Multi-object tracking (MOT) is widely used in intelligent transportation, public safety, and autonomous driving. Traditional MOT algorithms, relying on local feature modeling and heuristic association rules, have reached their performance limits. This leads to significant performance degradation in complex...
Proceedings Article

Precision Fine-Tuning: Leveraging LoRA for Text-Only Adaptation in Multi-Modal Medical Models

Wenru Lu
Large Multimodal Model (LMM), which has the ability to process visual and textual information, has great potential in medical and other professional fields. However, adapting these complex models to specific sub domains or tasks faces many challenges. Due to the high demand for computer resources and...
Proceedings Article

Adapter-Fusion: A Practical, Parameter-Efficient Framework for Composable Control in Text-to-Image Diffusion

Yunzhong Zheng
The surge of text-to-image diffusion models is an innovative step in the development of generative artificial intelligence. However, when the model is applied in production, the lack of precise control is a critical constraint. There are existing methods that introduced singular control modalities. The...
Proceedings Article

Evaluating LoRA, QLoRA, and Full Fine-Tuning on Compact Language Models Under Limited GPU Resources

Congbo Ni
Language models that are fine-tuned can also be surprisingly high-demand, although the model themselves can be itty-bitty. In the course of this project, the paper learned how to apply three approaches of adapting compact models to a simple classification problem: updating all model parameters, adding...
Proceedings Article

Research and Analysis of DCGAN in Different Application Fields

Chongyue Liu
Deep convolutional Generative Adversarial Network (DCGAN) is an important type of generative adversarial network. It integrates convolutional neural network (CNN) into the adversarial framework and performs well in image generation and data augmentation. This paper reviews and summarizes the basic technical...
Proceedings Article

Research and Analysis of Generative Adversarial Networks in the Field of Computer Vision

Longxuan Li
The integration of Generative Adversarial Networks (GANs) with various domains in computer vision has become one of the key topics in current research. Researchers have found that GANs have outperformed existing models in vertical applications, significantly enhancing and expanding model training performance...
Proceedings Article

Research and Analysis of VAE in Image and Data Analysis

Xuan Sheng
Deep generative models find popular applications in image and data analysis to learn more intricate patterns as well as to generate novel samples. Variational autoencoders are appreciated for ensuring a clear format and stable training and are used in recovery, learning features, and data augmentation....
Proceedings Article

A Novel Image-to-Image Model: MSF-CycleGAN

Yujing Wang
Unpaired image-to-image transformations are often affected by structural distortions and semantic inconsistencies because they rely on pixel-level cyclic consistency constraints. To address these limitations, the paper proposes a multi-scale feature consistency cyclic generative adversarial network,...
Proceedings Article

Evaluating High-Resolution Vessel Mask-to-Fundus Translation under Non-Monotonic GAN Dynamics

Jiaqiang Yang
Clinically viable retinal fundus synthesis from vessel masks requires photorealistic appearance while maintaining anatomical agreement with the input structure. The task is treated as a structure-sensitive, high-resolution (512 × 512) paired translation problem, with a Pix2Pix-style model as a baseline....
Proceedings Article

Evaluating Sparse and Transformer-based Representations for Chinese Weibo Sentiment Analysis Across Data Scales and Noise Conditions

Yuying Zhao
This study presents a systematic comparison between traditional sparse representations and contextualised Transformer models for Chinese Weibo sentiment classification. Using a publicly available dataset of 10,500 annotated microblog posts, the analysis examines four key dimensions: text representation,...
Proceedings Article

Attention-Enhanced CycleGAN for Unpaired Image-to-Image Translation

Wenqi Zheng
Cycle-consistent GANs are widely used for unpaired image-to-image translation, but they often over-translate textures in regions that should remain largely unchanged (e.g., background grass or sky in horse ↔ zebra). This failure mode is encouraged by discriminators that score the full image uniformly,...