Expertise
Research & Projects
Code Review AI
GitHubBuilt a multi-agent CLI tool that automates code review using NVIDIA Nemotron, reducing review turnaround from hours to minutes while maintaining senior-engineer-level thoroughness.
Manual code review is the single biggest bottleneck in development pipelines. Automating it with multi-agent LLM orchestration cuts cycle time, enforces consistency across teams, and frees senior engineers to focus on architecture instead of line-by-line review.
// applications
Automated pull request review for distributed engineering teams
Continuous code quality enforcement in CI/CD pipelines
Security vulnerability scanning before production deployment
Click to explore →
ML Serving Optimization in Edge-Cloud
GitHubPioneered resource optimization techniques for ML model serving on heterogeneous edge devices, achieving 25-30% efficiency improvement in multi-device deployments.
Edge ML inference is bottlenecked by heterogeneous hardware and limited resources. Optimizing serving across diverse edge nodes unlocks real-time AI at the network edge where latency matters most.
// applications
Autonomous vehicle perception on embedded GPUs
Smart retail real-time inventory tracking
Industrial quality inspection on edge cameras
Click to explore →
Federated Learning Marketplace
Designed a quality-aware, cost-conscious federated ML marketplace framework enabling ML-as-a-Service across distributed edge nodes.
Data privacy regulations prevent centralized training in many domains. A federated marketplace enables collaborative model improvement while keeping sensitive data local to each participant.
// applications
Cross-hospital medical AI without sharing patient data
Financial fraud detection across banking networks
Distributed smart grid energy optimization
Click to explore →
Real-time Manufacturing Process Optimization
WebsiteDesigned and deployed production ML pipeline integrating IoT sensor data with predictive analytics to reduce manufacturing operation costs.
Manufacturing downtime costs millions per hour. ML-driven predictive maintenance and process optimization catch failures before they happen and continuously tune operations for peak efficiency.
// applications
Predictive maintenance on CNC machines
Real-time defect detection on assembly lines
Energy consumption optimization in factories
Click to explore →
Explainable AI for Digital Twin Systems
GitHubDeveloped security-aware ML serving architecture integrating explainability for real-time decision auditing in critical infrastructure digital twins.
Critical infrastructure decisions cannot be black boxes. Explainable AI provides transparent reasoning for automated decisions, enabling human oversight and regulatory compliance in high-stakes environments.
// applications
Power grid anomaly detection with audit trails
Smart building security threat analysis
Water treatment plant process monitoring
Click to explore →
Opportunistic Data Operations for HPC
GitHubBuilt a framework enabling developers and scientists to execute data operations during idle periods of long-running HPC applications, improving resource utilization in data-intensive computational workflows.
HPC jobs leave resources idle during computation phases. Opportunistic scheduling reclaims these gaps for data tasks like analysis and ML, maximizing expensive supercomputer utilization without interfering with primary workloads.
// applications
In-situ data analysis during large-scale simulations
Opportunistic ML training on idle HPC nodes
Automated data preprocessing in scientific computing pipelines
Click to explore →
Chainer-XP: Deep Learning on Intel Xeon Phi
WebsiteDeveloped a flexible framework enabling neural network training and inference on Intel Xeon Phi coprocessors, bridging the gap between deep learning workloads and many-core architectures.
Deep learning frameworks were not optimized for many-core coprocessors like Intel Xeon Phi. Adapting ANN workloads to these architectures unlocks cost-effective parallel computation for research labs without GPU clusters.
// applications
Neural network training on HPC clusters with Xeon Phi nodes
Cost-effective deep learning for research institutions
Parallel ANN inference on many-core architectures
Click to explore →
pyMIC-DL: Deep Learning Library for Xeon Phi
WebsiteBuilt a Python library that enables existing deep learning frameworks to offload computation to Intel Xeon Phi coprocessors, providing a seamless bridge between high-level DL APIs and many-core hardware acceleration.
Deep learning libraries lacked native support for Intel Xeon Phi offloading. A transparent library layer lets researchers leverage many-core coprocessors without rewriting models, lowering the barrier to HPC-accelerated training.
// applications
Offloading DL training to Xeon Phi coprocessors in hybrid clusters
Accelerating scientific deep learning on non-GPU HPC hardware
Benchmarking neural network performance across compute architectures
Click to explore →
Quality-driven Inference Orchestration for Multi-modality Machine Learning Systems at the Edge
T Nguyen, AD Nguyen, L Truong
Under Submission
On Optimizing Resources for Real-Time End-to-End Machine Learning in Heterogeneous Edges
MT Nguyen, HL Truong
Software: Practice and Experience 55(3), 541-558
EADRAN: An Edge Marketplace for Federated Learning
TD Cao, HT Nguyen, MT Nguyen, T Truong-Huu, HL Truong
Future Generation Computer Systems, 108046
Security Orchestration with Explainability for Digital Twins-based Smart Systems
MT Nguyen, AN Lam, P Nguyen, HL Truong
IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)
Novel Contract-based Runtime Explainability Framework for End-to-End Ensemble Machine Learning Serving
MT Nguyen, HL Truong, T Truong-Huu
IEEE/ACM 3rd International Conference on AI Engineering (CAIN)
Optimizing Multiple Consumer-specific Objectives in End-to-End Ensemble Machine Learning Serving
MT Nguyen, HL Truong, P Arcaini, F Ishikawa
IEEE/ACM 17th International Conference on Utility and Cloud Computing (UCC)
Supporting Opportunistic Data Operations for Data-intensive Computational Applications
MT Nguyen, AD Nguyen, J Rantaharju, T Puro, M Rheinhardt, et al.
IEEE International Conference on Big Data (BigData), 3735-3744
QoA4ML - A Framework for Supporting Contracts in Machine Learning Services
L Truong, T Nguyen
IEEE International Conference on Web Services, 465-475
Demonstration Paper: Monitoring Machine Learning Contracts with QoA4ML
MT Nguyen, HL Truong
ACM/SPEC International Conference on Performance Engineering (ICPE)
Chainer-XP: A Flexible Framework for ANNs Run on the Intel Xeon Phi
TD Diep, MT Nguyen, NY Nguyen-Huynh, MT Chung, MT Nguyen, et al.
Modeling, Simulation and Optimization of Complex Processes HPSC 2018
Attention-based Neural Network: A Novel Approach for Predicting the Popularity of Online Content
MT Nguyen, DH Le, T Nakajima, M Yoshimi, N Thoai
IEEE 21st International Conference on High Performance Computing and Communications (HPCC)
Analyzing and Predicting the Popularity of Online Contents
MT Nguyen, T Nakajima, M Yoshimi, N Thoai
21st International Conference on Information Integration and Web Intelligence (iiWAS)
Optimizing Color-based Cooperative Caching in Telco-CDNs by Using Real Datasets
ATN Tran, MT Nguyen, TD Diep, T Nakajima, N Thoai
International Conference on Ubiquitous Information Management and Communication (IMCOM)
Analyzing and Visualizing Web Server Access Log File
MT Nguyen, TD Diep, T Hoang Vinh, T Nakajima, N Thoai
International Conference on Future Data and Security Engineering, 349-367
A Performance Study of Color-based Cooperative Caching in Telco-CDNs by Using Real Datasets
ATN Tran, MT Nguyen, TD Diep, T Nakajima, N Thoai
9th International Symposium on Information and Communication Technology (SoICT)
pyMIC-DL: A Library for Deep Learning Frameworks Run on the Intel Xeon Phi Coprocessor
ATN Tran, HP Nguyen, MT Nguyen, TD Diep, N Quang-Hung, N Thoai
IEEE 20th International Conference on High Performance Computing and Communications (HPCC)













