Japan–France Joint Workshop

Workshop on LLM-Driven Code Generation and Automation for HPC and Scientific Computing

Monday, July 6, 2026  ·  13:30–18:35
LIP6, Sorbonne University, Paris

Registration

Attendance is free of charge, but registration is required.

The deadline for registering for the social dinner is June 12th.

Access

LIP6 Laboratory, room 25-26 105, 1st floor
Sorbonne University, Campus Pierre et Marie Curie
4 place Jussieu, Paris, France
https://www.lip6.fr/informations/comment.php?LANG=en

Generative AI based on large language models (LLMs) is rapidly transforming high-performance computing (HPC) and scientific research workflows. This workshop brings together researchers from France and Japan to explore three interconnected frontiers: LLM-driven HPC code generation and autonomous research systems that operate across compiled-language and job-scheduler environments; hardware design automation spanning semiconductor development, HDL generation, and programming for emerging AI processors; and LLM-assisted numerical computation including time-series analysis, floating-point reliability, and linear solvers. Through research presentations and open discussion, participants aim to identify new directions and build opportunities for future international collaborative research.

13:30
13:35
Opening
13:35
14:50
Session 1: Code Generation and Research Automation Using LLM
13:35
14:00
HPC-GENIE project - Generative AI for HPC code development
Daichi Mukunoki — Information Technology Center, Nagoya University
Rapid advances in coding AI are revolutionizing software development. This is equally true in the HPC field, though it presents unique challenges distinct from general code development. For example, there are various considerations beyond functional correctness, including architecture-specific performance optimization, support for GPUs and Fortran, selection of appropriate algorithms tailored to the target environment, and control of numerical accuracy. At the Information Technology Center of Nagoya University, we are promoting the "HPC-GENIE" project, which focuses on applying generative AI to HPC code development. Our primary interests lie in the development of AI agents and technologies designed to operate all systems including LLMs within the user's local environment. In this talk, we will introduce our research cases to date and discuss our outlook for HPC code development in the era of generative AI.
This work was supported by JSPS KAKENHI JP25K24387, the Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN), and the JST Research and Development Program for Next-generation Edge AI Semiconductors (Grant Number JPMJES2511).
14:00
14:25
VibeCodeHPC: A Multi-Agent LLM Framework for Autonomous HPC Code Auto-Tuning — Toward an Agent-Driven Foundation for Semiconductor Design Workloads
Shun-ichiro Hayashi — Graduate School of Informatics, Nagoya University
VibeCodeHPC is a multi-agent LLM system in which multiple CLI-based agents coordinate to autonomously auto-tune workloads such as numerical kernels. Given a benchmark together with tuning requirements, the system explores optimization strategies, builds, executes, and iteratively improves performance. On representative HPC benchmarks, the multi-agent configuration consistently outperforms a single-agent baseline while exploring a wider variety of strategies. The framework is CLI-backend-agnostic and also supports local LLMs. The architecture itself is not kernel-specific: with appropriately authored requirement definitions (prompts), it can in principle drive a broader range of auto-optimization tasks — for example, semiconductor design workflows — positioning it as a foundation for agent-driven design and optimization workloads beyond HPC code generation.
This work was supported by the Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN) and the High-Performance Computing Infrastructure (HPCI) under Project ID jh250015. It was also partially supported by JSPS KAKENHI Grant Numbers JP23K11126 and JP24K02945. In addition, this work was supported by the JST Research and Development Program for Next-generation Edge AI Semiconductors (Grant Number JPMJES2511).
14:25
14:50
HPC-AutoResearch: Adapting Autonomous Research Systems for HPC through Split-Phase Execution
Takanori Kotama — Graduate School of Informatics, Nagoya University / RIKEN R-CCS
LLM-based autonomous research systems target only Python ML, leaving HPC—reliant on compiled languages and SLURM—untouched. We present HPC-AutoResearch, the first autonomous research system for HPC. It combines a Split-Phase Execution Model decomposing the pipeline into five phases (Planning, Setup, Coding, Compilation, Execution) inside Singularity containers with iterative repair, and a Compressed Inference Memory extending MemGPT with Core/Recall/Archival tiers. On a Himeno Benchmark task on an AMD EPYC 9554 SLURM cluster, HPC-AutoResearch alone scored 3/4 on all seven NeurIPS-style criteria (Accept, 7/10), outperforming Claude Code, Codex CLI, and Gemini CLI; ablating memory drops node success from 47.0% to 40.5%. Extensions to MPI and CFD/MD/first-principles workflows are discussed.
This study was supported by the JST Next-Generation Edge AI Semiconductor Research and Development Project JPMJES2511 and the Joint Usage/Research Center for Inter-disciplinary Large-scale Information Infrastructures (JHPCN) (#jh250015).
14:50
15:10
Coffee Break
15:10
16:25
Session 2: Hardware Design
15:10
15:35
From Vibe Coding to Silicon: Local LLM Agents for AI-Assisted Semiconductor Design
Takahiro Katagiri — Information Technology Center, Nagoya University
Semiconductor design is increasingly limited not only by device technology, but also by the productivity of design description, verification, and iterative refinement. This talk presents Nagoya University’s HPC-GENIE activities, focusing on VibeCodeHPC, a multi-LLM-agent framework originally developed for autonomous code generation, execution, verification, and tuning of HPC software. In VibeCodeHPC, role-specialized agents such as the Project Manager, System Engineer, Programmer, and Continuous Deliverer collaborate through shared prompts, inter-agent communication, monitoring, and dynamic deployment. This architecture enables natural-language-driven “vibe coding” while keeping agents focused on requirements, implementation, validation, and deliverable management. Building on this concept, the talk discusses its extension from HPC programs to semiconductor design codes, including SystemC transaction-level models, Verilog/VHDL RTL modules, HLS descriptions, EDA scripts, testbenches, assertions, and design-space exploration workflows. A key direction is to combine local LLMs with domain-specific design rules, coding guidelines, simulation feedback, lint/formal checks, and version-controlled refinement loops. The goal is not to replace hardware designers, but to create secure local AI design partners that accelerate specification-to-code translation, improve verification productivity, and make complex silicon design workflows more interactive, reproducible, and explainable.
This work was supported by the JST Research and Development Program for Next-generation Edge AI Semiconductors (Grant Number JPMJES2511).
15:35
16:00
Parametrized HDL Code Generation For Activation Functions
Aurélien Delmotte — Sorbonne University
Implementing Activation Functions on Hardware by hand is slow, error-prone and hardware-dependent, especially in the realm of more exotic number formats which are becoming more and more popular where a new implementation needs to be done for every new scenario. In this talk we will explore design challenges and solutions for the implementation of an Efficient Fixed-Point Softmax with Custom precision and automatic Pipelining.
16:00
16:25
AI Coding for Emerging AI Hardware
Daichi Mukunoki — Information Technology Center, Nagoya University
AI processors specifically designed for AI computing are emerging. Programming for such hardware is challenging due to the unique and specialized nature of the hardware and the dedicated programming language, resulting in high learning costs and difficulty. This talk will introduce an example of code development using coding agents such as Claude Code for Tensorrent hardware, AI processors based on RISC-V. This example focuses on code development for utilizing the AI ​​processor in scientific computing workloads.
This work was supported by JSPS KAKENHI JP25K24387 and the JST Next-Generation Edge AI Semiconductor Research and Development Project JPMJES2511.
16:25
16:45
Coffee Break
16:45
18:00
Session 3: Numerical Computation
16:45
17:10
From Signals to Symbols: Enabling LLM-based Automation for Time Series Analysis
Xinye Chen — LIP6, Sorbonne University
Large language models have shown increasing promise in code generation and research automation, but their ability to assist with scientific workflows depends critically on whether domain data can be represented in a form they can understand. Time series data pose a particular challenge: they are continuous, noisy, high-dimensional, and not naturally aligned with the token-based interface of LLMs. In this talk, I will introduce LLM-ABBA, a time-series framework that turns time series into compact symbolic sequences. This approach helps improve later time-series tasks by making them easier to understand and work with. Though LLM-ABBA is not primarily built for code generation, its symbolic format can serve as an intermediate step for LLM-guided automation. For example, symbolic time series can help create code for feature extraction, motif discovery, anomaly detection, and other experimental tasks. I will explain how this focus on representation connects time series analysis with LLM-based research automation, and how it could help future systems both understand scientific signals and generate code to analyze them.
17:10
17:35
Benchmarking Large Language Models on Floating-Point Error Classification
Lisa Taldir — University of Perpignan
The study examines how Large Language Models (LLMs) can detect and classify floating-point arithmetic errors in software, which are subtle but potentially catastrophic. To evaluate this, the authors introduce InterFLOPBench, a benchmark comprising 90 C code instances and 1,130 test samples covering six error categories (cancellation, overflow, underflow, NaN, division by zero, and comparison errors), validated using FPChecker and Herbgrind. A dozen LLMs (including Gemini 2.5 Flash, GPT-4o, DeepSeek-R1, and Phi 4 reasoning) are tested. The evaluation framework treats floating-point error detection as a multi-label classification problem and employs the F1-score metric to measure performance. The results show that LLMs exhibit strong numerical reasoning, with the best models achieving micro F1 scores above 0.90. LLMs detect explicit errors such as comparison and division by zero well but struggle with subtler issues like underflow. LLMs are not yet meant to replace formal verification or dynamic analysis, but they are already effective as a semantic filter to locate suspicious code, as classification engines to categorize errors, and as explanation engines to detail why an operation is numerically unsafe. Combined with their ability to suggest more stable reformulations, they offer a promising complement to existing floating-point debugging workflows.
17:35
18:00
Can LLMs Help with Rounding Error Analysis for Matrix Computation Algorithms?
Takeshi Fukaya — Hokkaido University
Rounding error analysis plays an important role in numerical linear algebra, particularly matrix computation, because it provides a theoretical basis for assessing the reliability and validity of numerical algorithms. However, such analysis is often difficult, especially for researchers who are not familiar with numerical error analysis. It also requires a skill set that is substantially different from that needed for designing or implementing algorithms. In addition to fundamental knowledge of floating-point arithmetic and matrix analysis, it is necessary to combine known results appropriately and to bound error terms carefully, which are generally nontrivial tasks. In this talk, we report an initial exploration of whether and how LLMs can help with rounding error analysis. As a case study, we consider a tall-skinny QR factorization algorithm that we have recently developed and examine the process of carrying out rounding error analysis with support from a cloud-based LLM service. Rather than presenting a completed methodology or a definitive evaluation, this talk focuses on our observations from this preliminary attempt, including what kinds of assistance may be possible, where difficulties arise, and what precautions are needed when applying LLMs to theoretical analysis in scientific computing.
18:00
18:30
Discussion
The Future of LLM-Driven HPC and Scientific Computing
18:30
18:35
Closing
19:00 –
Social Dinner

Acknowledgements

Organizers

Contact

Daichi Mukunoki, Nagoya University
mukunoki <at> cc.nagoya-u.ac.jp