Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

projects

Reversed Reversi(Alpha Zero)

Alpha Zero, Reinforcement Learning(RL), Evolutionary Algorithm(EA), Monte Carlo Tree Search(MCTS)

2022

Designed an AI for reversed reversi, and used alpha zero to build the AI model. Also applied evolutionary algorithm to optimize the weight in the reinforcement learning training network. The model won the 1st place in SUSTech in the grade with 402 wins, 1 loss and 1 draw, leading significantly in winning rate.

GroceryHelper

Firebase, Computer Vision(CV), Frontend, Backend

2023

GroceryHelper is a mobile app that enables users to identify the quantity and expiration dates of groceries by simply uploading photos. The app intelligently recommends recipes based on the shelf life of the groceries and sends email reminders before the items expire. This helps users manage their pantry efficiently and reduce food waste by suggesting timely culinary uses for the ingredients they have on hand.

SUSTech Library Webpage

Data Mining, Database, Frontend, Backend

2023

The SUSTech Library Webpage utilizes data mining and analysis techniques to compile and analyze borrowing and access data from faculty and students across the entire school. This results in the production of reports that help the school optimize the borrowing experience. Additionally, personalized user profiles are provided for each user.

EvoXbench: All-In-One Neural Architecture Search Framework

Neural Architecture Search(NAS), Benchmark, Evolutionary Algorithm(EA)

2023

EvoXbench is an open-source library that serves as a comprehensive framework for the development of Neural Architecture Search (NAS) algorithms. It integrates all essential technologies and provides user-friendly Python and MATLAB interfaces, enabling the easy testing and development of algorithms. Key tasks included the collection, extraction, and curation of extensive NASBench datasets using Django’s ORM framework, and training an MLP surrogate model using PyTorch while overseeing the experimental processes.

Heterogeneous Bert

Bert, Neural Architecture Search(NAS), Knowledge Distillation, Evolutionary Algorithm(EA), Super-Network

2023

We have implemented a neural architecture search and a super-network training framework for heterogeneous BERT models. Given the search space and a teacher model, the super-network is automatically trained and the network structures are evaluated using balanced Pareto sampling. Compared to traditional neural architecture search frameworks, our approach achieves higher accuracy, faster convergence for sub-models, and superior performance under the same structural configurations.

Domain Specific Language for Online Judger

Domain Specific Language(DSL), Security, Remote procedure call(RPC), K8S, Docker, Sandbox, Rust, Compiler

2023

The Domain Specific Language for Online Judger is a DSL implemented using macros in Rust. It allows users without any Rust or programming knowledge to define evaluation tasks using simple statements. This DSL supports a variety of functions, including defining file systems, setting up runtime environments (such as nsjail, Docker, VMs), and specifying tasks. The compiler translates this DSL into Rust code for execution, ensuring high operational efficiency.

Multi Person Conference, Remote Control System

Network, Electron, Frontend, Backend, Multi-Platform, P2P

2023

Design and implement a cross-platform remote multi-person conference, chat, remote control system. I took charge of all parts, used QUIC protocol to encapsulate the packets, Electron for multi-platform frontend, and applied P2P for decentralized permission verification This project won the highest grade in the class!

SUSTech Online Judge (OJ) System: JCoder

Security, Remote procedure call(RPC), K8S, Docker, Sandbox, Database, Frontend, Backend

2023

JCoder a scalable online judge system to evaluate code correctness across multiple programming languages including C/C++, Java, JUnit, SQL, Python, MIPS and Verilog. The system passed third-party penetration testing and is now officially used by the Computer Science Department at Southern University of Science and Technology (SUSTech) , serving over 3,500 students in 13 courses.

SageCube

Large Language Model(LLM), Retrieval-Augmented Generation(RAG), Agent, Tool Call, Electron, Agent Platform

2023

SageCube is an AI assistant designed for desktop use and is currently being tested on Steam. Powered by an LLM, SageCube offers a visually appealing user interface. Users can interact with interactive Live2D and 3D virtual avatars using voice or text. SageCube supports various voice models for text-to-speech functionality, enabling users to interact primarily through voice. Additionally, through Steam’s Workshop, users can upload and install various tools and acquire agents, enhancing the functionality and customization of SageCube.

GentBench

Large Language Model(LLM), LLM Evaluation, Benchmark

2024

Benchmark and Evaluation for Gentopia. A good ALM Benchmark should aim to solve problems and tasks hard/unsolvable by LLMs (else there is no meaning to pay tool tax). Tasks in Gentbench will be half-public and half-private. We open-source a demonstrative public benchmark to encourage agent tuning, but will use a private bench (of similar distribution) for fair eval.

GentBench

Gentopia.AI

Large Language Model(LLM), Retrieval-Augmented Generation(RAG), Agent, Tool Call

2024

Gentopia is a lightweight and extensible framework for LLM-driven Agents and ALM research. It provides essential components to build, test and evaluate agents. At its core, Gentopia aims to assemble an agent with a single config, thus minimizing your effort in building, tuning, and sharing agents.

Gentopia.AI

Merry Query

Large Language Model(LLM), Retrieval-Augmented Generation(RAG), Agent, Frontend, Backend

2024

MerryQuery is an AI-powered educational assistant that utilizes retrieval-augmented generation (RAG) to provide students with tailored responses based on course materials.This tool is designed to support both teachers and students. Teachers can input course materials, and define data controls to prevent undesirable content generation. Students receive responses tailored to their previous interactions and course materials with references. You can visit our website for more information: https://exploremq.benyamintabarsi.com/

Merry Query

Magics.AI

Machine Learning(ML) System, Fine-Tuning, Large Language Model(LLM), K8S, Docker, Frontend, Backend

2024

Magics.AI is an open-source platform designed for the academic community, providing tools for fine-tuning and inference of large language models (LLMs) with reduced costs and latency. It supports distributed resource integration across institutions, features a Python SDK and user-friendly interface, and lowers the technical barrier for model fine-tuning. Additionally, Magics.AI supports Embodied AI, enabling LLMs to control robotic systems and interact with virtual environments, bridging language understanding and physical actions for interdisciplinary research.

publications

Gentopia.AI: A Collaborative Platform for Tool-Augmented LLMs

Binfeng Xu, Xukun Liu, et al., 2023

Gentopia is a lightweight and extensible framework for LLM-driven Agents and ALM research. It provides essential components to build, test and evaluate agents. At its core, Gentopia aims to assemble an agent with a single config, thus minimizing your effort in building, tuning, and sharing agents.

Gentopia.AI: A Collaborative Platform for Tool-Augmented LLMs

Download here

ToolNet: Connecting Large Language Models With Massive Tools

Xukun Liu, Zhiyuan Peng, DK Xu, 2024

We introduce ToolNet, a plug-and-play method to assists LLMs in handling massive tools. ToolNet organizes tools in a weighted directed graph (node represents tools and edges denote tool transition) based on the tool-use trajectories produced by LLMs. An LLM navigates in the graph by iteratively choosing the next one from its successors until the task is resolved. Graphs are updated online, enabling adjustment to accommodate the frequent updates of tools and new tasks.

ToolNet: Connecting Large Language Models With Massive Tools

Download here

LawLLM: Law Large Language Model for the US Legal System

Dong Shu, Haoran Zhao, Xukun Liu, David Demeter, Mengnan Du, Yongfeng Zhang, 2024

We introduce LawLLM, a multi-task model designed for the US legal domain, excelling in Similar Case Retrieval (SCR), Precedent Case Recommendation (PCR), and Legal Judgment Prediction (LJP). LawLLM distinguishes between similar and precedent cases, providing clarity for future research. We apply customized data preprocessing, in-context learning, and advanced information retrieval methods to transform raw legal data into a trainable format. LawLLM outperforms existing baselines in zero-shot and few-shot scenarios, addressing key challenges in legal analytics.

LawLLM: Law Large Language Model for the US Legal System

Download here

Adaptive Draft-Verification for Efficient Large Language Model Decoding

Xukun Liu, Bowen Lie, Ruqi Zhang, Dongkuan Xu, 2024

We introduce an LLM decoding acceleration method that requires no fine-tuning. Our approach involves an adaptive draft-verification process that evolves over time to improve efficiency. We utilize a tri-gram matrixbased LLM representation to dynamically approximate the output distribution of the LLM, allowing the model to adjust to changing token probabilities during the decoding process.

Adaptive Draft-Verification for Efficient Large Language Model Decoding

Download here

MerryQuery: A Trustworthy LLM-Powered Tool Providing Personalized Support for Educators and Students

Benyamin Tabarsi, Aditya Basarkar, Xukun Liu, Dongkuan Xu, Tiffany Barnes, 2024

The potential of Large Language Models (LLMs) in education is not trivial, but concerns about academic misconduct, misinformation, and overreliance limit their adoption. To address these issues, we introduce MerryQuery, an AI-powered educational assistant using Retrieval-Augmented Generation (RAG), to provide contextually relevant, course-specific responses. MerryQuery features guided dialogues and source citation to ensure trust and improve student learning. Additionally, it enables instructors to monitor student interactions, customize response granularity, and input multimodal materials without compromising data fidelity. By meeting both student and instructor needs, MerryQuery offers a responsible way to integrate LLMs into educational settings.

MerryQuery: A Trustworthy LLM-Powered Tool Providing Personalized Support for Educators and Students

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.