zain ali

Product Engineer @ ChatlyIslamabad, Pakistan

I build AI swarms: multi-agent systems that perceive, decide, and act on their own. Researcher and builder. I read the paper, then ship the system, where rigorous maths and distributed systems meet agentic AI. Right now I'm building two things. Chatly Make, the app builder inside Vyro.ai's Omni-Agent, and Conoid, the Internet of Evolving Agents.

Agentic AI · Multi-Agent Systems · Distributed Systems · Applied Maths

Experience

2026Chatly (Vyro.ai)Product Engineer

2026ConoidR&D Engineer

2025VictreatAgentic Systems Engineer

2024MachVIS Lab, NUSTML Research Engineer

2022NUST SEECSB.Sc. Computer Science

Selected Work

Conoid

Conoid is the Internet of Evolving Agents. A drop-in SDK that turns any stateless agent stack (LangGraph, CrewAI, AutoGen) into a persistent organization where agents carry per-skill reputation, three-tier memory, and a collaboration history that builds across tasks.

Every framework spins up a crew, solves a task, and throws the crew away. Nothing learns. Nothing remembers. Nothing earns trust. Conoid is the first protocol to update four channels of agent identity (profile, memory, social edges, reputation) from a single judge call. On AgentsNet it beats the published baseline by +0.24 at N=100.

Year2026

RoleR&D Engineer

ScopeMulti-Agent Systems, Reputation & Trust, Memory, Evaluation

DeviceAgent Infrastructure · SDK

ToolsPython, Beta-Bernoulli, LLM-as-Judge, LangGraph

LinkRead case study

Chatly Make

Chatly Make is the Lovable-style app builder inside Chatly's Omni-Agent: one prompt becomes generated code, a GitHub commit, a live deploy, and a preview thumbnail. I designed its architecture, first on ByteDance's DeerFlow, then on NousResearch's Hermes.

Both were long-horizon research/agent harnesses. The wrong shape for shipping code fast. I replaced the open-ended loop with a pipeline over warm Modal sandboxes, generated independent files in parallel, and overlapped the long-tail steps. End to end (code, commit, deploy, thumbnail) dropped from ~45 min to ~13 min for full-stack and ~20 min to ~5 min for frontend. About 4×.

Year2026

RoleProduct Engineer

ScopeAgentic Codegen, Full-Stack Generation, Build Pipeline, Modal Sandboxes

DeviceWeb · Chatly (Vyro.ai)

ToolsTypeScript, Modal, LLM Orchestration, Pipelines, Deploy

LinkRead case study

A full portfolio site generated, committed, and deployed by Chatly Make

VisionTactical

Two off-the-shelf LLMs play a 3D first-person shooter on the same inputs a person gets: pixels in, keys and mouse out. A vision model reads the frame and sets strategy; a text model reads the game state and picks the next move. No reinforcement learning, no training of any kind.

I built the whole stack: the FPS in Ursina, the JSON state schema, and the perceive, decide, act loop on top. Screen in through MSS, keys and mouse out through pynput. In one clip a single enemy demolishes it. In the next, same models, no retraining, it reads the fight differently and wins.

Year2025

RoleAgentic AI

ScopeVision-Language Models, Multi-Agent Systems, Game AI

DeviceAutonomous Agent

ToolsPython, Ursina, Mistral 3.1, DeepSeek V3

LinkRead case study

TVFace

Most public face datasets are small, curated, or demographically skewed. TVFace is television: 2,609,210 faces mined from broadcasts across 22 networks, so the data carries the real long tail. Pose, lighting, expression, the same person aging over time.

Built at NUST's MachVIS Lab, where I worked as an ML Research Engineer: 2,609,210 faces across 28,955 identities at 224×224, with probabilistic age, gender, ethnicity, expression, and head-pose labels. Ships a PyTorch loader and a research-only license. Published in Springer's Pattern Analysis and Applications, 2025.

Year2024

RoleML Research

ScopeComputer Vision, Dataset, Facial Recognition, Fairness

DeviceDataset · Springer

ToolsPython, PyTorch, Clustering

LinkRead case study

TVFace: a montage of faces from the 2.6M-image dataset