zain ali

zain ali

Product Engineer @ ChatlyIslamabad, Pakistan

I build AI swarms: multi-agent systems that perceive, decide, and act on their own. Researcher and builder. I read the paper, then ship the system, where rigorous maths and distributed systems meet agentic AI. Right now I'm building two things. Chatly Make, the app builder inside Vyro.ai's Omni-Agent, and Conoid, the Internet of Evolving Agents.

Agentic AI · Multi-Agent Systems · Distributed Systems · Applied Maths

Experience

2026Chatly (Vyro.ai)Product Engineer
2026ConoidR&D Engineer
2025VictreatAgentic Systems Engineer
2024MachVIS Lab, NUSTML Research Engineer
2022NUST SEECSB.Sc. Computer Science

Selected Work

Conoid

Conoid is the Internet of Evolving Agents. A drop-in SDK that turns any stateless agent stack (LangGraph, CrewAI, AutoGen) into a persistent organization where agents carry per-skill reputation, three-tier memory, and a collaboration history that builds across tasks.

Every framework spins up a crew, solves a task, and throws the crew away. Nothing learns. Nothing remembers. Nothing earns trust. Conoid is the first protocol to update four channels of agent identity (profile, memory, social edges, reputation) from a single judge call. On AgentsNet it beats the published baseline by +0.24 at N=100.

Year2026
RoleR&D Engineer
ScopeMulti-Agent Systems, Reputation & Trust, Memory, Evaluation
DeviceAgent Infrastructure · SDK
ToolsPython, Beta-Bernoulli, LLM-as-Judge, LangGraph

Chatly Make

Chatly Make is the Lovable-style app builder inside Chatly's Omni-Agent: one prompt becomes generated code, a GitHub commit, a live deploy, and a preview thumbnail. I designed its architecture, first on ByteDance's DeerFlow, then on NousResearch's Hermes.

Both were long-horizon research/agent harnesses. The wrong shape for shipping code fast. I replaced the open-ended loop with a pipeline over warm Modal sandboxes, generated independent files in parallel, and overlapped the long-tail steps. End to end (code, commit, deploy, thumbnail) dropped from ~45 min to ~13 min for full-stack and ~20 min to ~5 min for frontend. About 4×.

Year2026
RoleProduct Engineer
ScopeAgentic Codegen, Full-Stack Generation, Build Pipeline, Modal Sandboxes
DeviceWeb · Chatly (Vyro.ai)
ToolsTypeScript, Modal, LLM Orchestration, Pipelines, Deploy
A full portfolio site generated, committed, and deployed by Chatly Make

VisionTactical

Two off-the-shelf LLMs play a 3D first-person shooter on the same inputs a person gets: pixels in, keys and mouse out. A vision model reads the frame and sets strategy; a text model reads the game state and picks the next move. No reinforcement learning, no training of any kind.

I built the whole stack: the FPS in Ursina, the JSON state schema, and the perceive, decide, act loop on top. Screen in through MSS, keys and mouse out through pynput. In one clip a single enemy demolishes it. In the next, same models, no retraining, it reads the fight differently and wins.

Year2025
RoleAgentic AI
ScopeVision-Language Models, Multi-Agent Systems, Game AI
DeviceAutonomous Agent
ToolsPython, Ursina, Mistral 3.1, DeepSeek V3

TVFace

Most public face datasets are small, curated, or demographically skewed. TVFace is television: 2,609,210 faces mined from broadcasts across 22 networks, so the data carries the real long tail. Pose, lighting, expression, the same person aging over time.

Built at NUST's MachVIS Lab, where I worked as an ML Research Engineer: 2,609,210 faces across 28,955 identities at 224×224, with probabilistic age, gender, ethnicity, expression, and head-pose labels. Ships a PyTorch loader and a research-only license. Published in Springer's Pattern Analysis and Applications, 2025.

Year2024
RoleML Research
ScopeComputer Vision, Dataset, Facial Recognition, Fairness
DeviceDataset · Springer
ToolsPython, PyTorch, Clustering
TVFace: a montage of faces from the 2.6M-image dataset