Define product feature and capabilities and own the architecture for compute, memory, interconnect & high-speed interface subsystems in the AI inference chip.
Collaborate with various software teams to co-optimize hardware features for AI workloads.
Collaborate with RTL designers to identify complex technical issues/risks. Review and guide RTL implementation, ensuring consistency with architectural intent and timing/power goals
Collaborate with Physical-design teams for Area/Floorplan refinement, Timing targets etc.
Define and document interface specifications, control/status logic, and pipeline structures.
Lead PPA analysis and trade-off discussions across RTL and architecture.
AI Workloads & HW-SW Co-Design
Collaborate closely with various software to co-optimize hardware features for real-world AI inference workloads.
Incorporate considerations such as, quantization, sparsity, dataflow, scheduling, memory bandwidth into architectural decisions.
Guide hardware features that improve programmability, debuggability, and long-term software scalability.
Modelling & Analysis
Develop and maintain high-level architecture and performance models.
Use simulation and architectural models to guide RTL-level improvements.
Validate model predictions against RTL or emulation results and refine accordingly.
Strong understanding of AI inference workloads, dataflows, quantization, and memory/bandwidth bottlenecks in edge deployments