About Architect
Architect is a frontier AI lab for chip design. We build AI models and tools for on-demand custom ASICs at scale. Our goal is to co-design custom ASICs alongside evolving ML workloads, and enable a new era of domain-specific chips that unlock capabilities impossible with current hardware paradigms. Born out of Stanford Research, our team blends AI with Silicon with a founding team from Anthropic, Google DeepMind, Meta SuperIntelligence, xAI, Apple and Intel.
What You'll Do
As a Founding Member of the Technical Staff on the RTL Design team at Architect, you'll own the AI-driven microarchitecture and RTL design of mission-critical SoC blocks and sub-systems (SS) going into production silicon. You will be expected to define, drive, and revise the block-level micro-arch specification for one of the fundamental HW accelerator blocks.
As such, you fit our current opportunities if you have hands-on design and block-owner experience in any of the following ASIC components: ML/AI accelerators, SIMD vector engines, DSPs, GPUs, on-chip memory SS, on-chip interconnect SS (NoCs, DMAs, etc.) with custom or standard protocols (AXI, etc.), IO and peripherals integration (PCIe, DDR, CXL, etc.), CPU/Host/controllers, Security, Compression.
As a lab, we are investing in building a world-class HW design team, so if you think you have a particular experience/background that is not listed here, please still reach out to us!
-
Own AI-driven RTL design flow end-to-end (at the frontend): through code generation to incorporating feedback from lint, CDC, synthesis, and timing closure stages for closing the design loop.
-
Work directly with the principal architect to refine microarchitectural specs, resolve implementation trade-offs, and feed area/timing/power realities back into the architecture and internal AI systems.
-
Define and maintain interface specifications (e.g. AXI, AXI-Stream, or custom-built) for block- and SS-level integration.
-
Build and maintain RTL infrastructure for our in-house AI-driven flow: design automation scripts, regression flows, lint/CDC waivers, and integration collateral.
-
Close collaboration with DV: Support DV bring-up with reference models, assertions, test-plans, and architectural documentation for verification closure.
-
Close collaboration with SW and ML: Support and guide our SW and ML experts to revise and improve our in-house AI flow based on your own experience.
-
Support FPGA prototyping on Xilinx for early functional validation.
What We'd Like To See
Qualifications & Skills:
-
Degree: Bachelor's, Master's, or PhD in Electrical Engineering, Computer Engineering, or a closely related field.
-
Experience: 5+ years (10+ preferred) in RTL design with at least one advanced-node tapeout experience.
-
Domain Background: RTL design experience on specialized HW accelerators, such as SoCs/IPs integrating XPUs (NPU, GPU, AR/VR) or AI/ML accelerators. Ideally having worked on Apple Neural Engine, Qualcomm Hexagon NPU / AI Engine, Google Edge TPU, AMD XDNA, Samsung NPU, MediaTek APU, NVIDIA DLA blocks, or accelerators at Groq, Cerebras, MatX, d-Matrix, or similar/equivalent.
-
SystemVerilog: Clear, synthesizable, lint-clean RTL with strong design habits such as parameterization, modularity, reuse and configurability.
-
Block-Level Depth: Hands-on experience with block-specific compute datapaths and data movement; such as MAC arrays, vector units, accumulators, on-chip SRAM controllers and arbiters, DMA engines, scratchpad memory management, etc.
-
SoC Methodology: Solid grasp of synthesis, timing constraints, clock domain crossings, reset strategies, AMBA protocols (AXI, AHB, APB), power management techniques, etc.
-
Python: Strong skills for design automation, regression infrastructure, and tooling.
-
PPA Ownership: Experience taking a block from RTL through synthesis and working with PD teams on timing/area/power closure.
-
Leadership: Ability to lead RTL design efforts and grow into a team lead over time.
Bonus
-
Low-power design techniques: clock gating, power gating, multi-voltage domains, UPF.
-
FPGA prototyping experience (ideally Xilinx Vivado/Vitis).
-
Familiarity with SIMD/VLIW execution pipelines or instruction-driven datapath design.
-
Experience writing SVA assertions and functional coverage for design-side verification.
-
Prior IP building and delivery experience on your block-of-expertise, such as DMA controllers, memory subsystems, interconnects, or similar SoC infrastructure blocks.
-
Domain-specific expertise: Track record on research and development on energy-efficient, high-performance HW accelerators on your block-of-expertise.
What We Offer
-
Competitive salary and meaningful equity stake
-
Fast-paced startup with autonomy and visible impact
-
Cutting-edge challenges at the intersection of AI and silicon design