Senior HPC Architect
U.S., Canada, UK, and elsewhere
Position : Senior HPC Architect
As a Senior HPC Architect at Qvelo, you will be responsible for leading the design, development, and implementation of high-performance computing (HPC) architectures that meet the complex and evolving needs of our clients. This role requires deep expertise in HPC systems, cloud integration, parallel computing, and advanced storage solutions. You will work closely with clients to assess their requirements, architect custom HPC solutions, and oversee their deployment to ensure optimal performance, scalability, and efficiency. As a senior member of the team, you will also mentor junior engineers and help shape the strategic direction of our HPC solutions.
Key Responsibilities:
- Lead the design and architecture of HPC systems tailored to specific client needs, including high-performance clusters, storage solutions, and network infrastructures.
- Develop scalable, high-performance computing architectures for large-scale simulations, AI/ML workloads, big data analytics, and scientific research.
- Integrate cloud-based and hybrid HPC environments, leveraging platforms such as AWS, Azure, and Google Cloud to extend on-premises capabilities.
- Oversee the deployment and configuration of HPC solutions, ensuring that they are optimized for performance, reliability, and cost-efficiency.
- Collaborate with clients and internal teams to gather requirements, conduct performance analyses, and deliver architectural recommendations for complex HPC use cases.
- Design and implement parallel computing strategies (e.g., MPI, OpenMP) that optimize application performance across multi-node environments.
- Evaluate and recommend technologies for HPC systems, including processors (CPUs, GPUs), accelerators, interconnects (InfiniBand, Ethernet), and storage architectures.
- Implement high-speed interconnects and network configurations to ensure low-latency, high-throughput communication between compute nodes and storage systems.
- Lead benchmarking and performance tuning efforts, ensuring that HPC systems meet or exceed performance expectations for specific workloads.
- Ensure security and compliance in HPC architectures by implementing best practices for data protection, system access, and secure networking.
- Provide technical leadership and mentorship to junior architects, engineers, and system administrators, fostering a collaborative and innovative team environment.
- Stay up-to-date with the latest trends and advancements in HPC technologies, ensuring that Qvelo remains a leader in delivering cutting-edge HPC solutions.
Requirements:
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field. Ph.D. or advanced experience in HPC or a similar domain is a plus.
- 10+ years of experience in designing, architecting, and deploying HPC systems, including large-scale clusters and parallel computing environments.
- Extensive knowledge of high-performance computing technologies, including processors (x86, ARM, GPUs), memory hierarchies, storage systems, and networking.
- Proven expertise in parallel computing frameworks (MPI, OpenMP, CUDA) and workload management systems (SLURM, PBS, LSF).
Experience with cloud-based HPC solutions and hybrid architectures, including expertise in deploying HPC workloads on AWS, Azure, or Google Cloud. - Strong understanding of high-performance interconnects (InfiniBand, RoCE) and their impact on HPC performance and scalability.
Knowledge of high-performance storage solutions, including parallel file systems (Lustre, GPFS) and advanced storage architectures for large-scale data management. - Experience in benchmarking, performance analysis, and system tuning to optimize HPC systems for specific workloads.
- Strong understanding of security best practices in HPC environments, including data encryption, access control, and secure networking protocols.
- Excellent communication skills, with the ability to articulate complex technical concepts to both technical and non-technical stakeholders.
- Experience in providing technical leadership and mentorship, guiding teams in the design and implementation of high-performance solutions.
Preferred Qualifications:
- Experience in AI/ML workloads and optimizing HPC systems for deep learning, machine learning, and data analytics applications.
- Knowledge of quantum computing concepts and the ability to integrate quantum simulation workflows with existing HPC infrastructures.
- Familiarity with containerization technologies (Docker, Singularity) and their role in modern HPC environments.
- Certification in cloud platforms (AWS Certified Solutions Architect, Google Cloud Professional Architect, etc.) or related HPC technologies.
- Experience with DevOps practices for HPC, including automation and orchestration frameworks (Ansible, Terraform, Kubernetes).
Department
CTO Office
Employment Type
Contract
Location
Remote or Hybrid (depending on your flexibility)
Workplace type
Hybrid/Remote
Compensation
Competitive, based on experience
Security Clearance
Canadian, U.S., or NATO clearance levels are desirable, but not mandatory. Some projects will require applicants to obtain a clearance at Secret-level clearance or higher.
Why Join Us?
As a Senior HPC Architect at Qvelo, you’ll have the opportunity to lead large-scale, high-impact projects at the forefront of HPC and AI technologies. You will play a key role in shaping the architectural strategy for clients across a variety of industries, including scientific research, financial services, healthcare, and government. You’ll work with a team of talented professionals in a collaborative, innovative environment where your expertise will directly influence the future of high-performance computing solutions. We offer opportunities for professional growth, continuous learning, and the ability to work with some of the most advanced technologies in the field.