Join Oak's mission to enhance education with innovative tech as our new Platform Engineer
Overview
£65460
100% remote
Expires 07-10-24 - Applications reviewed after this date
Oak is dedicated to transforming the educational landscape by providing high-quality curriculum and lesson resources for teachers and students. As a cutting-edge organization leveraging the latest technologies, Oak presents a unique opportunity for a Site Reliability Engineer to spearhead the evolution of our processes and systems.
- Collaborate with cross-functional teams to enhance user experience through observability principles.
- Lead initiatives to improve application stability and automate processes.
- Champion site reliability engineering practices across the organization.
- Proven experience with event-driven architectures and serverless technologies.
- Expertise in cloud infrastructure monitoring and observability tools, particularly Datadog.
- Proficiency in JavaScript/TypeScript for web application development.
- Familiarity with cloud platforms and Infrastructure as Code, specifically Terraform.
- Strong collaborative skills and a passion for continuous learning and improvement.
Enjoy a suite of benefits, including generous annual leave, pension contributions, remote working options, and regular team offsites. We celebrate diversity and encourage applications from all qualified individuals. If you're ready to make a significant impact in the education sector with your technical expertise, apply now!
Oak provides school teachers and pupils with the highest quality curriculum and lesson resources across all subjects and age groups. In this role, you will be working with engineering, product and research colleagues to build confidence using observability principles that aids our understanding of our users and help us continually improve our products.
We work together in product squads alongside designers, researchers and education experts, regularly releasing new features and improvements to give teachers and their pupils quick and easy access to the highest quality learning resources.
As a young organisation we have been able to leverage the latest technologies to rapidly build and deliver the game changing products we have. Now that we've proven ourselves and are established, we want to mature our processes to ensure we are getting the best out of the technology and remain able to respond quickly to business needs. We see this role as being a key part of that change.
You will be tasked with raising our monitoring and observability to a high standard across all our key applications while working closely with engineering teams to help them improve the stability of their applications and give engineers more sense of ownership.
You will also drive site reliability engineering principles and be a key driver of automation by working alongside other members of the platform team, helping to improve the overall developer experience.
Candidates must have a good understanding of SRE principles and the value they bring to an organisation. While a good grounding in development practices, security fundamentals and infrastructure operation are key, specific technical skills are less important than a passion for automation, an ability to understand complex systems and a keenness to learn.
WE RESERVE THE RIGHT TO CLOSE THE ROLE EARLY
Responsibilities
- Lead the continuous improvement of the observability, performance, and reliability of our web applications (Next.js, JavaScript, Typescript, Node), Serverless Functions (Google Cloud Functions, Cloudflare). Deployed on PaaS Infrastructure (Netlify, Vercel, Cloudflare).
- Promote and nurture a culture of quality across the product and engineering department, enabling teams in using SLO/SLAs to ensure they maintain a high quality of service delivery.
- Take ownership of our observability, monitoring, logging and reporting solutions to ensure they are easy to use and provide development teams with the information they need to understand service quality, resolve problems quickly, and get meaningful insights into application behaviour.
- Identify and implement ways in which automation can be used to speed up development, secure systems or improve the quality of the services we provide.
- As a member of the Oak Team, you will contribute to the wider success and culture of the organisation and support and role model our five values: create the right environment, be a great colleague, own your role but work for the team, make things happen, and keep getting better.
- Work in cross-functional and product-oriented squads with colleagues from across the organisation, as required. Oak has a strong focus on collaboration and mentoring.
- Deputise for other members of the Platform team and take on other general responsibilities as required.
Knowledge, skills, and experience
- The ideal candidate would have strong professional experience leading the continuous improvement of event-driven architectures using Serverless technologies such as Google Cloud Run, AWS Lambda or Azure Serverless.
- Considerable experience in designing and implementing monitoring, observability and reporting solutions for complex cloud infrastructures within a major cloud provider (GCP, AWS, Azure). In production we’re using Datadog as our main monitoring platform.
- Confident in understanding and maintaining web application code and able to design and build small apps, preferably using JavaScript/TypeScript.
- Experience working with Cloud computing platforms and a familiarity with Infrastructure as Code tools. We’ve chosen Terraform as our Infrastructure as Code tool.
- Comfortable promoting and leading a spirit of collaboration with a range of technical and non-technical stakeholders.
- The successful candidate will have a desire to contribute in all areas to ensure Oak is successful. You will be comfortable working at pace, with a range of digital systems (including proprietary ones as required) and you will continuously look at ways that the team can keep getting better. You will be excellent at working as part of a remote team, building relationships and managing your time effectively.
Benefits
Below is a selection of just some of our benefits:
- 25 days annual leave plus one day for each year in service (maximum of 28 days)
- Additional 'Oak closure days' at Christmas/New Year
- 11% employer contribution to pension with nominimum employee contribution
- Our full-time hours are 36 per week. We work a half day each Friday or every other Friday off
- We are a fully remote organisation. We’ll help you get set up to work comfortably and provide options for working in a shared office space if preferable
- We are remotely based across the UK, yet come together in person for termly offsites to make great things happen and build stronger connections (and have fun!)
If Oak sounds like somewhere you could do your life’s best work then we would love to hear from you. We use Applied platform to support diversity in our recruitment.
Oak is an equal opportunities employer and welcomes applications from all suitably qualified persons regardless of their race, sex, disability, religion/belief, sexual orientation or age. We particularly encourage applications from Black and minority ethnic candidates who are currently under-represented.