Ministry of Programming is specialized in building and growing start-ups into established brands and companies. The company was voted the most innovative company in Bosnia by the foreign investor council and recognized as the 21st fastest growing Tech company in Central Europe by Deloitte in 2019. MOP is the 1st company in Bosnia and Herzegovina to be ranked on the Financial Times FT1000 list of fastest-growing European companies at position 187. Our vision is to bring massive and positive socio-economic change to the world through technology.
We are a supercharged team of 180+ creative people and we are looking forward to hiring a new colleague who wants to help us achieve ambitious goals! At Ministry of Programming, you are more than just an employee. We are building next-generation web and mobile applications that have a real impact on people's lives and you will have large responsibilities from day one, where what you do will have an effect and your opinions and ideas will matter.
Do you want to be a part of this journey and help shape the future? Then you may well be who we are looking for!
We are seeking a qualified Site Reliability Engineer to join our team, providing technical leadership in the management and scaling of our services. In this role, you will collaborate with product teams to build, manage, and deploy infrastructure as code within a virtual computing and storage environment for digital media delivery and supply chain management. Your responsibilities will include empowering and aligning with Software Engineering Teams, coordinating efforts to architect systems, establishing shared standards, and documenting designs and prototypes. Additionally, you will contribute to the development and maintenance of techniques required for observability, instrumentation, metrics, and monitoring, as well as education on the use of these systems.
Ensure that our Kubernetes clusters are reliable, scalable, performant, and can be extended to support new requirements
Prescribe and enforce service-level objectives (SLOs) and error budgets for production systems
Automate the provisioning and management of infrastructure hosted in AWS and GCP
Create automated systems for repetitive tasks, including self-healing/auto-scaling capabilities.
Enforce access controls
Automate and tune static and runtime analysis to improve service security
Software system architecture
Participate in an on-call rotation
Implement change controls
Craft plans and procedures for disaster recovery
Familiarity with Linux and the UNIX methodology
Proficiency in a scripting language such as Python or Bash
Proficiency in observability tools such as Prometheus, Grafana and Sentry
Experience in a DevOps or Software Engineering role
Familiarity with software, including the application of data structures and algorithms
Experience operating Kubernetes or orchestrated containers (OCI) in a production environment
Familiarity with building and maintaining continuous delivery systems
Experience working with at least one of the major cloud providers (AWS, GCP preferred)
A background in building and managing highly available distributed systems
Ability to write infrastructure as code (some examples would be Terraform, Ansible, Puppet, and Chef)
Comfortable with networking concepts such as TCP/IP, DNS and HTTP
A basic understanding of relational and non-relational database technologies and how to administer these systems in a production environment (e.g. MariaDB, MySQL, Elasticsearch)