View on GitHub

sustainable-software-enginnering

List of resources on Sustainable Software Engineering

Roles, Responsibilities & Sustainability

User perspective

User procurement decisions unlikely affected by individual cost of energy consumption for updates
Update availability (cadence) and user experience (update duration, background installation vs. requiring restarts) are paramount
Better quality software -> less need for patch updates
Higher performance update process -> quicker installation__ times
Demand sustainability by leveraging exisiting forms of functional and non functional requirements
Communicate requirement transparency to vendors

As a user
I want
my updates to be

quick
- Network usage (download time)
- CPU + IOPS usage (installation time)
lightweight
- My device remains usable for other tasks while the update installs
- Mobile data usage
- energy consumption heat, battery usage, fan noise
frequent
- early availability of new features
- prevent premature end of life (compatiblility & security patches)
not too frequent
- “Participants report common unexpected restarts, while half also reported growing concern about the state of the device if an update took a long time”
Case Study Windows Update
- “Back to the envelope” estimation of the resource consumption impact. Assumptions
- Average installation duration: 10 minutes
- Average energy consumption: 50 Watt * 10 mins = 8.33 Wh
- Update requires a restart, PC is unusable for other tasks: exclusive energy consumption, 100% attributable to update process
- One update per month: 8.33 Wh *12 = 100 Wh per year
- Devices running Windows 10: 100 Wh 1.310^9 = 130 Gwh per year
CO2 equivalent:
- 92.129 t CO2e
- 20.036 (approx.) passenger vehicle cars driven for a year
- Energy consumption: 100 Wh*30ct/kWh = 3ct
- Lost productivity per year: 1210 min = 2h=> 2h15 (EU)/h = 30 EU

Architect perspective

As an Architect
I want
to set the principles so the software fulfils

Functional requirements
- Correct computation of updates available to a client
Non functional requirements
- 99.9% availability of the update services
- Scalability: Cost efficient use of infrastructure
- Performance target: maximum update installation time (incl. request, download and installation on client)

Architects can include sustainability in non functional requirements of the architecture

Scalability: Scale the service up and down in line with demand
Performance: Response time in SLA
Techniques:
- Break up services to have clear scaling factors allows independently scaling each factor and achieving better resource utlization
- Leverage optimized hardware

Using CDN for Sustainable software engineering

Advantages of using CDN
- Packets travel shorter distances (faster, less energy)
- Network requires less total bandwidth and therefore less infrastructure
- CDN provider maintains high utilization through caching of most popular content

Bin packing problem

Pack items of different sizes into finite number of bins of a fixed given capacity while minimizing the number of bins used
Applied to infrastructure: pack virtual machines or containers on physical hardware to maximise utilization
Computational complexity of the Bin packing problem is NP hard
Smaller/more fine granular items lead fragmentation and more optimal solutions
Delegating bin packing decisions to lower levels can increase overall efficiency
Requires collaboration of all involved stakeholders
- Architects: service architecture (e.g. monoliths vs. microservices)
- Developers: Tech Stacks and runtime behaviour
- Operators: Packaging Solutions (serverless, containers, VMs)
- Infrastruture Providers: Optimize hardware utilization
Solution:
- Define non-functional requirements that help achieve better sustainability [3]
- Design an energy efficient IT-architecture
  - Break up services by resource scaling factors (potentially: micro-services)
  - Ensure services can scale up and down
  - Collaborate for efficient bin packing

Developer perspective

As a Developer
I want to fulfil

Functional requirements & Non functional requirements same as architects
Set a high bar for non-functional requirements
High performance typically implies better energy efficiency
- programming language, e.g. C++ over Python
- Libraries/ frameworks with better algorithms e.g. O(log N) vs O(N)
- protocols and data-formats, e.g. Protobuf over JSON
- efficient patterns, e.g. event-driven logic vs polling
Consider impact on infrastructure when choosing technolgies and application services
- Common example: leveraging an existing database service as a lightweight message queue
- Smaller/ more fine granular items lead to less fragmentation and more optimal solutions
- Delegating bin packing decisions to lower levels can increase overall efficiency
- Requires collaboration of all involved stakeholders
  - Architects: Service Architecture (e.g. monoliths vs. microservices)
  - Developers: Tech stack and runtime behaviour
  - Operators: Packaging Solutions (serverless, containers, VMs)
  - Infrastructure Providers: Optimize hardware utilization
Tech stack example:
- Java Virtual Machine (JVM)
  - Just-in-time (JIT) compilation benefits from larger process granularity (code caches, optimization passes)
  - Efficient garbage collector implementations enable large heap sizes
  - Native threading model allows taking advantages of multiple CPU cores easily
- Python
  - Interpreted language, very small memory overhead possible
  - Memory management uses reference counting
  - Global Interpreter Lock (GIL) limits possible concurrency
Tech stack influence runtime behaviour - critical enabling lower levels to make efficient infrastructure decisions, e.g. serverless
Decoupling applications from infrastructure enables better sustainability by taking advantage of independent advancenments in infrastructure
Applications adhering to 12 Factor principles https://12factor.net/ give operators freedom to design efficient infrastructure (e.g. process model enables auto-scaling)
Practical examples multi-cloud strategy: applications targeting Kubernetes can be easily moved to a new cluster
- Move to a green cloud provider or re-patriate workloads to a private cloud
“You can’t manage what you can’t measure”
Sustainability is hard to measure - but adherence to functional and non functional requirements is not
Instrumentation: adding measurements like metrics and logs to software as a runtime behaviour (not just dedicated tests)
Developers make instrumentation decisions: what to measure, how to measure
- Make functional metrics available (e.g. number of updates served)
- Make non-functional metrics available (e.g. total update download time)
- Good KPIs are ratios, e.g. of infrastructure and functional metrics
  - CPU hours used/ # of updates served
Software consumes resource not only when used, but also during developement
Techniques to reduce waste
- Build the right thing, and build it right
- Anticipate maintenance concerns (Toolchains, Dependency Management)
- CI/CD: incremental builds and efficient tests
Beware of Developer Experience vs runtime tradeoffs

Operator perspective

Engineering applications to make efficient use of physical infrastructure is a pillar of sustainable software engineering (like DevOps)
Provide infrastructure using platforms that support tight bin packing (e.g. Virtualization, Container Orchestration)
Track infrastructure KPIs (CPU Usage, Memory, Storage..) vs. Business KPIs (e.g. RTT) and automate scaling auto-scaling
Designing Storage Hierarchy I
- Leverage the storage hierarchy - minimize work by keeping data as local as possible
  - Ensures higher performance
  - Reduced hardware footprint
  - Reduced energy consumption
Designing Storage Hierarchy II
- Scale out for performance & availability using a traditional database on virtual infrastructure
- Data durability maintained at the storage layer
- Database topology maintained externally (sharding, replication, node loss etc.)
Designing Storage Hierarchy III
- Scale out for performance & availability using a cloud native database
- Data durability maintained at the database cluster layer using replication
- Database is topology aware and handles changes dynamically
Lean Operations
- Backups
  - Simple backup strategies (e.g. daily full backup) can easily eclipse operational data processing volume, approaching O(N2).
- Logs
  - Maintain high signal to noise ratios
  - Leverage structured logging to avoid unnecessary [format -> transport/store -> parse] processing
- Cron Jobs
  - Periodic tasks are prone to overuse (does the job need to run on weekends?)

Leader perspective

As a Product Owner
I’m responsible for
the product and the process it gets built by

Proxy for the customers and users of the product
Make decisions on priority (e.g. by sorting a backlog of user stories)
Budget (time & cost)
Ensure good requirements engineering process that enables the team to meet functional and non-functional requirements
Leaders need to resolve Software Engineering goal conflicts by creating goal alignment between stakeholders
Example
- Designing for scalibility saves cost, improves quality to users and is more sustainable by freeing unused resources
Methods
- Fostering transparency of KPIs (Infrastructure used vs. Business Performance delivered) enables better engineering decision making

Pitfals

Example: IT as a Cost Center
- Annual or quaterly chargeback cycles leads to lack of feedback for software engineering teams
- Split responsibility for incurring vs. bearing cost leads to “shared irresponsibility”
- No incentive to save cost and resource consumption (apart from annual budget setting)
- Incentivizing sustainability requires adding a new goal dimension
Example: IT as a Profit Center
- Monthly chargeback cycles with daily report previews leads to fast feedback cycles for software engineering teams on actual resource spend
- Unified responsibility for incurring and bearing cost established sense of responsibility
- Teams are incentivized to save resources and cost at every opportunity
- Incentivizing sustainability aligned with the exisiting “cost” goal dimension

Selecting infrastructure Responsibly

Impact: A/B testing experiment for displaying carbon info in Cloud Console region picker
Among all users: 19%
Among new users, more than 50% more likely to select a “low carbon” region when exposed to “Low CO2” in picker

Key Actions for Leaders

Create goal alignment good software -> sustainable software
Involve all stakeholders and set clear goals for your team
Review incentives and empower sustainable software engineering

References

Sustinable Software Enginnering, Johannes Rudolph
Morris, J, Becker, I: Parkin, S: (2019) In Control with No Control
Raturi, A, Penzenstadler, B., Tomlinson, B. & Richardson, D. (2014, June)