ITIL-Main
ITIL is an acronym for Information Technology Infrastructure Library. It is a collection of best practices for IT Service Management. ITIL was first published by the OGC (Office of Government Commerce) of the Government of the United Kingdom in the late 1980s. All the best practice guidelines outlined in ITIL were put together by Subject Matter Experts who were approached and requested by the OGC to do so. This was because of the increasing trend of companies to use IT or IT related services to run their business and also to provide service to their customers. Updates to the original ITIL library were done in 2001 and 2002 which included updates related to the internet and e-commerce.
ITIL – which started off as a mere collection of books that underlined best practices for use of IT in Service Management, has grown in leaps and bounds to be an industry in it’s own right! There are many companies now which offer ITIL related services such as Training, Consultancy and other Management resources.
ITIL consists of seven volumes. They are:
- Service Delivery – Deals with customer agreements and monitors compliance to these agreements by the IT service provider.
- Service Support – Deals with any failures to meet the levels of service agreed in the customer agreements.
- Business Perspective – Business Perspective is made up of a group of people who serve as liaisons between the business and IT. It (BP) clearly indicates how they (the business and IT) will interact with each other. The people in the Business Perspective group are usually those who have a deep understanding of the business, it’s current goals and future direction and can quickly get the new products and services in place when there is a business requirement for the same.
- Infrastructure Management – Guides business users through the planning, delivery and management of high quality IT services.
- Application Management – Manages the relationships between different projects. ITIL considers that every application is made up of different project of groups of different projects.
- Planning to Implement Service Management – Used by Project Managers while implementing ITIL.
- Security Management – Provides security related information and management about the ICT infrastructure.
The CENTRAL Part of ITIL only consists of the first five volumes mentioned above. But we will only focus on CORE ITIL which consists of Service Delivery and Service Support.
CORE ITIL is made up of 11 (Eleven) Disciplines – 05 of these disciplines come under Service Delivery and the other 06 of them come under Service Support. Let’s take a closer look:
SERVICE DELIVERY DISCIPLINES:
1. Service Level Management – Deals with providing high quality services to the customer at the right costs.
2. IT Services Financial Management – Deals with providing maximum business value with the minimum financial outlay.
3. Availability Management – Manages the availability of services by optimizing the capability of the ICT infrastructure (hardware, software and peopleware) to deliver a cost effect level of (service) availability to meet business objectives.
4. Capacity Management – Provides information on available capacity at any location or site.
5. IT Sevice Continuity Management (BCM) – Manages the minimum level of support required to keep the business or service free from any interruption in the event of any disaster – example: floods, earthquakes etc.
SERVICE SUPPORT DISCIPLINES:
1. Service Desk – is the single point of contact for end-users who need help with the service(s) they are receiving from an IT service provider.
2. Incident Management – deals with restoring a service by providing a solution, “quick-fix” or a workaround as soon as possible.
3. Problem Management – is responsible to identify the underlying cause of incidents and provide long term solutions to them.
4. Change Management – is responsible to ensure that changes are handled promptly and efficiently using standardized procedures in order to minimize the impact of any related incident(s) on the IT service.
5. Release Management – deals with implementing new (software or hardware) releases into the operational (live) environment, using the controlling processes of Configuration Management and Change Management.
6. Configuration Management – manages the (ICT) assets of an organization and the relationship between these assets.
All these disciplines are known as “Processes” whereas, only Service Desk is known as a “Function”.
Service Delivery
SERVICE DELIVERY DISCIPLINES:
- Service Level Management – Deals with providing high quality services to the customer at the right costs.
- IT Services Financial Management – Deals with providing maximum business value with the minimum financial outlay.
- Availability Management – Manages the availability of services by optimizing the capability of the ICT infrastructure (hardware, software and peopleware) to deliver a cost effect level of (service) availability to meet business objectives.
- Capacity Management – Provides information on available capacity at any location or site.
- IT Sevice Continuity Management (BCM) – Manages the minimum level of support required to keep the business or service free from any interruption in the event of any disaster – example: floods, earthquakes etc.
IT Services Financial Management
IT SERVICES FINANCIAL MANAGEMENT:
The aim of IT Services Financial Management (ITSFM) is to provide an efficient and cost effective management of IT and financial resources in an enterprise. It also helps to extract maximum business value for minimum financial investments.
ITSFM helps to achieve maximum business value by:
- Calculating the Return on Investment (ROI) for the ITSM (IT Service Management) team and thereby helping in important decision-making.
- Financial Forecasting.
- By identifying, managing and controlling all costs incurred, both internally and externally (this can include costs incurred through any contracts with external suppliers). This helps to assess the Total Cost of Ownership which is the sum of the total cost of devlopment and the total cost of supporting it during it’s lifetime.
For any enterprise, it is important to strike the right balance between quality and cost of providing a service.There might be times when customers (or users) might demand close to 100% availability of a service, but it is the ITSFM team who will work out and provide the cost/benefit analysis to the ITSM team, based on which they (the ITSM team) can make a decision.
ITSFM is also responsible to recover the cost of a service /services from those who use them. Though this would primarily be a decision of the higher management, ITSFM can suggest the use of “differential charging” taking into consideration the demand for the service and the cost of providing that service.
ITSFM is concerned with the following functions:
- Budgeting: It deals with forecasting the money required to provide a particular service and tries to secure that money from the business for that purpose. It also monitors and controls the expenditure against the budgeted amounts.
- Accounting: It deals with keeping a track of where the money goes from that budget.
- Charging: deals with recovering the cost of providing a service from a customer or user. The charges for each service that is being provided to a customer should be accurately documented in the SLA (Service Level Agreement).
This is how it works – budgeting would tell you how much money has been allocated to provide a certain type of service. Accounting will tell you how that money is being/has been spent. And charging will tell you how much of the money spent on providing the service is being recovered.
Budgeting and accounting work hand-in-hand to identify and evaluate all the costs incurred to provide a service and how the money is being spent.
Budgeting, accounting and charging have a heirarchical relationship. While ITIL recommends implementation of at least budgeting and accounting, charging is optional. Accounting is required to calculate the ROI (Return on Investment) and Cost/Benefit calculations. These calculations might be necessary whenever new services are introducd or whenever there is a change to an existing service.
Types of costs:
Breaking down and classifying all the costs is crucial to creating a budget. The different types of costs in ITIL are as follows:
- Hardware (like computers, servers, printers, routers, hubs etc.)
- Software (like Operating Systems, proprietary applications, third-party applications etc.)
- People (costs incurred towards salaries and other benefits etc.)
- Accomodation (like offices, utility and storage spaces etc.)
- External Services (these refer to work that might be outsourced like security, development, facilities, disaster recovery, ISP etc.)
- Transfers (costs incurred due to cross charging within the business.
Classification of costs:
According to ITIL, costs must be classified at least into the following two categories:
- Capital costs.
- Operational costs.
Capital costs are usually associated with purchase of fixed assets like land, buildings etc and hardware such as computers, servers etc. They usually increase the value of the company. However, Operational costs are those costs incurred with the day to day running of the company and includes salaries, rents of buildings and other equipment, software licenses etc. Operational costs do not increase the value of the company because they are recurring in nature.
Depreciation: Sometimes it is necessay to track capital purchases wich lose their value over a certain period of time. Suppose an item was bought for Rs.100,000 and it was supposed to last three years. It just means that every year it’s value will reduce by one-third of the initial cost of Rs.100,000. This way, at the end of three years this item will have zero value. This is how depreciation is calculated.
Capitalization: This is just the opposite of depreciation. Here, operational costs is sometimes put up as capital costs so that it too can be allowed to depreciate. Let’s say one company spends Rs.100,000 to develop a software. When this software application is ready, it adds Rs.100,000 as value to the company. But if its life span was say, five years, then it will depreciate accordingly and finally, at the end of five years, it will have zero value.
Costs can also be classified as :
- Direct costs.
- Indirect costs.
Direct costs can be attributed directly to a customer or a number of customers. Example: The purchase of a server exclusively for the use of the payroll department.
Indirect costs are those costs that are shared among a group of customers or groups.They cannot be attributed to any single customer or group.
Indirect costs can be of two types:
- Absorbed costs – here it is possible to track usage by different customers or groups.
- Unabsorbed costs – here, it is not possible to track usage by different customers or groups. Example: The Service Desk. Here the total cost of running the Service Desk is distributed among all user groups.
Fixed costs and Variable costs:
Costs can also be classified as:
- Fixed costs – costs remain the same irrespective or usage. Example: cost of a leased line.
- Variable costs – increases or decreases according to usage. Example: cost of telephone usage.
After Budgeting Accounting, comes Charging:
Firstly it is upto the higher management to decide whether or not to charge for a particular service. ITSFM does not decide this.
Sometimes it can be decided that certain services will not be charged at all because of the high costs involved in the charging activities themselves – like invoicing, printing out bills, despatch and delivery of bills etc.
In other instances, a company can decide to charge back from users just what has been spent to provide a particular service. This is called “zero balance” policy.
There is also a “cost plus” policy where the amount recovered is more than what has been spent on providing the service.
The “cost minus” policy only recovers part of the amount spent on providing a particular service. Now, how much that “part” will be is upto the higher management to decide.
Different approaches to charging:
- A “going rate” approach is based on what departments or groups charge for the same kind of service provided.
- The “market rate” approach is based on what other companies charge for a similar type of service.
- A “fixed price” approach is based on an agreed price with the customer or user group.
ITIL recommends that companies should have a charging policy which is fair, easy to understand and easy to control.
Service Level Management
SERVICE LEVEL MANAGEMENT:
Firstly, Service Level Management or SLM is the heart of Service Management.
This process is responsible to make sure that Service Level Agreements (SLAs), Operational Level Agreements(OLAs) and Underpinning Contracts (UPCs) are met.
Service Level Management also makes sure that Service Targets, like availability of services and response time are all agreed upon in advance and accurately documented. It also manages the SLAs and provides targets which can be used to judge the performance of the service provider.
Some people like to say the goal of Service Level Management is to improve the quality of IT Services provided through regularly monitoring, reporting and reviewing the performance of the services, while at the same time working towards removing service or performance bottlenecks whenever possible.
Service Level Management is made up of four important stages. Let’s quickly take a look at each of these:
Stage 1: Create the Service Management Catalog.
Stage 2: Identify the SLRs (Service Level Requirements): This is basically identitying what kind of service your customers are exactly looking for and what they are willing to pay for.
Stage 3: Based on the SLRs (Service Level Requirements) go ahead and create the Operational Level Agreements (OLAs) and Underpinning Contracts (UPCs).
Stage 4: Create the SLA. Here, you can modify any existing SLAs you may have with any other client to suit your own company’s or organization’s business requirements.
Once you have created the SLA this way, you need to get it formally agreed by the customer. Once the agreement has been formalized, the SLAs need to be implemented and all concerned parties have to be informed about it.
Okay, so what happens after the SLAs are all agreed upon and implemented and you start to actually provide the service? Let’s take a look:
The next stage in SLM is to constantly monitor the services, provide accurate reports to the customer and at the same time, constantly review and modify any specific areas that may be seen as a Service or Performance bottleneck.
The next activity that needs to be carried out is to update the Service Catalog appropriately.
A crucial activity that also needs to be carried out is that review of the Service Level Management process as a whole. You should be able to point out the Critical Success Factors so that Key Performance Indicators (KPIs) can be established.
Monitoring and Reporting form an important part of Service Level Management. SLAs, OLAs and UPCs need to be monitored constantly so that whenever there is a breach in any of these, the same can not only be rectified at the earliest, but steps can also be taken to avoid or prevent future occurrences.
Reports should be simple and clear. It should clearly what happened and why. They can be Internal or External.
Internal Reporting covers the SLAs, OLAs and UPCs.
External Reporting covers Exception reports are used to indicate why there was a breakdown in service or why it came close to that.
The other important terms used for reporting in Service Level Management are SLAM or Service Level Agreement Monitoring Chart (another example of External Reporting). The color code used here is called RAG or Red, Amber & Green – which is used to quickly display the Service Levels and/or breaches, if any.
Trend Graphs are also quite popular as a Reporting Tool. They show the Consistency of Service over a defined period of time.
Service Support
SERVICE SUPPORT DISCIPLINES:
- Service Desk – is the single point of contact for end-users who need help with the service(s) they are receiving from an IT service provider.
- Incident Management – deals with restoring a service by providing a solution, “quick-fix” or a workaround as soon as possible.
- Problem Management – is responsible to identify the underlying cause of incidents and provide long term solutions to them.
- Change Management – is responsible to ensure that changes are handled promptly and efficiently using standardized procedures in order to minimize the impact of any related incident(s) on the IT service.
- Release Management – deals with implementing new (software or hardware) releases into the operational (live) environment, using the controlling processes of Configuration Management and Change Management.
- Configuration Management – manages the (ICT) assets of an organization and the relationship between these assets.
All these disciplines are known as “Processes” whereas, only Service Desk is known as a “Function“.
Change Management
CHANGE MANAGEMENT
What is a Change? - Change is the process of moving from one defined state to another.
Change Management is responsible to ensure that standardized methods and procedures are used for the efficient and prompt handling of all changes in order to minimize the impact of any related incident(s) on the (IT) service.
Change Management implements all the changes in an organization with minimum disruption to the IT services. It also carries out appropriate Impact Analyses before the implementation of the change and has a backout plan in place, in case the change does not work out in line with the expectations of the organization.
Change Management balances the need for the change against the risks to the IT infrastructure and will proceed only if:
- The impact is manageable.
- The cost is reasonable.
- The benefits to the business are worth it.
Change Management authorizes all changes to the IT infrastructure through the Change Advisory Board (CAB). The CAB is formed by a team of experts within the organization.
It is not necessary to approach the CAB for approval of all RFCs. In some cases, this responsibility can also be given to the Problem Management team or even the Operations Team. This is discussed in more detail in the later sections.
ITIL mandates that end-users are kept informed of any changes much in advance – this is done through Forward Schedules for Change – which lists out all the details of the change, including when it would occur and all the services and components that would be affected as a result of it.
All changes are released into the organization through the Release Management process.
The Change Management process starts with a Request for Change (RFC).
Some of the important sources of RFCs are:
- From the Service Desk.
- From Problem Management.
- When a new CI (Configuration Item) is introduced into the organization.
- Whenever there is a requirement for a new or changed IT service.
- From a customer or end-user.
- Any new legislation or laws.
The Change Advisory Board (CAB) is made up of:
- A Change Manager (The Change Manager chairs all CAB meetings.)
- Representativs of the IT Service Management team.
- Representatives of the customer.
- Representatives of the users.
- Representatives of developers, other consultants and other experts.
The CAB is responsible to:
- Review all RFCs and approve them if it meets the business requirements.
- Else the RFCs will be rejected.
- Keep a record of all RFCs, irrespective of wether it has been accepted or not.
- Advice on the grouping of changes into “Releases” so that there is little or no disruption to the organization.
The CAB EC:
- Stands for Change Advisory Board Emergency Committee.
- Consists of the Change Manager, a senior IT representative and a senior representative from the organization.
- Usually assembles at a short notice to review and authorize any urgent RFCs.
The Change Management Process:
- An RFC is generated to trigger the Change Management process.
- The Change Manager receives the RFC and approves or rejects it, as appropriate.
- Appropriate entries are made into the CMDB and the RFC is then indicated as a Change Record.
- The Change Manager allocates a priority to the change, after assessing the Impact and Urgency.
- The priority can either be “Standard Change” or “Urgent Change”.
- The change is Categorized as “Standard”, “Minor”, “Significant” or “Major”.
- “Standard” changes are usually the low-risk and frequently occurring ones and do not require authorization by the CAB. Example: upgrading a users computer or replacing a piece of hardware in a user’s computer.
- “Minor” changes are usually authorized by the Change Manager himself who then informs the CAB later.
- CAB authorization is needed for “Significant” and “Major” changes.
- If the RFC is approved, it is then implemented through the Release Management process after circulating a Forward Schedule for Change.
- If the Change is successfully implemented, it is then reviewed by the Change Manager and closed.
- If the Change is not successful, the Change Manager initiates the back out plan to get back to the previously working state.
Metrics for measurement of the Change Management process:
- Total number of changes in the defined period.
- Total number of Urgent Changes.
- Total number of changes implemented.
- Total cost for each change as against estimated costs.
- Number of rejected changes.
Change Management Audits should check:
- for compliance to all Change Management procedures.
- all Software releases to ensure they have been through the proper authorization process.
- all Incident records selected randomly through the change records.
- minutes of CAB meetings.
- Forward Schedules for Change.
- change review records.
Configuration Management
CONFIGURATION MANAGEMENT
Configuration Management is defined in ITIL as “Asset Management plus Relationships (with other Configuration Items [CIs]). It is important to consider the relationships between the CIs as making changes to one component can affect another CI.
Configuration Management underpins all delivery and support processes and defines IT assets and services as Configuration Items.
Configuration Item(CI) – A Configuration Item is a part of the ICT infrastructure (which can be a hardware, software, documentation or peopleware component). This term can be used to indicate whole systems or just a single hardware/software component. (In other words, a CI is “any component that has to be managed in order to deliver an IT service”.) CIs are under the control of the Change Management process.
Configuration Management Database(CMDB) – This is a database of all Configuration Items (CIs) in the organization. It not only contains full details of individual components but also contains details of the relationships between them.
Configuration Structure is the hierarchy of all CIs in any configuration.
Configuration Management Plan – is a document that lays out details of organization and procedures for the Configuration Management of any particular product or service.
Configuration Management has the responsibility to ensure that:
- the organization has an accurate record of its ICT (Hardware,Software, & Peopleware) assets.
- all details of the services offerred by the organization and it’s related ICT components and any relevant supporting documentation.
- any changes to the IT service(s) are done with the least risk to the business.
+ it provides a sound basis for Incident Management, Problem Management, Change Management & Release Management.
- any changes to the IT service(s) are done with the least risk to the business.
- it provides a sound basis for Incident Management, Problem Management, Change Management & Release Management.
- configuration records are verified against existing the infrastructure and any exceptions corrected.
Configuration Management has the following sub-processes:
Planning - consists of five sub-processes:
- Strategy, Policy, Scope and Objectives – to establish an effective Configuration Management System.
- Processes, Procedures, Guidelines & Responsibilities – to manage and control the ICT assets.
- Relationship with other ITIL Processes – define how Configuration Management will interact with other processes or vice-versa.
- Relationship with other Configuration Management teams – to exchange information with CMDBs of suppliers, external vendors, developers etc.
- Tools & Resource Requirements – to connect the CMDB to the system and network management tools to enable the automatic addition of CIs to the CMDB.
Identification – accurately identifying and labeling the CIs with IDs, versions, types and their relationships to other CIs.
Control - this has three sub-processes:
- Register – registering a CI when it enters the IT infrastructure.
- Update – updating the status of the CI to reflect it’s most current status.
- Archive – removing the CI from the CMDB and archiving it in a secure location.
Status Accounting – is concerned with reporting of all current & historical data about each CI throughout it’s life cycle.
Verification – (Verification & Audit) is responsible to ensure that the information contained in the CMDB exactly matches the live environment.
The Configuration Management Database (CMDB) - Configuration Management monitors the relationships between Configuration Items. It stores all details about these relationships (between CIs) in a CMDB.
A typical CMDB should contain the following information – refer to the image below:
- Details about the ICT components (Hardware, Software, Peopleware and related documents).
- All the services offerred by the organization and related CIs and the relationships between these CIs.
- Details of all Incidents, Problems and Known errors.
- Details about all Changes & Releases.
ITIL – CMDB
The CMDB contains the DHS (Definitive Hardware Store) and DSL (Definitive Software Library) which is managed by the Release Management process.
Incident Management
INCIDENT MANAGEMENT
What is an Incident?
An incident is any event which is not a part of the Standard Operation of a Service and which causes or may cause an interruption to or reduction in the quality of that Service.
The aim of Incident Management is to restore normal services as quickly as possible.
Some best practices:
- All inquiries should be recorded as incidents.
- Service Requests (request for a standard operational item, eg: password resets) should be recorded as incidents.
- A request for a new product or service should be recorded as a Request for Change (RFC).
- Automatically generated incidents (such as hardware or network failure) should also be recorded as incidents.
The Incident Life-Cycle
DETECTION & RECORDING:
- Provide a unique ID for each incident, even if it is a known issue.
- Record how the incident was reported – what were the Services and Configuration Items affected?
- Classify the incidents – like Hardware, Software or Service Requests.
- Match the current incident against previously reported incidents.
- Assign a priority to the incident. (Priority of an incident is determined by the Impact, Urgency, Availability of resources and the existence of certain parameters in the Service Level Agreement [SLA]).
- Provide initial support to the incident or provide a workaround. If it is a new workaround provided by the IT Service Desk, record it for future use.
- If the incident cannot be resolved, escalate the incident functionally.
INVESTIGATION & DIAGNOSIS:
- This may lead to resolution of the Incident right away or having it funcationally escalated ( to Level 2 support.) If that process is taking too much of time, it might also get heirarchically escalated.
RESOLUTION & RECOVERY:
- This can be done by raising an RFC and getting it implemented. Recovery just means “restoring a service or an ICT component back to its previously working condition“.
INCIDENT CLOSURE:
- This happens upon confirmation of resolution of the problem by the user.
Note:
- Impact is the measure of the level of effect the incident has on the business, for example: number of users affected or amount of revenue lost because of the incident.
- Urgency indicates the timescale within which the incident needs to be resolved.
For an incident to be considered High Priority – both the Impact & Urgency should be high.
Problem Management
PROBLEM MANAGEMENT
What is a Problem? – A Problem is “an unknown, underlying cause of one or more Incidents“.
Did you know? 80% of incidents are caused by 20% of ICT infrastructure components!
Problem Management is responsible to minimize the adverse effects of the Incidents and Problems caused by errors in the (ICT) Infrastructure on the business and to proactively prevent the occurrence of such errors, incidents and problems.
Problem Management looks for the underlying causes of Incidents and Problems and provides long-term (permanent) resolutions. It functions both Proactively and Re actively:
- Proactively: by trying to prevent the occurence of issues by intelligently analyzing problem trends and available statistics.
- Reactively: by identifying underlying problems which are causing the incidents and find a permanent resolution or an immediate workaround.
When Problem Management successfully identifies a problem and a suitable resolution to it – the resolution isimplemented through the Change Management process.
Prioritization of problems is generally done by the “Pain Factor (PF)”. (The Pain Factor is nothing but the number of people affected by the problem and the impact it is having on the business.) So, higher the PF, higher the priority.
Responsibilities of the Problem Management team:
- Problem Control: Transform Problems into Known Errors by identifying the root cause of the problem and providing a temporary workaround. (This converts a Problem into a Known Error.)
- Error Control: Resolves the Known Errors under the control of Change Management as soon as possible and whenever it is financially justifiable.
- Proactive Prevention of Problems: Carry out trend analyses and provide support to the organization.
- Providing Management Information from Problem Data: Carry out trend analyses and provide support to the organization.
- Conducting Major Problem Reviews: This is done after a major problem has been resolved so that future problems can be prevented.
The Problem Management process consists of the following stages:
- Identification: The first step is to identify a new Problem. If there are no matching records in the existing Problem or Known Errors database, then it is classified as a new Problem.
- Recording: A new record is created and a unique ID is assigned. All related Configuration Items are linked to it as well as all related Incidents/Known Errors.
- Classification: The Problem is classified appropriately and the impact of the Problem on the Service Levels are determined so that relevant resources can be assigned to resolve it.
- Investigation: The Problem is investigated so that a resolution is identified and it can be classified as a Known Error.
- Diagnosis: Techniques such as Kepler Tregoe analysis and Ishikawa Fishbone analysis are used. The end result again is the identification of a resolution or a temporary workaround to the problem so that it is converted into a Known Error.
- Review & Closure: After every Problem is resolved – it is thoroughly reviewed so that the following questions can be answered:
1. What was done right?
2. What was not done right?
3. What could have been done better?
4. How can we prevent it from happening again?
Release Management
RELEASE MANAGEMENT
What is a Release? A Release is a collection of authorized changes to an IT service.
Release Management is responsible for the implementation of all new and existing Hardware and Software releases (alongwith the related documents) into the live (operational) environment, under the controlling processes of Change Management and Configuration management. This process is concerned with protecting the live environment from any disruption and the Release Management activities are usually performed under the supervision of the Change Manager.
Releases can be classified as:
- Major Software Releases and Hardware Upgrades – This usually contains large amounts of new functionality and overrides all preceeding minor releases and upgrades.
- Minor Software Releases and Hardware Upgrades – This usually contains small amounts of new functionality and overrides all preceeding emergency releases and upgrades.
- Emergency Fixes – usually contains fixes to a small number of issues.
All changes are released as “Roll Outs“. A Roll Out includes distributing all the Configuration Items to wherever they are used. This can be done in many ways, for example: by internet, email, remotely or even by sending them on CDs. But when there are large releases to be rolled out over a vast geographical and cultural area, the use of automated scripts are a great help. However these scripts might need passwords to activate them.
Release Management has to maintain traceability of all the releases. We need to know where a particular version has come from and what are the changes it contains.
The Release Management process covers the following three areas:
- Development area.
- Release Management’s own pre-production area.
- Operational environment (live/production area.)
Migration from one area to another is subject to results from reviews, tests and other relevant quality checks.
Before a Release is Rolled Out into the live environment, Operational and Customer Acceptance tests are carried out. Operational tests ensures that anything that goes into the live environment is supportable, maintainable and robust. All existing and planned Backout Plans should also be fully tested.
The Contents of each Release is decided by Change Management but the Release Management team is always kept fully informed.
Hardware and Software Releases go through the following stages before they are Released into the live environment:
- Distribute.
- Build or Rebuild in the Live environment.
- Implementation.
It is important that each of these stages are carried out accurately before it progresses to the next one.
Release Management is also responsible for:
- DHS - This is the Definitive Hardware Store. It is a secure location or a number of locations where authorized versions of all hardware spares (Configuration Items in the live environment that exist in the CMDB) are stored.
- DSL - This is the Definitive Software Library. Again, this is a secure location or a number of locations where copies of all authorized versions of software CI are stored. (CI stands for Configuration Items). It can also be defined as a Physical library or repository where master copies of all software versions are stored.
Information about the DHS & DSL exists in the CMDB (Configuration Management Database) and Configuration Management is responsible to keep it always updated.
Adequate protection/security should be provided to both the DHS and DSL against eventualities like floods, earthquakes, fire and of course theft. In case of the DSL, it should also be protected from viruses, data corruption etc.
Release Unit: A Release Unit is the portion of the IT infrastructure that is normally released together.
Release Type: There are three Release types which are as given below:
- Full Release – In a Full Release, all components of the release unit are built, tested, distributed and released together. This is suitable for major changes and is very expensive.
- Delta Release – In a Delta Release, only those components that have changed since the last release are distributed. This type of release is best suited for fixes and emergency changes and is less expensive.
- Package Release – A Package Release is a group of Delta/Full release(s) which are released simultaneously. This type of release is suitable in situations where changes in one system may require changes to another.
Release Identification: Each release has to be identified. Usually, a numeric format is used. The specific release identification policy is generally decided by the Release Manager, after consulting with the Change Manager and the CAB. Example: A new application can be assigned an ID like v. 1.0 and a a later, minor release with some changes to it’s components can be identified as v.1.1. There is really no limit to the number of such levels that can be used to identify each release.
Roll Out Types: Releases can be rolled out in any/or a combination of the following ways:
- Big Bang Roll-outs - where all sites in the enterprise receive the releases simultaneously.
- Phased Roll-outs – where all sites receive some functionality at the same time and the remaining functionality at a later time.
- Pilot Roll-outs – where a single site receives all the functionality at one time, ahead of the others.
Service Desk
THE SERVICE DESK
When a company provides an IT Service to it’s customers, they are bound to have questions or just might run into problems for which they need a place from where they can get quick answers, quick resolutions or at least, a quick workaround to their problems so that they can carry on with their work and with their lives.
Customers get easily frustrated if they cannot get help – just when they need it.
An IT Service Desk is meant to be a single point of contact for customers who have a problem with the services they are receiving from the IT Service Provider. This is where they will report Incidents, RFCs (Requests for Change) or just any other problem.
Conversely, the IT Service Provider can also use the Service Desk as a channel through which he can communicate to his customers.
Service Desks are also known as Customer Help Desk, Help Desk, Hotline, Call Center, Customer Hotline etc.. But we will just call it Service Desk. which is nothing but an IT Service Desk. (You need to remember this.)
Why should you have a Service Desk?
- A good Service Desk helps to reduce Customer’s complaints.
- It increases Customer’s satisfaction.
- It reduces Service downtime and wastage of manpower.
- Increase Customer retention.
The Service Level Agreement (SLA) contains specific details about the hours of availability of the service, the time take to resolve an issue and the time within which the user must receive a response from the Service Desk. It is important for the Service Desk personnel to be aware of the SLA parameters.
Some organizations have a Service Desk which is a single point of contact for any issue they might face – from IT Problems to the-lift-is-not-working issues. But we are going to consider a Service Desk which deals only with IT related issues.
An IT Service Desk is responsible to:
- Monitor incidents and user’s queries.
- Keep the user updated about the progress.
- Follow up with any second level team and push for a quicker resolution.
- Make sure that SLAs are not breached.
A good IT Service Desk must have:
- Well trained Service Desk personnel (with good people skills).
- A properly organized system to record and track incidents.
- The Service Desk system should be able to identify similar incidents, even if previously reported/fixed.
- Should have a proper knowledge base which can be used as an important reference point.
- Service Desk Team should be technically competent to deal with users’ issues as and when they arise.
- Proper channels of communication with the other disciplines like Problem Management (for help when there is a major problem), Service Level Management (so that SLAs can be adhered to), Configuration Management (so that the user’s equipment can be identified whenever required) & Availability Management (for analysis of Service Desk related data which could help improve Services).
Types of IT Service Desks:
- Local Service Desk – to take care of regional/local users.
- Centralized Service Desk – which serves users from all geographical regions.
- Virtual Service Desk – is a collection of many local Service Desks, where calls are routed to the most appropriate Service Desk, based on the issue, time of day, location of user and so on.
How to contact an IT Service Desk?
An IT Service Desk gets its inputs from two major sources – Human Sources and Machine Sources.
- Humans can contact the IT Service Desk by Telephone, Fax, Email etc.
- Machines send out system generated alerts to the IT Service Desk.
Escalations:
An escalaton is the process where an incident is forwarded to a higher level or a more expert team for better resolution. Escalation can be of two types:
- Functional Escalation: Where an issue is passed on to a more competent team (Example: Escalating an issue to the Level 2 support team due to lack of enough expertise to solve a certain problem).
- Hierarchical Escalation: Where an issue is passed up the management chain because of lack of authority to do something. (Example: Escalating to the Service Desk Manager).
Service Delivery
SERVICE DELIVERY DISCIPLINES:
- Service Level Management – Deals with providing high quality services to the customer at the right costs.
- IT Services Financial Management – Deals with providing maximum business value with the minimum financial outlay.
- Availability Management – Manages the availability of services by optimizing the capability of the ICT infrastructure (hardware, software and peopleware) to deliver a cost effect level of (service) availability to meet business objectives.
- Capacity Management – Provides information on available capacity at any location or site.
- IT Sevice Continuity Management (BCM) – Manages the minimum level of support required to keep the business or service free from any interruption in the event of any disaster – example: floods, earthquakes etc.
IT Services Financial Management
IT SERVICES FINANCIAL MANAGEMENT:
The aim of IT Services Financial Management (ITSFM) is to provide an efficient and cost effective management of IT and financial resources in an enterprise. It also helps to extract maximum business value for minimum financial investments.
ITSFM helps to achieve maximum business value by:
- Calculating the Return on Investment (ROI) for the ITSM (IT Service Management) team and thereby helping in important decision-making.
- Financial Forecasting.
- By identifying, managing and controlling all costs incurred, both internally and externally (this can include costs incurred through any contracts with external suppliers). This helps to assess the Total Cost of Ownership which is the sum of the total cost of devlopment and the total cost of supporting it during it’s lifetime.
For any enterprise, it is important to strike the right balance between quality and cost of providing a service.There might be times when customers (or users) might demand close to 100% availability of a service, but it is the ITSFM team who will work out and provide the cost/benefit analysis to the ITSM team, based on which they (the ITSM team) can make a decision.
ITSFM is also responsible to recover the cost of a service /services from those who use them. Though this would primarily be a decision of the higher management, ITSFM can suggest the use of “differential charging” taking into consideration the demand for the service and the cost of providing that service.
ITSFM is concerned with the following functions:
- Budgeting: It deals with forecasting the money required to provide a particular service and tries to secure that money from the business for that purpose. It also monitors and controls the expenditure against the budgeted amounts.
- Accounting: It deals with keeping a track of where the money goes from that budget.
- Charging: deals with recovering the cost of providing a service from a customer or user. The charges for each service that is being provided to a customer should be accurately documented in the SLA (Service Level Agreement).
This is how it works – budgeting would tell you how much money has been allocated to provide a certain type of service. Accounting will tell you how that money is being/has been spent. And charging will tell you how much of the money spent on providing the service is being recovered.
Budgeting and accounting work hand-in-hand to identify and evaluate all the costs incurred to provide a service and how the money is being spent.
Budgeting, accounting and charging have a heirarchical relationship. While ITIL recommends implementation of at least budgeting and accounting, charging is optional. Accounting is required to calculate the ROI (Return on Investment) and Cost/Benefit calculations. These calculations might be necessary whenever new services are introducd or whenever there is a change to an existing service.
Types of costs:
Breaking down and classifying all the costs is crucial to creating a budget. The different types of costs in ITIL are as follows:
- Hardware (like computers, servers, printers, routers, hubs etc.)
- Software (like Operating Systems, proprietary applications, third-party applications etc.)
- People (costs incurred towards salaries and other benefits etc.)
- Accomodation (like offices, utility and storage spaces etc.)
- External Services (these refer to work that might be outsourced like security, development, facilities, disaster recovery, ISP etc.)
- Transfers (costs incurred due to cross charging within the business.
Classification of costs:
According to ITIL, costs must be classified at least into the following two categories:
- Capital costs.
- Operational costs.
Capital costs are usually associated with purchase of fixed assets like land, buildings etc and hardware such as computers, servers etc. They usually increase the value of the company. However, Operational costs are those costs incurred with the day to day running of the company and includes salaries, rents of buildings and other equipment, software licenses etc. Operational costs do not increase the value of the company because they are recurring in nature.
Depreciation: Sometimes it is necessay to track capital purchases wich lose their value over a certain period of time. Suppose an item was bought for Rs.100,000 and it was supposed to last three years. It just means that every year it’s value will reduce by one-third of the initial cost of Rs.100,000. This way, at the end of three years this item will have zero value. This is how depreciation is calculated.
Capitalization: This is just the opposite of depreciation. Here, operational costs is sometimes put up as capital costs so that it too can be allowed to depreciate. Let’s say one company spends Rs.100,000 to develop a software. When this software application is ready, it adds Rs.100,000 as value to the company. But if its life span was say, five years, then it will depreciate accordingly and finally, at the end of five years, it will have zero value.
Costs can also be classified as :
- Direct costs.
- Indirect costs.
Direct costs can be attributed directly to a customer or a number of customers. Example: The purchase of a server exclusively for the use of the payroll department.
Indirect costs are those costs that are shared among a group of customers or groups.They cannot be attributed to any single customer or group.
Indirect costs can be of two types:
- Absorbed costs – here it is possible to track usage by different customers or groups.
- Unabsorbed costs – here, it is not possible to track usage by different customers or groups. Example: The Service Desk. Here the total cost of running the Service Desk is distributed among all user groups.
Fixed costs and Variable costs:
Costs can also be classified as:
- Fixed costs – costs remain the same irrespective or usage. Example: cost of a leased line.
- Variable costs – increases or decreases according to usage. Example: cost of telephone usage.
After Budgeting Accounting, comes Charging:
Firstly it is upto the higher management to decide whether or not to charge for a particular service. ITSFM does not decide this.
Sometimes it can be decided that certain services will not be charged at all because of the high costs involved in the charging activities themselves – like invoicing, printing out bills, despatch and delivery of bills etc.
In other instances, a company can decide to charge back from users just what has been spent to provide a particular service. This is called “zero balance” policy.
There is also a “cost plus” policy where the amount recovered is more than what has been spent on providing the service.
The “cost minus” policy only recovers part of the amount spent on providing a particular service. Now, how much that “part” will be is upto the higher management to decide.
Different approaches to charging:
- A “going rate” approach is based on what departments or groups charge for the same kind of service provided.
- The “market rate” approach is based on what other companies charge for a similar type of service.
- A “fixed price” approach is based on an agreed price with the customer or user group.
ITIL recommends that companies should have a charging policy which is fair, easy to understand and easy to control.
Service Level Management
SERVICE LEVEL MANAGEMENT:
Firstly, Service Level Management or SLM is the heart of Service Management.
This process is responsible to make sure that Service Level Agreements (SLAs), Operational Level Agreements(OLAs) and Underpinning Contracts (UPCs) are met.
Service Level Management also makes sure that Service Targets, like availability of services and response time are all agreed upon in advance and accurately documented. It also manages the SLAs and provides targets which can be used to judge the performance of the service provider.
Some people like to say the goal of Service Level Management is to improve the quality of IT Services provided through regularly monitoring, reporting and reviewing the performance of the services, while at the same time working towards removing service or performance bottlenecks whenever possible.
Service Level Management is made up of four important stages. Let’s quickly take a look at each of these:
Stage 1: Create the Service Management Catalog.
Stage 2: Identify the SLRs (Service Level Requirements): This is basically identitying what kind of service your customers are exactly looking for and what they are willing to pay for.
Stage 3: Based on the SLRs (Service Level Requirements) go ahead and create the Operational Level Agreements (OLAs) and Underpinning Contracts (UPCs).
Stage 4: Create the SLA. Here, you can modify any existing SLAs you may have with any other client to suit your own company’s or organization’s business requirements.
Once you have created the SLA this way, you need to get it formally agreed by the customer. Once the agreement has been formalized, the SLAs need to be implemented and all concerned parties have to be informed about it.
Okay, so what happens after the SLAs are all agreed upon and implemented and you start to actually provide the service? Let’s take a look:
The next stage in SLM is to constantly monitor the services, provide accurate reports to the customer and at the same time, constantly review and modify any specific areas that may be seen as a Service or Performance bottleneck.
The next activity that needs to be carried out is to update the Service Catalog appropriately.
A crucial activity that also needs to be carried out is that review of the Service Level Management process as a whole. You should be able to point out the Critical Success Factors so that Key Performance Indicators (KPIs) can be established.
Monitoring and Reporting form an important part of Service Level Management. SLAs, OLAs and UPCs need to be monitored constantly so that whenever there is a breach in any of these, the same can not only be rectified at the earliest, but steps can also be taken to avoid or prevent future occurrences.
Reports should be simple and clear. It should clearly what happened and why. They can be Internal or External.
Internal Reporting covers the SLAs, OLAs and UPCs.
External Reporting covers Exception reports are used to indicate why there was a breakdown in service or why it came close to that.
The other important terms used for reporting in Service Level Management are SLAM or Service Level Agreement Monitoring Chart (another example of External Reporting). The color code used here is called RAG or Red, Amber & Green – which is used to quickly display the Service Levels and/or breaches, if any.
Trend Graphs are also quite popular as a Reporting Tool. They show the Consistency of Service over a defined period of time.
Service Support
SERVICE SUPPORT DISCIPLINES:
- Service Desk – is the single point of contact for end-users who need help with the service(s) they are receiving from an IT service provider.
- Incident Management – deals with restoring a service by providing a solution, “quick-fix” or a workaround as soon as possible.
- Problem Management – is responsible to identify the underlying cause of incidents and provide long term solutions to them.
- Change Management – is responsible to ensure that changes are handled promptly and efficiently using standardized procedures in order to minimize the impact of any related incident(s) on the IT service.
- Release Management – deals with implementing new (software or hardware) releases into the operational (live) environment, using the controlling processes of Configuration Management and Change Management.
- Configuration Management – manages the (ICT) assets of an organization and the relationship between these assets.
All these disciplines are known as “Processes” whereas, only Service Desk is known as a “Function“.
Change Management
CHANGE MANAGEMENT
What is a Change? - Change is the process of moving from one defined state to another.
Change Management is responsible to ensure that standardized methods and procedures are used for the efficient and prompt handling of all changes in order to minimize the impact of any related incident(s) on the (IT) service.
Change Management implements all the changes in an organization with minimum disruption to the IT services. It also carries out appropriate Impact Analyses before the implementation of the change and has a backout plan in place, in case the change does not work out in line with the expectations of the organization.
Change Management balances the need for the change against the risks to the IT infrastructure and will proceed only if:
- The impact is manageable.
- The cost is reasonable.
- The benefits to the business are worth it.
Change Management authorizes all changes to the IT infrastructure through the Change Advisory Board (CAB). The CAB is formed by a team of experts within the organization.
It is not necessary to approach the CAB for approval of all RFCs. In some cases, this responsibility can also be given to the Problem Management team or even the Operations Team. This is discussed in more detail in the later sections.
ITIL mandates that end-users are kept informed of any changes much in advance – this is done through Forward Schedules for Change – which lists out all the details of the change, including when it would occur and all the services and components that would be affected as a result of it.
All changes are released into the organization through the Release Management process.
The Change Management process starts with a Request for Change (RFC).
Some of the important sources of RFCs are:
- From the Service Desk.
- From Problem Management.
- When a new CI (Configuration Item) is introduced into the organization.
- Whenever there is a requirement for a new or changed IT service.
- From a customer or end-user.
- Any new legislation or laws.
The Change Advisory Board (CAB) is made up of:
- A Change Manager (The Change Manager chairs all CAB meetings.)
- Representativs of the IT Service Management team.
- Representatives of the customer.
- Representatives of the users.
- Representatives of developers, other consultants and other experts.
The CAB is responsible to:
- Review all RFCs and approve them if it meets the business requirements.
- Else the RFCs will be rejected.
- Keep a record of all RFCs, irrespective of wether it has been accepted or not.
- Advice on the grouping of changes into “Releases” so that there is little or no disruption to the organization.
The CAB EC:
- Stands for Change Advisory Board Emergency Committee.
- Consists of the Change Manager, a senior IT representative and a senior representative from the organization.
- Usually assembles at a short notice to review and authorize any urgent RFCs.
The Change Management Process:
- An RFC is generated to trigger the Change Management process.
- The Change Manager receives the RFC and approves or rejects it, as appropriate.
- Appropriate entries are made into the CMDB and the RFC is then indicated as a Change Record.
- The Change Manager allocates a priority to the change, after assessing the Impact and Urgency.
- The priority can either be “Standard Change” or “Urgent Change”.
- The change is Categorized as “Standard”, “Minor”, “Significant” or “Major”.
- “Standard” changes are usually the low-risk and frequently occurring ones and do not require authorization by the CAB. Example: upgrading a users computer or replacing a piece of hardware in a user’s computer.
- “Minor” changes are usually authorized by the Change Manager himself who then informs the CAB later.
- CAB authorization is needed for “Significant” and “Major” changes.
- If the RFC is approved, it is then implemented through the Release Management process after circulating a Forward Schedule for Change.
- If the Change is successfully implemented, it is then reviewed by the Change Manager and closed.
- If the Change is not successful, the Change Manager initiates the back out plan to get back to the previously working state.
Metrics for measurement of the Change Management process:
- Total number of changes in the defined period.
- Total number of Urgent Changes.
- Total number of changes implemented.
- Total cost for each change as against estimated costs.
- Number of rejected changes.
Change Management Audits should check:
- for compliance to all Change Management procedures.
- all Software releases to ensure they have been through the proper authorization process.
- all Incident records selected randomly through the change records.
- minutes of CAB meetings.
- Forward Schedules for Change.
- change review records.
Configuration Management
CONFIGURATION MANAGEMENT
Configuration Management is defined in ITIL as “Asset Management plus Relationships (with other Configuration Items [CIs]). It is important to consider the relationships between the CIs as making changes to one component can affect another CI.
Configuration Management underpins all delivery and support processes and defines IT assets and services as Configuration Items.
Configuration Item(CI) – A Configuration Item is a part of the ICT infrastructure (which can be a hardware, software, documentation or peopleware component). This term can be used to indicate whole systems or just a single hardware/software component. (In other words, a CI is “any component that has to be managed in order to deliver an IT service”.) CIs are under the control of the Change Management process.
Configuration Management Database(CMDB) – This is a database of all Configuration Items (CIs) in the organization. It not only contains full details of individual components but also contains details of the relationships between them.
Configuration Structure is the hierarchy of all CIs in any configuration.
Configuration Management Plan – is a document that lays out details of organization and procedures for the Configuration Management of any particular product or service.
Configuration Management has the responsibility to ensure that:
- the organization has an accurate record of its ICT (Hardware,Software, & Peopleware) assets.
- all details of the services offerred by the organization and it’s related ICT components and any relevant supporting documentation.
- any changes to the IT service(s) are done with the least risk to the business.
+ it provides a sound basis for Incident Management, Problem Management, Change Management & Release Management.
- any changes to the IT service(s) are done with the least risk to the business.
- it provides a sound basis for Incident Management, Problem Management, Change Management & Release Management.
- configuration records are verified against existing the infrastructure and any exceptions corrected.
Configuration Management has the following sub-processes:
Planning - consists of five sub-processes:
- Strategy, Policy, Scope and Objectives – to establish an effective Configuration Management System.
- Processes, Procedures, Guidelines & Responsibilities – to manage and control the ICT assets.
- Relationship with other ITIL Processes – define how Configuration Management will interact with other processes or vice-versa.
- Relationship with other Configuration Management teams – to exchange information with CMDBs of suppliers, external vendors, developers etc.
- Tools & Resource Requirements – to connect the CMDB to the system and network management tools to enable the automatic addition of CIs to the CMDB.
Identification – accurately identifying and labeling the CIs with IDs, versions, types and their relationships to other CIs.
Control - this has three sub-processes:
- Register – registering a CI when it enters the IT infrastructure.
- Update – updating the status of the CI to reflect it’s most current status.
- Archive – removing the CI from the CMDB and archiving it in a secure location.
Status Accounting – is concerned with reporting of all current & historical data about each CI throughout it’s life cycle.
Verification – (Verification & Audit) is responsible to ensure that the information contained in the CMDB exactly matches the live environment.
The Configuration Management Database (CMDB) - Configuration Management monitors the relationships between Configuration Items. It stores all details about these relationships (between CIs) in a CMDB.
A typical CMDB should contain the following information – refer to the image below:
- Details about the ICT components (Hardware, Software, Peopleware and related documents).
- All the services offerred by the organization and related CIs and the relationships between these CIs.
- Details of all Incidents, Problems and Known errors.
- Details about all Changes & Releases.
ITIL – CMDB
The CMDB contains the DHS (Definitive Hardware Store) and DSL (Definitive Software Library) which is managed by the Release Management process.
Incident Management
INCIDENT MANAGEMENT
What is an Incident?
An incident is any event which is not a part of the Standard Operation of a Service and which causes or may cause an interruption to or reduction in the quality of that Service.
The aim of Incident Management is to restore normal services as quickly as possible.
Some best practices:
- All inquiries should be recorded as incidents.
- Service Requests (request for a standard operational item, eg: password resets) should be recorded as incidents.
- A request for a new product or service should be recorded as a Request for Change (RFC).
- Automatically generated incidents (such as hardware or network failure) should also be recorded as incidents.
The Incident Life-Cycle
DETECTION & RECORDING:
- Provide a unique ID for each incident, even if it is a known issue.
- Record how the incident was reported – what were the Services and Configuration Items affected?
- Classify the incidents – like Hardware, Software or Service Requests.
- Match the current incident against previously reported incidents.
- Assign a priority to the incident. (Priority of an incident is determined by the Impact, Urgency, Availability of resources and the existence of certain parameters in the Service Level Agreement [SLA]).
- Provide initial support to the incident or provide a workaround. If it is a new workaround provided by the IT Service Desk, record it for future use.
- If the incident cannot be resolved, escalate the incident functionally.
INVESTIGATION & DIAGNOSIS:
- This may lead to resolution of the Incident right away or having it funcationally escalated ( to Level 2 support.) If that process is taking too much of time, it might also get heirarchically escalated.
RESOLUTION & RECOVERY:
- This can be done by raising an RFC and getting it implemented. Recovery just means “restoring a service or an ICT component back to its previously working condition“.
INCIDENT CLOSURE:
- This happens upon confirmation of resolution of the problem by the user.
Note:
- Impact is the measure of the level of effect the incident has on the business, for example: number of users affected or amount of revenue lost because of the incident.
- Urgency indicates the timescale within which the incident needs to be resolved.
For an incident to be considered High Priority – both the Impact & Urgency should be high.
Problem Management
PROBLEM MANAGEMENT
What is a Problem? – A Problem is “an unknown, underlying cause of one or more Incidents“.
Did you know? 80% of incidents are caused by 20% of ICT infrastructure components!
Problem Management is responsible to minimize the adverse effects of the Incidents and Problems caused by errors in the (ICT) Infrastructure on the business and to proactively prevent the occurrence of such errors, incidents and problems.
Problem Management looks for the underlying causes of Incidents and Problems and provides long-term (permanent) resolutions. It functions both Proactively and Re actively:
- Proactively: by trying to prevent the occurence of issues by intelligently analyzing problem trends and available statistics.
- Reactively: by identifying underlying problems which are causing the incidents and find a permanent resolution or an immediate workaround.
When Problem Management successfully identifies a problem and a suitable resolution to it – the resolution isimplemented through the Change Management process.
Prioritization of problems is generally done by the “Pain Factor (PF)”. (The Pain Factor is nothing but the number of people affected by the problem and the impact it is having on the business.) So, higher the PF, higher the priority.
Responsibilities of the Problem Management team:
- Problem Control: Transform Problems into Known Errors by identifying the root cause of the problem and providing a temporary workaround. (This converts a Problem into a Known Error.)
- Error Control: Resolves the Known Errors under the control of Change Management as soon as possible and whenever it is financially justifiable.
- Proactive Prevention of Problems: Carry out trend analyses and provide support to the organization.
- Providing Management Information from Problem Data: Carry out trend analyses and provide support to the organization.
- Conducting Major Problem Reviews: This is done after a major problem has been resolved so that future problems can be prevented.
The Problem Management process consists of the following stages:
- Identification: The first step is to identify a new Problem. If there are no matching records in the existing Problem or Known Errors database, then it is classified as a new Problem.
- Recording: A new record is created and a unique ID is assigned. All related Configuration Items are linked to it as well as all related Incidents/Known Errors.
- Classification: The Problem is classified appropriately and the impact of the Problem on the Service Levels are determined so that relevant resources can be assigned to resolve it.
- Investigation: The Problem is investigated so that a resolution is identified and it can be classified as a Known Error.
- Diagnosis: Techniques such as Kepler Tregoe analysis and Ishikawa Fishbone analysis are used. The end result again is the identification of a resolution or a temporary workaround to the problem so that it is converted into a Known Error.
- Review & Closure: After every Problem is resolved – it is thoroughly reviewed so that the following questions can be answered:
1. What was done right?
2. What was not done right?
3. What could have been done better?
4. How can we prevent it from happening again?
Release Management
RELEASE MANAGEMENT
What is a Release? A Release is a collection of authorized changes to an IT service.
Release Management is responsible for the implementation of all new and existing Hardware and Software releases (alongwith the related documents) into the live (operational) environment, under the controlling processes of Change Management and Configuration management. This process is concerned with protecting the live environment from any disruption and the Release Management activities are usually performed under the supervision of the Change Manager.
Releases can be classified as:
- Major Software Releases and Hardware Upgrades – This usually contains large amounts of new functionality and overrides all preceeding minor releases and upgrades.
- Minor Software Releases and Hardware Upgrades – This usually contains small amounts of new functionality and overrides all preceeding emergency releases and upgrades.
- Emergency Fixes – usually contains fixes to a small number of issues.
All changes are released as “Roll Outs“. A Roll Out includes distributing all the Configuration Items to wherever they are used. This can be done in many ways, for example: by internet, email, remotely or even by sending them on CDs. But when there are large releases to be rolled out over a vast geographical and cultural area, the use of automated scripts are a great help. However these scripts might need passwords to activate them.
Release Management has to maintain traceability of all the releases. We need to know where a particular version has come from and what are the changes it contains.
The Release Management process covers the following three areas:
- Development area.
- Release Management’s own pre-production area.
- Operational environment (live/production area.)
Migration from one area to another is subject to results from reviews, tests and other relevant quality checks.
Before a Release is Rolled Out into the live environment, Operational and Customer Acceptance tests are carried out. Operational tests ensures that anything that goes into the live environment is supportable, maintainable and robust. All existing and planned Backout Plans should also be fully tested.
The Contents of each Release is decided by Change Management but the Release Management team is always kept fully informed.
Hardware and Software Releases go through the following stages before they are Released into the live environment:
- Distribute.
- Build or Rebuild in the Live environment.
- Implementation.
It is important that each of these stages are carried out accurately before it progresses to the next one.
Release Management is also responsible for:
- DHS - This is the Definitive Hardware Store. It is a secure location or a number of locations where authorized versions of all hardware spares (Configuration Items in the live environment that exist in the CMDB) are stored.
- DSL - This is the Definitive Software Library. Again, this is a secure location or a number of locations where copies of all authorized versions of software CI are stored. (CI stands for Configuration Items). It can also be defined as a Physical library or repository where master copies of all software versions are stored.
Information about the DHS & DSL exists in the CMDB (Configuration Management Database) and Configuration Management is responsible to keep it always updated.
Adequate protection/security should be provided to both the DHS and DSL against eventualities like floods, earthquakes, fire and of course theft. In case of the DSL, it should also be protected from viruses, data corruption etc.
Release Unit: A Release Unit is the portion of the IT infrastructure that is normally released together.
Release Type: There are three Release types which are as given below:
- Full Release – In a Full Release, all components of the release unit are built, tested, distributed and released together. This is suitable for major changes and is very expensive.
- Delta Release – In a Delta Release, only those components that have changed since the last release are distributed. This type of release is best suited for fixes and emergency changes and is less expensive.
- Package Release – A Package Release is a group of Delta/Full release(s) which are released simultaneously. This type of release is suitable in situations where changes in one system may require changes to another.
Release Identification: Each release has to be identified. Usually, a numeric format is used. The specific release identification policy is generally decided by the Release Manager, after consulting with the Change Manager and the CAB. Example: A new application can be assigned an ID like v. 1.0 and a a later, minor release with some changes to it’s components can be identified as v.1.1. There is really no limit to the number of such levels that can be used to identify each release.
Roll Out Types: Releases can be rolled out in any/or a combination of the following ways:
- Big Bang Roll-outs - where all sites in the enterprise receive the releases simultaneously.
- Phased Roll-outs – where all sites receive some functionality at the same time and the remaining functionality at a later time.
- Pilot Roll-outs – where a single site receives all the functionality at one time, ahead of the others.
Service Desk
THE SERVICE DESK
When a company provides an IT Service to it’s customers, they are bound to have questions or just might run into problems for which they need a place from where they can get quick answers, quick resolutions or at least, a quick workaround to their problems so that they can carry on with their work and with their lives.
Customers get easily frustrated if they cannot get help – just when they need it.
An IT Service Desk is meant to be a single point of contact for customers who have a problem with the services they are receiving from the IT Service Provider. This is where they will report Incidents, RFCs (Requests for Change) or just any other problem.
Conversely, the IT Service Provider can also use the Service Desk as a channel through which he can communicate to his customers.
Service Desks are also known as Customer Help Desk, Help Desk, Hotline, Call Center, Customer Hotline etc.. But we will just call it Service Desk. which is nothing but an IT Service Desk. (You need to remember this.)
Why should you have a Service Desk?
- A good Service Desk helps to reduce Customer’s complaints.
- It increases Customer’s satisfaction.
- It reduces Service downtime and wastage of manpower.
- Increase Customer retention.
The Service Level Agreement (SLA) contains specific details about the hours of availability of the service, the time take to resolve an issue and the time within which the user must receive a response from the Service Desk. It is important for the Service Desk personnel to be aware of the SLA parameters.
Some organizations have a Service Desk which is a single point of contact for any issue they might face – from IT Problems to the-lift-is-not-working issues. But we are going to consider a Service Desk which deals only with IT related issues.
An IT Service Desk is responsible to:
- Monitor incidents and user’s queries.
- Keep the user updated about the progress.
- Follow up with any second level team and push for a quicker resolution.
- Make sure that SLAs are not breached.
A good IT Service Desk must have:
- Well trained Service Desk personnel (with good people skills).
- A properly organized system to record and track incidents.
- The Service Desk system should be able to identify similar incidents, even if previously reported/fixed.
- Should have a proper knowledge base which can be used as an important reference point.
- Service Desk Team should be technically competent to deal with users’ issues as and when they arise.
- Proper channels of communication with the other disciplines like Problem Management (for help when there is a major problem), Service Level Management (so that SLAs can be adhered to), Configuration Management (so that the user’s equipment can be identified whenever required) & Availability Management (for analysis of Service Desk related data which could help improve Services).
Types of IT Service Desks:
- Local Service Desk – to take care of regional/local users.
- Centralized Service Desk – which serves users from all geographical regions.
- Virtual Service Desk – is a collection of many local Service Desks, where calls are routed to the most appropriate Service Desk, based on the issue, time of day, location of user and so on.
How to contact an IT Service Desk?
An IT Service Desk gets its inputs from two major sources – Human Sources and Machine Sources.
- Humans can contact the IT Service Desk by Telephone, Fax, Email etc.
- Machines send out system generated alerts to the IT Service Desk.
Escalations:
An escalaton is the process where an incident is forwarded to a higher level or a more expert team for better resolution. Escalation can be of two types:
- Functional Escalation: Where an issue is passed on to a more competent team (Example: Escalating an issue to the Level 2 support team due to lack of enough expertise to solve a certain problem).
- Hierarchical Escalation: Where an issue is passed up the management chain because of lack of authority to do something. (Example: Escalating to the Service Desk Manager).
IT Services Financial Management
IT SERVICES FINANCIAL MANAGEMENT:
The aim of IT Services Financial Management (ITSFM) is to provide an efficient and cost effective management of IT and financial resources in an enterprise. It also helps to extract maximum business value for minimum financial investments.
The aim of IT Services Financial Management (ITSFM) is to provide an efficient and cost effective management of IT and financial resources in an enterprise. It also helps to extract maximum business value for minimum financial investments.
ITSFM helps to achieve maximum business value by:
- Calculating the Return on Investment (ROI) for the ITSM (IT Service Management) team and thereby helping in important decision-making.
- Financial Forecasting.
- By identifying, managing and controlling all costs incurred, both internally and externally (this can include costs incurred through any contracts with external suppliers). This helps to assess the Total Cost of Ownership which is the sum of the total cost of devlopment and the total cost of supporting it during it’s lifetime.
For any enterprise, it is important to strike the right balance between quality and cost of providing a service.There might be times when customers (or users) might demand close to 100% availability of a service, but it is the ITSFM team who will work out and provide the cost/benefit analysis to the ITSM team, based on which they (the ITSM team) can make a decision.
ITSFM is also responsible to recover the cost of a service /services from those who use them. Though this would primarily be a decision of the higher management, ITSFM can suggest the use of “differential charging” taking into consideration the demand for the service and the cost of providing that service.
ITSFM is concerned with the following functions:
- Budgeting: It deals with forecasting the money required to provide a particular service and tries to secure that money from the business for that purpose. It also monitors and controls the expenditure against the budgeted amounts.
- Accounting: It deals with keeping a track of where the money goes from that budget.
- Charging: deals with recovering the cost of providing a service from a customer or user. The charges for each service that is being provided to a customer should be accurately documented in the SLA (Service Level Agreement).
This is how it works – budgeting would tell you how much money has been allocated to provide a certain type of service. Accounting will tell you how that money is being/has been spent. And charging will tell you how much of the money spent on providing the service is being recovered.
Budgeting and accounting work hand-in-hand to identify and evaluate all the costs incurred to provide a service and how the money is being spent.
Budgeting, accounting and charging have a heirarchical relationship. While ITIL recommends implementation of at least budgeting and accounting, charging is optional. Accounting is required to calculate the ROI (Return on Investment) and Cost/Benefit calculations. These calculations might be necessary whenever new services are introducd or whenever there is a change to an existing service.
Types of costs:
Breaking down and classifying all the costs is crucial to creating a budget. The different types of costs in ITIL are as follows:
- Hardware (like computers, servers, printers, routers, hubs etc.)
- Software (like Operating Systems, proprietary applications, third-party applications etc.)
- People (costs incurred towards salaries and other benefits etc.)
- Accomodation (like offices, utility and storage spaces etc.)
- External Services (these refer to work that might be outsourced like security, development, facilities, disaster recovery, ISP etc.)
- Transfers (costs incurred due to cross charging within the business.
Classification of costs:
According to ITIL, costs must be classified at least into the following two categories:
According to ITIL, costs must be classified at least into the following two categories:
- Capital costs.
- Operational costs.
Capital costs are usually associated with purchase of fixed assets like land, buildings etc and hardware such as computers, servers etc. They usually increase the value of the company. However, Operational costs are those costs incurred with the day to day running of the company and includes salaries, rents of buildings and other equipment, software licenses etc. Operational costs do not increase the value of the company because they are recurring in nature.
Depreciation: Sometimes it is necessay to track capital purchases wich lose their value over a certain period of time. Suppose an item was bought for Rs.100,000 and it was supposed to last three years. It just means that every year it’s value will reduce by one-third of the initial cost of Rs.100,000. This way, at the end of three years this item will have zero value. This is how depreciation is calculated.
Capitalization: This is just the opposite of depreciation. Here, operational costs is sometimes put up as capital costs so that it too can be allowed to depreciate. Let’s say one company spends Rs.100,000 to develop a software. When this software application is ready, it adds Rs.100,000 as value to the company. But if its life span was say, five years, then it will depreciate accordingly and finally, at the end of five years, it will have zero value.
Costs can also be classified as :
- Direct costs.
- Indirect costs.
Direct costs can be attributed directly to a customer or a number of customers. Example: The purchase of a server exclusively for the use of the payroll department.
Indirect costs are those costs that are shared among a group of customers or groups.They cannot be attributed to any single customer or group.
Indirect costs can be of two types:
- Absorbed costs – here it is possible to track usage by different customers or groups.
- Unabsorbed costs – here, it is not possible to track usage by different customers or groups. Example: The Service Desk. Here the total cost of running the Service Desk is distributed among all user groups.
Fixed costs and Variable costs:
Costs can also be classified as:
Costs can also be classified as:
- Fixed costs – costs remain the same irrespective or usage. Example: cost of a leased line.
- Variable costs – increases or decreases according to usage. Example: cost of telephone usage.
After Budgeting Accounting, comes Charging:
Firstly it is upto the higher management to decide whether or not to charge for a particular service. ITSFM does not decide this.
Sometimes it can be decided that certain services will not be charged at all because of the high costs involved in the charging activities themselves – like invoicing, printing out bills, despatch and delivery of bills etc.
In other instances, a company can decide to charge back from users just what has been spent to provide a particular service. This is called “zero balance” policy.
There is also a “cost plus” policy where the amount recovered is more than what has been spent on providing the service.
The “cost minus” policy only recovers part of the amount spent on providing a particular service. Now, how much that “part” will be is upto the higher management to decide.
Different approaches to charging:
- A “going rate” approach is based on what departments or groups charge for the same kind of service provided.
- The “market rate” approach is based on what other companies charge for a similar type of service.
- A “fixed price” approach is based on an agreed price with the customer or user group.
ITIL recommends that companies should have a charging policy which is fair, easy to understand and easy to control.
Service Level Management
SERVICE LEVEL MANAGEMENT:
Firstly, Service Level Management or SLM is the heart of Service Management.
This process is responsible to make sure that Service Level Agreements (SLAs), Operational Level Agreements(OLAs) and Underpinning Contracts (UPCs) are met.
Service Level Management also makes sure that Service Targets, like availability of services and response time are all agreed upon in advance and accurately documented. It also manages the SLAs and provides targets which can be used to judge the performance of the service provider.
Some people like to say the goal of Service Level Management is to improve the quality of IT Services provided through regularly monitoring, reporting and reviewing the performance of the services, while at the same time working towards removing service or performance bottlenecks whenever possible.
Service Level Management is made up of four important stages. Let’s quickly take a look at each of these:
Stage 1: Create the Service Management Catalog.
Stage 2: Identify the SLRs (Service Level Requirements): This is basically identitying what kind of service your customers are exactly looking for and what they are willing to pay for.
Stage 3: Based on the SLRs (Service Level Requirements) go ahead and create the Operational Level Agreements (OLAs) and Underpinning Contracts (UPCs).
Stage 4: Create the SLA. Here, you can modify any existing SLAs you may have with any other client to suit your own company’s or organization’s business requirements.
Once you have created the SLA this way, you need to get it formally agreed by the customer. Once the agreement has been formalized, the SLAs need to be implemented and all concerned parties have to be informed about it.
Okay, so what happens after the SLAs are all agreed upon and implemented and you start to actually provide the service? Let’s take a look:
The next stage in SLM is to constantly monitor the services, provide accurate reports to the customer and at the same time, constantly review and modify any specific areas that may be seen as a Service or Performance bottleneck.
The next activity that needs to be carried out is to update the Service Catalog appropriately.
A crucial activity that also needs to be carried out is that review of the Service Level Management process as a whole. You should be able to point out the Critical Success Factors so that Key Performance Indicators (KPIs) can be established.
Monitoring and Reporting form an important part of Service Level Management. SLAs, OLAs and UPCs need to be monitored constantly so that whenever there is a breach in any of these, the same can not only be rectified at the earliest, but steps can also be taken to avoid or prevent future occurrences.
Reports should be simple and clear. It should clearly what happened and why. They can be Internal or External.
Internal Reporting covers the SLAs, OLAs and UPCs.
External Reporting covers Exception reports are used to indicate why there was a breakdown in service or why it came close to that.
The other important terms used for reporting in Service Level Management are SLAM or Service Level Agreement Monitoring Chart (another example of External Reporting). The color code used here is called RAG or Red, Amber & Green – which is used to quickly display the Service Levels and/or breaches, if any.
Trend Graphs are also quite popular as a Reporting Tool. They show the Consistency of Service over a defined period of time.
Service Support
SERVICE SUPPORT DISCIPLINES:
- Service Desk – is the single point of contact for end-users who need help with the service(s) they are receiving from an IT service provider.
- Incident Management – deals with restoring a service by providing a solution, “quick-fix” or a workaround as soon as possible.
- Problem Management – is responsible to identify the underlying cause of incidents and provide long term solutions to them.
- Change Management – is responsible to ensure that changes are handled promptly and efficiently using standardized procedures in order to minimize the impact of any related incident(s) on the IT service.
- Release Management – deals with implementing new (software or hardware) releases into the operational (live) environment, using the controlling processes of Configuration Management and Change Management.
- Configuration Management – manages the (ICT) assets of an organization and the relationship between these assets.
All these disciplines are known as “Processes” whereas, only Service Desk is known as a “Function“.
Change Management
CHANGE MANAGEMENT
What is a Change? - Change is the process of moving from one defined state to another.
What is a Change? - Change is the process of moving from one defined state to another.
Change Management is responsible to ensure that standardized methods and procedures are used for the efficient and prompt handling of all changes in order to minimize the impact of any related incident(s) on the (IT) service.
Change Management implements all the changes in an organization with minimum disruption to the IT services. It also carries out appropriate Impact Analyses before the implementation of the change and has a backout plan in place, in case the change does not work out in line with the expectations of the organization.
Change Management balances the need for the change against the risks to the IT infrastructure and will proceed only if:
- The impact is manageable.
- The cost is reasonable.
- The benefits to the business are worth it.
Change Management authorizes all changes to the IT infrastructure through the Change Advisory Board (CAB). The CAB is formed by a team of experts within the organization.
It is not necessary to approach the CAB for approval of all RFCs. In some cases, this responsibility can also be given to the Problem Management team or even the Operations Team. This is discussed in more detail in the later sections.
ITIL mandates that end-users are kept informed of any changes much in advance – this is done through Forward Schedules for Change – which lists out all the details of the change, including when it would occur and all the services and components that would be affected as a result of it.
All changes are released into the organization through the Release Management process.
The Change Management process starts with a Request for Change (RFC).
Some of the important sources of RFCs are:
- From the Service Desk.
- From Problem Management.
- When a new CI (Configuration Item) is introduced into the organization.
- Whenever there is a requirement for a new or changed IT service.
- From a customer or end-user.
- Any new legislation or laws.
The Change Advisory Board (CAB) is made up of:
- A Change Manager (The Change Manager chairs all CAB meetings.)
- Representativs of the IT Service Management team.
- Representatives of the customer.
- Representatives of the users.
- Representatives of developers, other consultants and other experts.
The CAB is responsible to:
- Review all RFCs and approve them if it meets the business requirements.
- Else the RFCs will be rejected.
- Keep a record of all RFCs, irrespective of wether it has been accepted or not.
- Advice on the grouping of changes into “Releases” so that there is little or no disruption to the organization.
The CAB EC:
- Stands for Change Advisory Board Emergency Committee.
- Consists of the Change Manager, a senior IT representative and a senior representative from the organization.
- Usually assembles at a short notice to review and authorize any urgent RFCs.
The Change Management Process:
- An RFC is generated to trigger the Change Management process.
- The Change Manager receives the RFC and approves or rejects it, as appropriate.
- Appropriate entries are made into the CMDB and the RFC is then indicated as a Change Record.
- The Change Manager allocates a priority to the change, after assessing the Impact and Urgency.
- The priority can either be “Standard Change” or “Urgent Change”.
- The change is Categorized as “Standard”, “Minor”, “Significant” or “Major”.
- “Standard” changes are usually the low-risk and frequently occurring ones and do not require authorization by the CAB. Example: upgrading a users computer or replacing a piece of hardware in a user’s computer.
- “Minor” changes are usually authorized by the Change Manager himself who then informs the CAB later.
- CAB authorization is needed for “Significant” and “Major” changes.
- If the RFC is approved, it is then implemented through the Release Management process after circulating a Forward Schedule for Change.
- If the Change is successfully implemented, it is then reviewed by the Change Manager and closed.
- If the Change is not successful, the Change Manager initiates the back out plan to get back to the previously working state.
Metrics for measurement of the Change Management process:
- Total number of changes in the defined period.
- Total number of Urgent Changes.
- Total number of changes implemented.
- Total cost for each change as against estimated costs.
- Number of rejected changes.
Change Management Audits should check:
- for compliance to all Change Management procedures.
- all Software releases to ensure they have been through the proper authorization process.
- all Incident records selected randomly through the change records.
- minutes of CAB meetings.
- Forward Schedules for Change.
- change review records.
Configuration Management
CONFIGURATION MANAGEMENT
Configuration Management is defined in ITIL as “Asset Management plus Relationships (with other Configuration Items [CIs]). It is important to consider the relationships between the CIs as making changes to one component can affect another CI.
Configuration Management is defined in ITIL as “Asset Management plus Relationships (with other Configuration Items [CIs]). It is important to consider the relationships between the CIs as making changes to one component can affect another CI.
Configuration Management underpins all delivery and support processes and defines IT assets and services as Configuration Items.
Configuration Item(CI) – A Configuration Item is a part of the ICT infrastructure (which can be a hardware, software, documentation or peopleware component). This term can be used to indicate whole systems or just a single hardware/software component. (In other words, a CI is “any component that has to be managed in order to deliver an IT service”.) CIs are under the control of the Change Management process.
Configuration Management Database(CMDB) – This is a database of all Configuration Items (CIs) in the organization. It not only contains full details of individual components but also contains details of the relationships between them.
Configuration Structure is the hierarchy of all CIs in any configuration.
Configuration Management Plan – is a document that lays out details of organization and procedures for the Configuration Management of any particular product or service.
Configuration Management Plan – is a document that lays out details of organization and procedures for the Configuration Management of any particular product or service.
Configuration Management has the responsibility to ensure that:
- the organization has an accurate record of its ICT (Hardware,Software, & Peopleware) assets.
- all details of the services offerred by the organization and it’s related ICT components and any relevant supporting documentation.
- any changes to the IT service(s) are done with the least risk to the business.
+ it provides a sound basis for Incident Management, Problem Management, Change Management & Release Management. - any changes to the IT service(s) are done with the least risk to the business.
- it provides a sound basis for Incident Management, Problem Management, Change Management & Release Management.
- configuration records are verified against existing the infrastructure and any exceptions corrected.
Configuration Management has the following sub-processes:
Planning - consists of five sub-processes:
- Strategy, Policy, Scope and Objectives – to establish an effective Configuration Management System.
- Processes, Procedures, Guidelines & Responsibilities – to manage and control the ICT assets.
- Relationship with other ITIL Processes – define how Configuration Management will interact with other processes or vice-versa.
- Relationship with other Configuration Management teams – to exchange information with CMDBs of suppliers, external vendors, developers etc.
- Tools & Resource Requirements – to connect the CMDB to the system and network management tools to enable the automatic addition of CIs to the CMDB.
Identification – accurately identifying and labeling the CIs with IDs, versions, types and their relationships to other CIs.
Control - this has three sub-processes:
- Register – registering a CI when it enters the IT infrastructure.
- Update – updating the status of the CI to reflect it’s most current status.
- Archive – removing the CI from the CMDB and archiving it in a secure location.
Status Accounting – is concerned with reporting of all current & historical data about each CI throughout it’s life cycle.
Verification – (Verification & Audit) is responsible to ensure that the information contained in the CMDB exactly matches the live environment.
The Configuration Management Database (CMDB) - Configuration Management monitors the relationships between Configuration Items. It stores all details about these relationships (between CIs) in a CMDB.
The Configuration Management Database (CMDB) - Configuration Management monitors the relationships between Configuration Items. It stores all details about these relationships (between CIs) in a CMDB.
A typical CMDB should contain the following information – refer to the image below:
- Details about the ICT components (Hardware, Software, Peopleware and related documents).
- All the services offerred by the organization and related CIs and the relationships between these CIs.
- Details of all Incidents, Problems and Known errors.
- Details about all Changes & Releases.
ITIL – CMDB
The CMDB contains the DHS (Definitive Hardware Store) and DSL (Definitive Software Library) which is managed by the Release Management process.
The CMDB contains the DHS (Definitive Hardware Store) and DSL (Definitive Software Library) which is managed by the Release Management process.
Incident Management
INCIDENT MANAGEMENT
What is an Incident?
What is an Incident?
An incident is any event which is not a part of the Standard Operation of a Service and which causes or may cause an interruption to or reduction in the quality of that Service.
The aim of Incident Management is to restore normal services as quickly as possible.
Some best practices:
- All inquiries should be recorded as incidents.
- Service Requests (request for a standard operational item, eg: password resets) should be recorded as incidents.
- A request for a new product or service should be recorded as a Request for Change (RFC).
- Automatically generated incidents (such as hardware or network failure) should also be recorded as incidents.
The Incident Life-Cycle
DETECTION & RECORDING:
DETECTION & RECORDING:
- Provide a unique ID for each incident, even if it is a known issue.
- Record how the incident was reported – what were the Services and Configuration Items affected?
- Classify the incidents – like Hardware, Software or Service Requests.
- Match the current incident against previously reported incidents.
- Assign a priority to the incident. (Priority of an incident is determined by the Impact, Urgency, Availability of resources and the existence of certain parameters in the Service Level Agreement [SLA]).
- Provide initial support to the incident or provide a workaround. If it is a new workaround provided by the IT Service Desk, record it for future use.
- If the incident cannot be resolved, escalate the incident functionally.
INVESTIGATION & DIAGNOSIS:
- This may lead to resolution of the Incident right away or having it funcationally escalated ( to Level 2 support.) If that process is taking too much of time, it might also get heirarchically escalated.
RESOLUTION & RECOVERY:
- This can be done by raising an RFC and getting it implemented. Recovery just means “restoring a service or an ICT component back to its previously working condition“.
INCIDENT CLOSURE:
- This happens upon confirmation of resolution of the problem by the user.
Note:
- Impact is the measure of the level of effect the incident has on the business, for example: number of users affected or amount of revenue lost because of the incident.
- Urgency indicates the timescale within which the incident needs to be resolved.
For an incident to be considered High Priority – both the Impact & Urgency should be high.
Problem Management
PROBLEM MANAGEMENT
What is a Problem? – A Problem is “an unknown, underlying cause of one or more Incidents“.
What is a Problem? – A Problem is “an unknown, underlying cause of one or more Incidents“.
Did you know? 80% of incidents are caused by 20% of ICT infrastructure components!
Problem Management is responsible to minimize the adverse effects of the Incidents and Problems caused by errors in the (ICT) Infrastructure on the business and to proactively prevent the occurrence of such errors, incidents and problems.
Problem Management looks for the underlying causes of Incidents and Problems and provides long-term (permanent) resolutions. It functions both Proactively and Re actively:
- Proactively: by trying to prevent the occurence of issues by intelligently analyzing problem trends and available statistics.
- Reactively: by identifying underlying problems which are causing the incidents and find a permanent resolution or an immediate workaround.
When Problem Management successfully identifies a problem and a suitable resolution to it – the resolution isimplemented through the Change Management process.
Prioritization of problems is generally done by the “Pain Factor (PF)”. (The Pain Factor is nothing but the number of people affected by the problem and the impact it is having on the business.) So, higher the PF, higher the priority.
Responsibilities of the Problem Management team:
- Problem Control: Transform Problems into Known Errors by identifying the root cause of the problem and providing a temporary workaround. (This converts a Problem into a Known Error.)
- Error Control: Resolves the Known Errors under the control of Change Management as soon as possible and whenever it is financially justifiable.
- Proactive Prevention of Problems: Carry out trend analyses and provide support to the organization.
- Providing Management Information from Problem Data: Carry out trend analyses and provide support to the organization.
- Conducting Major Problem Reviews: This is done after a major problem has been resolved so that future problems can be prevented.
The Problem Management process consists of the following stages:
- Identification: The first step is to identify a new Problem. If there are no matching records in the existing Problem or Known Errors database, then it is classified as a new Problem.
- Recording: A new record is created and a unique ID is assigned. All related Configuration Items are linked to it as well as all related Incidents/Known Errors.
- Classification: The Problem is classified appropriately and the impact of the Problem on the Service Levels are determined so that relevant resources can be assigned to resolve it.
- Investigation: The Problem is investigated so that a resolution is identified and it can be classified as a Known Error.
- Diagnosis: Techniques such as Kepler Tregoe analysis and Ishikawa Fishbone analysis are used. The end result again is the identification of a resolution or a temporary workaround to the problem so that it is converted into a Known Error.
- Review & Closure: After every Problem is resolved – it is thoroughly reviewed so that the following questions can be answered:
1. What was done right?
2. What was not done right?
3. What could have been done better?
4. How can we prevent it from happening again?
Release Management
RELEASE MANAGEMENT
What is a Release? A Release is a collection of authorized changes to an IT service.
What is a Release? A Release is a collection of authorized changes to an IT service.
Release Management is responsible for the implementation of all new and existing Hardware and Software releases (alongwith the related documents) into the live (operational) environment, under the controlling processes of Change Management and Configuration management. This process is concerned with protecting the live environment from any disruption and the Release Management activities are usually performed under the supervision of the Change Manager.
Releases can be classified as:
- Major Software Releases and Hardware Upgrades – This usually contains large amounts of new functionality and overrides all preceeding minor releases and upgrades.
- Minor Software Releases and Hardware Upgrades – This usually contains small amounts of new functionality and overrides all preceeding emergency releases and upgrades.
- Emergency Fixes – usually contains fixes to a small number of issues.
All changes are released as “Roll Outs“. A Roll Out includes distributing all the Configuration Items to wherever they are used. This can be done in many ways, for example: by internet, email, remotely or even by sending them on CDs. But when there are large releases to be rolled out over a vast geographical and cultural area, the use of automated scripts are a great help. However these scripts might need passwords to activate them.
Release Management has to maintain traceability of all the releases. We need to know where a particular version has come from and what are the changes it contains.
The Release Management process covers the following three areas:
- Development area.
- Release Management’s own pre-production area.
- Operational environment (live/production area.)
Migration from one area to another is subject to results from reviews, tests and other relevant quality checks.
Before a Release is Rolled Out into the live environment, Operational and Customer Acceptance tests are carried out. Operational tests ensures that anything that goes into the live environment is supportable, maintainable and robust. All existing and planned Backout Plans should also be fully tested.
The Contents of each Release is decided by Change Management but the Release Management team is always kept fully informed.
Hardware and Software Releases go through the following stages before they are Released into the live environment:
- Distribute.
- Build or Rebuild in the Live environment.
- Implementation.
It is important that each of these stages are carried out accurately before it progresses to the next one.
Release Management is also responsible for:
- DHS - This is the Definitive Hardware Store. It is a secure location or a number of locations where authorized versions of all hardware spares (Configuration Items in the live environment that exist in the CMDB) are stored.
- DSL - This is the Definitive Software Library. Again, this is a secure location or a number of locations where copies of all authorized versions of software CI are stored. (CI stands for Configuration Items). It can also be defined as a Physical library or repository where master copies of all software versions are stored.
Information about the DHS & DSL exists in the CMDB (Configuration Management Database) and Configuration Management is responsible to keep it always updated.
Adequate protection/security should be provided to both the DHS and DSL against eventualities like floods, earthquakes, fire and of course theft. In case of the DSL, it should also be protected from viruses, data corruption etc.
Release Unit: A Release Unit is the portion of the IT infrastructure that is normally released together.
Release Type: There are three Release types which are as given below:
- Full Release – In a Full Release, all components of the release unit are built, tested, distributed and released together. This is suitable for major changes and is very expensive.
- Delta Release – In a Delta Release, only those components that have changed since the last release are distributed. This type of release is best suited for fixes and emergency changes and is less expensive.
- Package Release – A Package Release is a group of Delta/Full release(s) which are released simultaneously. This type of release is suitable in situations where changes in one system may require changes to another.
Release Identification: Each release has to be identified. Usually, a numeric format is used. The specific release identification policy is generally decided by the Release Manager, after consulting with the Change Manager and the CAB. Example: A new application can be assigned an ID like v. 1.0 and a a later, minor release with some changes to it’s components can be identified as v.1.1. There is really no limit to the number of such levels that can be used to identify each release.
Roll Out Types: Releases can be rolled out in any/or a combination of the following ways:
- Big Bang Roll-outs - where all sites in the enterprise receive the releases simultaneously.
- Phased Roll-outs – where all sites receive some functionality at the same time and the remaining functionality at a later time.
- Pilot Roll-outs – where a single site receives all the functionality at one time, ahead of the others.
Service Desk
THE SERVICE DESK
When a company provides an IT Service to it’s customers, they are bound to have questions or just might run into problems for which they need a place from where they can get quick answers, quick resolutions or at least, a quick workaround to their problems so that they can carry on with their work and with their lives.
When a company provides an IT Service to it’s customers, they are bound to have questions or just might run into problems for which they need a place from where they can get quick answers, quick resolutions or at least, a quick workaround to their problems so that they can carry on with their work and with their lives.
Customers get easily frustrated if they cannot get help – just when they need it.
An IT Service Desk is meant to be a single point of contact for customers who have a problem with the services they are receiving from the IT Service Provider. This is where they will report Incidents, RFCs (Requests for Change) or just any other problem.
Conversely, the IT Service Provider can also use the Service Desk as a channel through which he can communicate to his customers.
Service Desks are also known as Customer Help Desk, Help Desk, Hotline, Call Center, Customer Hotline etc.. But we will just call it Service Desk. which is nothing but an IT Service Desk. (You need to remember this.)
Why should you have a Service Desk?
Why should you have a Service Desk?
- A good Service Desk helps to reduce Customer’s complaints.
- It increases Customer’s satisfaction.
- It reduces Service downtime and wastage of manpower.
- Increase Customer retention.
The Service Level Agreement (SLA) contains specific details about the hours of availability of the service, the time take to resolve an issue and the time within which the user must receive a response from the Service Desk. It is important for the Service Desk personnel to be aware of the SLA parameters.
Some organizations have a Service Desk which is a single point of contact for any issue they might face – from IT Problems to the-lift-is-not-working issues. But we are going to consider a Service Desk which deals only with IT related issues.
An IT Service Desk is responsible to:
- Monitor incidents and user’s queries.
- Keep the user updated about the progress.
- Follow up with any second level team and push for a quicker resolution.
- Make sure that SLAs are not breached.
A good IT Service Desk must have:
- Well trained Service Desk personnel (with good people skills).
- A properly organized system to record and track incidents.
- The Service Desk system should be able to identify similar incidents, even if previously reported/fixed.
- Should have a proper knowledge base which can be used as an important reference point.
- Service Desk Team should be technically competent to deal with users’ issues as and when they arise.
- Proper channels of communication with the other disciplines like Problem Management (for help when there is a major problem), Service Level Management (so that SLAs can be adhered to), Configuration Management (so that the user’s equipment can be identified whenever required) & Availability Management (for analysis of Service Desk related data which could help improve Services).
Types of IT Service Desks:
- Local Service Desk – to take care of regional/local users.
- Centralized Service Desk – which serves users from all geographical regions.
- Virtual Service Desk – is a collection of many local Service Desks, where calls are routed to the most appropriate Service Desk, based on the issue, time of day, location of user and so on.
How to contact an IT Service Desk?
An IT Service Desk gets its inputs from two major sources – Human Sources and Machine Sources.
An IT Service Desk gets its inputs from two major sources – Human Sources and Machine Sources.
- Humans can contact the IT Service Desk by Telephone, Fax, Email etc.
- Machines send out system generated alerts to the IT Service Desk.
Escalations:
An escalaton is the process where an incident is forwarded to a higher level or a more expert team for better resolution. Escalation can be of two types:
An escalaton is the process where an incident is forwarded to a higher level or a more expert team for better resolution. Escalation can be of two types:
- Functional Escalation: Where an issue is passed on to a more competent team (Example: Escalating an issue to the Level 2 support team due to lack of enough expertise to solve a certain problem).
- Hierarchical Escalation: Where an issue is passed up the management chain because of lack of authority to do something. (Example: Escalating to the Service Desk Manager).
Prakhar is getting more knowledgable these days... beware
ReplyDelete