Agile and Markit EDM

In Brief

Many Organisations want to adopt Agile Methodology to implement IT Projects including Markit EDM Projects.  Since Agile Methodology is defined for the application projects, implementing the same practice in data projects like Markit EDM will have their own practical challenges especially in BAU environment.  In this blog, I will discuss about various challenges of implementing EDM Projects in agile methodology.

Before discussing the challenges in Agile methodology, I will touch upon two cultural challenges that impacts the use of Agile in EDM Projects.

  1. Developers
  2. Project Management and Agile Definition

Developers

All these years, Markit EDM developers considered data load or export as one single project. Naturally they will have an assumption that Markit EDM projects cannot be divided into minor modules to complete the implementation in one sprint. So they are always reluctant to adopt Agile methodology. Even when they adopt, they tend to develop waterfall style which is explained in next section.

Project Management and Agile Definition

DevOps culture is originated from application projects. So managers and scrum masters of these projects would have seen the benefits of the DevOps practice and encourage implementing the same process in Markit EDM projects as well. It is very important for the managers to know the basic difference between an application development using programming languages and Markit EDM development. But in reality very few managers know the difference and push for the same process which usually ends in a failure. On the other hand, if the manager has good experience on Markit EDM projects but do not know about DevOps practice, we will never start DevOps culture.

Challenges of using Agile in Data Projects

  1. Breaking down the work
  2. Definition of Done
  3. Sprint Length
  4. Team Structure
  5. Documentation and User Sign off
  6. DevOps

Breaking Down the work

One of the main agile principles is to complete the user story by end of every sprint. To complete the user story in a small sprint, user story needs to be very small but should add value to the users. Big question is how to break the Markit EDM Requirement to smaller pieces to complete develop, test and user sign off in the same sprint.

EDM Work can be mainly divided into two work streams

  1. Data Import, Matching and Mastering/ Data Export
  2. UI

In many cases, UI can be developed independently from core components provided data structure is in place. Challenge is breaking down the work in core development.

In Waterfall method, developer would have developed Import, Matching, and Mastering including Exceptions first before completing the Unit testing and hand over to testing team for Integration.

In Agile, we can have couple of approaches to break down the work which will have its own Pros and Cons.

When companies move to agile methodology, they tend to start Agile version of water fall. However in time, they will appreciate the benefits of developing in real agile way and move to 80-20 development.

Agile Version of Waterfall

  1. Sometimes, projects are developed in waterfall method but each step in the project is encapsulated in one sprint
  2. Let us assume Solution consists of 1000 Load, 2000 matching, 2500 Enrichment, 3000 Validation and 4000 Solution Mastering and each will take one week to complete in two weeks sprint
  3. Complete 1000 in sprint 1 and hand over to testing team, and then continue 2000 in next sprint
  4. As soon as testing is sign off, move the code to Prod without enabling in Prod.
  5. Even though we are following Agile, actually we are doing waterfall development without adding business value each sprint.
  6. 1000 and 2000 populates only staging tables, testers need not test at the end of 1000 and 2000 development which is like wasting their time.
  7. Since we are developing step by step and will not change the components developed in previous sprint, regression testing is not required
  8. Users are not involved will not see any benefit until sprint 4 which will not encourage project stake holders.

80 – 20 Development

  1. 80 Percent of the work in a project is usually completed in 20 percent of the time and remaining 20 percent takes 80 percent of time.
  2. Discuss with users and prioritise the important aspects of work
  3. For example, sometimes you can have 100 fields from Source files, and users require 20 field files immediately, other fields are required only for the future use. In this case, complete end to end for the 20 fields in sprint one and master other fields in future sprints.
  4. In some cases, users know system is very clean and Validations are required only for auditing purpose. In this case, loading and mastering can be developed in sprint one and Exceptions can be developed in future sprint
  5. In both cases, at the end of sprint one, users will have prioritised work in production
  6. Since same components may be touched, Challenge is to make sure the code is not regressed
  7. Also involvement of users will be very crucial which the basic aspect of Agile is.
  8. Mapping specs should be versioned correctly to reflect the agile development.

Definition of Done

One of the main agile principles is to complete the user story by end of the sprint. But How to define the ‘completion’ or in Agile terms what is the Definition of ‘Done’?

In Ideal World, Definition of ‘Done’ is to deploy the code in production and make it ready for the users to use the data. But in Markit EDM World, there can be any many outside factors which will stop the team from deploying in to Production.

Agile Teams are successful who are self-sufficient from Start of the work to until the work is ‘Done’.  It is better to define ‘done’ as a point where team cannot control the proceedings. For example, Organisation may have release cycle which will not fit into the sprint. In this case, Definition of done can be when UAT is signed off and ready for Live.

Sprint Length

Length of the sprint is very important for successful EDM implementation using Agile methodology. Traditionally it will be two weeks sprint for application projects which will be forced upon EDM projects.  But it is very important to understand the bottlenecks of EDM development before defining the sprint length. If the team thinks, increasing or reducing the sprint length will help the team increase productivity, there should not be any reason from stopping the team. It is very important to start the development only if developers are clear about the requirements.

For example, Let us assume we will have two week sprint. In traditional EDM development, teams plan as below.

Sprint Length – 10 days

Day Dev team Testing Team
1 -3 Development Testing Preparation
4-5 Testing Execution
6-7 Bug Development
8 Bug Testing
8 Release Preparation
9 Demo/ UAT Sign off Demo/UAT sign off
10 Next sprint estimation Next sprint estimation

Since testing team will not have much work in first week while development team will not have much in second week, this will not work.  Challenge is to divide the User story into multiple user stores for which development and testing team can do in parallel.

In Agile world, it will look like

Sprint Length – 10 days

Day Dev team Testing Team
1 -2 User Story (US)  1 US1 Testing Preparation
3-4 US 2 US 1 Execution
5-6 US 1 Bug US2 Testing Preparation
7 US2 Bug US1 Bug and US1 signs off
8 Release Preparation US2 Bug Sign off
9 Demo/ UAT Sign off Demo/UAT sign off
10 Next sprint estimation Next sprint estimation

Team Structure

Traditionally Markit EDM teams are managed horizontally. EDM Developers are grouped into one team who will provide the shared service to various projects. But the success of Agile implementation depends on the self-sufficiency of the team from Start to Live.

For example, when a new Fund is introduced, Agile team will have

  1. Product owner
  2. BA
  3. OMS Developers
  4. EDM
  5. Data Management team
  6. Testing Team
  7. Releasing Team (Or team should have control over the release)

Documentation and User Sign off

Agile methodology suggests less documentation and mostly defines the requirement in User stories. But realistically, this may not be possible for EDM Projects as they have multiple transformations for many fields. So unlike any other projects, irrespective of the Agile or Waterfall, EDM Projects will have mapping specs attached to the user story.

It is better to have users who need to sign off the story as part of the team. Since users are supposed to do their BAU, it may not be possible. It is very important to define the process to get the UAT sign off smoothly without impacting the team’s productivity.

DevOps

Most Organisations have well defined existing Release process which would have defined for Waterfall method followed in earlier days. Those releases may not be frequent and involve manual steps. In some places, Releases are managed and controlled by outside teams and Agile team (including Product owners) will have no say in release times.

It is very important to implement DevOps concepts like Test Automation, Continuous Integration and Release Automation to speed up the EDM implementation. It is suggested to release as soon as UAT is signed off. But if you have constraints in the release schedule, you will need to release once in a sprint for successful EDM Agile implementation.

For Markit EDM DevOps,

https://etlops.home.blog/category/markit-edm/

In Short

Success of Agile implementation of EDM projects depends on

  • Change in the development culture to see the requirement as collection of requirements rather than each requirement as a project
  • Self-sufficient teams
  • How the work is break down into small chunks which can provide value to business users in short time
  • Provide DevOps tools around the development, testing and release process to remove any bottlenecks

 

DevOps Ideas for Markit EDM

In Brief

Markit EDM have introduced many new features in both core Markit EDM and UI in its newer versions. This enables us to use the Markit beyond Asset Mastering. With Markit EDM Ware house, many organisations started to view Markit as not only mastering tool but also an ETL tool . Scope of Markit development in an organisation is growing at a faster rate and developers are under pressure to deliver faster so that users can enjoy the new features as soon as possible.

But…

  • Are the current development practice keeps up the pace of the actual development?
  • Are we applying right kind of DevOps Practice?
  • What are the different stages of development cycle where changes can improve the productivity?

I will not discuss at any out-of-box solution but will try to answer some of these questions, I will also touch up on various issues and ways to introduce DevOps culture in Markit EDM development which is becoming the basics for any good development practice. Hence triggering a conversation in Markit EDM development community to go into next level of Markit EDM development and automation.

What is DevOps?

DevOps is not a technology or a frame work. It is an organisation culture. Even though Dev Ops is became popular in recent years, Dev Ops is a not a new practice in IT development cycle. It exist from long time ago but with different names. Most times we called them as ‘Automation’. DevOps is nothing but ‘DEVlopment for OPerationS’. That is

  • Automation of the Development
  • Automation of the testing
  • Automation of Code Packaging
  • Automation of Releases
  • Automation of BAU support

In short, automation of each and every step along the project life cycle to optimize the project delivery and operations is called as DevOps.

Challenges of bringing DevOps in Data or Markit EDM Projects

I look at the DevOps challenges in three different perspectives

  1. Technology
  2. Project Management
  3. Developers

Technology

Most of the existing DevOps frame works and technologies are developed for programming technologies like Java and .Net. But the development of jobs in Markit EDM is completely different from Java or .Net development. Developer cannot control how the underlying Markit component XML was generated and stored. Developer cannot control how the code was released. Markit EDM uses its own method to deploy the code. So when we try to use DevOps frame work developed for Java or .Net in Markit EDM, it will not meet our objective.

We end up in developing work around for each step and the effort will outweigh the benefits. When effort outweigh benefits, shareholders will not appreciate and eventually we will go back to manual tasks. It is better to develop a DevOps frame work that suits your Markit EDM needs rather than simply following the practice in application projects.

Project Management

DevOps culture is originated from application projects. So managers of these projects would have seen the benefits of the DevOps practice and encourage to implement the same process in Markit EDM projects as well. It is very important for the managers to know the basic difference between an application development using programming languages and Markit EDM development. But in reality very few managers know the difference and push for the same process which usually ends in a failure.

On the other hand, if the manager has good experience on Markit EDM projects but do not know about DevOps practice, we will never start DevOps culture.

Developers

Most of the DevOps or automation tasks are implemented through scripting languages. But most of the Markit EDM developers are used to work in EDM and will not have any experience in scripting languages. Markit EDM Developers need to come out of Comfort zone and learn new scripting languages required for automation

Agile Vs Water fall Methodology

Since the topic was discussed lot in Markit EDM conferences, I assume Agile Methodology is used for Markit EDM development. However some of the DevOps tasks like automated testing can be used in any project management methods.

DevOps Ideas

DevOps can be introduced in each and every phase of Markit EDM Development. We will touch upon what is available in Markit and what can be developed. Please note some of the ideas may not be approved or agreed by Markit.

DevOps in Development

Markit EDM development usually have two phases

Markit EDM Development

Companies are looking to automate as much as possible to not only reduce the time required for development but also reduce the manual errors and maintain consistency. Most of the Markit jobs follows the same template of Data Porter or Data Flow to populate the ‘In’ table, Inspectors for Inspection, Constructors for Mastering.

Screen Shot 2018-10-03 at 9.38.13 am
Figure 1 : Automate Development

Instead of creating similar data Flow again and again with different column names and table names manually, we create a job or a program (using Markit or any other language) to create the underlying XML from the mapping Spec and then import the XML to Markit. This may look touch too far but the method is already used in popular ETL tools. There is no reason why it cannot be done for Markit.

Automate Markit EDM code Review

Code review is a complicated step especially in the Markit EDM environment. Definition of code review is different from person to person, environment to environment and organisation to organisation. One way to standardises the code review is to automate the code review process. Markit came with few out-of-box reviews like number of inputs, etc. during code development but they can be simply discarded by the developers and are never reported.

Screen Shot 2018-10-03 at 9.38.48 am
Figure 2 : Automate Markit Code Review

Proposed Steps to automate the code review

  1. Define the standards in metadata table
  2. Taking component XML as an input, write the inspection against the ‘Standard’ metadata table (Metadata table used to define standards)
  3. Report the inspection failures into a table
  4. Write an UI report to show how many reviews are done and how many passed
  5. If there are any critical reviews, those can be marked in a different colour

DevOps in Testing

Testing is another area that might make or break any development project. There are many frame works available in the market for data testing. You can either use out-of-box Markit Testing Automation tool or any existing framework that is available in the market e.g.: Fitnesse. Most of the framework simply follows “expected result verses actual result” rule. So you can create your own framework if required.

Screen Shot 2018-10-03 at 9.39.10 am
Figure 3 : Automate Testing

Testing automation can be done in three phases of the project

  1. Unit testing
  2. System Testing or Integration testing
  3. Regression Testing

DevOps in Packaging and Deployment

Automating the Markit EDM code packaging and deployment is complicated and may depend on version control used for development. Latest version of Markit helps the developers to package directly from version control using labelling. When the ‘database’ version control is used, SQL can be used to find the component that need to be released and PowerShell script to call DOS commands to do the actual Packaging and code release. When automatic packaging is implemented using Labels, convention in labelling become important. In some places, packaging is done manually in the specified format but the deployment is done automatically. Also you have an option of packaging and deploying SQL tables and Markit components separately.

Screen Shot 2018-10-03 at 9.39.24 am
Figure 4 : Automate Packaging and Deployment

There are various tools available in market for building (e.g.: VSTS) and releasing the package (e.g.: Octopus)

Continuous Integration

Due to nature of Markit development, more than one development can update the same component at the same time. When multiple developments going on in parallel, there are high chances of deploying the untested code in Production in addition to regression issues.

Continuous integration helps to identify the regression issues early in the development life cycle. Continuous integration plays an important role in smooth transition of code from Development to Test to Production especially in BAU scenarios. Automating testing, packaging and deployment are pre-requisite tasks for implementing Continuous Integration.

Points to remember before implementing continuous integration

  • Due to performance of package deployment, it is not realistic deploy full set of code every time
  • Since both Business Data and Markit component data are stored in the same database, some reference data may be lost while restoring the database from the last backup
  • Always set up the data required for testing at the test or suite level
  • If multiple packages are submitted for continuous integration testing, what will happen to other packages when one of the package is failed in testing
Screen Shot 2018-10-03 at 9.39.40 am
Figure 5 : Continuous Integration

Figure 5 shows how Continuous Integration can be implemented in Markit EDM projects for one development cycle. In a real world it can be complicated than this because there can be more than one release for the sprint and many development can impact the same component. However the principle is same.

Build can be two ways.

  1. Release the Full set every time (for new projects)
    1. Each build will have code for the entire system
    2. ‘Build n+1’ will have code from both ‘Build n’ and ‘Build n+1’. Hence only ‘Build n+1’ will be released. New developments can follow the method. If it fails, either code can be reverted back to ‘Build n’ and test or fix the code and build n+2 version
  2. Release incremental code (for BAU and Maintenance)
    1. Each build will have only incremental code
    2. ‘Build n’ and ‘Build n+1’ are two different builds and independent of each other. Both builds will be released to subsequent environments. If one of them failed, other will be still released. This is usually followed in BAU scenarios

In Short

DevOps may not solve any business problems. However it will expedite the speed and quality of the Markit EDM project delivery by

  •  Finding the issues up front
  •  Reduce the regression issues
  •  Minimize the rework
  •  Minimize the errors by minimize the manual tasks
  •  Time and effort for the actual development against time spending in process like code reviews and deployment
  •  Avoid boredom of doing same tasks again and again

 

ETLOps

 

Most of the existing DevOps frame works and technologies are developed for programming technologies like Java and .Net. But the ETL development is completely different from Java or .Net development. Developer cannot control how the underlying code or package was generated and stored. Developer cannot control how the code was released. Different ETL tools uses its own method to deploy the code. So when we try to use DevOps frame work developed for Java or .Net in ETL tools, it will not meet our objective. We end up in developing work around for each step and the effort will outweigh the benefits. When effort outweigh benefits, shareholders will not appreciate and eventually we will go back to manual tasks. It is better to develop a DevOps frame work that suits your ETL needs rather than simply following the practice in application projects.