Synthetic data generator and test data automation using intelligent OSS and online productivity tools

Test data is critical to the success of software products that are being developed, maintained and enhanced. But, how many of us really give importance to it?

If the software product has required a volume of test data, which is also realistic in nature, then most of the issues or bugs could have been identified or avoided before the product goes live. And the cost of fixing the defects could have been less & controlled and also the final product can be delivered on schedule i.e. project can be completed on time or earlier and within the budget (Time to cost).

Having realistic test data will also motivate developers to code better. Otherwise, he/she may enter data like ‘Test 1, Name abcd, bla bla, lorum ipsum, 123456789 for phone numbers, 1234567898762345 for the credit card, Address 1, 3, test city, mystate, etc.,‘ which will look very odd and irritating to product owners, stakeholders or anyone. And many times we go with such junk data for demos…OMG….we should stop using such junk data from the beginning of the software development itself.

The solution to the above is to generate synthetic data, using tools that help in cutting the overall development and testing efforts multifold. Having realistic data in terms of lookalike and required volume of data (in 1000s), we can address the functional aspects and performance aspects easily. The process of generating test data involves more for developers than the testers, or else the testers who can program (Google call them as Software Engineers in Test SET and Microsoft call them SDET).

Synthetic data generation and integration into development and testing workflow

How can we generate 1000s of realistic test data (also called as SYNTHETIC DATA) of various combinations as per the domain model and industry vertical of the software you are building? It will be challenging and requires intelligence (Artificial Intelligence) and deep learning than the mainstream product or project you are developing. But there are many free and paid synthetic data generation tools available in the market, which can be leveraged into your test strategy and workflow early in the project development life cycle.


imageI am using to  generate synthetic data and the benefit I am getting is huge in terms of effort reduction, and also, the quality of the final deliverable is stupendous. Mockaroo has a free tier and paid tier, but the free one will suffice most of your need. Also, it exposes REST API using which you can easily integrate into test automation workflow for repeatable test data generation. Usually, I use Node.JS to stitch my test data generation strategy.



Another product I like is which is free and open source whose code can be forked for many purpose.

A live use case from one of my projects is given here with a fictitious data model to understand the process and benefits.

I had Geo based (lat, lon) application for which I needed to inject as many 1000’s of test data, equally spread in all 8 directions (North, South, East, West, North East, North West, South East, South West). Each data should be plotted with X meters distance between them (using Haversine formula). You can see the complete working code in my Github (TBD, please stay tuned). Follow the screenshot for how I set it up:

Step # 1 Fictitious table for which test data needs to be generated


Step # 2 Signup or login

Step # 3 Define your test data schema as below



Step #  4 Preview test data for just created schema [I have chosen JSON as it is easy to manipulate programmatically]




Data attributes shaded in yellow in the below screenshot will be manipulated later through another program stitched in a node.js application [This process deserve a separate blog post]. For latitude and longitude, Haversine formula is used to plot new lat, lon for a given lat, lon and distance in meters. For thumbnail source, Flicker API is used to resolve unique photo thumbnails for a given tags and text search with parental controls, etc.



Step # 5 Save your schema


Step # 6 Save your schema

Step # 7 Copy the REST API end point which can be accessed programmatically in any REST client.image

Step # 8 Optional step, test the REST API (Generate data) from POSTMAN REST Client (POSTMAN is my choice)


Step # 9 Optional step; automate data load by integrating mockaroo generated synthetic data, data transformation for lat. lon and thumbnail source. Store final result as xls/csv file (JSON2XLS) or insert directly into the SQL table using any programming language of your choice.

I have used node.js to weave all of the above tasks. The complete application can be viewed online from my public cloud9 IDE @ [ Note: To access the editor, you need to have cloud9 login which comes with a free registration @]

Alternatively, you can download or the code from my Git repo


My choice of wireless mirroring & streaming software for IOS & Android phones during demos, presentations and screen recordings

Screen sharing a.k.a. mirroring IOS and Android devices during mobile applications demos, presentations, application walkthrough recordings, and training are very crucial. So, having reliable tools in your tool belt is important. There exist free and paid versions; here I provided my choice of the tools, which I frequently use with my teams @ remote locations and for training, presentations, and debugging sessions.


Team viewer remote control





This is free and good for debugging android apps. But, I will not recommend for client demos as the streaming quality is poor.






I never used but worth a try. It is $15 USD single license.





I never used it but worth a try. it is $7 USD additional when you purchase for PC / MAC which costs $15 USD single license.


Lonely screen




Free Airplay receiver for PC & MAC. I used it but not reliable hence I will not recommend to use for client demos but good to have it in your tool belt for interacting with your internal team for debugging etc. When it works it is really good (performance).





My defacto tool for iPhone/ipad screen mirroring which I use very often with my team and client demos and presentations. It is $15 USD single license. It is really a must have tool.

Note: I recommend buying the licensed version, which is worth because we cannot afford missing client demos just due to tool misbehavior. Getting 10 ~ 30 mins time with a client is very critical for converting the sales lead to order (opportunity to order).


Introduction to Augmented Reality in quick, simple and easy to understand

Augmented Reality (AR) is one of the fastest growing technologies, which can be easily experienced through smartphones. But it is still not reached the general public clearly what it is?

What is Augmented Reality or AR in short form?



Augmented reality (AR) is a live direct or indirect view of a physical, real-world environment whose elements are augmented (or supplemented) by computer-generated sensory input such as sound, video, graphics or GPS data.

General Definition

Augmenting (Blending) rich multimedia contents (also known as virtual contents like 3D, Video, Audio, Image, Text, Call to action buttons) with the real physical objects around us and can be viewed through custom developed mobile application in camera view, termed as “3rd Eye,” to bring immersive user experience.

Technology behind Augmented Reality

Computer Vision, Image & Pattern Recognitions, Machine Learning, Deep Learning, Artificial Intelligence, Neural Networks, Genetic algorithm…. OMG! It sounds highly technical, isn’t it? Yes, it is. But relax; they are easily accessible as SDKs and APIs on free and subscription basis. So, it is super simple to learn and build AR apps quickly with these SDKs, and the rest is one’s smartness and intelligence to build customer-centric product/solution. If you know or familiar with these fields of studies then it is a big plus and differentiates you from the initiators.

Talk is cheap, show me the code , quote from Linus Torvalds who wrote Linux kernel !!!! Yes it is DEMOS!!!

It will be easy to comprehend if we see Augmented Reality in action. So, I would like to introduce some of the coolest Augmented Reality apps that exist in the app store. These are the apps, which inspired me and admired. So, I always suggest these apps to my friends and colleagues while introducing AR.



I recommend to download the above apps from app store, install, play and experience the augmented effects. Some apps are time bound. Hence, you may not see the image markers to download, but I collected those and kept in my Google drive to download. Also, I attempted to screen capture these apps to see it in action in the video.

Demo Markers [download here]

anatomy4D01Body_Target_Update_v2Appletizer-Bottle-2Appletizer-Bottle-3aurasma-ups-christmas campaigndominos-blippar-augmented-reality-promolayar-ceoliesje

Demo Apps Screencast [Click on the video thumbnail to see it in action]

Magic GeniusOne of the astonishing implementation of augmented reality in consumer products under fashion and lifestyle category. This is one of my favorite AR apps. I have requested my daughter Ashwini to help me in taking this screencast available in YouTube. Click on the video link below to see it in action.






Quiver – TBD


Flow –TBD

Recommended websites and Blogs

Popular Augmented Reality Videos


The story of our BPM Engine’s birth (Embeddable Human Workflow Service) and the role of ElasticSearch in it.

In 2013 we bagged a development project from one of the largest fresh produce company in the world based out of USA to build half a dozen custom workflows and incrementally add more workflows in the future. One of our senior architect went onsite to their location for requirement, environment , Systems dependencies study and from that he arrived with a solution design and implementation plan which includes project sizing, effort, cost & timeline etc.

The client wanted to to build the workflow application in Force.Com platform because their Trade Promotions Management (TPM) system was built in but we decided not to go with that the reason being it is very proprietary and lot of custom built applications were already in Microsoft .NET stack and it made sense to stick to MS.NET as the support from Microsoft also guaranteed and good.

During the study our team learnt that each workflow request details (I mean the data captured in the workflow request or attributes) are completely different. You can consider a typical example of a workflow requests are (For security reason, I cannot reveal the actual request form but similar)

  1. Expense claims approval request
  2. Hiring approval request (Hiring request)
  3. Leave approval request
  4. Travel approval request
  5. Loan approval request

With my experiences from full stack BPM solutions implementation (product evaluation, procurement, training and implementation) in Infosys back in 2006, I realized that we need to build a light weight BPM engine. Commercially off the shelf BPM engine was out of question due to the complexities it brings in, licensing cost, infra requirement, consultants demand their and their exorbitant salary etc! also I wiped out windows workflow foundation which IMHO is a failure framework form Microsoft and SharePoint is more for document workflow hence custom build in .NET was the proposal we put forward and were able to got their IT team approval to move-ahead.

The Design Challenge

Being an ALT.NET advocate, I wanted the design based on SOLID principles (loose coupling, plug & play, scalable etc.) as the client also given enough and reasonable time for the project and they were very co-operative. More than the BPM engine the design challenge is Workflow Request Domain Model (Data model), I wanted to keep one single code base and one single domain model for all the workflow request so the natural choice of the database is Document Database / schema less (No SQL) but the customer felt it is risk because of the first time implementation in the enterprise and not time tested in their environment hence seeking approval from various groups inside the enterprise might take longer time or will not get approval due to lengthy approval process and policies involved in it so as an alternate I suggested to store it in single table in a single column as JSON (like in No SQL) with a compromise  that We cannot query the request object as it involves string parsing or loading the domain model as POCO collections in the client (.NET) which is not advisable and it will not scale as time passes and data grows.”

(Note: The above design is against RDBMS concept but it just works(Ed) as Jason Fried says in his book “Rework”)

Note: We didn’t had any need to query the workflow request hence it was easy to go with the above approach but what-if if the need comes in the later stage? The Answer was “No SQL” i.e. we had a plan of importing the workflow data into the document database like (reporting server / de-normalized database) and fire-up JSON based queries and present the result. Even if we import into the Document database, it will demand strongly typed objects (domain model) which we didn’t had in our design (we used C# dynamics and leveraged DLR) …this is the requirement for RAVENDB whereas MongoDB accepts any JSON/ schema without strongly typed domain model which is an optional one [This is where ElasticSearch came into the picture and our initial design embraced it without any hassle]

I was very specific in the design to cut short the #. Of lines of code, encourage code re-use and leverage DLR (Dynamic Language Runtime) capabilities of the .NET which Microsoft released to promote C# as dynamic language like Python etc. We took following decisions

  1. Use C# dynamics and no separate domain model or view model for each workflow request and any domain per say.
  2. All data has to be first class JSON citizen so that in the future we can migrate to Document database (No SQL like MongoDB or Raven DB)
  3. Use Massive Micro ORM layer which built on C# dynamics (DLR) which will cut the data access code to sub zero.
  4. For workflow status querying, we designed a standard workflow table which has dedicated fields for direct querying but the entire request object is stored in a single text field JSON so we are free to store anything. This will help us to import into schema less DB like Mongo etc. in the future
  5. First time to develop in ASP.NET MVC and making our controller act as RESTful end point using content negotiation.
  6. Build a generic Audit trail engine based on “Event Sourcing Design Pattern”. All the data stored as JSON
  7. Build workflow engine by leveraging proven design patterns (i.e. Chain of responsibilities, Strategy, Specification)
  8. Plus all our standard architecture, tools and frameworks i.e. Spring.Net for DI & AOP, JSON.NET for JSON serializing & de-serializing, client side template, ASP.NET controllers for view rendering and RESTfy

We finished the development and thoroughly tested our design; all worked well as expected however we did continuous fine-tuning.

What is the role of Elasticsearch?

We realized the potential of the work we did for this client and we thought why not we create it as a product; also as we learnt from our pre-sales pipeline that many companies live with manual excel based workflows or custom developed rudimentary workflow tools with tightly coupled business rules which is hard to modify. The cost of the workflow products are huge which SME cannot afford and many don’t want to go for cloud based tools as they feel their data is not secured, all of this encouraged us to make this as generic workflow engine!

When we approached the prospects, most wanted the solution in Open Source Platform and not in MS.NET stack plus their need is also querying the workflow requests so we decided to re-write it in JAVA/PLAY framework with MongoDB & My SQL as Backend and Elasticsearch for Querying and reporting engine.

For Elasticserach querying, we ran a scheduled job which will push the workflow request into RabbitMQ queue and a listener will pick up from the queue and push it into the Elasticsearch Server and builds up the index. We wrote bunch of elasticsearch JSON based queries to pull the results and weaved into our UI layer. This is up and running in-house @ Vmoksha, nearly 10 internal workflows built in this design and rolled out to production which caters 150 of our employees on daily basis. The result from elastic search is lightening fast.

What are the other places we use ElasticSearch @ Vmoksha?

Apart from BPM, we heavily use ElasticSearch to store custom index of the data from our Applicant Tracking System (ATS) built in Open Source Rails based tool REDMINE and on top it we wrote Elasticsearch Scalar queries to power our dashboards, also we use ElasticSearch to pull our P&L data from custom SQL server database and present it in our Executive and Financial Dashboard, Our CRM metrics dashboard is powered by Elasticsearch where the data is coming from SQL Server.

Architecture of our (Vmoksha’s) vBPM, Embeddable Human Workflow Service

We have it in two flavors MS.NET & JAVA (PLAY). vBPM expose its functionalities as RESTful services and can be easily accessed from any client i.e. mobile, web, windows etc. and it is technology agnostic. The Input to the vBPM & output from the vBPM is JSON which is very light weight in nature and easy to consume from any type of client. Security is implemented using Json Web Token (JWT)

TBD : ElasticSearch & RabbitMQ are optional components hence not depicted in the diagram. I am working in creating a new diagram and once done this will be updated.


Workflow Designer : vBPM persist the workflow definition in the MySql tables (Workflow & actors table). It will have a rudimentary web based user interface to configure the workflows and steps, alternatively one can setup the workflow by directly entering the data in to the MySql tables.

Workflow Execution Engine : It is a Java class file which implements Chain of responsibility (CoR) design pattern. All of the functionalities of the workflow are exposed as RESTful apis. Once the workflow instances are kicked-in, this will read the workflow meta data and accordingly execute the instance. The actors of the workflow steps will be identified using CoR & Workflow rules.

Business Rules Engine : It is a a Java class file implemented using Strategy Design pattern tied together with CoR through Spring DI & AOP

Activity Monitor : During workflow process execution, the state of the workflow can be queried through Activity Monitor which can be presented in the client layer. This will be exposed as REST api.

ElasticSearch & RabbitMQ is optional components for high performance searching & querying.

SLA Manager : At workflow level & at workflow step level, SLA can be configured. The Workflow execution engine will monitor the cross over of the SLA and raises corresponding alerts and notifications which again will be a configurable template at workflow & step level.

Alerts & Notification : Custom alerts can be sent to the interested parties. Alert templates is a configurable parameter for every workflow and workflow actions. Email & SMS alerts can be sent. The system having a provision to configure any email & sms gateway through provider model design pattern.

External API invoker : The system has a provision to invoke external APIs and use their result as input for workflow routing.

Attended NY Tech Meetup, my first meetup. Huge gathering and wonderful experience

I heard lot about NY Tech Meetup from my directors and sales heads in US. Yesterday I got an opportunity to attend it and the experience was great. These kinds of meetups helps to build network which is of crucial for people who are in Technology, Sales & Marketing etc. We will find new people, networks that can converted into win-win business, friends etc.

Some pictures from my nexus phone camera roll

image8 image1 image4   image10 image

In yesterday’s meetup the auditorium was full even though the weather was not good (snow, windy and rain),  I learnt that, NY tech meetup is always full and it not unusual for them. NY tech meetup is the largest tech meetup group in the World.

As the name says meetup, I met lot of interesting people (engineers, sales, students, school kids) around the world including one student from Pakistan “”named Junaid who  sat next to me in the auditorium, he came to NY for doing a project in  Microsoft (the project is making smart phone as Hearing Aid device using Bluetooth). His interest area is hardware – software interfacing & IoT..interesting

Following were the companies / individuals & NGOs presented their idea, product, MVPs etc. [Content courtesy from]. From all of the demos, I liked Lyft & Peloton Bike

Digital Natives Group


Whisper from Digital Natives Group: The most user-friendly and flexible K-12 school communications platform.



Kollecto helps people buy affordable art by pairing them with a virtual Art Advisor & personalized art recommendations.



The free NYC marketplace for leasebreaks and short-term rentals.



ListenLoop is the evolution of B2B retargeting, providing advertising automation that displays personalized ads based on visitor behavior and demographics.



Lyft is a ride within minutes, right at your doorstep. Simply request a ride and get picked up by a friendly driver.

Peloton Cycle


Peloton is a company in Manhattan that makes an indoor bike equipped with a touch screen that allows riders to participate in live-stream and on-demand cycling classes.

AngularJS – Datatable Directive hack for nonSortable property

We use AngularJS Datatable Directive. As per our UX standard for datagrid, all the actions buttons should be on the left extreme out title and sort icon.

(See screen shot “Before”). The datatable directive exposes a property called “nonSortable” but it doesn’t work as expected for columns which appears first (See screen shot “Before”). To overcome this I added a invisible column as first column made this one as second column (See screen shot “After”.). Hope this will be useful.






Disclaimer: By any means this is not a elegant solution but it works.

How iBeacon can be leveraged for a real life scenario – Beep beep bus has arrived

Apple iBeacon (BLE based proximity sensors) are gaining huge attractions and opened lot of use cases in our day to day activities.

Bus Notifier

Mid last year I sourced dozens of Gimbal proximity sensors from US all the way to Japan and then to our India Office (No direct shipment to India hence I requested my acquaintance in Japan for help) for our lab purpose “Vmoksha Labs” for doing R&D using Proximity Sensors. For various reasons it was idle for many months until 3 Engineering students met me in the month of Dec 2014 to guide them for their Engineering academic projects in mobility….I proposed a use case using Gimbal proximity sensor (Beep beep bus has arrived)… which we published in our official blog by our Architect Princeton Paul Arokyaraj. (Image designed by our UX Architect Archana) Please read it and it is really interesting.

As of today, the engineering students converted the use case into a real life Android App which they are going to present as part of their final year project and we are showcasing it to our prospects and technology geeks…

Note: Our IOT division is handling many client projects which are interfacing with hardware and sensors. If you are looking for partner who develops mobile, IOT apps with hands on hardware knowledge then reach us @

We @ Vmoksha Technologies implemented Google MDM solution to distribute corporate apps to our employees



Last week we rolled out Google Mobile Device Management (MDM) solution @ Vmoksha Technologies Pvt. Ltd. to park all of our corporate Android Mobile applications in a secure manner in compliance with ISO27001 information security standards. With this we enabled “corporate owned, personally enabled (COPE) devices and employee owned devices as part of vmoksha’s bring your own device (BYOD) program

Actually we were evaluating open source MDM product WSO2 and almost finalized to implement but then we understood that Google itself provides MDM and it makes sense for us because we already use Google Apps for  business [Email, Calendar, Drive etc.] hence no integration & setup overheads (i.e. Loading Users, Defining Policies etc.) and the critical mass are Android users.

How to upload corporate android apps in PLAY Store?

Uploading Enterprise android apps into PLAY store is same as uploading apps for public except the app distribution is restricted to your windows domain [email domain]. This can be enabled by checking “Only make this application available to users of my Google Apps domain name (

A sample scree of ours is given below



How to download and install apps?

Once app is uploaded by the administrator into the Google PLAY store, employees can download and install the app directly from their android mobile device PLAY store PROVIDED THE DEVICE IS REGISTERED AND APPROVED BY YOUR GOOGLE BUSINESS APPS ADMINISTRATOR.

The device can be registered thru “Device Policy” android app as shown below. Once the device is registered, an email will go your Google business apps administrator to approve the device (and user) which make sure that the device can be remotely administered and controlled. In case the device was stolen or the employee leaves the organization the administrator can uninstall the apps, wipe out data (restrict access) etc. from Google MDM control panel over the wire.



Once the above setup is through, we can see a new category listed under APPS – > CATEGORIES – > <your organization name> as shown below, clicking on <Your Organization name> will list all the apps [as per the access control defined for that user in the policies]. Clicking on each app, one can see the details of that app, install, uninstall, total downloads etc. like any other apps in the Google PLAY store.


Inbound Email processor using Node.JS Mail-Notifier

I have written a simple but full fledged inbound email processor application using mail-notifier npm which internally uses imap & mailparser. The complete source code is in my Git Repository for download or clone


[Note: Image created using Lucid chart online diagramming tool]

In this application, I have taken care of the following business functionalities

  1. Continuously look for all incoming email messages
  2. When email arrived, process it and form a modified email JSON object with the following properties
    • From
    • To
    • Subject
    • Plain Text Body
    • Message ID
    • Date of the email
    • Attachments
  3. Keep the configurations in a separate config file i.e. Email credentials, Mongo connection string etc.
  4. Save the email object into Mongo DB store
  5. Save the email object as .txt/.json file in “uploads” folder
  6. Iterate through attachments if any and store it in “uploads” folder [TODO: MongoDB GridFS]


How to run the application?

  1. Clone the repository or download the zip and extract into a folder
  2. set the SMTP configuration in “config.js” – use your email credentials
  3. Set MongoDB configuration – this is optional; if you don’t need it then comment out following line in “mailprocessor.js”
  4. Set mongo collection as “emails” – This is applicable only if point # 3 is valid.
  5. run “npm update” – this will download all the dependent node_modules recognized from the package.json
  6. run “node mailprocessor.js”


Why I created this?

  • Business Motive – We have our home-grown helpdesk / ticketing system for our employees to make any kind of support tickets online however employees are comfortable in sending helpdesk tickets thru email to; it is our process that the helpdesk / IS team should make a ticket entry for every request received thru email. This application is built to automate ticket creation from email.
  • Passion – I wanted to try out in Node.JS as it is very simple due to the availability of loads of ready made NPMs but choosing the correct npm is the key
  • No full-fledged sample – During the course of development I could not find complete references and samples easily i.e. mailparsing, attachments etc. I collected lot of references from stackoverflow Q&As

Scope for improvement (or TBD)

  1. Email filters and rules to process only those emails which matches a specific condition
  2. Store the attachments in the MongoDB GridFS


  •  JavaScript IDEWebMatrix, it is free from Microsoft and simple to use with ability to integrate with Git Repo
  • MongoDB Query Analyser – MongoVUE