September Golang Bangalore Meetup

The September Golang Bangalore Meetup was conducted on Saturday, September 16, 2017 at DoSelect, Bengaluru. Around 25-30 people attended the meetup.

The meetup started at 10:15 with the first talk by Baiju Muthukadan who works at Red Hat India Pvt. Ltd., Bengaluru. He talked about “Testing techniques in Golang”.

IMG_20170916_102219

Karthikeyan Annamalai  gave a lightning talk about “Building microservice with gRPC”. The slides related to his talk can be found here.

karthik

Dinesh Kumar gave an awesome talk about “Gotcha’s in Golang”.  The slides related to his talk could be found here and the code explained during the demo is here.

IMG_20170916_115114.jpg

The last lightning talk of the meetup was by Akshat who works at Go-Jek. Akshat talked about “Building an asynchronous http client with retries and hystrix in golang“.

IMG_20170916_125031.jpg

I thank Sanket Saurav, Mohommad Rafy for helping us to organize the September Golang Bangalore Meetup by providing venue and food at DoSelect. Also, I thank Sudipta Sen for helping us out with the meetup preparation.

Advertisements
September Golang Bangalore Meetup

Serverless Architecture

I attended Serverless Architecture Meetup  organized by Hasgeek on Saturday, September 23 which got me curious to learn more about Serverless architecture. The meetup was conducted at Walmart Labs, Bengaluru.

The first talk was by Akhilesh Singh who is a Senior Technical Consultant at Google. Akhilesh Talked about:

  • What is Serverless Architecture?
  • Evolution of serverless
  • Serverless vs IaaS model

Akhilesh was very proficient in not only explaining what is serverless architecture but also putting across his point of view about this trend.

The second talk was by Ganesh Samarthyam, Co-founder of CodeOps Technologies and Srushith Repakula, Software Engineer at CodeOps Technologies. Ganesh talked about how serverless architecture is applied in practice. Srushith showed a demo application for auto-retweeting written in Python which uses Apache OpenWhisk.

The most interesting part of the meetup was the Panel Discussion. The panel members were:

  • Akhilesh Singh
  • Ganesh Samarthyam
  • Joydeep Sen Sarma (Co-founder & CTO, Qubole)
  • Rishu Mehrotra (SRE Manager, LinkedIn)

During the meetup, a lot of questions were raised around:

  • Security in serverless architecture
  • How resources are utilized
  • Role of devOps in serverless architecture, etc

These are my notes on serverless architecture:

Servers

Conventionally, servers:

  • have fixed resources
  • are supposed to run all the time
  • are managed by system administrators

 

Problem with Servers

  1. When traffic increases, servers were not able to handle enormous amount of requests and would crash.

 

Paas

  1. To handle the above problem, Paas came into existence which offered scaling.
  2. This can be considered as the first iteration of Serverless
  3. You think about servers but you dont have to manage them

 

What does Server-less mean?

The word “server-less” doesnot mean -> no servers at all. It simply means elimination of ‘managing’ of servers.

 

What is Serverless?

  1. Serverless computing is a cloud computing execution model in which the cloud
    • manages allocation of machine resources
    • bills based on actual amount of resources consumed by application (rather than billing on pre-purchased units of capacity)

 

 

What problem does Serverless architecture solve?

  1. We build our applications around VM. We have a VM for each:
    • database
    • web
    • application
  2.  If  VM fails, a layer of our application fails
  3. Even if we break down into smaller containers or microservices, when these microservices or infrastructure fail, our application fails.

 

Advantages of Serverless architecture

1. Focus on application development rather than managing servers.

2. Serverless provisions are completely managed by providers using automated systems which eliminates the need of system administrators.

 

Stateless Nature of Serverless architecture

1. Serverless architectures are event driven.

2. This means for each event or request to server, a state is created.
After the request is served, the state is destroyed.

 

Problem with Statelessness

  1. There are different usecases for Stateless architecture. So, your application architecture needs to be redesigned according to the usecase.
  2. States can be stored across multiple requests with:
    • in memory db like redis
    • simple object storage
  3. This is slower than storing state in:
    • cache
    • RAM

 

Function As A Service (FaaS)

  1. A way to implement Serverless architecture
  2. What is a function?
    • Function is a small program that does one small thing
  3. Short lived functions are invoked upon each request and provider bills client for running each individual function.

 

Popular Faas Services

  1. AWS Lambda
  2. Google Cloud Functions
  3. IBM BlueMix OpenWhisk
  4. hook.io

 

FaaS vs Managed Servers

1. Similarity:
You dont have to manage the servers

2. Fundamental Difference:
In Faas, you dont need to manage server applications as well

 

Advantages of Faas

  1. Two FaaS functions written in different languages can interact with each other easily.
  2. Multiple functions can be connected and chained together to implement reusable components.

 

FaaS vs PaaS

Consider an e-commerce website. On a normal day, the traffic is average. But during holidays, we could expect a sudden surge in the traffic. In those cases, the server will not be able to serve so many requests and eventually crash. But this can be solved by scaling the server resources.

In Paas, scaling is provided. But you need to estimate how much resources you would need and then provision them accordingly. The problem with this is that you might over or under estimate. If you over estimate, then even on normal days, you would pay for unused resources. If you under estimate, then your server will crash when traffic increases.

In FaaS, the biggest USP is ‘automatic scaling’. You dont have to think about scaling. Automatic horizontal scaling is managed by the provider and is completely elastic.

 

Backend As A Service (BaaS)

  1. It integrates into FaaS architecture
  2. BaaS provides entire application component as a service like:
    • DB storage
    • push notifications
    • analytics

 

FaaS Cold Start Problem

  1. Cold starting a function in serverless platform takes a considerable amount of time to load.
  2. This is bad in the cases where certain functions are accessed infrequently.
  3. This can be overcomed by a process called ‘warming’ where in functions are invoked periodically.

 

FaaS Time Limit Problem

  1. FaaS Functions have time limit within which they have to run
  2. If they exceed it, they will be automatically killed.
  3. So, application should be redesigned to divide a long-lived function into multiple co-ordinated functions

 

Vendor lock-in

  1. This is the major disadvantage of FaaS..
  2. When you move from one provider to another, you will need to change your code accordingly.

 

Serverless Architecture

  1. Serverless goes a step beyond where you dont even have to think about capacity in advance.
  2.  You would generally run a monolith application on a PaaS.
  3. Serverless lets you break your application into small self contained programs (functions).
    • Example:
      • Each API end point can be a seperate function
  4. From operations perpective, the reason you would break down your app into functions is to scale and deploy seperately.
    • Example:
      • If one of your API endpoint has 90% of traffic, then that one bit of code/ function can be distributed and scaled much easier than your entire application.
Serverless Architecture

August’17 Golang Bangalore Meetup

The August Golang Bangalore Meetup was conducted on Saturday, August 26, 2017 at Red Hat India Pvt. Ltd. Since the event took place around the holidays, there were less number of people who turned up for the event.

The meetup started at 10:30 with the first talk by Nurali Virani who works at SAP Labs, Bengaluru. He talked about “Understanding Slice & Map in Golang”. Nurali’s talk was a beginner friendly talk. He explained the concepts in very detail and by live coding. He addressed each and every question raised by the participants. The code written  by Nurali during his demo can be found here.

The next talk was done remotely by Steve Manuel. Steve (@nilslice) lives in Boulder, Colorado. He is the co-founder of Boss Sauce Creative. Steve talked about his open source project Ponzu. Ponzu is a headless CMS with automatic JSON API, featuring auto HTTPS, HTTP/2 Server Push, and flexible server framework written in Go. The slides related to his talk can be found here. Other related resources: Github, Docs, Addons. To know more about ponzu, join #ponzu on gophers.slack.com. You can receive invitation to join Slack from here: https://invite.slack.golangbridge.org

Fortunately, Steve’s talk is recorded. The recording can be found here.

I thank Udayakumar Chandrashekhar and Red Hat India Pvt. Ltd. for helping us to organize the August Golang Bangalore Meetup by providing venue and food.

August’17 Golang Bangalore Meetup

June ’17 Golang Bangalore Meetup

The June Golang Bangalore Meetup was conducted on Saturday, June 17th, 2017. There were around 35-40 people who attended the meetup.

DCf3kQJVwAA8VSC.jpg:large

The meetup started at 10:15 with the first talk by Nurali Virani who works at SAP Labs, Bengaluru. He talked about “Understanding Type System In Go”.

DCgErtoU0AAVDds.jpg:large

Saifi gave an awesome talk about “Working with C code and plugins in Go”. The slides related to his talk can be found here

DCgNU5GU0AAFIhH.jpg:large

Umasankar Mukkara is the Co-Founder and CEO at CloudByte Inc. He talked about OpenEBS and their experience with Golang. The slides of his talk could be found here. OpenEBS is an open source project written in Golang. You can find the source code of OpenEBS here.

Satyam Zode who is also a fellow Golang programmer at OpenEBS presented a talk about “Package oriented design in Go”. The slides related to his talk could be found here

IMG_20170617_115317IMG_20170617_115325

I thank Uma, OpenEBS and Nexus Ventures Partners for helping us to organize the June Golang Bangalore Meetup by providing venue and food. There were also a few goodies for the participants by OpenEBS. Fortuanately, this event was recorded and we will be sharing the recorded videos soon when they will be ready.

 

 

June ’17 Golang Bangalore Meetup

Working on Jira and Bugzilla issue in Project Almighty

I joined Red Hat on 15/06/2016.

I was assigned two issues on project Almighty to work on:

  1. Create simple example of fetching Issues from Jira in Go
  2. Create simple example of fetching Issues from Bugzilla in Go

I started with the Jira Issue. This was my first issue in golang. While solving this issue, i learnt the usage of “go get” command in golang. It downloads + installs an upstream package. For example, this is an upstream package which I used in the Jira Issue –> github.com/andygrunwald/go-jira

So, I installed+downloaded the package using following command:

go get github.com/andygrunwald/go-jira

The package got installed:

Screenshot from 2016-06-27 12-39-46

I learnt how to take input from command-line flags using “flag” package. To do this:

  • import flag package using:

import "flag"

  • Declare the variable in which you want to store the input

var username string
flag.StringVar(&username, "uname", "", "Username")

The first argument &uname is a pointer which stores the input value.
The second argument is the flag name which you use on the command line e.g. “-username”.
The third argument is the default value of the flag input.
The fourth argument is the description of the flag.

  • Parse the flag:

flag.Parse()

Next, I learnt to work with “net/http” package and how to parse http response in Go. Although I didnt use the package for this issue but I did experiment with it.

I used “reflect” package to find out the data type of http response. The two commands i used were:

reflect.TypeOf(result)
reflect.TypeOf(result).KindOf()

I found that http response have slice data type and each slice element had struct data type.

To access the value from struct data type, I had to find out its property name. For that, I used “reflect” package:
reflect.Indirect(result).FieldByName(field)

The next issue was bugzilla issue, The golang packages for bugzilla are listed here.

I used this package to get the results –> github.com/kolo/xmlrpc

XmlRpc is used to make remote procedure calls over HTTP.

The endpoint for xml-rpc interface is xmlrpc.cgi script in the bugzilla installation. So, for Red Hat Bugzilla, the endpoint is https://bugzilla.redhat.com/xmlrpc.cgi

You can find my code here.

Working on Jira and Bugzilla issue in Project Almighty

Machine Learning

Learning

To gain knowledge or understanding or skill through:

  • study
  • instruction
  • experience

 

Machine Learning

The field of study that gives computers the ability to learn without the need of explicitly programming.

The goal is to device programs that learn and improve performance with experience without human intervention.

 

Training Data

  1. Set of examples (input -> output) for learning
  2. Used to build model

 

Test data

  1. Used to test:
    • how good your model can predict
    • estimate model properties
  2. It is always outside training data set but follows some probability distribution as training data

 

Feature

  1. also called predictor
  2. It is a meaningful attribute
  3. Internal representation of data
  4. quantity describing an instance
  5. property of an instance

 

Tuple

  1. A Record in data base
Screenshot from 2016-05-02 23-34-01
Features are columns and Tuples are rows

 

If we increase the number of records, attributes in a data set, then Machine Learning problem also becomes a Big Data problem.

 

Supervised learning

  1. It uses training data set consisting of input -> correct output to train the model
  2. Example:
    • Page Ranking Algorithm
    • Next word recommendation in Instant Messaging Application/ Whatsapp/ SMS

 

Unsupervised learning

  1. No training data set exists
  2. Most difficult algorithms are unsupervised learning because there is no “fixed” objective.
  3. used in Explaratory Data Analysis (EDA)
  4. Example:
    • used in recommendation systems to determine users who are similar to me from existing database

 

tiff infomation
Machine Learning types

 

 Classification  Clustering
We have a set of pre-defined classes and we want to know which class a new object belongs to. Group a set of objects and find whether there is some relationship between objects.
It is predictive modelling. We give          pre-defined groups and predict group of new data. It is descriptive modelling. We try to find groups which occur naturally in data .

 

Classification

There are 6 items categorised in 2 classes:

tiff infomation
Example of Classification

 

Each category has a label e.g. Eatables and Non-Eatables. If we have to predict the class of a new item “strawberry”, then it will be assigned a label “Eatable”

 

Clustering

There are 6 items categorised in 2 groups:

tiff infomation
Example of Clustering

 

Each category is unnamed i.e. there is no label attached to the group. If we have to predict the group of a new item “strawberry” then it will be in the first group.

 

Accuracy

  1. How often is the prediction correct?
  2. Accuracy is not reliable metric for real performance of model because it will yield misleading results if training data set is unbalanced (i.e.  number of samples in different classes vary greatly).
  3. Example:
    1. Let number of cats be 95 and number of dogs be 5
    2. Classifier can easily bias into classifying all samples as cats
    3. Overall acuuracy = 95%
    4. BUT 100% recognition rate for cats and 0% recognition rate for dogs
  4. One of the ways to improve accuracy is to provide more balanced data.

 

This is one of the interesting things explained by Satish Patil in Pune Python Meetup that:

There is no right or wrong model. There is no best or worst model. There is ONLY useful and non-useful model. 

Nobody knows how much percentage of accuracy is good. How much accuracy is needed depends on Business Context.

Consider a company which wants to launch a new product and they want the probability of success of the product using Machine Learning. So, it is the company which DECIDES that if they get probability below 60%, then they will not launch the product. So, this is not something that the developer decides. This totally depends on the business context.

 

Market Basket Analysis

  1. Also called affinity analysis
  2. Association Rule:
    • discovering interesting relation/connection/association between specific objects
  3. Sometimes, certain products are typically purchased together like:
    • beer and chips
    • beer and diapers
    • bread and eggs
    • shampoo and conditioner
  4. So, market basket analysis tells a retailer that promotion involving just one of the items from the set would likely drive sales of the other
  5. This technique is used by retailers to:
    • improve product placement
    • marketing
    • new product development
    • making discount plans

 

Titanic Data Set

The titanic data set was used in the machine learning talk in Pune Python Meetup. It can be downloaded here.

There are some features in the data set which can be ignored as they are not important like:

  • Passenger ID
  • Name
  • Ticket Number
  • Cabin

and there are some important features which help in classifying like:

  • Survived
  • Gender

 

Impurity Measure

  1. Measures how well are the classes separated
  2. Should be 0 when all data belong to one class

 

Entropy

  1. Entropy can be a measure of quality of model
  2. It is a measure of how distributed are the probabilities.
  3. The more equal is the share for the probability values in all the classes, the higher is the entropy.  The more skewed is the share among the classes, lesser is the entropy.
  4. The goal in machine learning is to get a very low entropy in order to make the most accurate decisions and classifications

 

Decision Tree

  1. A way of graphically representing an sequential decision process
  2. Non-leaf nodes are labelled with attribute/ question
  3. Leaf nodes are labelled with class
tiff infomation
decision tree based on titanic data set

 

Pruning

  1. Data can contain noise:
    • instance can contain error
    • wrong classification
    • wrong attribute value
  2. If a particular feature is not used by a tuple or if the feature is not influencing, then it is removed.

 

Data Preprocessing

  1. Converting data into interval form
  2. Machine learning algorithms learn from data so its important to feed it the right data
  3. Data preprocessing basically involves:
    • correcting mistakes
    • handle missing values
    • handle outliers
    • normalize values
    • nominal values

 

Missing Value

  1. The value of an attribute which is not known or does not exist
  2. Example:
    • value was not measured
    • instrument malfunction
    • attribute does not apply
  3. If a column contains “Not Available”, then it is NOT considered as a missing value.

 

Outliers

  1. samples which are far away from other samples
  2. They can be mistake/ noise or represent a special behaviour
  3. Outliers are generally removed

 

Questions that were asked in meetup

  1. Can data be extended to multiple dimension?
  2. Can distance be other than Euclidian?
    • Yes, Manhattan distance
  3. Are there online courses that teach ML intro?
    • Yes
  4. What is “k” in k-means?
    • k is no. of clusters
  5. Can we use ML for trading?
    • Yes
  6. Any daily life clustering example
  7. Any software product based on unsupervised learning?
    • Google Maps
    • Matrimony/ Dating websites
    • Red Coupon (real estate)
    • Amazon recommendation
    • Netflix
  8. Order in which features is given, is that important?
    • No
  9. Why do we say that one model is better than the other?
  10. What if accuracy is not the concern?
    • Accuracy is one way of looking at prediction
  11. Do you think that if model changes, something in feature has changed?
  12. We have tools like WEKA, so why would anyone prefer Python or R?
    • depends on the language available or language the company uses
  13. How do we know that a particular feature is important or not?
  14. What if some features are more influential than others? How will the decision tree be affected?
  15. How to handle outliers in a decision tree?
  16. Will the algorithm figure out the relationship between input and output?
    • This is possible through Regression

 

Machine Learning

Event Report: April Pune Python Meetup

April Pune Python Meetup (@PythonPune) was conducted on April 30, 2016 at Redhat, Pune. Around 70 people registered for the meetup but the turnout was around 72-73. A few people registered on the spot.

Python Pune Meetups are organised by Chandan Kumar (@ciypro) who is a fellow RedHat employee, a python programmer and FOSS enthusiast who has contributed to many upstream projects.

The meetup started around 10:45 with the introduction where everybody introduced themselves. Almost everybody knew python. There were 1-2 people who did not know python. There were a few people who were experience in machine learning and some who were completely new to Machine Learning. I had a course on machine learning in my college where i learnt the theory and did some practical assignments in R language. The crowd was diverse consisting of students, data scientists, professors and people of various age groups 18 – 70.

This speakers of this meetup were Satish Patil (@DataGeekSatish) and Sudarshan Gadhave (@sudarshan1989) who took a session on Introduction to Machine Learning. 

4
Satish Patil in Pune Python Meetup

 

5
Sudarshan Gadhave in Pune Python Meetup

Satish Patil is the Founder and Chief Data Scientist of Lemoxo Technologies, Pune where he advises companies large and small on their data strategy. He has 10+ years of research experience in the field of drug discovery and development. He told a few real life machine learning examples from his field in the meetup!

Satish is passionate about applying technology, artificial intelligence, design thinking and cognitive science to better understand, predict and improve business functions. He has a great interest in Machine Learning, Artificial Intelligence, Data Visualisation, Big Data.

Satish covered the following topics:

  • What is Machine Learning
  • The Black Box of Machine Learning
  • features
  • training and test data set
  • classification
  • clustering
  • pure and impure states
  • entropy
  • decision tree
  • supervised and unsupervised learning
  • market basket analysis
  • data pre-processing
  • Titanic data set
  • K means algorithm

Although Machine Learning is a vast concept and it definetly requires more sessions to grasp, but Satish made a remarkable effort in making us understand all the above topics in layman terms.

There are a lot of books, courses, material available online for Machine Learning, so why this meetup? Well, the best part about this meetup was the way Satish explained the BUSINESS CONTEXT of MACHINE LEARNINGThis was something new for me to learn. Getting to know the real life examples from the entrepreneur-cum-data scientist was really interesting.

1.jpg
The Machine Learning Workshop in Pune Python Meetup

The details of his talk will be in my next blog.

Chandan Kumar talked about Fedora Labs. The Fedora science spin comes pre-installed with essential tools for scientific and numerical work like IDE, tools and libraries for programming in Python, C, C++, Java and R. It basically eliminates the need to download a bunch a scientific packages which you need.

If you need any help regarding the spin, you can get help from #fedora-science channel on Freenode on IRC.

As Chandan Kumar ALWAYS encourages us to contribute to open source, he introduced us to WHAT CAN I DO FOR FEDORA?. Pune Python meetups and Devsprint are a great platform to seek for help if you want to contribute to opensource.

3
Chandan Kumar in Pune Python Meetup

 

Thanks to Satish Patil and Sudarshan Gadhave for conducting an awesome workshop! We hope to see more such workshops by you in the meetups.

Thanks to RedHat for the food, beverages and venue.

Thanks to Chandan Kumar, Pravin Kumar (@kumar_pravin), Amol Kahat, Sudhir Verma for organising such interesting meetups where we always learn something new 🙂

 

 

Event Report: April Pune Python Meetup