Tag Archives: python

DIY Web Analytics

I’ve long established that adding heavy duty analytics and tracking scripts to my blog pages isn’t the right thing to do. Personally, it is also a bit liberating to not know which article of mine is getting a lot of traffic and which isn’t, because then I’m not biased by what the internet is searching for and can write about pretty much anything that I feel like writing about. 10 programming languages you should learn in 2020 has exactly the same weight as Let me tell you a funny story from last night, so which one do think I’d write about?

The analytics and tracking world has come a long way and is viewed very negatively in the light of recent internet incidents. But it started off very simple and had a very simple and non-malicious idea at its core: Getting to know your user better so that you can serve them better.

That thought made me search for a simple analytics solution that I could run on my blog for a couple of weeks and get enough insights to make informed decisions regarding the frontend design changes while not compromising on the privacy of the visitors. If I’m completely honest, I was also just curious to know these things with no agenda behind it.

I looked into Simple Analytics, a nice solution that does exactly what I needed (perhaps a bit more than that), but a little expensive for me at USD 19 a month. There are also self hosted analytics solutions like Plausible, but that was too much work for realizing this simple thought. So I decided to put something together quickly and the following is what I ended up implementing.

Client side JavaScript

On the client side, I needed to get the data that interested me. It was details like the browsers used by my visitors, platform, width of their screens etc. More technically, the user agent, platform, screen width, referrer and the current page’s url (although I don’t plan on using it for this article. Spoiler: One of my lowest effort articles is pulling more than half of all pageviews which is a bit saddening).

1
2
3
4
5
6
7
8
9
10
11
if (!('doNotTrack' in navigator) || !(navigator.doNotTrack === '1')) {
  let analytics = 

  analytics["href"] = window.location.href
  analytics["userAgent"] = navigator.userAgent
  analytics["width"] = window.innerWidth
  analytics["referrer"] = document.referrer
  analytics["platform"] = navigator.platform

  navigator.sendBeacon(ANALYTICS_ENDPOINT, JSON.stringify(analytics
}

There’s not much happening here. Just checking if the user prefers to not be tracked, else get the desired data and POST it to our analytics endpoint using the navigator.sendBeacon API.

Server

We need to implement the endpoint that’s listening for the POST requests from our client browsers. I decided to go with Firebase’s functions for handling the request and Firebase’s realtime database to store the data.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
const functions = require('firebase-functions'
const admin = require('firebase-admin'

admin.initializeApp

const cors = require('cors')({
  origin: true,


exports.handler = functions.https.onRequest(async (req, res) => {
  if(req.method === 'POST') {
    const snapshot = await admin.database().ref('/hit').push(JSON.parse(req.body
    return cors(req, res, () => {
      res.json({ message: 'success' 
    
  }
  else {
    res.json({ message: 'have a good day!' 
  }

Now this is super bad code for a variety of reasons, but it worked for my temporary needs. I deployed this, waited for a couple of weeks and had some data to answer some basic questions about my blog’s visitors.

Parsing data

So at this point I had let this code run long enough to have accumulated couple of hundred entries. It was time to analyze. Firebase allows you to easily export the database in JSON format. Using some basic Python-fu, I created lists of each dimension and passed these lists to Python’s builtin collections.Counter (which is perfect since I’m only interested in aggregated stats), and then take the top 5 most frequent items using the .most_common method. Finally, we plot bar charts for these top 5 values across each dimension using Matplotlib to visualize the results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import json
from collections import Counter, defaultdict
from user_agents import parse
import matplotlib.pyplot as plt

analytics_data = defaultdict(list)


def plot_chart_from_ctr(ctr):
    most_common = ctr.most_common(5)
    x, y = [item[0] for item in most_common], [item[1] for item in most_common]
    x = [str(i) for i in x]
    plt.bar(x, y)
    plt.show()


def driver():
    with open('export.json', 'r') as output:
        data = json.load(output)
        hit = data['hit']
        entries = []
        for item in hit:
            entries.append(hit[item])

        analytics_data['height'] = []
        for item in entries:
            analytics_data['width'].append(item['width'])
            analytics_data['href'].append(item['href'])
            analytics_data['platform'].append(item['platform'])
            analytics_data['referrer'].append(item['referrer'])
            analytics_data['userAgent'].append(item['userAgent'])

    browser_family = []
    for agent in analytics_data['userAgent']:
        user_agent = parse(agent)
        browser_family.append(user_agent.browser.family)

    ctr_browser_family = Counter(browser_family)
    plot_chart_from_ctr(ctr_browser_family)

    ctr_platform = Counter(analytics_data['platform'])
    plot_chart_from_ctr(ctr_platform)

    ctr_referrer = Counter(analytics_data['referrer'])
    plot_chart_from_ctr(ctr_referrer)

    ctr_width = Counter(analytics_data['width'])
    plot_chart_from_ctr(ctr_width)


if __name__ == '__main__':
    driver()

The questions

What are the most common browsers?

I would’ve guessed this to be the case for browsers, but not in Chrome’s favour to this extent 🙁

What are the most common platforms?

Win32 is still the majority platform among my visitors and there’s some Apple action going on. There’s also a healthy chunk of Linux X86_64 visitors. The armv7l and armv8l may contain Android devices but hard to tell.

What are the most common referrers?

Unsurprisingly, most of the traffic comes through Google.

What are the most common screen widths?

There’s a healthy mix of screen resolutions with the most frequent being 360px.

In closing

So that’s it for this little article. I’m happy with the outcome given how little effort went into this whole assignment. I hope you enjoyed reading it. As always, write me an email in case you have any comments!

Thank you for reading.

Concurrency In Python For Network I/O – Synchronous, Threading, Multiprocessing and Asynchronous IO

In this article, let’s look at some of the ways to do batch HTTP requests in Python and some of the tools at our disposal. Mainly, we’ll look at the following ways:

  1. Synchronous with requests module
  2. Parallel with multiprocessing module
  3. Threaded with threading module
  4. Event loop based with asyncio module

I was inspired to do this because I had a task at hand which required me to write code for a superset of this problem. As someone who can write basic Python, I implemented the most straightforward solution with for loops and requests module. The task was to perform HTTP HEAD requests on a bunch of URLs, around 20,000 of them. The goal was to figure out which of URLs are non-existent (404s, timeouts and so on) in a bunch of markdown files. Averaging a couple of URLs a second on my home network and my personal computer, it would’ve taken a good 3-5 hours.

I knew there had to be a better way. The search for one lead to me learning a few things and this blog post. So let’s get started with each solution. Do note that depending on the hardware and network, it might go up and down quite a bit, but what’s interesting is the relative improvements we make going from one method to the next.

Synchronous code with requests module

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import time
from helper.urls import urls
import requests


def make_request(url):
    try:
        res = requests.head(url, timeout=3)
        print(res.status_code)
    except Exception as e:
        print(e.__class__.__name__)


def driver():
    for url in urls:
        make_request(url)


def main():
    start = time.perf_counter()
    driver()
    end = time.perf_counter()
    print(f'Synchronous: {end - start:.2f}')


if __name__ == '__main__':
    main()

If you try to think what you’d do if you had to check 10 URLs if they work, and write that process down in pseudocode form, this approach is pretty much what you’d get. Single threaded and synchronous, just like a human.

This script fetches 800 random URLs from a different file, makes an HTTP HEAD request to each, and then times the whole operation using time.perf_counter().

I ran all the tests on a Raspberry Pi 3 running freshly installed Ubuntu 20.04 LTS, but I don’t intend on making this scientific or reproducible, so don’t take the results at face value and test it yourself. Better yet, correct me and tell me how I could’ve done it better!

Synchronous code took 536 seconds to hit 800 URLs

With that we have our baseline. 536 seconds for 800 URLs that we can use to compare our other methods against.

Multiprocessing with multiprocessing module

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import time
from helper.urls import urls
import requests
from multiprocessing import Pool
from multiprocessing import cpu_count


def make_request(url):
    try:
        res = requests.head(url, timeout=3)
        print(res.status_code)
    except Exception as e:
        print(e.__class__.__name__)


def driver():
    with Pool(cpu_count()) as p:
        p.map(make_request, urls)


def main():
    start = time.perf_counter()
    driver()
    end = time.perf_counter()
    print(f'Multiprocessing: {end - start:.2f}')


if __name__ == '__main__':
    main()

My Raspberry Pi is a quad core board, so it can have 4 processes running in parallel. Python conveniently provides us with a multiprocessing module that can help us do exactly that. Theoretically, we should see everything done in about 25% of the time (4x the processing power).

Multiprocessing solution took 121 seconds

So a bit less than 25% but if I run both the scripts over and over again, the average converges to a ratio of roughly 4:1 as we’d expect.

Threading with threading module

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import time
from helper.urls import urls
import requests
import threading


def make_request(url):
    try:
        res = requests.head(url, timeout=3)
        print(res.status_code)
    except Exception as e:
        print(e.__class__.__name__)


def driver():
    threads = []
    for url in urls:
        t = threading.Thread(target=make_request, args=(url,))
        threads.append(t)

    for t in threads:
        t.start()

    for t in threads:
        t.join()


def main():
    start = time.perf_counter()
    driver()
    end = time.perf_counter()
    print(f'Threading: {end - start:.2f}')


if __name__ == '__main__':
    main()

With threading, we essentially use just one process but offload the work to a number of thread that run concurrently (along with each other but not technically parallel).

Threading code runs in 41 seconds

Threading runs much faster than the multiprocessing, but that’s expected as threading is the right tool for network and I/O bound workload while multiprocessing suits CPU intensive workloads better.

Short detour: visualizing threading and multiprocessing

If you look into your system monitor, or use htop like I have in the following images, you’ll see how multiprocessing differs from threading. On my dual core (with 4 threads) personal computer, multiprocessing creates four processes (with only the default one thread per process), while threading solution creates a much larger number of threads all spawned from one process.


Code running the multiprocessing [above] and threading [below] solution as viewed through htop

This helped me better understand the difference between threads and processes on a high level and why threading solution is much faster for this particular workload.

Asynchronous code with asyncio module

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import time
from helper.urls import urls
import httpx
import asyncio

async_client = httpx.AsyncClient()

async def make_request(url):
    try:
        res = await async_client.head(url, timeout=3)
        print(res.status_code)
    except Exception as e:
        print(e.__class__.__name__)


async def driver():
    await asyncio.gather(*[make_request(url) for url in urls])


def main():
    start = time.perf_counter()
    asyncio.run(driver())
    end = time.perf_counter()
    print(f'Async IO: {end - start:.2f}')


if __name__ == '__main__':
    main()

Finally, we reach Async IO, sort of the right tool for the job given its event driven nature. It was also what sparked my curiosity in the subject in the first place as it is quite fascinating, coming from the event driven JavaScript land, to find an event loop in Python. [Read: Callbacks And Event Loop Explained and Event Driven Programming]

Asynchronous IO code ran in 11 seconds

Clearly this task is made for this approach, and the code looks surprisingly simple to understand which is a plus.

In closing

That’s it for this little adventure in Python land I sincerely hope you enjoyed it. If you find any corrections and improvements in the above text, please write them to me. It will help me get better at Python (and writing )

Thank you for reading

Backend Development With Flask

Context

I distinctly remember how much I liked writing backends (well, I tried). I also remember thinking, I will never become a frontend engineer. Ever. Like many software developers in my circle, I hated writing CSS, and JS hadn’t clicked until then. And these were the days before I knew anything about automated deployments. Deployment, for me, was spinning up a Digital Ocean droplet, installing everything manually, setting up the database (and copy pasting the db creds into code), and then running the development server and keeping it running. Stop laughing, please.

Motivations

So I decided to pick up some backend skills again. The main reasons for this were that I’ll need a full-time job soon, and it would be much better if I get to write the full stack of the web instead of just the frontend. Secondly, knowing full stack is a superpower that I’d not want to not know. It comes in incredibly handy when working on some personal projects that require the web as an enabler. Thirdly, I wanted to refresh my Python skills. I liked python a lot back in the day, but I had lost touch since the last couple of years. So with those goals, I started with Python and Flask.

Python & Flask

I liked flask, but I’m still struggling with some basics, especially working with app instances for writing tests. Yes, there are a few differences working with the backend this time versus like three years ago. I’m following (or trying to follow) the best practices, writing tests for the code, I have a nice CI pipeline which starts with testing of the code and ends with deploying the app on Heroku. Most importantly, I have an excellent mentor for my backend adventures who’s a badass python programmer.

Flask is fun to work with. The framework is minimal, kinda like React. There’s a lot of support online, and there are great plugins already available for most common functionality. Databases are one of my weakest points in web engineering, and I’m trying to experiment a lot of things on the model layer with SQLAlchemy and Postgres backend. One more novelty for me was asynchronous programming. In Javascript, you had to beg for things to be synchronous. But here you face a different problem; If something is slow, then the entire thread is blocked. For taking care of things that are slow, say sending an email, one could use Celery with RabbitMQ’s backend. All of this is given to us ready-made by Heroku. So no more manual DevOps work, and fewer variables to worry about.

The other motive was to learn quality Python 3. In python, you have a pythonic and many non-pythonic ways of doing things. There’s no point in writing Python like C. I wish to learn the philosophy of the language so that I can make the right choices when deciding how to solve a problem. There’s nothing like writing clean and elegant code that others can appreciate. In the last week, I also got exposed to a lot of different data structures that the python library provides for specific use cases. In the right scenario, using an appropriate data structure can be the best optimization you can do to your code. I am looking forward to getting a hold of this as well.

Lastly, one thing that I never thought I’d so, but I’m doing, is trying to learn object-oriented programming. I had done some OO python in the past, but it never clicked. Laster, with JS, it was all about functional programming. Now, I’ve rediscovered OO programming and wish to relearn it, apply it and try to make it click. I like the contrasts in both paradigms of programming, and it could not be better explained than this StackOverflow answer.

  • Object-oriented languages are good when you have a fixed set of operations on things, and as your code evolves, you primarily add new things. This can be accomplished by adding new classes which implement existing methods, and the existing classes are left alone.
  • Functional languages are good when you have a fixed set of things, and as your code evolves, you primarily add new operations to existing things. This can be accomplished by adding new functions which compute with existing data types, and the existing functions are left alone.

Overall, I feel I’m becoming a little (very little indeed) mature with programming. Instead of sticking to paradigms and trying to defend the one that I’m most comfortable, I’m trying to see why those paradigms exist and what problems are they helping me solve. And since python supports both object-oriented as well as functional programming, it will be fun to work on any such problems.

I would write some technical articles on the subject when I feel confident enough in the near future. Just wanted to give you an update on what’s happening on my front in this one. Hope you found it useful. I’ll leave you with an interesting video about ‘Duck typing and asking forgiveness, not permission’ which is a design pattern in Python. Thank you for reading.

Advice From An Old Programmer – Zed Shaw

The first article on my site is about me starting with Python. That was four years ago and around that time I had read this book called ‘Learn Python The Hard Way’. I really, really liked it. It was amongst one of those earlier pieces of memories that I’d probably never forget for my entire life, similar to the kind of impact reading The Hacker Manifesto and Sir Eric Raymond’s ‘How To Become A Hacker’ had on me.

I decided to give the Python 3 edition of Learn Python The Hard Way a read. The last section of the book is titled ‘Advice from an old programmer’ and in that, Zed Shaw shares with us some of his zoomed-out thoughts on programming and the career one makes out of it. Although it is very subjective and very blunt, just like the rest of the book (and I really like the rawness in his writing), for me personally it refreshed the old memories associated with the book.

I had read this exact chapter in the previous edition, but this time it made so much more sense. And not just this chapter, but in the entire book, the subtle pieces of well targeted humor and strong opinions held by the author were something of a delight to read even if you didn’t believe in the exact same thing.

I’m copy pasting the section of that book that I think I’ll come back to read re-read again and again. I think many of you will appreciate it as well.

Advice From An Old Programmer

You’ve finished this book and have decided to continue with programming. Maybe it will be a career for you, or maybe it will be a hobby. You’ll need some advice to make sure you continue on the right path and get the most enjoyment out of your newly chosen activity.

I’ve been programming for a very long time. So long that it’s incredibly boring to me. At the time that I wrote this book, I knew about 20 programming languages and could learn new ones in about a day to a week depending on how weird they were. Eventually, though, this just became boring and couldn’t hold my interest anymore. This doesn’t mean I think programming is boring, or that you will think it’s boring, only that I find it uninteresting at this point in my journey.

What I discovered after this journey of learning is that it’s not the languages that matter but what you do
with them. Actually, I always knew that, but I’d get distracted by the languages and forget it periodically.
Now I never forget it, and neither should you.

Which programming language you learn and use doesn’t matter. Do not get sucked into the religion
surrounding programming languages as that will only blind you to their true purpose of being your tool
for doing interesting things.

Programming as an intellectual activity is the only art form that allows you to create interactive art. You
can create projects that other people can play with, and you can talk to them indirectly. No other art form
is quite this interactive. Movies flow to the audience in one direction. Paintings do not move. Code goes
both ways.

Programming as a profession is only moderately interesting. It can be a good job, but you could make
about the same money and be happier running a fast food joint. You’re much better off using code as
your secret weapon in another profession.

People who can code in the world of technology companies are a dime a dozen and get no respect.
People who can code in biology, medicine, government, sociology, physics, history, and mathematics
are respected and can do amazing things to advance those disciplines.

Of course, all of this advice is pointless. If you liked learning to write software with this book, you should try
to use it to improve your life any way you can. Go out and explore this weird, wonderful, new intellectual
pursuit that barely anyone in the last 50 years has been able to explore. Might as well enjoy it while you
can.

Finally, I’ll say that learning to create software changes you and makes you different. Not better or
worse, just different. You may find that people treat you harshly because you can create software, maybe
using words like “nerd.” Maybe you’ll find that because you can dissect their logic they hate arguing
with you. You may even find that simply knowing how a computer works makes you annoying and weird
to them.

To this I have just one piece of advice: they can go to hell. The world needs more weird people who know
how things work and who love to figure it all out. When they treat you like this, just remember that this is
your journey, not theirs. Being different is not a crime, and people who tell you it is are just jealous that
you’ve picked up a skill they never in their wildest dreams could acquire.
You can code.

They cannot. That is pretty damn cool.

Beautiful, isn’t it? Thank you for reading!

Social Share Counts Python Implementation

I wrote this little script that grabs the count of shares on popular social networks, using their APIs. I have listed the documentation on the project page on Github. I will copy the relevant pieces of readme here.

Social Network APIs

Facebook

Request:
https://graph.facebook.com/?id=https://www.github.com

Response:

{
   "id": "https://www.github.com",
   "shares": 31684
}

Twitter

Request:
https://cdn.api.twitter.com/1/urls/count.json?url=https://github.com

Response:

{"count":14,"url":"http://github.com/"}

Google Plus

Request:
https://plusone.google.com/_/+1/fastbutton?url=https://github.com

This returns the +1 button. I extracted the counts using regex window.__SSR = {c: ([d]+)

LinkedIn

Request:
https://www.linkedin.com/countserv/count/share?url=https://github.com&format=json

Response:

{"count":0,"fCnt":"0","fCntPlusOne":"1","url":"https://github.com"}

StumbleUpon

Request:
https://www.stumbleupon.com/services/1.01/badge.getinfo?url=https://github.com

Response too large.

Pinterest

Request:
https://api.pinterest.com/v1/urls/count.json?url=https://github.com

Response:

receiveCount({"url":"https://github.com","count":0}) 

Reddit

Request:
https://www.reddit.com/api/info.json?url=https://github.com

This returns a lot of data, which can be easily used to extracted counts.
Note that the Reddit guys don’t really like automated requests and may cause the call to return HTTP 429.

Vkontakte

Request:
https://vk.com/share.php?act=count&index=1&url=https://github.com

Response:

VK.Share.count(1, 419

This marks the end of the documentation. I hope that serves some purpose. The reason I wrote this was that I wanted social sharing buttons on this blog. I could not afford any of the mainstream buttons like Addthis and Sharethis. They kill the load time, and I’m already struggling with Disqus and Adsense overload. More on that soon. I hope to create a ‘dynamic looking’ static set of buttons for this blog, and I’ll update you if I get time to complete it.

A Day Of Struggle With Python IDEs

Yesterday, I gave up on doing my next web application project with QT. I knew C++ was never meant to be a language of the web, but I really had some hopes with QT. It actually is good, and apparently it would have ran faster than any other platform or language for the web. The problem is, it is a lot time consuming to develop anything in it, especially web apps. I really don’t have all that time. So I decided to do it in Python or Ruby. After reading some articles, it became clear, they are not much different, where python is a more general programming language, ruby is more towards the web, with it’s rails framework.

I choose Python, just because I know how to write code in python, beforehand. It was time to go shopping for some good frameworks to develop this thing. No doubt, finally I had to decide between Django and Flask. I choose flask. It was damn too simple, or at least it seemed like that. I tried the simple hello.py script which displays “hello world” on the localhost port 5000.

I tried that in emacs, but immediately felt the need for an IDE in this foreign land. I looked in my ‘Downloads’ folder, and luckily there was Aptana Studio 3 still sitting there. I used to have it installed when I was into PHP last year. Since then, it got removed and thrown into a corner. I installed it. I really loved Aptana back then, for it’s usability. But now, it started to act like a stubborn child, refusing to detect Flask. I googled and googled, but alas, no way. Many people seemed to be having the same problem, and the only solution I saw didn’t work.

Seeing no way, I uninstalled Aptana, and googled for other good IDEs. PyCharm was what most were recommending. I decided to give it a try. Turned out, it was a memory hogger. Both my CPUs were doing a constant 100% and other windows turned sluggish too. About half a tonne of RAM was what it was utilizing, with a single .py file open with 4 lines of text in it. No way, again. Removed it, and went to eat some food. Damn.

I was not ready to go to the sluggish Eclipse again, nor Netbeans for the same reason. Finally I settled for Komodo Edit, free lite version of the commercial Komodo IDE. It lacks many things that you will ask for in an IDE, and it is a little better than using bare emacs or vim. Still, for now, I am using it. Configured it to execute python script right inside the window following this tutorial, https://stackoverflow.com/questions/21686395/how-to-run-the-first-python-program-in-komodo-edit-8-5

Life’s good, but just hoping to learn flask for my next project as fast as I can.

Simple group messenger/chat in Python

Aaaaannd here is another post of mine on sockets, client/server models and stuff. I suppose you have started to get bored by the same stuff everyday right? Seems like I just can’t get enough of this thing. But trust me, I had no intention of doing this today, a friend of mine instigated me write a messenger for a project we were working on, and it turned out, writing this simple messaging program was more interesting than I thought.

So, instead of creating a straight forward chat program that was ‘actually’ required by my friend, I created this ‘one-server-multiple-clients’ program, which is more like the real world chat apps we use everyday.

To run the program, execute the ‘server.py’ (after changing the bind() address to the address of the server on which it is supposed to run). Then, execute the ‘client.py’ (guess what? same thing here. But you will have to add the server address in the connect() function here). Of course, you can have all of them running on the same system, but there ain’t any fun in it, right?

Last but not the least, I have commented as much as I could, unlike my last article, so I expect everyone reading this, with a bit of programming knowledge, will understand this straight forward code.

server.py

# Import the necessary libraries
import socket
import sys
import select
# Take message from an host and send it to all others
def shout(sock, message):
  for socket in LIST:
    try:
      # Don't send it back to server and yourself!
      if socket != serv and socket != sock:
        socket.send(message)
    except:
      # Assume client has got disconnected and remove it.
      socket.close
      LIST.remove(socket)
# Declare variables required later.
# To store list of sockets of clients as well as server itself.
LIST = []
# Common buffer for all purposes
buff = 1024
# Declaration of Server socket.
serv = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
serv.bind(("192.168.1.10", 1356))
# Listen for upto 6 clients. Increase if you need more.
serv.listen(6)
# Add server socket to the LIST
LIST.append(serv)
while 1:
  # Moniter clients all simultaneously
  reads, writes, err = select.select(LIST, [], [])
  for sock in reads:
    # A new client connected?
    if sock == serv:
      sockfd, addr = serv.accept()
      LIST.append(sockfd)
    # Naah, just a new message!
    else:
      try:
        # Get his shitty message.
        data = sock.recv(buff)
        if data:
          # If he wrote something, send it to shout() function for broadcast.
          shout(sock, data)
      except:
        # Shit just got real. Client kicked by server :3
        sock.close()
        LIST.remove(sock)
        # Do this till the end of time.
        continue
serv.close()

client.py

# Import the nessary libraries
import socket
import string
import select
import sys

# Socket variable declaration
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(2)

# Connect to server. Change this for remote servers.
s.connect(("192.168.1.10", 1356))

# A prompt asking client to enter something.
sys.stdout.write(">")
sys.stdout.flush()

while 1:
  # These are the possible events.
  # sys.stdin --> Client has typed something through keyboard.
  # s --> Server has send a new message by some other client you.
  streams = [sys.stdin, s]

  # Moniter both the streams simultaneously for inputs.
  readable, writable, err = select.select(streams, [], [])

  # If server has sent something, readable will fill up.
  for sock in readable:
    if sock == s:

      # Receive data in our variable. Check if it is empty.
      data = sock.recv(1024)
      if not data:
        sys.exit()
      else:

        # Write data to stdout and give client prompt back.
        sys.stdout.write(data)
        sys.stdout.write(">")
        sys.stdout.flush()

    # No, its not the server. Our client has typed something in.
    else:

      # Read message. Send it to server. Give prompt back to client.
      msg = sys.stdin.readline()
      s.send(msg)
      sys.stdout.write(">")
      sys.stdout.flush()

So that was the code. Here is a glimpse of the output. The server is running off my ‘raspberrypi’ and all clients are running on my computer. Looks cool right?

If you look closely, one of the clients missed a message sent by another client, LOL, so that is normal. Any suggestions or edits or corrections, drop them in the comments below. 🙂

Python vs C – How simple is it to write a pair of communicating sockets?

Lately I have been reading a lot of articles online written to compare Python to other languages. It is not a secret to anyone that the Python community is growing, and along with it, is the number of people who promote/recommend this language, of course.

Let me not add up to the already large mass of those articles by boasting about Python’s usability, speed and practicality, but rather, I will compare the two languages by writing a small socket client/server pair in each of those languages.

But first, let me give you some of my personal opinions about both the languages since I know them well enough. C is very dear to me, not only because it was the first language I had ever learnt, but also because it runs most of the GNU, and GNU is well, very dear to me! C also happens to be my only second language of choice, after Python (although I know bits of Java, I prefer not to use it, not sure why, but I hate it). I have been programming in Python only from the last couple of months and I was really impressed. I solve HackerEarth and CodeChef problems as a pass time. Although I could do all of the problems I have done in C, doing them in Python took like 1/10th of the time (literally!) and 1/10th the typing effort. I would admit, C is much more fun to write than Python, simply because you ‘feel’ the code is yours, and I love to code C whenever I am free, will I use it in an environment where time is the priority? Probably not. Maybe when C is the only way out, but most of the time, I am better off writing it in Python.

That being said, the popularity of C doesn’t get any less, and it is going to stay that way as long as, maybe the Internet. Here’s something I found.

https://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
You see the thing on top there? Yes, it is there for a reason. To make it short, C is powerful, very powerful. C gives you access to things you can not really imagine in other languages. On the other hand, Python is practical, flexible, and easy to learn. Web apps, sockets, Raspberry Pi, Arduino, Android or anything else you can imagine, there has to be a library made for it by someone, somewhere.

The code part.

I am giving the client and server code in both the languages here as is. No explanation and stuff, because that’s not the topic here. Note that all the source codes are tested running OK on Kali 1.0.6, gcc and all stock stuff, so it should be not much trouble to get it running. Windows guy, search for gcc directory and run it over the command prompt. It won’t run from any IDE.

Python

Writing a pair of communicating TCP sockets require around 30 minutes along with the understanding part, if you have got some background in networking. Python does most of the stuff for you, and you just create a socket variable, supply host and port and that’s it. Rest is left to your imagination (or not, I got too carried away!). Here comes the code:
client.py

import socket
s = socket.socket()
host = socket.gethostname()
port = 1356 
s.connect((host, port))
shit = s.recv(1024)
print shit
s.close()
server.py

s = socket.socket()
host = socket.gethostname()
s.bind((host, 1356))
s.listen(5)
while True:
    c, addr = s.accept()
    c.send("Message from server")
    c.close()

And that is it. Even if it looks lame (which it is), it is the maybe the simplest thing that qualifies to be called a server/client.

C

Now lets write the same in C. This is around 4 times the size of Python code, and much of the stuff are done by hand (nothing new for C, I suppose). This code is the shortest I could cut it to, and just does one simple task. Sends the “Client talking loud!\n” message to server over port 1356 on localhost. The parameters can be edited as per convenience to suit any inter network testing, but that’s the most this code will do. Nevertheless, this is a TCP client/server model.
client.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <unistd.h>

int main(int argc, char **argv) {
int sock, port, n;
struct sockaddr_in serv_addr;
struct hostent *server;
char buffer[256];
port = atoi("1356");
sock = socket(AF_INET, SOCK_STREAM, 0);
server = gethostbyname("127.0.0.1");
bzero((char *)&serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
bcopy((char *)server->h_addr, (char *)&serv_addr.sin_addr.s_addr, server->h_length);
serv_addr.sin_port = htons(port);
connect(sock, (struct sockaddr *)&serv_addr, sizeof(serv_addr));
bzero(buffer, 256);
strcpy(buffer, "Client talking loud!\n");
write(sock, buffer, strlen(buffer));
close(sock);
}

server.c

#include <stdio.h> 
#include <strings.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>


int main(int argc, char **argv) {
int sock, nsock, port;
socklen_t clilen;
char buffer[256];
struct sockaddr_in serv_addr, cli_addr;
sock = socket(AF_INET, SOCK_STREAM, 0);
bzero((char*)&serv_addr, sizeof(serv_addr));
port = atoi("1356");
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = INADDR_ANY;
serv_addr.sin_port = htons(port);
bind(sock, (struct sockaddr *)&serv_addr, sizeof(serv_addr));
listen(sock, 2);
clilen = sizeof(cli_addr);
nsock = accept(sock, (struct sockaddr *)&cli_addr, &clilen);
bzero(buffer, 256);
read(nsock, buffer, 255);
printf("%s\n", buffer);
close(sock);
close(nsock);
}

Here is the expected output:

Sorry, there is no commenting in the above code, and it really needs some explanation. I would’ve written them, but then, the code would have grown three folds (LOL)! It will need another nice article to explain all the stuff from that client.c and server.c code. I will conclude here. Thank you for reading 🙂

Update: If you happen to run any of the above code, make sure you run server first!

Object Oriented Programming: Concepts and explaination using Python

Object oriented programming or OOP is a new concept for me. Till now, all I had done was C and some scripting languages. All of them were ‘procedural’ type of languages. I knew nothing about OOP, and I really hate knowing nothing about anything. So I decided to try it out. I could learn or start with C++ or Java, but I choose Python. The reason being its flexibility and more importantly, it is a language of choice when it comes to creating simple utilities and exploits, which is a field of interest of mine.

So it went good. I studied Python, both in procedural and object oriented way. I am still a beginner in Python but I studied the OOP concepts well. Then my college started and I had a subject called OOPM which deals with the same OOP stuff plus Java. The teacher, who teaches us the subject, I don’t really know what is wrong with her, but her examples are only confined to banks and ATM machines. That is pretty funny and annoying. One example is good, two are Okay, but when someone goes on like a week entirely on the same examples to teach something like OOP, your brain dies.

I have attended all the lectures and I am still trying to figure out what exactly she teaches during the lecture. Just a reminder, I have recently finished with OOP concepts in Python, so it shouldn’t have to that hard. Then I can only imagine the state of all those who are actually doing it for the first time.

Sick of it, I am writing this post on OOP concepts and will try my best to keep the terms and explanation simple and practical.

Starting with Object Oriented Programming 

OOP, as the name suggests is Object Oriented. Meaning that objects are the significant em…objects in it. But what exactly are objects? I wont answer it right away, because it would require some understanding. For now, everything, from variables to functions and else, are objects. First we will see some of the terms used.

Class

A class is like a code-template for creating objects of similar types which have some dissimilarities. For example, If I had to classify the Animal Kingdom again, I would create a class called animals which will be the parent class.

    class Animals(object):
        pass
       
    dog = Animals()
    dog.legs = 4
   
    cat = Animals()
    cat.legs = 2

And to use the class, we simply create an instance of it, which can be called an object as well. Dog and Cat are the instances here, for example. Since the class is empty (see the ‘pass’). We can set attributes to the instances. ‘legs’ is an attribute to the object dog and cat, which are instances

Object

Objects are all around you. Take a look. Everything that can have an individual significance can be considered an object. All objects have two characteristics, state (like color, size) and behavior (like tasks it does, eating, sleeping). In programming, objects are much similar to that. They have state (which are variables) and they have behavior (which is through methods). In the above ‘class’ example, cat and dog are objects. Let me write something that will make it clear.

    class Foobar(object):
        def __init__(self):
            self.x = x
            self.y = y
       
        def area(self, x, y):
            return x * y
           
    rectangle1 = Foobar()
    area = rectangle1.area(4,6)

So rectangle1 is the object I created. I use the method area to find out area of the supplied rectangle.

Inheritance

Inheritance is using an existing class and creating a new class which inherits, or has all the features of the old class plus some new features. So, if one were to organize reptiles and cats separately, he could do this.

    class Reptile(Animals):
        pass
       
    class cat(Animals):
        pass

This creates classes which can be called daughter classes of parent class Animals. They inherit all the properties of Animals plus they can have their own new properties. It results in code reuse, but read any good book and they will tell you why you should avoid inheritance. Since this is not a guide, I wont.

Methods

When you write a function inside a class, it is called a method. Why? I don’t know. Some books I have read insisted on calling them functions, while some didn’t. Its upto you. For the sake of an example, I will write a class ‘Shape’ and daughter classes and methods in them for finding their respective areas.

    import math

    class Shapes(object):
        pass
       
    class Circle(Shapes):
        def area(self, radius):
            return radius * radius * math.pi
           
    class Triangle(Shapes):
        def area(self, base, height):
            return 0.5 * base * height

The ‘area’ function in both the classes calculates the area of that particular shape and returns it. You need to call the function with parameters. For example,

    import math

    class Shapes(object):
        pass
      
    class Circle(Shapes):
        def area(self, radius):
            return radius * radius * math.pi
          
    class Triangle(Shapes):
        def area(self, base, height):
            return 0.5 * base * height
          
    mytriangle = Triangle() # Creating an instance
    area_of_triangle = mytriangle.area(5,10) # 5, 10 are arguments

If you are wondering, the first parameter ‘self’ is the object itself.

So that was my brief scattered explanation on OOP concepts. I am pretty sure there are some mistakes in the above snippets of code. Please let me and others know the mistakes so I can correct them.